Add alert rules when OpenStack services are down

The openstack-exporter generate metrics that checks if services
are up. E.g:
- openstack_loadbalancer_up
- openstack_designate_up
- openstack_identity_up

This new rule uses a regex to identify all metrics name that
starts with 'openstack' and ends with 'up'. Case the result is 0,
meaning service is down, it will generate individual alerts for
every service with problems.

Change-Id: Ia3f6aced5dcbfa124b4340ad054a43a460284019
This commit is contained in:
Gabriel Cocenza
2024-07-09 11:37:31 -03:00
parent f3ab145e58
commit e3daf1ad07
2 changed files with 25 additions and 1 deletions

View File

@@ -0,0 +1,16 @@
groups:
- name: OpenStackServices
rules:
- alert: OpenStackServicesDown
expr: |
sum by(service) (
label_replace({__name__=~"openstack_(.+)_up"}, "service", "$1", "__name__", "openstack_(.+)_up")
) == 0
for: 5m
labels:
severity: critical
service: "{{ $labels.service }}"
annotations:
summary: OpenStack Services Down
description: |
The OpenStack service {{ $labels.service }} is down