Add alert rules when OpenStack services are down
The openstack-exporter generate metrics that checks if services are up. E.g: - openstack_loadbalancer_up - openstack_designate_up - openstack_identity_up This new rule uses a regex to identify all metrics name that starts with 'openstack' and ends with 'up'. Case the result is 0, meaning service is down, it will generate individual alerts for every service with problems. Change-Id: Ia3f6aced5dcbfa124b4340ad054a43a460284019
This commit is contained in:
@@ -0,0 +1,16 @@
|
||||
groups:
|
||||
- name: OpenStackServices
|
||||
rules:
|
||||
- alert: OpenStackServicesDown
|
||||
expr: |
|
||||
sum by(service) (
|
||||
label_replace({__name__=~"openstack_(.+)_up"}, "service", "$1", "__name__", "openstack_(.+)_up")
|
||||
) == 0
|
||||
for: 5m
|
||||
labels:
|
||||
severity: critical
|
||||
service: "{{ $labels.service }}"
|
||||
annotations:
|
||||
summary: OpenStack Services Down
|
||||
description: |
|
||||
The OpenStack service {{ $labels.service }} is down
|
||||
Reference in New Issue
Block a user