Add Prometheus alerting rules for Cinder agents state

The new rules use the metric below to check if any cinder service is disabled or down:
- openstack_cinder_agent_state

If any metric contains a disabled adminState, the rules will return warning.
If any metric results in 0, meaning service is down, the rules will return critical.

Change-Id: I77b9dd6d5fe18622998063be36a197a0aff83a10
This commit is contained in:
Xuhui Zhu 2024-07-09 15:49:13 -04:00 committed by Gabriel Cocenza
parent 48460fd3d3
commit 8835290acd
No known key found for this signature in database
2 changed files with 32 additions and 0 deletions

View File

@ -47,6 +47,14 @@ This charm automatically adds Prometheus alert rules using the files at
`src/prometheus_alert_rules` when related with `grafana-agent`.
The following alerts are configured by default:
- `CinderStateWarning`: This alert rule will trigger when a cinder service is disabled. The
exporter generates metric openstack_cinder_agent_state which checks cinder service status.
Alerts will appear if any Cinder service is found to be disabled.
- `CinderStateCritical`: This alert rule will trigger when a cinder service is down. The exporter
generates metric openstack_cinder_agent_state which checks cinder service status.
Alerts will appear if any Cinder service is found to be down.
- `NeutronStateCritical`: This alert rule triggers when a Neutron agent is enabled, but down.
The exporter generates the metric openstack_neutron_agent_state, which checks the status
of neutron agents. Alerts will appear if any neutron agent is found to be down.

View File

@ -0,0 +1,24 @@
groups:
- name: Cinder
rules:
- alert: CinderStateWarning
expr: openstack_cinder_agent_state{adminState="disabled"}
for: 5m
labels:
severity: warning
annotations:
summary: Cinder service disabled. (Instance {{ $labels.hostname }})
description: |
The Cinder service is currently disabled on host {{ $labels.hostname }}.
LABELS = {{ $labels }}
- alert: CinderStateCritical
expr: openstack_cinder_agent_state{adminState="enabled"} == 0
for: 5m
labels:
severity: critical
annotations:
summary: Cinder service down. (Instance {{ $labels.hostname }})
description: |
The Cinder service is currently down on host {{ $labels.hostname }}.
LABELS = {{ $labels }}