Change the expression of defined alert in prometheus to avoid unnecessary errors

There were some false alerts about volume_claim_capacity_high_utilization
due to wrong formula used to determine the percentage of used capacity.

Change-Id: I24afed7946f915e5e13f0ba759eca252c2598af9
This commit is contained in:
Hemant 2019-06-05 14:15:07 +02:00 committed by Chris Wedgwood
parent b2f47aabb1
commit b9a9ee323b

View File

@ -1416,7 +1416,7 @@ conf:
description: 'Pod {{$labels.pod}} in namespace {{$labels.namespace}} has a container terminated for more than 10 minutes' description: 'Pod {{$labels.pod}} in namespace {{$labels.namespace}} has a container terminated for more than 10 minutes'
summary: 'Pod {{$labels.pod}} in namespace {{$labels.namespace}} in error status' summary: 'Pod {{$labels.pod}} in namespace {{$labels.namespace}} in error status'
- alert: volume_claim_capacity_high_utilization - alert: volume_claim_capacity_high_utilization
expr: (kubelet_volume_stats_available_bytes / kubelet_volume_stats_capacity_bytes) > 0.80 expr: (kubelet_volume_stats_capacity_bytes / kubelet_volume_stats_used_bytes) < 1.25
for: 5m for: 5m
labels: labels:
severity: page severity: page