ceph/monitoring/prometheus
Benoît Knecht 653c3f6682 monitoring: Fix "10% OSDs down" alert description
The alert was triggered when less than 90% of OSDs were _up_, but then the
description took that value and described it as the percentage of OSDs being
_down_. So with 12% of OSDs down, the alert description would read:

```
88% or 88 of 100 OSDs are down (>=10%).
```

which can be panic-inducing.

This commit changes the alert expression to actually compute the ratio of OSDs
being down, which makes the correct value appear in the description.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2020-05-06 18:49:26 +02:00
..
alerts monitoring: Fix "10% OSDs down" alert description 2020-05-06 18:49:26 +02:00
README.md

README.md

Alerts

In monitoring/prometheus/alerts you'll find a set of Prometheus alert rules that should provide a decent set of default alerts for a Ceph cluster. Just put this file in a place according to your Prometheus configuration (wherever the rules configuration stanza points).