mirror of https://github.com/ceph/ceph
653c3f6682
The alert was triggered when less than 90% of OSDs were _up_, but then the description took that value and described it as the percentage of OSDs being _down_. So with 12% of OSDs down, the alert description would read: ``` 88% or 88 of 100 OSDs are down (>=10%). ``` which can be panic-inducing. This commit changes the alert expression to actually compute the ratio of OSDs being down, which makes the correct value appear in the description. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> |
||
---|---|---|
.. | ||
alerts | ||
README.md |
README.md
Prometheus related bits
Alerts
In monitoring/prometheus/alerts you'll find a set of Prometheus alert rules that
should provide a decent set of default alerts for a Ceph cluster. Just put this
file in a place according to your Prometheus configuration (wherever the rules
configuration stanza points).