mirror of
https://github.com/ceph/ceph
synced 2024-12-21 02:42:48 +00:00
2010432b50
SLOW_OPS is triggered by op tracker, and generates a health alert but healthchecks do not create metrics for prometheus to use as alert triggers. This change adds SLOW_OPS metric, and provides a simple means to extend to other relevant health checks in the future If the extract of the value from the health check message fails we log an error and remove the metric from the metric set. In addition the metric description has changed to better reflect the scenarios where SLOW_OPS can be triggered. Signed-off-by: Paul Cuzner <pcuzner@redhat.com> |
||
---|---|---|
.. | ||
alerts | ||
README.md |
Prometheus related bits
Alerts
In monitoring/prometheus/alerts you'll find a set of Prometheus alert rules that
should provide a decent set of default alerts for a Ceph cluster. Just put this
file in a place according to your Prometheus configuration (wherever the rules
configuration stanza points).