ceph/monitoring/prometheus
Paul Cuzner 7ffcbd7f79 mgr/prometheus: Update rule format and enhance SNMP support
Rules now adhere to the format defined by Prometheus.io.
This changes alert naming and each alert now includes a
a summary description to provide a quick one-liner.

In addition to reformatting some missing alerts for MDS and
cephadm have been added, and corresponding tests added.

The MIB has also been refactored, so it now passes standard
lint tests and a README included for devs to understand the
OID schema.

Fixes: https://tracker.ceph.com/issues/53111

Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
2021-11-05 11:24:25 +13:00
..
alerts mgr/prometheus: Update rule format and enhance SNMP support 2021-11-05 11:24:25 +13:00
tests mgr/prometheus: Update rule format and enhance SNMP support 2021-11-05 11:24:25 +13:00
CMakeLists.txt monitoring/prometheus: Add cmake integration 2021-10-22 13:37:31 +13:00
README.md monitoring:Updated README 2021-10-06 14:32:47 +13:00

Alerts

In monitoring/prometheus/alerts you'll find a set of Prometheus alert rules that should provide a decent set of default alerts for a Ceph cluster. Just put this file in a place according to your Prometheus configuration (wherever the rules configuration stanza points).

SNMP

Ceph provides a MIB (CEPH-PROMETHEUS-ALERT-MIB.txt) to support sending Prometheus alerts through to an SNMP management platform. The translation from Prometheus alert to SNMP trap requires the Prometheus alert to contain an OID that maps to a definition within the MIB. When making changes to the Prometheus alert rules file, developers should include any necessary changes to the MIB.