doc/rados/operations/health-checks: document MGR_DOWN

Signed-off-by: Sage Weil <sage@redhat.com>
2025-02-23 11:07:35 +00:00 · 2019-07-31 04:57:49 -05:00 · 2019-07-31 04:57:49 -05:00 · 078ef210d5
commit 078ef210d5
parent 7385e917bb
1 changed files with 18 additions and 0 deletions
--- a/doc/rados/operations/health-checks.rst
+++ b/doc/rados/operations/health-checks.rst
@ -75,6 +75,24 @@ If a monitor is configured to listen for v1 connections on a non-standard port (
 Manager
 -------

+MGR_DOWN
+________
+
+All manager daemons are currently down.  The cluster should normally
+have at least one running manager (``ceph-mgr``) daemon.  If no
+manager daemon is running, the cluster's ability to monitor itself will
+be compromised, and parts of the management API will become
+unavailable (for example, the dashboard will not work, and most CLI
+commands that report metrics or runtime state will block).  However,
+the cluster will still be able to perform all IO operations and
+recover from failures.
+
+The down manager daemon should generally be restarted as soon as
+possible to ensure that the cluster can be monitored (e.g., so that
+the ``ceph -s`` information is up to date, and/or metrics can be
+scraped by Prometheus).
+
+
 MGR_MODULE_DEPENDENCY
 _____________________