From 91ed10bc734bd93605b60c87037393d2704a16bc Mon Sep 17 00:00:00 2001 From: Zac Dover Date: Thu, 9 Nov 2023 20:20:20 +1000 Subject: [PATCH] doc/rados: edit t-mon "common issues" (3 of x) Edit the second part of the section "Most Common Monitor Issues" in doc/rados/troubleshooting/troubleshooting-mon.rst. Follows https://github.com/ceph/ceph/pull/54417. Signed-off-by: Zac Dover --- doc/dev/mon-elections.rst | 2 ++ .../troubleshooting/troubleshooting-mon.rst | 30 ++++++++++++------- 2 files changed, 21 insertions(+), 11 deletions(-) diff --git a/doc/dev/mon-elections.rst b/doc/dev/mon-elections.rst index 86cfc3803e7..1f346aece4d 100644 --- a/doc/dev/mon-elections.rst +++ b/doc/dev/mon-elections.rst @@ -1,3 +1,5 @@ +.. _dev_mon_elections: + ================= Monitor Elections ================= diff --git a/doc/rados/troubleshooting/troubleshooting-mon.rst b/doc/rados/troubleshooting/troubleshooting-mon.rst index 428f08d1b02..740de9be017 100644 --- a/doc/rados/troubleshooting/troubleshooting-mon.rst +++ b/doc/rados/troubleshooting/troubleshooting-mon.rst @@ -251,18 +251,26 @@ detail`` returns a message similar to the following:: information about the proper preparation of logs. -**What if state is ``electing``?** +**What does it mean when a Monitor's state is ``electing``?** - This means the monitor is in the middle of an election. With recent Ceph - releases these typically complete quickly, but at times the monitors can - get stuck in what is known as an *election storm*. This can indicate - clock skew among the monitor nodes; jump to - `Clock Skews`_ for more information. If all your clocks are properly - synchronized, you should search the mailing lists and tracker. - This is not a state that is likely to persist and aside from - (*really*) old bugs there is not an obvious reason besides clock skews on - why this would happen. Worst case, if there are enough surviving mons, - down the problematic one while you investigate. + If ``ceph health detail`` shows that a Monitor's state is ``electing``, the + monitor is in the middle of an election. Elections typically complete + quickly, but sometimes the monitors can get stuck in what is known as an + *election storm*. See :ref:`Monitor Elections ` for more + on monitor elections. + + The presence of election storm might indicate clock skew among the monitor + nodes. See `Clock Skews`_ for more information. + + If your clocks are properly synchronized, search the mailing lists and bug + tracker for issues similar to your issue. The ``electing`` state is not + likely to persist. In versions of Ceph after the release of Cuttlefish, there + is no obvious reason other than clock skew that explains why an ``electing`` + state would persist. + + It is possible to investigate the cause of a persistent ``electing`` state if + you put the problematic Monitor into a ``down`` state while you investigate. + This is possible only if there are enough surviving Monitors to form quorum. **What if state is ``synchronizing``?**