ceph/doc/rados/configuration/mon-osd-interaction.rst

=====================================
 Configuring Monitor/OSD Interaction
=====================================

After you have completed your initial Ceph configuration, you may deploy and run
Ceph.  When you execute a command such as ``ceph health`` or ``ceph -s``,  the
monitor reports on the current state of the cluster. The monitor knows about the
cluster by requiring reports from each OSD, and by receiving reports from OSDs
about the status of their neighboring OSDs. If the monitor doesn't receive
reports, or if it receives reports of changes in the cluster, the monitor
updates the status of the cluster.

Ceph provides reasonable default settings for monitor/OSD interaction. However,
you may override the defaults. The following sections describe how Ceph monitors
and OSDs interact for the purposes of monitoring.


OSDs Check Heartbeats
=====================

Each OSD checks the heartbeat of other OSDs every 6 seconds. You can change the
heartbeat interval by adding an ``osd heartbeat interval`` setting under the
``[osd]`` section of your Ceph configuration file, or by setting the value at
runtime. If an OSD doesn't show a heartbeat within a 20 second grace period, the
cluster may consider the OSD ``down``. You may change this grace period by
adding an ``osd heartbeat grace`` setting under the ``[osd]`` section of your
Ceph configuration file, or by setting the value at runtime.


.. ditaa:: +---------+          +---------+
           |  OSD 1  |          |  OSD 2  |
           +---------+          +---------+
                |                    |
                |----+ Heartbeat     |
                |    | Interval      |
                |<---+ Exceeded      |
                |                    |
                |       Check        |
                |     Heartbeat      |
                |------------------->|
                |                    |
                |<-------------------|
                |   Heart Beating    |
                |                    |
                |----+ Heartbeat     |
                |    | Interval      |
                |<---+ Exceeded      |
                |                    |
                |       Check        |
                |     Heartbeat      |
                |------------------->|
                |                    |
                |----+ Grace         |
                |    | Period        |
                |<---+ Exceeded      |
                |                    |
                |----+ Mark          |
                |    | OSD 2         |
                |<---+ Down          |


OSDs Report Down OSDs
=====================

By default, an OSD must report to the monitors that another OSD is ``down``
three times before the monitors acknowledge that the reported OSD is ``down``.
You can change the minimum number of ``osd down`` reports by adding an ``osd min
down reports`` setting under the ``[osd]`` section of your Ceph configuration
file, or by setting the value at runtime. By default, only one OSD is required
to report another OSD down. You can change the number of OSDs required to report
a monitor down by adding an ``osd min down reporters`` setting under the
``[osd]`` section of your Ceph configuration file, or by setting the value at
runtime.


.. ditaa:: +---------+     +---------+
           |  OSD 1  |     | Monitor |
           +---------+     +---------+
                |               |
                | OSD 2 Is Down |
                |-------------->|
                |               |
                | OSD 2 Is Down |
                |-------------->|
                |               |
                | OSD 2 Is Down |
                |-------------->|
                |               |
                |               |----------+ Mark
                |               |          | OSD 2
                |               |<---------+ Down


OSDs Report Peering Failure
===========================

If an OSD cannot peer with any of the OSDs defined in its Ceph configuration
file, it will ping the monitor for the most recent copy of the cluster map every
30 seconds. You can change the monitor heartbeat interval by adding an ``osd mon
heartbeat interval`` setting under the ``[osd]`` section of your Ceph
configuration file, or by setting the value at runtime.

.. ditaa:: +---------+     +---------+     +-------+     +---------+
           |  OSD 1  |     |  OSD 2  |     | OSD 3 |     | Monitor |
           +---------+     +---------+     +-------+     +---------+
                |               |              |              |
                |  Request To   |              |              |
                |     Peer      |              |              |
                |-------------->|              |              |
                |<--------------|              |              |
                |    Peering                   |              |
                |                              |              |
                |  Request To                  |              |
                |     Peer                     |              |
                |----------------------------->|              |
                |                                             |
                |----+ OSD Monitor                            |
                |    | Heartbeat                              |
                |<---+ Interval Exceeded                      |
                |                                             |
                |         Failed to Peer with OSD 3           |
                |-------------------------------------------->|
                |<--------------------------------------------|
                |          Receive New Cluster Map            |


OSDs Report Their Status
========================

If an OSD doesn't report to the monitor once at least every 120 seconds, the
monitor will consider the OSD ``down``. You can change the monitor report
interval by adding an ``osd mon report interval max`` setting under the
``[osd]`` section of your Ceph configuration file, or by setting the value at
runtime. The OSD attempts  to report on its status every 30 seconds. You can
change the OSD report interval by adding an ``osd mon report interval min``
setting under the ``[osd]`` section of your Ceph configuration file, or by
setting the value at runtime.


.. ditaa:: +---------+          +---------+
           |  OSD 1  |          | Monitor |
           +---------+          +---------+
                |                    |
                |----+ Report Min    |
                |    | Interval      |
                |<---+ Exceeded      |
                |                    |
                |     Report To      |
                |      Monitor       |
                |------------------->|
                |                    |
                |----+ Report Min    |
                |    | Interval      |
                |<---+ Exceeded      |
                |                    |
                | No Report          |
                                     +----+ Report Max
                                     |    | Interval
                                     |<---+ Exceeded
                                     |
                                     +----+ Mark
                                     |    | OSD 1
                                     |<---+ Down