ceph/doc/cephfs/health-messages.rst


.. _cephfs-health-messages:

======================
CephFS health messages
======================

Cluster health checks
=====================

The Ceph monitor daemons will generate health messages in response
to certain states of the file system map structure (and the enclosed MDS maps).

Message: mds rank(s) *ranks* have failed
Description: One or more MDS ranks are not currently assigned to
an MDS daemon; the cluster will not recover until a suitable replacement
daemon starts.

Message: mds rank(s) *ranks* are damaged
Description: One or more MDS ranks has encountered severe damage to
its stored metadata, and cannot start again until it is repaired.

Message: mds cluster is degraded
Description: One or more MDS ranks are not currently up and running, clients
may pause metadata IO until this situation is resolved.  This includes
ranks being failed or damaged, and additionally includes ranks
which are running on an MDS but have not yet made it to the *active*
state (e.g. ranks currently in *replay* state).

Message: mds *names* are laggy
Description: The named MDS daemons have failed to send beacon messages
to the monitor for at least ``mds_beacon_grace`` (default 15s), while
they are supposed to send beacon messages every ``mds_beacon_interval``
(default 4s).  The daemons may have crashed.  The Ceph monitor will
automatically replace laggy daemons with standbys if any are available.

Message: insufficient standby daemons available
Description: One or more file systems are configured to have a certain number
of standby daemons available (including daemons in standby-replay) but the
cluster does not have enough standby daemons. The standby daemons not in replay
count towards any file system (i.e. they may overlap). This warning can
configured by setting ``ceph fs set <fs> standby_count_wanted <count>``.  Use
zero for ``count`` to disable.


Daemon-reported health checks
=============================

MDS daemons can identify a variety of unwanted conditions, and
indicate these to the operator in the output of ``ceph status``.
These conditions have human readable messages, and additionally
a unique code starting with ``MDS_``.

.. highlight:: console

``ceph health detail`` shows the details of the conditions. Following
is a typical health report from a cluster experiencing MDS related
performance issues::

  $ ceph health detail
  HEALTH_WARN 1 MDSs report slow metadata IOs; 1 MDSs report slow requests
  MDS_SLOW_METADATA_IO 1 MDSs report slow metadata IOs
     mdsfs-01(mds.0): 3 slow metadata IOs are blocked > 30 secs, oldest blocked for 51123 secs
  MDS_SLOW_REQUEST 1 MDSs report slow requests
     mdsfs-01(mds.0): 5 slow requests are blocked > 30 secs

Where, for intance, ``MDS_SLOW_REQUEST`` is the unique code representing the
condition where requests are taking long time to complete. And the following
description shows its severity and the MDS daemons which are serving these
slow requests.

This page lists the health checks raised by MDS daemons. For the checks from
other daemons, please see :ref:`health-checks`.

* ``MDS_TRIM``

  Message
    "Behind on trimming..."
  Description
    CephFS maintains a metadata journal that is divided into
    *log segments*.  The length of journal (in number of segments) is controlled
    by the setting ``mds_log_max_segments``, and when the number of segments
    exceeds that setting the MDS starts writing back metadata so that it
    can remove (trim) the oldest segments.  If this writeback is happening
    too slowly, or a software bug is preventing trimming, then this health
    message may appear.  The threshold for this message to appear is controlled by
    the config option ``mds_log_warn_factor``, the default is 2.0.
* ``MDS_HEALTH_CLIENT_LATE_RELEASE``, ``MDS_HEALTH_CLIENT_LATE_RELEASE_MANY``

  Message
    "Client *name* failing to respond to capability release"
  Description
    CephFS clients are issued *capabilities* by the MDS, which
    are like locks.  Sometimes, for example when another client needs access,
    the MDS will request clients release their capabilities.  If the client
    is unresponsive or buggy, it might fail to do so promptly or fail to do
    so at all.  This message appears if a client has taken longer than
    ``session_timeout`` (default 60s) to comply.
* ``MDS_CLIENT_RECALL``, ``MDS_HEALTH_CLIENT_RECALL_MANY``

  Message
    "Client *name* failing to respond to cache pressure"
  Description
    Clients maintain a metadata cache.  Items (such as inodes) in the
    client cache are also pinned in the MDS cache, so when the MDS needs to shrink
    its cache (to stay within ``mds_cache_memory_limit``), it sends messages to
    clients to shrink their caches too.  If the client is unresponsive or buggy,
    this can prevent the MDS from properly staying within its cache limits and it
    may eventually run out of memory and crash.  This message appears if a client
    has failed to release more than
    ``mds_recall_warning_threshold`` capabilities (decaying with a half-life of
    ``mds_recall_max_decay_rate``) within the last
    ``mds_recall_warning_decay_rate`` second.
* ``MDS_CLIENT_OLDEST_TID``, ``MDS_CLIENT_OLDEST_TID_MANY``

  Message
    "Client *name* failing to advance its oldest client/flush tid"
  Description
    The CephFS client-MDS protocol uses a field called the
    *oldest tid* to inform the MDS of which client requests are fully
    complete and may therefore be forgotten about by the MDS.  If a buggy
    client is failing to advance this field, then the MDS may be prevented
    from properly cleaning up resources used by client requests.  This message
    appears if a client appears to have more than ``max_completed_requests``
    (default 100000) requests that are complete on the MDS side but haven't
    yet been accounted for in the client's *oldest tid* value.
* ``MDS_DAMAGE``

  Message
    "Metadata damage detected"
  Description
    Corrupt or missing metadata was encountered when reading
    from the metadata pool.  This message indicates that the damage was
    sufficiently isolated for the MDS to continue operating, although
    client accesses to the damaged subtree will return IO errors.  Use
    the ``damage ls`` admin socket command to get more detail on the damage.
    This message appears as soon as any damage is encountered.
* ``MDS_HEALTH_READ_ONLY``

  Message
    "MDS in read-only mode"
  Description
    The MDS has gone into readonly mode and will return EROFS
    error codes to client operations that attempt to modify any metadata.  The
    MDS will go into readonly mode if it encounters a write error while
    writing to the metadata pool, or if forced to by an administrator using
    the *force_readonly* admin socket command.
* ``MDS_SLOW_REQUEST``

  Message
    "*N* slow requests are blocked"

  Description
    One or more client requests have not been completed promptly,
    indicating that the MDS is either running very slowly, or that the RADOS
    cluster is not acknowledging journal writes promptly, or that there is a bug.
    Use the ``ops`` admin socket command to list outstanding metadata operations.
    This message appears if any client requests have taken longer than
    ``mds_op_complaint_time`` (default 30s).
* ``MDS_CACHE_OVERSIZED``

  Message
    "Too many inodes in cache"
  Description
    The MDS is not succeeding in trimming its cache to comply with the
    limit set by the administrator.  If the MDS cache becomes too large, the daemon
    may exhaust available memory and crash.  By default, this message appears if
    the actual cache size (in memory) is at least 50% greater than
    ``mds_cache_memory_limit`` (default 1GB). Modify ``mds_health_cache_threshold``
    to set the warning ratio.
doc/cephfs: explain the various health messages Signed-off-by: John Spray <john.spray@redhat.com> 2016-07-07 15:45:08 +00:00
doc/cephfs add label to health messages for use in refs linking Signed-off-by: Alfredo Deza <adeza@redhat.com> 2017-08-15 12:41:33 +00:00			`.. _cephfs-health-messages:`

doc/cephfs: explain the various health messages Signed-off-by: John Spray <john.spray@redhat.com> 2016-07-07 15:45:08 +00:00			`======================`
			`CephFS health messages`
			`======================`

			`Cluster health checks`
			`=====================`

			`The Ceph monitor daemons will generate health messages in response`
doc: filesystem to file system "Filesystem" is not a word (although fairly common in use). Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> 2019-09-09 19:36:04 +00:00			`to certain states of the file system map structure (and the enclosed MDS maps).`
doc/cephfs: explain the various health messages Signed-off-by: John Spray <john.spray@redhat.com> 2016-07-07 15:45:08 +00:00
			`Message: mds rank(s) ranks have failed`
			`Description: One or more MDS ranks are not currently assigned to`
			`an MDS daemon; the cluster will not recover until a suitable replacement`
			`daemon starts.`

			`Message: mds rank(s) ranks are damaged`
			`Description: One or more MDS ranks has encountered severe damage to`
			`its stored metadata, and cannot start again until it is repaired.`

			`Message: mds cluster is degraded`
			`Description: One or more MDS ranks are not currently up and running, clients`
			`may pause metadata IO until this situation is resolved. This includes`
			`ranks being failed or damaged, and additionally includes ranks`
			`which are running on an MDS but have not yet made it to the active`
			`state (e.g. ranks currently in replay state).`

			`Message: mds names are laggy`
			`Description: The named MDS daemons have failed to send beacon messages`
			to the monitor for at least ``mds_beacon_grace`` (default 15s), while
			they are supposed to send beacon messages every ``mds_beacon_interval``
			`(default 4s). The daemons may have crashed. The Ceph monitor will`
			`automatically replace laggy daemons with standbys if any are available.`

mds: warn if insufficient standbys exist Fixes: http://tracker.ceph.com/issues/17604 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> 2016-11-18 18:27:07 +00:00			`Message: insufficient standby daemons available`
			`Description: One or more file systems are configured to have a certain number`
			`of standby daemons available (including daemons in standby-replay) but the`
doc: 'daemon' is misspelled in doc/cephfs/health-messages.rst and src/tools/rbd_recover_tool/README Signed-off-by: ashitakasam <694240887@qq.com> 2018-03-16 02:51:23 +00:00			`cluster does not have enough standby daemons. The standby daemons not in replay`
mds: warn if insufficient standbys exist Fixes: http://tracker.ceph.com/issues/17604 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> 2016-11-18 18:27:07 +00:00			`count towards any file system (i.e. they may overlap). This warning can`
			configured by setting ``ceph fs set <fs> standby_count_wanted <count>``. Use
			zero for ``count`` to disable.


doc/cephfs: explain the various health messages Signed-off-by: John Spray <john.spray@redhat.com> 2016-07-07 15:45:08 +00:00			`Daemon-reported health checks`
			`=============================`

			`MDS daemons can identify a variety of unwanted conditions, and`
			indicate these to the operator in the output of ``ceph status``.
doc/cephfs: reformat the health checks otherwise the "Message" and "Code" of each check are cluttered in the same paragraph. Signed-off-by: Kefu Chai <kchai@redhat.com> 2020-10-27 07:34:27 +00:00			`These conditions have human readable messages, and additionally`
			a unique code starting with ``MDS_``.

			`.. highlight:: console`

			``ceph health detail`` shows the details of the conditions. Following
			`is a typical health report from a cluster experiencing MDS related`
			`performance issues::`

			`$ ceph health detail`
			`HEALTH_WARN 1 MDSs report slow metadata IOs; 1 MDSs report slow requests`
			`MDS_SLOW_METADATA_IO 1 MDSs report slow metadata IOs`
			`mdsfs-01(mds.0): 3 slow metadata IOs are blocked > 30 secs, oldest blocked for 51123 secs`
			`MDS_SLOW_REQUEST 1 MDSs report slow requests`
			`mdsfs-01(mds.0): 5 slow requests are blocked > 30 secs`

			Where, for intance, ``MDS_SLOW_REQUEST`` is the unique code representing the
			`condition where requests are taking long time to complete. And the following`
			`description shows its severity and the MDS daemons which are serving these`
			`slow requests.`

			`This page lists the health checks raised by MDS daemons. For the checks from`
			other daemons, please see :ref:`health-checks`.

			* ``MDS_TRIM``

			`Message`
			`"Behind on trimming..."`
			`Description`
			`CephFS maintains a metadata journal that is divided into`
			`log segments. The length of journal (in number of segments) is controlled`
			by the setting ``mds_log_max_segments``, and when the number of segments
			`exceeds that setting the MDS starts writing back metadata so that it`
			`can remove (trim) the oldest segments. If this writeback is happening`
			`too slowly, or a software bug is preventing trimming, then this health`
			`message may appear. The threshold for this message to appear is controlled by`
			the config option ``mds_log_warn_factor``, the default is 2.0.
			* ``MDS_HEALTH_CLIENT_LATE_RELEASE``, ``MDS_HEALTH_CLIENT_LATE_RELEASE_MANY``

			`Message`
			`"Client name failing to respond to capability release"`
			`Description`
			`CephFS clients are issued capabilities by the MDS, which`
			`are like locks. Sometimes, for example when another client needs access,`
			`the MDS will request clients release their capabilities. If the client`
			`is unresponsive or buggy, it might fail to do so promptly or fail to do`
			`so at all. This message appears if a client has taken longer than`
			``session_timeout`` (default 60s) to comply.
			* ``MDS_CLIENT_RECALL``, ``MDS_HEALTH_CLIENT_RECALL_MANY``

			`Message`
			`"Client name failing to respond to cache pressure"`
			`Description`
			`Clients maintain a metadata cache. Items (such as inodes) in the`
			`client cache are also pinned in the MDS cache, so when the MDS needs to shrink`
			its cache (to stay within ``mds_cache_memory_limit``), it sends messages to
			`clients to shrink their caches too. If the client is unresponsive or buggy,`
			`this can prevent the MDS from properly staying within its cache limits and it`
			`may eventually run out of memory and crash. This message appears if a client`
			`has failed to release more than`
			``mds_recall_warning_threshold`` capabilities (decaying with a half-life of
			``mds_recall_max_decay_rate``) within the last
			``mds_recall_warning_decay_rate`` second.
			* ``MDS_CLIENT_OLDEST_TID``, ``MDS_CLIENT_OLDEST_TID_MANY``

			`Message`
			`"Client name failing to advance its oldest client/flush tid"`
			`Description`
			`The CephFS client-MDS protocol uses a field called the`
			`oldest tid to inform the MDS of which client requests are fully`
			`complete and may therefore be forgotten about by the MDS. If a buggy`
			`client is failing to advance this field, then the MDS may be prevented`
			`from properly cleaning up resources used by client requests. This message`
			appears if a client appears to have more than ``max_completed_requests``
			`(default 100000) requests that are complete on the MDS side but haven't`
			`yet been accounted for in the client's oldest tid value.`
			* ``MDS_DAMAGE``

			`Message`
			`"Metadata damage detected"`
			`Description`
			`Corrupt or missing metadata was encountered when reading`
			`from the metadata pool. This message indicates that the damage was`
			`sufficiently isolated for the MDS to continue operating, although`
			`client accesses to the damaged subtree will return IO errors. Use`
			the ``damage ls`` admin socket command to get more detail on the damage.
			`This message appears as soon as any damage is encountered.`
			* ``MDS_HEALTH_READ_ONLY``

			`Message`
			`"MDS in read-only mode"`
			`Description`
			`The MDS has gone into readonly mode and will return EROFS`
			`error codes to client operations that attempt to modify any metadata. The`
			`MDS will go into readonly mode if it encounters a write error while`
			`writing to the metadata pool, or if forced to by an administrator using`
			`the force_readonly admin socket command.`
			* ``MDS_SLOW_REQUEST``

			`Message`
			`"N slow requests are blocked"`

			`Description`
			`One or more client requests have not been completed promptly,`
			`indicating that the MDS is either running very slowly, or that the RADOS`
			`cluster is not acknowledging journal writes promptly, or that there is a bug.`
			Use the ``ops`` admin socket command to list outstanding metadata operations.
			`This message appears if any client requests have taken longer than`
			``mds_op_complaint_time`` (default 30s).
			* ``MDS_CACHE_OVERSIZED``

			`Message`
			`"Too many inodes in cache"`
			`Description`
			`The MDS is not succeeding in trimming its cache to comply with the`
			`limit set by the administrator. If the MDS cache becomes too large, the daemon`
			`may exhaust available memory and crash. By default, this message appears if`
			`the actual cache size (in memory) is at least 50% greater than`
			``mds_cache_memory_limit`` (default 1GB). Modify ``mds_health_cache_threshold``
			`to set the warning ratio.`