mirror of https://github.com/ceph/ceph
208 lines
6.6 KiB
ReStructuredText
208 lines
6.6 KiB
ReStructuredText
|
======================
|
||
|
Monitoring a Cluster
|
||
|
======================
|
||
|
|
||
|
Once you have a running cluster, you may use the ``ceph`` tool to monitor your
|
||
|
cluster. Monitoring a cluster typically involves checking OSD status, monitor
|
||
|
status, placement group status and metadata server status.
|
||
|
|
||
|
Interactive Mode
|
||
|
================
|
||
|
|
||
|
To run the ``ceph`` tool in interactive mode, type ``ceph`` at the command line
|
||
|
with no arguments. For example::
|
||
|
|
||
|
ceph
|
||
|
ceph> health
|
||
|
ceph> status
|
||
|
ceph> quorum_status
|
||
|
ceph> mon_status
|
||
|
|
||
|
|
||
|
Checking Cluster Health
|
||
|
=======================
|
||
|
|
||
|
After you start your cluster, and before you start reading and/or
|
||
|
writing data, check your cluster's health first. You can check on the
|
||
|
health of your Ceph cluster with the following::
|
||
|
|
||
|
ceph health
|
||
|
|
||
|
If you specified non-default locations for your configuration or keyring,
|
||
|
you may specify their locations::
|
||
|
|
||
|
ceph -c /path/to/conf -k /path/to/keyring health
|
||
|
|
||
|
Upon starting the Ceph cluster, you will likely encounter a health
|
||
|
warning such as ``HEALTH_WARN XXX num placement groups stale``. Wait a few moments and check
|
||
|
it again. When your cluster is ready, ``ceph health`` should return a message
|
||
|
such as ``HEALTH_OK``. At that point, it is okay to begin using the cluster.
|
||
|
|
||
|
Watching a Cluster
|
||
|
==================
|
||
|
|
||
|
To watch the cluster's ongoing events, open a new terminal. Then, enter::
|
||
|
|
||
|
ceph -w
|
||
|
|
||
|
Ceph will print each version of the placement group map and their status. For
|
||
|
example, a tiny Ceph cluster consisting of one monitor, one metadata server and
|
||
|
two OSDs may print the following::
|
||
|
|
||
|
health HEALTH_OK
|
||
|
monmap e1: 1 mons at {a=192.168.0.1:6789/0}, election epoch 0, quorum 0 a
|
||
|
osdmap e13: 2 osds: 2 up, 2 in
|
||
|
placement groupmap v9713: 384 placement groups: 384 active+clean; 8730 bytes data, 22948 MB used, 264 GB / 302 GB avail
|
||
|
mdsmap e4: 1/1/1 up {0=a=up:active}
|
||
|
|
||
|
2012-08-01 11:33:53.831268 mon.0 [INF] placement groupmap v9712: 384 placement groups: 384 active+clean; 8730 bytes data, 22948 MB used, 264 GB / 302 GB avail
|
||
|
2012-08-01 11:35:31.904650 mon.0 [INF] placement groupmap v9713: 384 placement groups: 384 active+clean; 8730 bytes data, 22948 MB used, 264 GB / 302 GB avail
|
||
|
2012-08-01 11:35:53.903189 mon.0 [INF] placement groupmap v9714: 384 placement groups: 384 active+clean; 8730 bytes data, 22948 MB used, 264 GB / 302 GB avail
|
||
|
2012-08-01 11:37:31.865809 mon.0 [INF] placement groupmap v9715: 384 placement groups: 384 active+clean; 8730 bytes data, 22948 MB used, 264 GB / 302 GB avail
|
||
|
|
||
|
|
||
|
Checking a Cluster's Status
|
||
|
===========================
|
||
|
|
||
|
To check a cluster's status, execute the following::
|
||
|
|
||
|
ceph status
|
||
|
|
||
|
Or::
|
||
|
|
||
|
ceph -s
|
||
|
|
||
|
In interactive mode, type ``status`` and press **Enter**. ::
|
||
|
|
||
|
ceph> status
|
||
|
|
||
|
Ceph will print the cluster status. For example, a tiny Ceph cluster consisting
|
||
|
of one monitor, one metadata server and two OSDs may print the following::
|
||
|
|
||
|
health HEALTH_OK
|
||
|
monmap e1: 1 mons at {a=192.168.0.1:6789/0}, election epoch 0, quorum 0 a
|
||
|
osdmap e13: 2 osds: 2 up, 2 in
|
||
|
placement groupmap v9754: 384 placement groups: 384 active+clean; 8730 bytes data, 22948 MB used, 264 GB / 302 GB avail
|
||
|
mdsmap e4: 1/1/1 up {0=a=up:active}
|
||
|
|
||
|
|
||
|
Checking OSD Status
|
||
|
===================
|
||
|
|
||
|
An OSD's status is either in the cluster (``in``) or out of the
|
||
|
cluster (``out``); and, it is either up and running (``up``), or it is down and
|
||
|
not running (``down``). If an OSD is ``up``, it may be either ``in`` in the
|
||
|
cluster (you can read and write data) or it is out of the cluster ``out``. If
|
||
|
it is ``down``, it should also be ``out``. If an OSD is ``down`` and ``in``,
|
||
|
there is a problem.
|
||
|
|
||
|
.. ditaa:: +----------------+ +----------------+
|
||
|
| | | |
|
||
|
| OSD #n In | | OSD #n Up |
|
||
|
| | | |
|
||
|
+----------------+ +----------------+
|
||
|
^ ^
|
||
|
| |
|
||
|
| |
|
||
|
v v
|
||
|
+----------------+ +----------------+
|
||
|
| | | |
|
||
|
| OSD #n Out | | OSD #n Down |
|
||
|
| | | |
|
||
|
+----------------+ +----------------+
|
||
|
|
||
|
You can check OSDs to ensure they are ``up`` and ``in`` by executing::
|
||
|
|
||
|
ceph osd stat
|
||
|
|
||
|
Or::
|
||
|
|
||
|
ceph osd dump
|
||
|
|
||
|
You can also check view OSDs according to their position in the CRUSH map. ::
|
||
|
|
||
|
ceph osd tree
|
||
|
|
||
|
Ceph will print out a CRUSH tree with a host, its OSDs, whether they are up
|
||
|
and their weight. ::
|
||
|
|
||
|
# id weight type name up/down reweight
|
||
|
-1 3 pool default
|
||
|
-3 3 rack mainrack
|
||
|
-2 3 host osd-host
|
||
|
0 1 osd.0 up 1
|
||
|
1 1 osd.1 up 1
|
||
|
2 1 osd.2 up 1
|
||
|
|
||
|
|
||
|
Checking Monitor Status
|
||
|
=======================
|
||
|
|
||
|
If your cluster has multiple monitors (likely), you should check the monitor
|
||
|
quorum status after you start the cluster before reading and/or writing data. A
|
||
|
quorum must be present when multiple monitors are running. You should also check
|
||
|
monitor status periodically to ensure that they are running.
|
||
|
|
||
|
To see display the monitor map, execute the following::
|
||
|
|
||
|
ceph mon stat
|
||
|
|
||
|
Or::
|
||
|
|
||
|
ceph mon dump
|
||
|
|
||
|
To check the quorum status for the monitor cluster, execute the following::
|
||
|
|
||
|
ceph quorum_status
|
||
|
|
||
|
Ceph will return the quorum status. For example, a Ceph cluster consisting of
|
||
|
three monitors may return the following:
|
||
|
|
||
|
.. code-block:: javascript
|
||
|
|
||
|
{ "election_epoch": 10,
|
||
|
"quorum": [
|
||
|
0,
|
||
|
1,
|
||
|
2],
|
||
|
"monmap": { "epoch": 1,
|
||
|
"fsid": "444b489c-4f16-4b75-83f0-cb8097468898",
|
||
|
"modified": "2011-12-12 13:28:27.505520",
|
||
|
"created": "2011-12-12 13:28:27.505520",
|
||
|
"mons": [
|
||
|
{ "rank": 0,
|
||
|
"name": "a",
|
||
|
"addr": "127.0.0.1:6789\/0"},
|
||
|
{ "rank": 1,
|
||
|
"name": "b",
|
||
|
"addr": "127.0.0.1:6790\/0"},
|
||
|
{ "rank": 2,
|
||
|
"name": "c",
|
||
|
"addr": "127.0.0.1:6791\/0"}
|
||
|
]
|
||
|
}
|
||
|
}
|
||
|
|
||
|
Checking MDS Status
|
||
|
===================
|
||
|
|
||
|
Metadata servers provide metadata services for Ceph FS. Metadata servers have
|
||
|
two sets of states: ``up | down`` and ``active | inactive``. To ensure your
|
||
|
metadata servers are ``up`` and ``active``, execute the following::
|
||
|
|
||
|
ceph mds stat
|
||
|
|
||
|
To display details of the metadata cluster, execute the following::
|
||
|
|
||
|
ceph mds dump
|
||
|
|
||
|
|
||
|
Checking Placement Group States
|
||
|
===============================
|
||
|
|
||
|
Placement groups map objects to OSDs. When you monitor your
|
||
|
placement groups, you will want them to be ``active`` and ``clean``. For other
|
||
|
PG states, see `Placement Group States`_.
|
||
|
|
||
|
.. _Placement Group States: ../pg-states
|