2020-03-15 13:45:46 +00:00
|
|
|
==================
|
|
|
|
Cephadm Operations
|
|
|
|
==================
|
|
|
|
|
|
|
|
Watching cephadm log messages
|
|
|
|
=============================
|
|
|
|
|
2020-03-17 13:54:47 +00:00
|
|
|
Cephadm logs to the ``cephadm`` cluster log channel, meaning you can
|
|
|
|
monitor progress in realtime with::
|
2020-03-15 13:45:46 +00:00
|
|
|
|
|
|
|
# ceph -W cephadm
|
|
|
|
|
|
|
|
By default it will show info-level events and above. To see
|
|
|
|
debug-level messages too::
|
|
|
|
|
|
|
|
# ceph config set mgr mgr/cephadm/log_to_cluster_level debug
|
|
|
|
# ceph -W cephadm --watch-debug
|
|
|
|
|
|
|
|
Be careful: the debug messages are very verbose!
|
|
|
|
|
|
|
|
You can see recent events with::
|
|
|
|
|
|
|
|
# ceph log last cephadm
|
|
|
|
|
|
|
|
These events are also logged to the ``ceph.cephadm.log`` file on
|
2020-03-17 13:54:47 +00:00
|
|
|
monitor hosts and to the monitor daemons' stderr.
|
2020-03-15 13:45:46 +00:00
|
|
|
|
|
|
|
|
|
|
|
Ceph daemon logs
|
|
|
|
================
|
|
|
|
|
|
|
|
Logging to stdout
|
|
|
|
-----------------
|
|
|
|
|
2020-03-17 13:54:47 +00:00
|
|
|
Traditionally, Ceph daemons have logged to ``/var/log/ceph``. By
|
|
|
|
default, cephadm daemons log to stderr and the logs are
|
2020-03-15 13:45:46 +00:00
|
|
|
captured by the container runtime environment. For most systems, by
|
|
|
|
default, these logs are sent to journald and accessible via
|
|
|
|
``journalctl``.
|
|
|
|
|
|
|
|
For example, to view the logs for the daemon ``mon.foo`` for a cluster
|
2020-03-17 13:54:47 +00:00
|
|
|
with ID ``5c5a50ae-272a-455d-99e9-32c6a013e694``, the command would be
|
2020-03-15 13:45:46 +00:00
|
|
|
something like::
|
|
|
|
|
|
|
|
journalctl -u ceph-5c5a50ae-272a-455d-99e9-32c6a013e694@mon.foo
|
|
|
|
|
|
|
|
This works well for normal operations when logging levels are low.
|
|
|
|
|
|
|
|
To disable logging to stderr::
|
|
|
|
|
|
|
|
ceph config set global log_to_stderr false
|
|
|
|
ceph config set global mon_cluster_log_to_stderr false
|
|
|
|
|
|
|
|
Logging to files
|
|
|
|
----------------
|
|
|
|
|
|
|
|
You can also configure Ceph daemons to log to files instead of stderr,
|
|
|
|
just like they have in the past. When logging to files, Ceph logs appear
|
|
|
|
in ``/var/log/ceph/<cluster-fsid>``.
|
|
|
|
|
|
|
|
To enable logging to files::
|
|
|
|
|
|
|
|
ceph config set global log_to_file true
|
|
|
|
ceph config set global mon_cluster_log_to_file true
|
|
|
|
|
2020-03-17 13:54:47 +00:00
|
|
|
We recommend disabling logging to stderr (see above) or else everything
|
|
|
|
will be logged twice::
|
2020-03-15 13:45:46 +00:00
|
|
|
|
|
|
|
ceph config set global log_to_stderr false
|
|
|
|
ceph config set global mon_cluster_log_to_stderr false
|
|
|
|
|
|
|
|
By default, cephadm sets up log rotation on each host to rotate these
|
|
|
|
files. You can configure the logging retention schedule by modifying
|
|
|
|
``/etc/logrotate.d/ceph.<cluster-fsid>``.
|
|
|
|
|
|
|
|
|
|
|
|
Data location
|
|
|
|
=============
|
|
|
|
|
|
|
|
Cephadm daemon data and logs in slightly different locations than older
|
|
|
|
versions of ceph:
|
|
|
|
|
|
|
|
* ``/var/log/ceph/<cluster-fsid>`` contains all cluster logs. Note
|
|
|
|
that by default cephadm logs via stderr and the container runtime,
|
|
|
|
so these logs are normally not present.
|
|
|
|
* ``/var/lib/ceph/<cluster-fsid>`` contains all cluster daemon data
|
|
|
|
(besides logs).
|
|
|
|
* ``/var/lib/ceph/<cluster-fsid>/<daemon-name>`` contains all data for
|
|
|
|
an individual daemon.
|
|
|
|
* ``/var/lib/ceph/<cluster-fsid>/crash`` contains crash reports for
|
|
|
|
the cluster.
|
|
|
|
* ``/var/lib/ceph/<cluster-fsid>/removed`` contains old daemon
|
|
|
|
data directories for stateful daemons (e.g., monitor, prometheus)
|
|
|
|
that have been removed by cephadm.
|
|
|
|
|
|
|
|
Disk usage
|
|
|
|
----------
|
|
|
|
|
|
|
|
Because a few Ceph daemons may store a significant amount of data in
|
2020-03-17 13:54:47 +00:00
|
|
|
``/var/lib/ceph`` (notably, the monitors and prometheus), we recommend
|
|
|
|
moving this directory to its own disk, partition, or logical volume so
|
|
|
|
that it does not fill up the root file system.
|
2020-02-03 15:31:36 +00:00
|
|
|
|
2019-02-19 19:28:50 +00:00
|
|
|
|
|
|
|
|
2020-03-14 14:33:01 +00:00
|
|
|
SSH Configuration
|
|
|
|
=================
|
|
|
|
|
|
|
|
Cephadm uses SSH to connect to remote hosts. SSH uses a key to authenticate
|
|
|
|
with those hosts in a secure way.
|
|
|
|
|
|
|
|
|
|
|
|
Default behavior
|
|
|
|
----------------
|
|
|
|
|
2020-03-17 13:54:47 +00:00
|
|
|
Cephadm stores an SSH key in the monitor that is used to
|
2020-03-14 14:33:01 +00:00
|
|
|
connect to remote hosts. When the cluster is bootstrapped, this SSH
|
2020-03-17 13:54:47 +00:00
|
|
|
key is generated automatically and no additional configuration
|
2020-03-14 14:33:01 +00:00
|
|
|
is necessary.
|
|
|
|
|
|
|
|
A *new* SSH key can be generated with::
|
|
|
|
|
|
|
|
ceph cephadm generate-key
|
|
|
|
|
|
|
|
The public portion of the SSH key can be retrieved with::
|
|
|
|
|
|
|
|
ceph cephadm get-pub-key
|
|
|
|
|
|
|
|
The currently stored SSH key can be deleted with::
|
|
|
|
|
|
|
|
ceph cephadm clear-key
|
|
|
|
|
|
|
|
You can make use of an existing key by directly importing it with::
|
|
|
|
|
|
|
|
ceph config-key set mgr/cephadm/ssh_identity_key -i <key>
|
|
|
|
ceph config-key set mgr/cephadm/ssh_identity_pub -i <pub>
|
|
|
|
|
|
|
|
You will then need to restart the mgr daemon to reload the configuration with::
|
|
|
|
|
|
|
|
ceph mgr fail
|
|
|
|
|
|
|
|
|
|
|
|
Customizing the SSH configuration
|
|
|
|
---------------------------------
|
|
|
|
|
2020-03-17 13:54:47 +00:00
|
|
|
Cephadm generates an appropriate ``ssh_config`` file that is
|
2020-03-14 14:33:01 +00:00
|
|
|
used for connecting to remote hosts. This configuration looks
|
|
|
|
something like this::
|
|
|
|
|
|
|
|
Host *
|
|
|
|
User root
|
|
|
|
StrictHostKeyChecking no
|
|
|
|
UserKnownHostsFile /dev/null
|
2019-02-19 19:28:50 +00:00
|
|
|
|
2020-03-14 14:33:01 +00:00
|
|
|
There are two ways to customize this configuration for your environment:
|
2019-02-19 19:28:50 +00:00
|
|
|
|
2020-03-17 13:54:47 +00:00
|
|
|
#. Import a customized configuration file that will be stored
|
2020-03-14 14:33:01 +00:00
|
|
|
by the monitor with::
|
2019-02-19 19:28:50 +00:00
|
|
|
|
2020-03-14 14:33:01 +00:00
|
|
|
ceph cephadm set-ssh-config -i <ssh_config_file>
|
2019-02-19 19:28:50 +00:00
|
|
|
|
2020-03-17 13:54:47 +00:00
|
|
|
To remove a customized SSH config and revert back to the default behavior::
|
2019-02-19 19:28:50 +00:00
|
|
|
|
2020-03-14 14:33:01 +00:00
|
|
|
ceph cephadm clear-ssh-config
|
2019-02-19 19:28:50 +00:00
|
|
|
|
2020-03-17 13:54:47 +00:00
|
|
|
#. You can configure a file location for the SSH configuration file with::
|
2019-02-19 19:28:50 +00:00
|
|
|
|
2020-03-14 14:33:01 +00:00
|
|
|
ceph config set mgr mgr/cephadm/ssh_config_file <path>
|
2019-02-19 19:28:50 +00:00
|
|
|
|
2020-03-17 13:54:47 +00:00
|
|
|
We do *not recommend* this approach. The path name must be
|
2020-03-14 14:33:01 +00:00
|
|
|
visible to *any* mgr daemon, and cephadm runs all daemons as
|
|
|
|
containers. That means that the file either need to be placed
|
|
|
|
inside a customized container image for your deployment, or
|
|
|
|
manually distributed to the mgr data directory
|
|
|
|
(``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at
|
|
|
|
``/var/lib/ceph/mgr/ceph-<id>`` from inside the container).
|
2019-02-19 19:28:50 +00:00
|
|
|
|
2020-01-21 16:40:07 +00:00
|
|
|
|
|
|
|
Health checks
|
2020-02-21 13:28:07 +00:00
|
|
|
=============
|
2020-01-21 16:40:07 +00:00
|
|
|
|
2020-03-12 18:13:11 +00:00
|
|
|
CEPHADM_PAUSED
|
|
|
|
--------------
|
|
|
|
|
|
|
|
Cephadm background work has been paused with ``ceph orch pause``. Cephadm
|
2020-03-17 13:54:47 +00:00
|
|
|
continues to perform passive monitoring activities (like checking
|
2020-03-12 18:13:11 +00:00
|
|
|
host and daemon status), but it will not make any changes (like deploying
|
|
|
|
or removing daemons).
|
|
|
|
|
2020-03-17 13:54:47 +00:00
|
|
|
Resume cephadm work with::
|
2020-03-12 18:13:11 +00:00
|
|
|
|
|
|
|
ceph orch resume
|
|
|
|
|
2020-01-21 16:40:07 +00:00
|
|
|
CEPHADM_STRAY_HOST
|
2020-02-21 13:28:07 +00:00
|
|
|
------------------
|
2020-01-21 16:40:07 +00:00
|
|
|
|
|
|
|
One or more hosts have running Ceph daemons but are not registered as
|
|
|
|
hosts managed by *cephadm*. This means that those services cannot
|
|
|
|
currently be managed by cephadm (e.g., restarted, upgraded, included
|
2020-02-11 16:01:33 +00:00
|
|
|
in `ceph orch ps`).
|
2020-01-21 16:40:07 +00:00
|
|
|
|
|
|
|
You can manage the host(s) with::
|
|
|
|
|
2020-02-07 19:20:42 +00:00
|
|
|
ceph orch host add *<hostname>*
|
2020-01-21 16:40:07 +00:00
|
|
|
|
|
|
|
Note that you may need to configure SSH access to the remote host
|
|
|
|
before this will work.
|
|
|
|
|
|
|
|
Alternatively, you can manually connect to the host and ensure that
|
2020-03-17 13:54:47 +00:00
|
|
|
services on that host are removed or migrated to a host that is
|
2020-01-21 16:40:07 +00:00
|
|
|
managed by *cephadm*.
|
|
|
|
|
|
|
|
You can also disable this warning entirely with::
|
|
|
|
|
|
|
|
ceph config set mgr mgr/cephadm/warn_on_stray_hosts false
|
|
|
|
|
2020-02-11 16:01:33 +00:00
|
|
|
CEPHADM_STRAY_DAEMON
|
2020-02-21 13:28:07 +00:00
|
|
|
--------------------
|
2020-01-21 16:40:07 +00:00
|
|
|
|
|
|
|
One or more Ceph daemons are running but not are not managed by
|
2020-03-17 13:54:47 +00:00
|
|
|
*cephadm*. This may be because they were deployed using a different
|
|
|
|
tool, or because they were started manually. Those
|
|
|
|
services cannot currently be managed by cephadm (e.g., restarted,
|
|
|
|
upgraded, or included in `ceph orch ps`).
|
2020-01-21 16:40:07 +00:00
|
|
|
|
2020-03-17 13:54:47 +00:00
|
|
|
If the daemon is a stateful one (monitor or OSD), it should be adopted
|
|
|
|
by cephadm; see :ref:`cephadm-adoption`. For stateless daemons, it is
|
|
|
|
usually easiest to provision a new daemon with the ``ceph orch apply``
|
|
|
|
command and then stop the unmanaged daemon.
|
2020-01-21 16:40:07 +00:00
|
|
|
|
2020-03-17 13:54:47 +00:00
|
|
|
This warning can be disabled entirely with::
|
2020-01-21 16:40:07 +00:00
|
|
|
|
2020-02-11 16:01:33 +00:00
|
|
|
ceph config set mgr mgr/cephadm/warn_on_stray_daemons false
|
2020-01-24 17:46:40 +00:00
|
|
|
|
|
|
|
CEPHADM_HOST_CHECK_FAILED
|
2020-02-21 13:28:07 +00:00
|
|
|
-------------------------
|
2020-01-24 17:46:40 +00:00
|
|
|
|
|
|
|
One or more hosts have failed the basic cephadm host check, which verifies
|
|
|
|
that (1) the host is reachable and cephadm can be executed there, and (2)
|
|
|
|
that the host satisfies basic prerequisites, like a working container
|
|
|
|
runtime (podman or docker) and working time synchronization.
|
|
|
|
If this test fails, cephadm will no be able to manage services on that host.
|
|
|
|
|
|
|
|
You can manually run this check with::
|
|
|
|
|
|
|
|
ceph cephadm check-host *<hostname>*
|
|
|
|
|
|
|
|
You can remove a broken host from management with::
|
|
|
|
|
2020-02-07 19:20:42 +00:00
|
|
|
ceph orch host rm *<hostname>*
|
2020-01-24 17:46:40 +00:00
|
|
|
|
|
|
|
You can disable this health warning with::
|
|
|
|
|
|
|
|
ceph config set mgr mgr/cephadm/warn_on_failed_host_check false
|