mirror of
https://github.com/ceph/ceph
synced 2025-02-22 18:47:18 +00:00
- reorganized cephadm into a top-level item with a series of sub-items. - condensed the 'install' page so that it doesn't create a zillion items in the toctree on the left - started updating the cephadm/install sequence (incomplete) Signed-off-by: Sage Weil <sage@redhat.com>
251 lines
7.4 KiB
ReStructuredText
251 lines
7.4 KiB
ReStructuredText
==================
|
|
Cephadm Operations
|
|
==================
|
|
|
|
Watching cephadm log messages
|
|
=============================
|
|
|
|
Cephadm logs to the ``cephadm`` cluster log channel, which means you can monitor progress in realtime with::
|
|
|
|
# ceph -W cephadm
|
|
|
|
By default it will show info-level events and above. To see
|
|
debug-level messages too::
|
|
|
|
# ceph config set mgr mgr/cephadm/log_to_cluster_level debug
|
|
# ceph -W cephadm --watch-debug
|
|
|
|
Be careful: the debug messages are very verbose!
|
|
|
|
You can see recent events with::
|
|
|
|
# ceph log last cephadm
|
|
|
|
These events are also logged to the ``ceph.cephadm.log`` file on
|
|
monitor hosts and/or to the monitor-daemon stderr.
|
|
|
|
|
|
Ceph daemon logs
|
|
================
|
|
|
|
Logging to stdout
|
|
-----------------
|
|
|
|
Traditionally, Ceph daemons have logged to ``/var/log/ceph``. With
|
|
cephadm, by default, daemons instead log to stderr and the logs are
|
|
captured by the container runtime environment. For most systems, by
|
|
default, these logs are sent to journald and accessible via
|
|
``journalctl``.
|
|
|
|
For example, to view the logs for the daemon ``mon.foo`` for a cluster
|
|
with id ``5c5a50ae-272a-455d-99e9-32c6a013e694``, the command would be
|
|
something like::
|
|
|
|
journalctl -u ceph-5c5a50ae-272a-455d-99e9-32c6a013e694@mon.foo
|
|
|
|
This works well for normal operations when logging levels are low.
|
|
|
|
To disable logging to stderr::
|
|
|
|
ceph config set global log_to_stderr false
|
|
ceph config set global mon_cluster_log_to_stderr false
|
|
|
|
Logging to files
|
|
----------------
|
|
|
|
You can also configure Ceph daemons to log to files instead of stderr,
|
|
just like they have in the past. When logging to files, Ceph logs appear
|
|
in ``/var/log/ceph/<cluster-fsid>``.
|
|
|
|
To enable logging to files::
|
|
|
|
ceph config set global log_to_file true
|
|
ceph config set global mon_cluster_log_to_file true
|
|
|
|
You probably want to disable logging to stderr (see above) or else everything
|
|
will be logged twice!::
|
|
|
|
ceph config set global log_to_stderr false
|
|
ceph config set global mon_cluster_log_to_stderr false
|
|
|
|
By default, cephadm sets up log rotation on each host to rotate these
|
|
files. You can configure the logging retention schedule by modifying
|
|
``/etc/logrotate.d/ceph.<cluster-fsid>``.
|
|
|
|
|
|
Data location
|
|
=============
|
|
|
|
Cephadm daemon data and logs in slightly different locations than older
|
|
versions of ceph:
|
|
|
|
* ``/var/log/ceph/<cluster-fsid>`` contains all cluster logs. Note
|
|
that by default cephadm logs via stderr and the container runtime,
|
|
so these logs are normally not present.
|
|
* ``/var/lib/ceph/<cluster-fsid>`` contains all cluster daemon data
|
|
(besides logs).
|
|
* ``/var/lib/ceph/<cluster-fsid>/<daemon-name>`` contains all data for
|
|
an individual daemon.
|
|
* ``/var/lib/ceph/<cluster-fsid>/crash`` contains crash reports for
|
|
the cluster.
|
|
* ``/var/lib/ceph/<cluster-fsid>/removed`` contains old daemon
|
|
data directories for stateful daemons (e.g., monitor, prometheus)
|
|
that have been removed by cephadm.
|
|
|
|
Disk usage
|
|
----------
|
|
|
|
Because a few Ceph daemons may store a significant amount of data in
|
|
``/var/lib/ceph`` (notably, the monitors and prometheus), you may want
|
|
to move this directory to its own disk, partition, or logical volume so
|
|
that you do not fill up the root file system.
|
|
|
|
|
|
|
|
SSH Configuration
|
|
=================
|
|
|
|
Cephadm uses SSH to connect to remote hosts. SSH uses a key to authenticate
|
|
with those hosts in a secure way.
|
|
|
|
|
|
Default behavior
|
|
----------------
|
|
|
|
Cephadm normally stores an SSH key in the monitor that is used to
|
|
connect to remote hosts. When the cluster is bootstrapped, this SSH
|
|
key is generated automatically. Normally, no additional configuration
|
|
is necessary.
|
|
|
|
A *new* SSH key can be generated with::
|
|
|
|
ceph cephadm generate-key
|
|
|
|
The public portion of the SSH key can be retrieved with::
|
|
|
|
ceph cephadm get-pub-key
|
|
|
|
The currently stored SSH key can be deleted with::
|
|
|
|
ceph cephadm clear-key
|
|
|
|
You can make use of an existing key by directly importing it with::
|
|
|
|
ceph config-key set mgr/cephadm/ssh_identity_key -i <key>
|
|
ceph config-key set mgr/cephadm/ssh_identity_pub -i <pub>
|
|
|
|
You will then need to restart the mgr daemon to reload the configuration with::
|
|
|
|
ceph mgr fail
|
|
|
|
|
|
Customizing the SSH configuration
|
|
---------------------------------
|
|
|
|
Normally cephadm generates an appropriate ``ssh_config`` file that is
|
|
used for connecting to remote hosts. This configuration looks
|
|
something like this::
|
|
|
|
Host *
|
|
User root
|
|
StrictHostKeyChecking no
|
|
UserKnownHostsFile /dev/null
|
|
|
|
There are two ways to customize this configuration for your environment:
|
|
|
|
#. You can import a customized configuration file that will be stored
|
|
by the monitor with::
|
|
|
|
ceph cephadm set-ssh-config -i <ssh_config_file>
|
|
|
|
To remove a customized ssh config and revert back to the default behavior::
|
|
|
|
ceph cephadm clear-ssh-config
|
|
|
|
#. You can configure a file location for the ssh configuration file with::
|
|
|
|
ceph config set mgr mgr/cephadm/ssh_config_file <path>
|
|
|
|
This approach is *not recommended*, however, as the path name must be
|
|
visible to *any* mgr daemon, and cephadm runs all daemons as
|
|
containers. That means that the file either need to be placed
|
|
inside a customized container image for your deployment, or
|
|
manually distributed to the mgr data directory
|
|
(``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at
|
|
``/var/lib/ceph/mgr/ceph-<id>`` from inside the container).
|
|
|
|
|
|
Health checks
|
|
=============
|
|
|
|
CEPHADM_PAUSED
|
|
--------------
|
|
|
|
Cephadm background work has been paused with ``ceph orch pause``. Cephadm
|
|
will continue to perform passive monitoring activities (like checking
|
|
host and daemon status), but it will not make any changes (like deploying
|
|
or removing daemons).
|
|
|
|
You can resume cephadm work with::
|
|
|
|
ceph orch resume
|
|
|
|
CEPHADM_STRAY_HOST
|
|
------------------
|
|
|
|
One or more hosts have running Ceph daemons but are not registered as
|
|
hosts managed by *cephadm*. This means that those services cannot
|
|
currently be managed by cephadm (e.g., restarted, upgraded, included
|
|
in `ceph orch ps`).
|
|
|
|
You can manage the host(s) with::
|
|
|
|
ceph orch host add *<hostname>*
|
|
|
|
Note that you may need to configure SSH access to the remote host
|
|
before this will work.
|
|
|
|
Alternatively, you can manually connect to the host and ensure that
|
|
services on that host are removed and/or migrated to a host that is
|
|
managed by *cephadm*.
|
|
|
|
You can also disable this warning entirely with::
|
|
|
|
ceph config set mgr mgr/cephadm/warn_on_stray_hosts false
|
|
|
|
CEPHADM_STRAY_DAEMON
|
|
--------------------
|
|
|
|
One or more Ceph daemons are running but not are not managed by
|
|
*cephadm*, perhaps because they were deploy using a different tool, or
|
|
were started manually. This means that those services cannot
|
|
currently be managed by cephadm (e.g., restarted, upgraded, included
|
|
in `ceph orch ps`).
|
|
|
|
**FIXME:** We need to implement and document an adopt procedure here.
|
|
|
|
You can also disable this warning entirely with::
|
|
|
|
ceph config set mgr mgr/cephadm/warn_on_stray_daemons false
|
|
|
|
CEPHADM_HOST_CHECK_FAILED
|
|
-------------------------
|
|
|
|
One or more hosts have failed the basic cephadm host check, which verifies
|
|
that (1) the host is reachable and cephadm can be executed there, and (2)
|
|
that the host satisfies basic prerequisites, like a working container
|
|
runtime (podman or docker) and working time synchronization.
|
|
If this test fails, cephadm will no be able to manage services on that host.
|
|
|
|
You can manually run this check with::
|
|
|
|
ceph cephadm check-host *<hostname>*
|
|
|
|
You can remove a broken host from management with::
|
|
|
|
ceph orch host rm *<hostname>*
|
|
|
|
You can disable this health warning with::
|
|
|
|
ceph config set mgr mgr/cephadm/warn_on_failed_host_check false
|