ceph/doc/cephadm/troubleshooting.rst
Sage Weil ce2066e623 doc: reorganize cephadm docs
- reorganized cephadm into a top-level item with a series of sub-items.
- condensed the 'install' page so that it doesn't create a zillion items
in the toctree on the left
- started updating the cephadm/install sequence (incomplete)

Signed-off-by: Sage Weil <sage@redhat.com>
2020-03-17 17:52:43 -05:00

72 lines
2.0 KiB
ReStructuredText

Troubleshooting
===============
Sometimes there is a need to investigate why a cephadm command failed or why
a specific service no longer runs properly.
As cephadm deploys daemons as containers, troubleshooting daemons is slightly
different. Here are a few tools and commands to help investigating issues.
Gathering log files
-------------------
Use journalctl to gather the log files of all daemons:
.. note:: By default cephadm now stores logs in journald. This means
that you will no longer find daemon logs in ``/var/log/ceph/``.
To read the log file of one specific daemon, run::
cephadm logs --name <name-of-daemon>
Note: this only works when run on the same host where the daemon is running. To
get logs of a daemon running on a different host, give the ``--fsid`` option::
cephadm logs --fsid <fsid> --name <name-of-daemon>
Where the ``<fsid>`` corresponds to the cluster id printed by ``ceph status``.
To fetch all log files of all daemons on a given host, run::
for name in $(cephadm ls | jq -r '.[].name') ; do
cephadm logs --fsid <fsid> --name "$name" > $name;
done
Collecting systemd status
-------------------------
To print the state of a systemd unit, run::
systemctl status "ceph-$(cephadm shell ceph fsid)@<service name>.service";
To fetch all state of all daemons of a given host, run::
fsid="$(cephadm shell ceph fsid)"
for name in $(cephadm ls | jq -r '.[].name') ; do
systemctl status "ceph-$fsid@$name.service" > $name;
done
List all downloaded container images
------------------------------------
To list all container images that are downloaded on a host:
.. note:: ``Image`` might also be called `ImageID`
::
podman ps -a --format json | jq '.[].Image'
"docker.io/library/centos:8"
"registry.opensuse.org/opensuse/leap:15.2"
Manually running containers
---------------------------
cephadm writes small wrappers that run a containers. Refer to
``/var/lib/ceph/<cluster-fsid>/<service-name>/unit.run`` for the container execution command.
to execute a container.