Ceph AIO installation with single/multiple node is not friendly for
loopback mount, especially always get deadlock issue during graceful
system reboot.
We already have `rbdmap.service` with graceful system reboot friendly as
below:
[Unit]
After=network-online.target
Before=remote-fs-pre.target
Wants=network-online.target remote-fs-pre.target
[Service]
ExecStart=/usr/bin/rbdmap map
ExecReload=/usr/bin/rbdmap map
ExecStop=/usr/bin/rbdmap unmap-all
This PR introduce:
- `ceph-mon.target`: Ensure startup after `network-online.target` and
before `remote-fs-pre.target`
- `ceph-*.target`: Ensure startup after `ceph-mon.target` and before
`remote-fs-pre.target`
- `rbdmap.service`: Once all `_netdev` get unmount by
`remote-fs.target`, ensure unmap all RBD BEFORE any Ceph components
under `ceph.target` get stopped during shutdown
The logic is concept proof by
<https://github.com/alvistack/ansible-role-ceph_common/tree/develop>;
also works as expected with Ceph + Kubernetes deployment by
<https://github.com/alvistack/ansible-collection-kubernetes/tree/develop>.
No more deadlock happened during graceful system reboot, both AIO
single/multiple no de with loopback mount.
Also see:
- <https://github.com/ceph/ceph/pull/36776>
- <https://github.com/etcd-io/etcd/pull/12259>
- <https://github.com/cri-o/cri-o/pull/4128>
- <https://github.com/kubernetes/release/pull/1504>
Fixes: https://tracker.ceph.com/issues/47528
Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com>
The PartOf= and WantedBy= directives in the various systemd
unit files and targets create the following logical hierarchy:
- ceph.target
- ceph-fuse.target
- ceph-fuse@.service
- ceph-mds.target
- ceph-mds@.service
- ceph-mgr.target
- ceph-mgr@.service
- ceph-mon.target
- ceph-mon@.service
- ceph-osd.target
- ceph-osd@.service
- ceph-radosgw.target
- ceph-radosgw@.service
- ceph-rbd-mirror.target
- ceph-rbd-mirror@.service
Additionally, the ceph-{fuse,mds,mon,osd,radosgw,rbd-mirror}
targets have WantedBy=multi-user.target. This gives the
following behaviour:
- `systemctl {start,stop,restart}` of any target will restart
all dependent services (e.g.: `systemctl restart ceph.target`
will restart all services; `systemctl restart ceph-mon.target`
will restart all the mons, and so forth).
- `systemctl {enable,disable}` for the second level targets
(ceph-mon.target etc.) will cause depenent services to come
up on boot, or not (of course the individual services can
be enabled or disabled as well - for a service to start
on boot, both the service and its target must be enabled;
disabling either will cause the service to be disabled).
- `systemctl {enable,disable} ceph.target` has no effect on
whether or not services come up at boot; if the second level
targets and services are enabled, they'll start regardless of
whether ceph.target is enabled. This is due to the second
level targets all having WantedBy=multi-user.target.
- The OSDs will always start regardless of ceph-osd.target
(unless they are explicitly masked), thanks to udev magic.
So far, so good. Except, several users have encountered
services not starting with the following error:
Failed to start ceph-osd@5.service: Transaction order is
cyclic. See system logs for details.
I've not been able to reproduce this myself in such a way as to
cause OSDs to fail to start, but I *have* managed to get systemd
into that same confused state, as follows:
- Disable ceph.target, ceph-mon.target, ceph-osd.target,
ceph-mon@$(hostname).service and all ceph-osd instances.
- Re-enable all of the above.
At this point, everything is fine, but if I then subseqently
disable ceph.target, *then* try `systemctl restart ceph.target`,
I get "Failed to restart ceph.target: Transaction order is cyclic.
See system logs for details."
Explicitly adding Before=ceph.target to each second level target
prevents systemd from becoming confused in this situation.
Signed-off-by: Tim Serong <tserong@suse.com>