Ceph AIO installation with single/multiple node is not friendly for
loopback mount, especially always get deadlock issue during graceful
system reboot.
We already have `rbdmap.service` with graceful system reboot friendly as
below:
[Unit]
After=network-online.target
Before=remote-fs-pre.target
Wants=network-online.target remote-fs-pre.target
[Service]
ExecStart=/usr/bin/rbdmap map
ExecReload=/usr/bin/rbdmap map
ExecStop=/usr/bin/rbdmap unmap-all
This PR introduce:
- `ceph-mon.target`: Ensure startup after `network-online.target` and
before `remote-fs-pre.target`
- `ceph-*.target`: Ensure startup after `ceph-mon.target` and before
`remote-fs-pre.target`
- `rbdmap.service`: Once all `_netdev` get unmount by
`remote-fs.target`, ensure unmap all RBD BEFORE any Ceph components
under `ceph.target` get stopped during shutdown
The logic is concept proof by
<https://github.com/alvistack/ansible-role-ceph_common/tree/develop>;
also works as expected with Ceph + Kubernetes deployment by
<https://github.com/alvistack/ansible-collection-kubernetes/tree/develop>.
No more deadlock happened during graceful system reboot, both AIO
single/multiple no de with loopback mount.
Also see:
- <https://github.com/ceph/ceph/pull/36776>
- <https://github.com/etcd-io/etcd/pull/12259>
- <https://github.com/cri-o/cri-o/pull/4128>
- <https://github.com/kubernetes/release/pull/1504>
Fixes: https://tracker.ceph.com/issues/47528
Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com>
The daemon is built for future integration with both RBD and RGW cache.
The key components are:
- domain socket based simple IPC
- simple LRU policy based promotion/demotion for the cache
- simple file based caching store for RADOS objs with sync IO interface
- systemd service/target files for the daemon
Signed-off-by: Dehao Shang <dehao.shang@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>