RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2024-12-15 07:56:12 +00:00

Author	SHA1	Message	Date
Loic Dachary	a9eb52e0a4	ceph-disk: set the default systemd unit timeout to 3h There needs to be a timeout to prevent ceph-disk from hanging forever. But there is no good reason to set it to a value that is less than a few hours. Each OSD activation needs to happen in sequence and not in parallel, reason why there is a global activation lock. It would be possible, when an OSD is using a device that is not otherwise used by another OSD (i.e. they do not share an SSD journal device etc.), to run all activations in parallel. It would however require a more extensive modification of ceph-disk to avoid any chances of races. Fixes: http://tracker.ceph.com/issues/20229 Signed-off-by: Loic Dachary <loic@dachary.org>	2017-06-08 22:29:48 +02:00
Alexey Sheplyakov	22332f6bae	systemd/ceph-disk: make it possible to customize timeout When booting a server with 20+ HDDs udev has to process a lot of events (especially if dm-crypt is used), and 2 minutes might be not enough for that. Make it possible to override the timeout (via systemd drop-in files), and use a longer timeout (5 minutes) by default. Fixes: http://tracker.ceph.com/issues/18740 Signed-off-by: Alexey Sheplyakov <asheplyakov@mirantis.com>	2017-02-06 14:17:20 +04:00
David Disseldorp	8a62cbc074	systemd/ceph-disk: reduce ceph-disk flock contention "ceph-disk trigger" invocation is currently performed in a mutually exclusive fashion, with each call first taking an flock on the path /var/lock/ceph-disk. On systems with a lot of osds, this leads to a large amount of lock contention during boot-up, and can cause some service instances to trip the 120 second timeout. Take an flock on a device specific path instead of /var/lock/ceph-disk, so that concurrent "ceph-disk trigger" invocations are permitted for independent osds. This greatly reduces lock contention and consequently the chance of service timeout. Per-device concurrency restrictions required for http://tracker.ceph.com/issues/13160 are maintained. Fixes: http://tracker.ceph.com/issues/18049 Signed-off-by: David Disseldorp <ddiss@suse.de>	2016-11-28 17:55:39 +01:00
Loic Dachary	d954de5546	ceph-disk: systemd unit must run after local-fs.target A ceph udev action may be triggered before the local file systems are mounted because there is no ordering in udev. The ceph udev action delegates asynchronously to systemd via ceph-disk@.service which will fail if (for instance) the LVM partition required to mount /var/lib/ceph is not available yet. The systemd unit will retry a few times but will eventually fail permanently. The sysadmin can systemctl reset-fail at a later time and it will succeed. Add a dependency to ceph-disk@.service so that it waits until the local file systems are mounted: After=local-fs.target Since local-fs.target depends on lvm, it will wait until the lvm partition (as well as any dm devices) is ready and mounted before attempting to activate the OSD. It may still fail because the corresponding journal/data partition is not ready yet (which is expected) but it will no longer fail because the lvm/filesystems/dm are not ready. Fixes: http://tracker.ceph.com/issues/17889 Signed-off-by: Loic Dachary <loic@dachary.org>	2016-11-22 15:23:47 +01:00
Loic Dachary	bed1a5cc05	ceph-disk: timeout ceph-disk to avoid blocking forever When ceph-disk runs from udev or init script, it is in the background and should it block for any reason, it may keep a lock forever. All calls to ceph-disk in these context are changed to timeout. The TimeoutStartSec= and TimeoutStopSec= which are both set via TimeoutSec= do not apply to Type=oneshot services. https://www.freedesktop.org/software/systemd/man/systemd.service.html Fixes: http://tracker.ceph.com/issues/16580 Signed-off-by: Loic Dachary <loic@dachary.org>	2016-07-18 08:53:11 +02:00
Loic Dachary	c8f7d44c93	build/ops: systemd ceph-disk unit must not assume /bin/flock The flock command may be installed elsewhere, depending on the system. Let the PATH search figure that out. http://tracker.ceph.com/issues/13975 Fixes: #13975 Signed-off-by: Loic Dachary <loic@dachary.org>	2015-12-04 21:11:09 +01:00
Loic Dachary	f0a47578c7	ceph-disk: systemd must not kill a running ceph-disk When activating a device, ceph-disk trigger restarts the ceph-disk systemd service. Two consecutive udev add on the same device will restart the ceph-disk systemd service and the second one may kill the first one, leaving the device half activated. The ceph-disk systemd service is instructed to not kill an existing process when restarting. The second run waits (via flock) for the second one to complete before running so that they do not overlap. http://tracker.ceph.com/issues/13160 Fixes: #13160 Signed-off-by: Loic Dachary <ldachary@redhat.com>	2015-09-22 08:46:56 +02:00
Sage Weil	f1b80e99b0	systemd: consolidate into a single ceph-disk@.service This simple service will 'ceph-disk trigger DEV --sync'. Signed-off-by: Sage Weil <sage@redhat.com>	2015-09-01 11:22:25 -04:00

8 Commits