RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2024-12-24 20:33:27 +00:00

Author	SHA1	Message	Date
Wido den Hollander	e73eb8cc1e	systemd: Restart Mon after 10s in case of failure In some situations the IP address the Monitor wants to bind to might not be available yet. This might for example be a IPv6 Address which is still performing DAD or waiting for a Router Advertisement to be send by the Router(s). Have systemd wait for 10s before starting the Mon and increase the amount of times it does so to 5. This allows the system to bring up IP Addresses in the mean time while systemd waits with restarting the Mon. Fixes: #18635 Signed-off-by: Wido den Hollander <wido@42on.com>	2017-01-23 08:50:08 +01:00
Mark Korenberg	2ccd02a838	Fix startup of Ceph cluster manager daemon on Debian 8 Signed-off-by: Mark Korenberg <socketpair@gmail.com>	2016-12-18 18:07:21 +05:00
John Spray	63ae8579bf	Merge pull request #11542 from batrick/systemd-ceph-fuse systemd: add ceph-fuse service file Reviewed-by: John Spray <john.spray@redhat.com>	2016-12-14 13:55:33 +00:00
Patrick Donnelly	d32d70b783	systemd: add ceph-fuse service file Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2016-12-01 19:51:37 -05:00
Loic Dachary	b3887379d6	build/ops: restart ceph-osd@.service after 20s instead of 100ms Instead of the default 100ms pause before trying to restart an OSD, wait 20 seconds instead and retry 30 times instead of 3. There is no scenario in which restarting an OSD almost immediately after it failed would get a better result. It is possible that a failure to start is due to a race with another systemd unit at boot time. For instance if ceph-disk@.service is delayed, it may start after the OSD that needs it. A long pause may give the racing service enough time to complete and the next attempt to start the OSD may succeed. This is not a sound alternative to resolve a race, it only makes the OSD boot process less sensitive. In the example above, the proper fix is to enable --runtime ceph-osd@.service so that it cannot race at boot time. The wait delay should not be minutes to preserve the current runtime behavior. For instance, if an OSD is killed or fails and restarts after 10 minutes, it will be marked down by the ceph cluster. This is not a change that could break things but it is significant and should be avoided. Refs: http://tracker.ceph.com/issues/17889 Signed-off-by: Loic Dachary <loic@dachary.org>	2016-12-01 08:28:20 +01:00
David Disseldorp	8a62cbc074	systemd/ceph-disk: reduce ceph-disk flock contention "ceph-disk trigger" invocation is currently performed in a mutually exclusive fashion, with each call first taking an flock on the path /var/lock/ceph-disk. On systems with a lot of osds, this leads to a large amount of lock contention during boot-up, and can cause some service instances to trip the 120 second timeout. Take an flock on a device specific path instead of /var/lock/ceph-disk, so that concurrent "ceph-disk trigger" invocations are permitted for independent osds. This greatly reduces lock contention and consequently the chance of service timeout. Per-device concurrency restrictions required for http://tracker.ceph.com/issues/13160 are maintained. Fixes: http://tracker.ceph.com/issues/18049 Signed-off-by: David Disseldorp <ddiss@suse.de>	2016-11-28 17:55:39 +01:00
Loic Dachary	d954de5546	ceph-disk: systemd unit must run after local-fs.target A ceph udev action may be triggered before the local file systems are mounted because there is no ordering in udev. The ceph udev action delegates asynchronously to systemd via ceph-disk@.service which will fail if (for instance) the LVM partition required to mount /var/lib/ceph is not available yet. The systemd unit will retry a few times but will eventually fail permanently. The sysadmin can systemctl reset-fail at a later time and it will succeed. Add a dependency to ceph-disk@.service so that it waits until the local file systems are mounted: After=local-fs.target Since local-fs.target depends on lvm, it will wait until the lvm partition (as well as any dm devices) is ready and mounted before attempting to activate the OSD. It may still fail because the corresponding journal/data partition is not ready yet (which is expected) but it will no longer fail because the lvm/filesystems/dm are not ready. Fixes: http://tracker.ceph.com/issues/17889 Signed-off-by: Loic Dachary <loic@dachary.org>	2016-11-22 15:23:47 +01:00
Owen Synge	639385a7f4	systemd/CMakeLists.txt:Remove ceph-create-keys cmake ceph-create-keys should not be started on boot of mons with systemd so should not exist as 'After' or 'Wants' for the ceph-mon.service Signed-off-by: Owen Synge <osynge@suse.com>	2016-11-04 23:05:44 +01:00
Owen Synge	dc5fe8d415	systemd/ceph-mon@.service:Remove ceph-create-keys for mon in systemd ceph-create-keys should not be started on boot of mons with systemd so should not exist as 'After' or 'Wants' for the ceph-mon.service Signed-off-by: Owen Synge <osynge@suse.com>	2016-11-04 23:05:26 +01:00
Owen Synge	8bcb4646b6	systemd/ceph-create-keys@.service:Remove ceph-create-keys for systemd ceph-create-keys should not be started on boot of mons with systemd so should not exist in the systemd files Signed-off-by: Owen Synge <osynge@suse.com>	2016-11-04 23:05:17 +01:00
Tim Serong	082199f69d	systemd: autogenerate ceph-mgr key during daemon startup This is a hack to inject a key for the mgr daemon, using whatever key already exists on the mon on this node to gain sufficient permissions to create the mgr key. Failure is ignored at every step (the '-' prefix) in case someone has already used some other trick to set everything up manually. Signed-off-by: Tim Serong <tserong@suse.com>	2016-09-29 17:27:08 +01:00
Tim Serong	61d779345e	systemd: encourage ceph-mgr to start in sync with ceph-mon This change introduces the following behaviour: - When ceph-mon starts, it will try to start ceph-mgr with the same instance id (Wants=), but will not fail to start if ceph-mgr doesn't start (i.e. the mon still works as it always did). - ceph-mgr will start After= ceph-mon, and will stop and start when ceph-mon stops and starts, because it's PartOf= ceph-mon. If you don't want ceph-mgr to run on the mons, you need to mask the service, i.e. `systemctl mask ceph-mgr@INSTANCE`. Hostnames are typically instance names, so `systemctl mask ceph-mgr@$(hostname)` should suffice if you wish to disable ceph-mgr on the mons. Signed-off-by: Tim Serong <tserong@suse.com>	2016-09-29 17:27:08 +01:00
Tim Serong	d8ded57a87	systemd: add ceph-mgr service and target files Signed-off-by: Tim Serong <tserong@suse.com>	2016-09-29 17:27:08 +01:00
Jason Dillaman	b1ce837a46	Merge pull request #10942 from JellevdK/master systemd: add install section to rbdmap.service file Reviewed-by: Jason Dillaman <dillaman@redhat.com>	2016-09-20 16:31:00 -04:00
Sage Weil	fba798dcad	remove autotools Signed-off-by: Sage Weil <sage@redhat.com>	2016-09-07 11:50:14 -04:00
Jelle vd Kooij	57b6f656e1	Add Install section to systemd rbdmap.service file Signed-off-by: Jelle vd Kooij <vdkooij.jelle@gmail.com>	2016-09-01 00:42:34 +02:00
Yuri Weinstein	8175ce07b8	Merge pull request #10262 from dachary/wip-16580-ceph-disk-timeout ceph-disk: timeout ceph-disk to avoid blocking forever Reviewed-by: Willem Jan Withagen <wjw@digiware.nl> Reviewed-by: Ken Dreyer (Red Hat) <kdreyer@redhat.com> Reviewed-by: Boris Ranto <branto@redhat.com> Reviewed-by: Nathan Cutler <ncutler@suse.cz>	2016-08-05 08:22:39 -07:00
Loic Dachary	bed1a5cc05	ceph-disk: timeout ceph-disk to avoid blocking forever When ceph-disk runs from udev or init script, it is in the background and should it block for any reason, it may keep a lock forever. All calls to ceph-disk in these context are changed to timeout. The TimeoutStartSec= and TimeoutStopSec= which are both set via TimeoutSec= do not apply to Type=oneshot services. https://www.freedesktop.org/software/systemd/man/systemd.service.html Fixes: http://tracker.ceph.com/issues/16580 Signed-off-by: Loic Dachary <loic@dachary.org>	2016-07-18 08:53:11 +02:00
Ruben Kerkhof	4179aa8d44	systemd: add osd id to service description So, instead of logging this: Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. We see this, which is a lot more useful: Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.27. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.32. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.29. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.31. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.23. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.24. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.25. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.30. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.28. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.22.	2016-07-01 14:02:36 +02:00
Kefu Chai	41061ce769	cmake: install systemd files add an option "WITH_SYSTEMD", off by default Signed-off-by: Kefu Chai <kchai@redhat.com>	2016-06-30 19:27:43 +08:00
Nathan Cutler	80be4a8cbf	systemd: fix typo in preset file Signed-off-by: Nathan Cutler <ncutler@suse.com>	2016-04-30 16:21:13 +02:00
Nathan Cutler	53b1a6799c	systemd: enable all the ceph .target services by default Some distros, like Fedora and openSUSE, have a policy that all services are disabled by default. This patch changes that default for the ceph.target and ceph-{mds,mon,osd,radosgw}.target services. Signed-off-by: Nathan Cutler <ncutler@suse.com> Signed-off-by: Boris Ranto <branto@redhat.com>	2016-04-27 14:20:29 +02:00
Nathan Cutler	df893f395e	systemd: make Ceph daemon units "want" time-sync.target Fixes: http://tracker.ceph.com/issues/15419 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2016-04-23 17:48:08 +02:00
Sage Weil	dcd211cdd1	Merge pull request #8449 from javacruft/ceph-osd-prestart ceph-osd-prestart.sh: drop --setuser/--setgroup Reviewed-by: Sage Weil <sage@redhat.com>	2016-04-22 17:06:44 -04:00
Boris Ranto	62084375fa	systemd: Use the same restart limits as upstart Currently, the systemd daemons are not restarted on failure. This patch adds this functionality and sets the defaults to those defined in upstart. This resolves to 3 fails per 30 minutes for osd, mon and mds and 5 fails per 30 seconds for radosgw. Signed-off-by: Boris Ranto <branto@redhat.com>	2016-04-13 21:26:31 +02:00
James Page	05cafcf19f	Drop any systemd imposed process/thread limits If systemd has task accounting enabled, a default of 512 tasks will be applied to all systemd units. For ceph, this is way to low even for a modest cluster, so stop this restriction being applied and allow administrators to apply limits using sysctl. Signed-off-by: James Page <james.page@ubuntu.com>	2016-04-05 17:33:57 +01:00
James Page	74977f7884	Drop --setuser/--setgroup from osd prestart These are not supported by /usr/lib/ceph/ceph-osd-prestart.sh, resulting in warnings: ceph-osd-prestart.sh[23367]: getopt: unrecognized option '--setuser' ceph-osd-prestart.sh[23367]: getopt: unrecognized option '--setgroup' --setuser and --setgroup are only needed for the ceph-osd process. Signed-off-by: James Page <james.page@ubuntu.com>	2016-04-05 16:59:38 +01:00
Sage Weil	df6570c2bd	Merge pull request #8222 from SUSE/wip-14984 systemd: set up environment in rbdmap unit file Reviewed-by: Boris Ranto <branto@redhat.com>	2016-03-23 12:33:39 -04:00
Nathan Cutler	a7a36581ff	systemd: set up environment in rbdmap unit file http://tracker.ceph.com/issues/14984 Fixes: #14984 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2016-03-19 06:34:07 +01:00
Jason Dillaman	8a0e47281f	systemd: new ceph-rbd-mirror scripts Signed-off-by: Jason Dillaman <dillaman@redhat.com>	2016-03-18 17:51:23 -04:00
Nathan Cutler	69291f872e	packaging: move ceph_common.sh and ceph-osd-prestart.sh to /usr/lib/ceph First, it makes sense for both ceph_common.sh and ceph-osd-prestart.sh to reside in the same directory: make it so. Second, /usr/lib exists on both RHEL/Fedora and SLE/openSUSE, whereas the later lacks /usr/libexec. To make this less painful, package ceph_common.sh and ceph-osd-prestart.sh in /usr/lib/ceph. Third, allow e.g. FreeBSD to do its own thing by using the $(libexecdir) Autoconf variable (but set it to /usr/lib in the spec file). http://tracker.ceph.com/issues/14687 Fixes: #14687 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2016-02-18 12:19:14 +01:00
Sage Weil	9da41fee1a	systemd/ceph-radosgw-prestart.sh: remove This is unpackaged and unused. Signed-off-by: Sage Weil <sage@redhat.com>	2016-02-04 17:48:16 -05:00
Patrick Donnelly	b65d9c5457	systemd: Add systemd sandboxing to services. This change makes it so the mon/osd/mds/radosgw daemons: o Cannot write to /usr, /etc, and /boot. o Cannot access /home, /root, or /run/user. o Each daemon gets its own private /tmp and /var/tmp. o All daemons get a private /dev without physical devices (exception: osd) I'm not sure if the osd daemon needs access to a full /dev so I left ProtectDevices out for ceph-osd@.service. Signed-off-by: Patrick Donnelly <batrick@batbytes.com>	2016-01-28 10:50:00 -05:00
Loic Dachary	c8f7d44c93	build/ops: systemd ceph-disk unit must not assume /bin/flock The flock command may be installed elsewhere, depending on the system. Let the PATH search figure that out. http://tracker.ceph.com/issues/13975 Fixes: #13975 Signed-off-by: Loic Dachary <loic@dachary.org>	2015-12-04 21:11:09 +01:00
Sage Weil	a12efa204e	Merge pull request #6276 from david-z/wip-systemd-finegrain-ceph-service systemd: start/stop/restart ceph services by daemon type Reviewed-by: Nathan Cutler <ncutler@suse.com> Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Boris Ranto <branto@redhat.com> Reviewed-by: Ken Dreyer <kdreyer@redhat.com>	2015-11-28 08:25:40 -05:00
suckowbiz	5972a44106	doc: fix message typos in systemd Signed-off-by: Tobias Suckow <tobias@suckow.biz>	2015-11-23 16:50:07 +01:00
Boris Ranto	9224ac2ad2	rbdmap: systemd support Fixes: #13374 Signed-off-by: Boris Ranto <branto@redhat.com>	2015-11-06 10:26:22 +01:00
Zhi Zhang	cfa2d0a08a	fine-grained control systemd to start/stop/restart ceph services at once Signed-off-by: Zhi Zhang <zhangz.david@outlook.com>	2015-10-26 15:13:19 +08:00
Sage Weil	fb5f058a92	Merge remote-tracking branch 'gh/infernalis'	2015-09-22 14:04:44 -04:00
Loic Dachary	f0a47578c7	ceph-disk: systemd must not kill a running ceph-disk When activating a device, ceph-disk trigger restarts the ceph-disk systemd service. Two consecutive udev add on the same device will restart the ceph-disk systemd service and the second one may kill the first one, leaving the device half activated. The ceph-disk systemd service is instructed to not kill an existing process when restarting. The second run waits (via flock) for the second one to complete before running so that they do not overlap. http://tracker.ceph.com/issues/13160 Fixes: #13160 Signed-off-by: Loic Dachary <ldachary@redhat.com>	2015-09-22 08:46:56 +02:00
Sage Weil	ea977611c4	systemd: increase nproc ulimit We were observed to be hitting the limit on centos7 (triggering pthread_create failures) on a ~2000 OSD cluster. Increasing this resolves it! Reported-by: Dan van der Ster <daniel.vanderster@cern.ch> Signed-off-by: Sage Weil <sage@redhat.com>	2015-09-21 14:35:15 -04:00
Sage Weil	8e13d89f0f	systemd: eliminate ceph-rgw tmpfiles.d file This is for storing the rgw socket files for fastcgi, which we do not want to enable by default. Signed-off-by: Sage Weil <sage@redhat.com>	2015-09-14 14:00:26 -04:00
Sage Weil	367c794cb1	systemd: no need to preprocess ceph-osd@service This used to be necessary but now is not. Signed-off-by: Sage Weil <sage@redhat.com>	2015-09-14 14:00:26 -04:00
Sage Weil	8453a89cb2	systemd: set nofile limit in unit files Make it big so hopefully nobody has to change it. Signed-off-by: Sage Weil <sage@redhat.com>	2015-09-14 14:00:26 -04:00
Sage Weil	ea91c4ef85	systemd: tmpfiles.d in /run, not /var/run Signed-off-by: Sage Weil <sage@redhat.com>	2015-09-11 11:38:47 -04:00
Sage Weil	3aa38bc07f	make /var/run/ceph 770 ceph:ceph This allows members of the ceph group to make librados clients (like the ceph cli and qemu) create sockets in the default /var/run/ceph/* location. Signed-off-by: Sage Weil <sage@redhat.com>	2015-09-11 11:26:59 -04:00
Sage Weil	f1b80e99b0	systemd: consolidate into a single ceph-disk@.service This simple service will 'ceph-disk trigger DEV --sync'. Signed-off-by: Sage Weil <sage@redhat.com>	2015-09-01 11:22:25 -04:00
Sage Weil	8f3185bade	systemd: use --setuser and --setgroup for all daemons Allow all daemons drop privilege themselves, instead of letting systemd do it. Among other things, this means that admins can conditionally not drop prives by setting setuser match path = /var/lib/ceph/$type/$cluster-$id in their ceph.conf to ease the pain of upgrade. Signed-off-by: Sage Weil <sage@redhat.com> Reviewed-by: Boris Ranto <branto@redhat.com>	2015-08-26 20:34:15 -04:00
Sage Weil	c7ee798a0f	set nofile ulimit in /etc/security/limits.d/ceph only Specify the nofile ulimit in one standard place, where everyone expects it to be. Drop it from the ceph-osd unit file. Leave upstart and sysvinit untouched for the time being to avoid compat issues. Signed-off-by: Sage Weil <sage@redhat.com>	2015-08-26 20:34:15 -04:00
Sage Weil	7c9fdf44f2	systemd: make ceph-osd setuid/gid to ceph:ceph Signed-off-by: Sage Weil <sage@redhat.com>	2015-08-26 20:34:15 -04:00

1 2

70 Commits