ceph/systemd/ceph-osd@.service.in

[Unit]
Description=Ceph object storage daemon osd.%i
After=network-online.target local-fs.target time-sync.target ceph-mon.target
Wants=network-online.target local-fs.target time-sync.target
PartOf=ceph-osd.target

[Service]
LimitNOFILE=1048576
LimitNPROC=1048576
EnvironmentFile=-@SYSTEMD_ENV_FILE@
Environment=CLUSTER=ceph
ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph
ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i
ExecReload=/bin/kill -HUP $MAINPID
LockPersonality=true
MemoryDenyWriteExecute=true
# Need NewPrivileges via `sudo smartctl`
NoNewPrivileges=false
ProtectControlGroups=true
ProtectHome=true
ProtectKernelModules=true
# flushing filestore requires access to /proc/sys/vm/drop_caches
ProtectKernelTunables=false
ProtectSystem=full
PrivateTmp=true
TasksMax=infinity
Restart=on-failure
StartLimitInterval=30min
StartLimitBurst=3

[Install]
WantedBy=ceph-osd.target
init: add systemd service files This patch adds systemd service files. It is possible to start and enable multiple instances (per monid, osdid, mds name), e.g. # systemctl start ceph-mon@node01 # systemctl enable ceph-mon@node01 # systemctl start ceph-osd@0 # systemctl enable ceph-osd@0 The ceph cluster can be set in the system config file: /etc/sysconfig/ceph adding or editing the CLUSTER environment variable. Signed-off-by: Federico Simoncelli <fsimonce@redhat.com> 2014-07-20 02:08:43 +00:00			`[Unit]`
systemd: add osd id to service description So, instead of logging this: Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. Jul 01 13:51:04 localhost systemd[1]: Failed to start Ceph object storage daemon. We see this, which is a lot more useful: Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.27. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.32. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.29. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.31. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.23. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.24. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.25. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.30. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.28. Jul 01 13:59:32 localhost systemd[1]: Failed to start Ceph object storage daemon osd.22. 2016-07-01 11:55:58 +00:00			`Description=Ceph object storage daemon osd.%i`
systemd: Start OSDs after MONs Currently, we start/stop OSDs and MONs simultaneously. This may cause problems especially when we are shutting down the system. Once the mon goes down it causes a re-election and the MONs can miss the message from the OSD that is going down. Resolves: http://tracker.ceph.com/issues/18516 Signed-off-by: Boris Ranto <branto@redhat.com> 2017-01-25 11:39:40 +00:00			`After=network-online.target local-fs.target time-sync.target ceph-mon.target`
systemd: make Ceph daemon units "want" time-sync.target Fixes: http://tracker.ceph.com/issues/15419 Signed-off-by: Nathan Cutler <ncutler@suse.com> 2016-04-07 18:17:44 +00:00			`Wants=network-online.target local-fs.target time-sync.target`
fine-grained control systemd to start/stop/restart ceph services at once Signed-off-by: Zhi Zhang <zhangz.david@outlook.com> 2015-10-26 07:13:19 +00:00			`PartOf=ceph-osd.target`
init: add systemd service files This patch adds systemd service files. It is possible to start and enable multiple instances (per monid, osdid, mds name), e.g. # systemctl start ceph-mon@node01 # systemctl enable ceph-mon@node01 # systemctl start ceph-osd@0 # systemctl enable ceph-osd@0 The ceph cluster can be set in the system config file: /etc/sysconfig/ceph adding or editing the CLUSTER environment variable. Signed-off-by: Federico Simoncelli <fsimonce@redhat.com> 2014-07-20 02:08:43 +00:00
			`[Service]`
systemd: set nofile limit in unit files Make it big so hopefully nobody has to change it. Signed-off-by: Sage Weil <sage@redhat.com> 2015-09-14 14:54:53 +00:00			`LimitNOFILE=1048576`
systemd: increase nproc ulimit We were observed to be hitting the limit on centos7 (triggering pthread_create failures) on a ~2000 OSD cluster. Increasing this resolves it! Reported-by: Dan van der Ster <daniel.vanderster@cern.ch> Signed-off-by: Sage Weil <sage@redhat.com> 2015-09-17 22:28:38 +00:00			`LimitNPROC=1048576`
cmake,deb: set EnvironmentFile using cmake this change also fix the EnvironmentFile specified in rbdmap.service. without this change EnvironmentFile in rbdmap.service is always /etc/sysconfig/ceph even on debian derived distros. after this change, this variable is /etc/default/ceph in rbdmap.service shipped by the deb packages. Signed-off-by: Kefu Chai <kchai@redhat.com> 2018-02-27 08:42:48 +00:00			`EnvironmentFile=-@SYSTEMD_ENV_FILE@`
init: add systemd service files This patch adds systemd service files. It is possible to start and enable multiple instances (per monid, osdid, mds name), e.g. # systemctl start ceph-mon@node01 # systemctl enable ceph-mon@node01 # systemctl start ceph-osd@0 # systemctl enable ceph-osd@0 The ceph cluster can be set in the system config file: /etc/sysconfig/ceph adding or editing the CLUSTER environment variable. Signed-off-by: Federico Simoncelli <fsimonce@redhat.com> 2014-07-20 02:08:43 +00:00			`Environment=CLUSTER=ceph`
systemd: make ceph-osd setuid/gid to ceph:ceph Signed-off-by: Sage Weil <sage@redhat.com> 2015-04-24 00:15:14 +00:00			`ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph`
Drop --setuser/--setgroup from osd prestart These are not supported by /usr/lib/ceph/ceph-osd-prestart.sh, resulting in warnings: ceph-osd-prestart.sh[23367]: getopt: unrecognized option '--setuser' ceph-osd-prestart.sh[23367]: getopt: unrecognized option '--setgroup' --setuser and --setgroup are only needed for the ceph-osd process. Signed-off-by: James Page <james.page@ubuntu.com> 2016-04-05 15:58:58 +00:00			`ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i`
logrotate.conf: fixes for systemd Before this patch, the command 'logrotate -f /etc/logrotate.d/ceph' was generating an error "Failed to reload ceph.target: Job type reload is not applicable for unit ceph.target". Before we issue systemctl reload, check that there is at least one active ceph-* service. (The hyphen is significant.) Since we use grep, make the grep package a dependency. http://tracker.ceph.com/issues/12173 Fixes: #12173 Signed-off-by: Tim Serong <tserong@suse.com> Signed-off-by: Lars Marowsky-Bree <lmb@suse.com> Signed-off-by: Nathan Cutler <ncutler@suse.com> 2015-06-26 11:13:33 +00:00			`ExecReload=/bin/kill -HUP $MAINPID`
systemd: lock down privileges more Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> 2019-02-01 19:48:00 +00:00			`LockPersonality=true`
			`MemoryDenyWriteExecute=true`
			# Need NewPrivileges via `sudo smartctl`
			`NoNewPrivileges=false`
			`ProtectControlGroups=true`
systemd: Add systemd sandboxing to services. This change makes it so the mon/osd/mds/radosgw daemons: o Cannot write to /usr, /etc, and /boot. o Cannot access /home, /root, or /run/user. o Each daemon gets its own private /tmp and /var/tmp. o All daemons get a private /dev without physical devices (exception: osd) I'm not sure if the osd daemon needs access to a full /dev so I left ProtectDevices out for ceph-osd@.service. Signed-off-by: Patrick Donnelly <batrick@batbytes.com> 2016-01-28 02:17:14 +00:00			`ProtectHome=true`
systemd: lock down privileges more Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> 2019-02-01 19:48:00 +00:00			`ProtectKernelModules=true`
			`# flushing filestore requires access to /proc/sys/vm/drop_caches`
			`ProtectKernelTunables=false`
systemd: Add systemd sandboxing to services. This change makes it so the mon/osd/mds/radosgw daemons: o Cannot write to /usr, /etc, and /boot. o Cannot access /home, /root, or /run/user. o Each daemon gets its own private /tmp and /var/tmp. o All daemons get a private /dev without physical devices (exception: osd) I'm not sure if the osd daemon needs access to a full /dev so I left ProtectDevices out for ceph-osd@.service. Signed-off-by: Patrick Donnelly <batrick@batbytes.com> 2016-01-28 02:17:14 +00:00			`ProtectSystem=full`
			`PrivateTmp=true`
Drop any systemd imposed process/thread limits If systemd has task accounting enabled, a default of 512 tasks will be applied to all systemd units. For ceph, this is way to low even for a modest cluster, so stop this restriction being applied and allow administrators to apply limits using sysctl. Signed-off-by: James Page <james.page@ubuntu.com> 2016-04-05 16:32:59 +00:00			`TasksMax=infinity`
systemd: Use the same restart limits as upstart Currently, the systemd daemons are not restarted on failure. This patch adds this functionality and sets the defaults to those defined in upstart. This resolves to 3 fails per 30 minutes for osd, mon and mds and 5 fails per 30 seconds for radosgw. Signed-off-by: Boris Ranto <branto@redhat.com> 2016-03-17 17:54:47 +00:00			`Restart=on-failure`
			`StartLimitInterval=30min`
systemd: only restart 3 times in 30 minutes, as fast as possible Once upon a time, we configured our init systems to only restart an OSD 3 times in a 30 minute period. This made sure a permanently-slow OSD would stay dead, and that an OSD which was dying on boot (but only after a long boot process) would not insist on rejoining the cluster for too long. In 62084375fa8370ca3884327b4a4ad28e0281747e, Boris applied these same rules to systemd in a great bid for init system consistency. Hurray! Sadly, Loic discovered that the great dragons udev and ceph-disk were susceptible to races under systemd (that we apparently didn't see with the other init systems?), and our 3x start limit was preventing the system from sorting them out. In b3887379d6dde3b5a44f2e84cf917f4f0a0cb120 he configured the system to allow 30 restarts in 30 minutes, but no more frequently than every 20 seconds. So that resolved the race issue, which was far more immediately annoying than any concern about OSDs sometimes taking too long to die. But I've started hearing in-person reports about OSDs not failing hard and fast when they go bad, and I attribute some of those reports to these init system differences. Happily, we no longer rely on udev and ceph-disk, and ceph-volume shouldn't be susceptible to the same race, so I think we can just go back to the old way. Partly-reverts: b3887379d6dde3b5a44f2e84cf917f4f0a0cb120 Partly-fixes: http://tracker.ceph.com/issues/24368 Signed-off-by: Greg Farnum <gfarnum@redhat.com> 2018-05-31 22:55:51 +00:00			`StartLimitBurst=3`
init: add systemd service files This patch adds systemd service files. It is possible to start and enable multiple instances (per monid, osdid, mds name), e.g. # systemctl start ceph-mon@node01 # systemctl enable ceph-mon@node01 # systemctl start ceph-osd@0 # systemctl enable ceph-osd@0 The ceph cluster can be set in the system config file: /etc/sysconfig/ceph adding or editing the CLUSTER environment variable. Signed-off-by: Federico Simoncelli <fsimonce@redhat.com> 2014-07-20 02:08:43 +00:00
			`[Install]`
fine-grained control systemd to start/stop/restart ceph services at once Signed-off-by: Zhi Zhang <zhangz.david@outlook.com> 2015-10-26 07:13:19 +00:00			`WantedBy=ceph-osd.target`