Currently, the systemd daemons are not restarted on failure. This patch
adds this functionality and sets the defaults to those defined in
upstart. This resolves to 3 fails per 30 minutes for osd, mon and mds
and 5 fails per 30 seconds for radosgw.
Signed-off-by: Boris Ranto <branto@redhat.com>
If systemd has task accounting enabled, a default of 512 tasks
will be applied to all systemd units.
For ceph, this is way to low even for a modest cluster, so stop
this restriction being applied and allow administrators to apply
limits using sysctl.
Signed-off-by: James Page <james.page@ubuntu.com>
These are not supported by /usr/lib/ceph/ceph-osd-prestart.sh,
resulting in warnings:
ceph-osd-prestart.sh[23367]: getopt: unrecognized option '--setuser'
ceph-osd-prestart.sh[23367]: getopt: unrecognized option '--setgroup'
--setuser and --setgroup are only needed for the ceph-osd process.
Signed-off-by: James Page <james.page@ubuntu.com>
First, it makes sense for both ceph_common.sh and ceph-osd-prestart.sh to
reside in the same directory: make it so.
Second, /usr/lib exists on both RHEL/Fedora and SLE/openSUSE, whereas
the later lacks /usr/libexec. To make this less painful, package
ceph_common.sh and ceph-osd-prestart.sh in /usr/lib/ceph.
Third, allow e.g. FreeBSD to do its own thing by using the $(libexecdir)
Autoconf variable (but set it to /usr/lib in the spec file).
http://tracker.ceph.com/issues/14687Fixes: #14687
Signed-off-by: Nathan Cutler <ncutler@suse.com>
This change makes it so the mon/osd/mds/radosgw daemons:
o Cannot write to /usr, /etc, and /boot.
o Cannot access /home, /root, or /run/user.
o Each daemon gets its own private /tmp and /var/tmp.
o All daemons get a private /dev without physical devices (exception: osd)
I'm not sure if the osd daemon needs access to a full /dev so I left
ProtectDevices out for ceph-osd@.service.
Signed-off-by: Patrick Donnelly <batrick@batbytes.com>
The flock command may be installed elsewhere, depending on the
system. Let the PATH search figure that out.
http://tracker.ceph.com/issues/13975Fixes: #13975
Signed-off-by: Loic Dachary <loic@dachary.org>
systemd: start/stop/restart ceph services by daemon type
Reviewed-by: Nathan Cutler <ncutler@suse.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Boris Ranto <branto@redhat.com>
Reviewed-by: Ken Dreyer <kdreyer@redhat.com>
When activating a device, ceph-disk trigger restarts the ceph-disk
systemd service. Two consecutive udev add on the same device will
restart the ceph-disk systemd service and the second one may kill the
first one, leaving the device half activated.
The ceph-disk systemd service is instructed to not kill an existing
process when restarting. The second run waits (via flock) for the second
one to complete before running so that they do not overlap.
http://tracker.ceph.com/issues/13160Fixes: #13160
Signed-off-by: Loic Dachary <ldachary@redhat.com>
We were observed to be hitting the limit on centos7
(triggering pthread_create failures) on a ~2000 OSD cluster.
Increasing this resolves it!
Reported-by: Dan van der Ster <daniel.vanderster@cern.ch>
Signed-off-by: Sage Weil <sage@redhat.com>
This allows members of the ceph group to make librados clients (like the
ceph cli and qemu) create sockets in the default /var/run/ceph/* location.
Signed-off-by: Sage Weil <sage@redhat.com>
Allow all daemons drop privilege themselves, instead of letting
systemd do it.
Among other things, this means that admins can conditionally not
drop prives by setting
setuser match path = /var/lib/ceph/$type/$cluster-$id
in their ceph.conf to ease the pain of upgrade.
Signed-off-by: Sage Weil <sage@redhat.com>
Reviewed-by: Boris Ranto <branto@redhat.com>
Specify the nofile ulimit in one standard place, where everyone expects it
to be. Drop it from the ceph-osd unit file.
Leave upstart and sysvinit untouched for the time being to avoid compat
issues.
Signed-off-by: Sage Weil <sage@redhat.com>
The udev(7) man page states:
RUN
...
This can only be used for very short-running foreground tasks. Running
an event process for a long period of time may block all further
events for this or a dependent device.
Starting daemons or other long-running processes is not appropriate
for udev; the forked processes, detached or not, will be
unconditionally killed after the event handling has finished.
ceph-disk activate is far from a short-running task:
- check whether path is a block dev, for dirs call through to
activate_dir()
- call blkid to obtain the filesystem type for the block dev
- pull mount options from hard-coded ceph.conf file
- mount the OSD dev at a temporary path
- check the ceph magic for mounted filesystem
- read cluster uuid and locate corresponding /etc/ceph/{cluster}.conf
path
- read or generate (if missing) the OSD uuid
- create a file indicating init system usage (systemd)
- mount the device at a second (final) location
- umount (lazy) the temporary mount path
- enable the systemd ceph-osd@{osd_id} service
- start the systemd ceph-osd@{osd_id} service
This logic is therefore best left in a systemd service for execution. As
it is less limited in terms of execution time, and also allows for
improved event handling in future (fsck, dmcrypt mapping etc.).
This change sees 95-ceph-osd.rules.systemd trigger ceph-disk activate or
ceph-disk activate-journal via new ceph-disk-activate-journal@.service,
ceph-disk-activate@.service and ceph-disk-dmcrypt-activate@.service
systemd service files.
ceph-disk-dmcrypt-activate@.service makes use of the newly added
--dmcrypt parameter for ceph-disk activate.
Signed-off-by: David Disseldorp <ddiss@suse.de>
To simplify the spec file we should install as much using autotools
and as little as possible in the spec file.
Signed-off-by: Owen Synge <osynge@suse.com>
Added a radosgw systemd support and associated prestart script.
- With improved checking over first revison.
- ceph-radosgw-prestart.sh now installed in /usr/lib/ceph-radosgw
Signed-off-by: Owen Synge <osynge@suse.com>
tmpfiles.d are part of system.d and define how temporary directories are setup.
rgw needs a socket directory. To do this we template tmpfiles.d user and group
for rgw and fill in the values using autotools.
Note1: Added to spec file.
Note2: Name changed to rgw from radosgw as is preferred name by Sage.
Note3: Adds configure options
--with-rgw-user=UserName
--with-rgw-group=GroupName
Note4: Defaults set for debian
Note5: spec file overrides defaults for redhat and suse
Signed-off-by: Owen Synge <osynge@suse.com>
Before this patch, the command 'logrotate -f /etc/logrotate.d/ceph'
was generating an error "Failed to reload ceph.target: Job type reload is not
applicable for unit ceph.target".
Before we issue systemctl reload, check that there is at least
one active ceph-* service. (The hyphen is significant.)
Since we use grep, make the grep package a dependency.
http://tracker.ceph.com/issues/12173Fixes: #12173
Signed-off-by: Tim Serong <tserong@suse.com>
Signed-off-by: Lars Marowsky-Bree <lmb@suse.com>
Signed-off-by: Nathan Cutler <ncutler@suse.com>
- only start OSDs if mon daemons are also present
- adds support for mask and unmask
- removes support for cluster with non default cluster name,
as this was very limited and inconsistent
- Reapplied from a patch as could not cherry-pick
66cb46c411 from Mon Jan 12
as this produced issues with src/gmock
Signed-off-by: Owen Synge <osynge@suse.com>
The libexec path is different for different distributions.
systemd. This path is defined by a new variable on the
configure path.
This variable can be set with enviroment SYSTEMD_LIBEXEC_DIR.
The parameter --with-systemd-libexec-dir overrides the enviroment
variable.
Appropriate conditionals are set for SUSE and RHEL derivatives.
This is then used to template out systemd/ceph-osd@.service
Signed-off-by: Owen Synge <osynge@suse.com>
Under heavy load the number of file descriptors opened
by the OSD can go beyond the 64K file limit. This patch
increases the default to 128K.
Signed-off-by: Owen Synge <osynge@suse.com>
Prior to this commit, we didn't install /var/run/ceph as a normal
directory. We used the %ghost directive and created the directory with
a "mkdir" command in %post.
This was lacking in several ways:
1) Simplicy: there is no need to use %ghost; other packages (eg.
mariadb) simply use a normal %dir for their socket directory.
2) RPM does not have control over the permissions of the /var/run/ceph
directory. This does not interact well with "rpm -V". Moreover,
once Ceph itself gets unprivileged user support, RPM itself won't
be able to set the permissions of the directory for a (future)
unprivileged UID.
3) On distributions that use systemd as an init system, /var/run is a
symlink to /run, which is tmpfs. This means that /var/run/ceph does
not persist across reboots on those systems.
Remove the %ghost directive; it makes more sense for RPM to simply
install this directory like the rest of the %files.
Add a "_with_systemd" conditional so we know which distros use systemd
as their init system. Add the /etc/tmpfiles.d/ceph.conf file on those
distros. See
http://www.freedesktop.org/software/systemd/man/tmpfiles.d.html
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
Added ceph.target
Made ceph-mds, ceph-mon, ceph-osd services part of ceph.target
Made ceph-mds, ceph-mon, ceph-osd services require partitions to be available.
Added ceph init script with sysV like behaviour.
Provided by Tim Serong tserong@suse.com and Owen Synge osynge@suse.com
Signed-off-by: Owen Synge <osynge@suse.com>
This patch adds systemd service files. It is possible to start and
enable multiple instances (per monid, osdid, mds name), e.g.
# systemctl start ceph-mon@node01
# systemctl enable ceph-mon@node01
# systemctl start ceph-osd@0
# systemctl enable ceph-osd@0
The ceph cluster can be set in the system config file:
/etc/sysconfig/ceph
adding or editing the CLUSTER environment variable.
Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>