Currently, we start/stop OSDs and MONs simultaneously. This may cause
problems especially when we are shutting down the system. Once the mon
goes down it causes a re-election and the MONs can miss the message
from the OSD that is going down.
Resolves: http://tracker.ceph.com/issues/18516
Signed-off-by: Boris Ranto <branto@redhat.com>
Instead of the default 100ms pause before trying to restart an OSD, wait
20 seconds instead and retry 30 times instead of 3. There is no scenario
in which restarting an OSD almost immediately after it failed would get
a better result.
It is possible that a failure to start is due to a race with another
systemd unit at boot time. For instance if ceph-disk@.service is
delayed, it may start after the OSD that needs it. A long pause may give
the racing service enough time to complete and the next attempt to start
the OSD may succeed.
This is not a sound alternative to resolve a race, it only makes the OSD
boot process less sensitive. In the example above, the proper fix is to
enable --runtime ceph-osd@.service so that it cannot race at boot time.
The wait delay should not be minutes to preserve the current runtime
behavior. For instance, if an OSD is killed or fails and restarts after
10 minutes, it will be marked down by the ceph cluster. This is not a
change that could break things but it is significant and should be
avoided.
Refs: http://tracker.ceph.com/issues/17889
Signed-off-by: Loic Dachary <loic@dachary.org>
Currently, the systemd daemons are not restarted on failure. This patch
adds this functionality and sets the defaults to those defined in
upstart. This resolves to 3 fails per 30 minutes for osd, mon and mds
and 5 fails per 30 seconds for radosgw.
Signed-off-by: Boris Ranto <branto@redhat.com>
If systemd has task accounting enabled, a default of 512 tasks
will be applied to all systemd units.
For ceph, this is way to low even for a modest cluster, so stop
this restriction being applied and allow administrators to apply
limits using sysctl.
Signed-off-by: James Page <james.page@ubuntu.com>
These are not supported by /usr/lib/ceph/ceph-osd-prestart.sh,
resulting in warnings:
ceph-osd-prestart.sh[23367]: getopt: unrecognized option '--setuser'
ceph-osd-prestart.sh[23367]: getopt: unrecognized option '--setgroup'
--setuser and --setgroup are only needed for the ceph-osd process.
Signed-off-by: James Page <james.page@ubuntu.com>
First, it makes sense for both ceph_common.sh and ceph-osd-prestart.sh to
reside in the same directory: make it so.
Second, /usr/lib exists on both RHEL/Fedora and SLE/openSUSE, whereas
the later lacks /usr/libexec. To make this less painful, package
ceph_common.sh and ceph-osd-prestart.sh in /usr/lib/ceph.
Third, allow e.g. FreeBSD to do its own thing by using the $(libexecdir)
Autoconf variable (but set it to /usr/lib in the spec file).
http://tracker.ceph.com/issues/14687Fixes: #14687
Signed-off-by: Nathan Cutler <ncutler@suse.com>
This change makes it so the mon/osd/mds/radosgw daemons:
o Cannot write to /usr, /etc, and /boot.
o Cannot access /home, /root, or /run/user.
o Each daemon gets its own private /tmp and /var/tmp.
o All daemons get a private /dev without physical devices (exception: osd)
I'm not sure if the osd daemon needs access to a full /dev so I left
ProtectDevices out for ceph-osd@.service.
Signed-off-by: Patrick Donnelly <batrick@batbytes.com>
We were observed to be hitting the limit on centos7
(triggering pthread_create failures) on a ~2000 OSD cluster.
Increasing this resolves it!
Reported-by: Dan van der Ster <daniel.vanderster@cern.ch>
Signed-off-by: Sage Weil <sage@redhat.com>
The libexec path is different for different distributions.
systemd. This path is defined by a new variable on the
configure path.
This variable can be set with enviroment SYSTEMD_LIBEXEC_DIR.
The parameter --with-systemd-libexec-dir overrides the enviroment
variable.
Appropriate conditionals are set for SUSE and RHEL derivatives.
This is then used to template out systemd/ceph-osd@.service
Signed-off-by: Owen Synge <osynge@suse.com>
Under heavy load the number of file descriptors opened
by the OSD can go beyond the 64K file limit. This patch
increases the default to 128K.
Signed-off-by: Owen Synge <osynge@suse.com>
Added ceph.target
Made ceph-mds, ceph-mon, ceph-osd services part of ceph.target
Made ceph-mds, ceph-mon, ceph-osd services require partitions to be available.
Added ceph init script with sysV like behaviour.
Provided by Tim Serong tserong@suse.com and Owen Synge osynge@suse.com
Signed-off-by: Owen Synge <osynge@suse.com>
This patch adds systemd service files. It is possible to start and
enable multiple instances (per monid, osdid, mds name), e.g.
# systemctl start ceph-mon@node01
# systemctl enable ceph-mon@node01
# systemctl start ceph-osd@0
# systemctl enable ceph-osd@0
The ceph cluster can be set in the system config file:
/etc/sysconfig/ceph
adding or editing the CLUSTER environment variable.
Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>