Note: the name is important so that kclient mount can override the
distro setting.
Fixes: https://tracker.ceph.com/issues/43968
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
If we get a SIGINT or SIGTERM or are deleted from the OSDMap, do a fast
shutdown by exiting immediately. This has a few important benefits:
- We immediately stop responding (binding) to any sockets, which means
other OSDs will immediately decide we are down (and dead!). This
minimizes IO interruption.
- We avoid the complex "clean" shutdown process, which is historically a
source of bugs.
In reality, the only purpose of the "clean" shutdown is to try to tear down
everything in memory so we can do memory leak checking with valgrind. Set
this option to false for valgrind QA runs so we can still do that.
Not that with the new read leases in octopus, we rely on the default
behavior that a ECONNREFUSED is taken to mean that the OSD is fully dead,
so that we don't have to wait for any leases to time out. This works in
sane environments with normal IP networks, but that behavior could
conceivably be a bad idea if there are some weird network shenanigans
going on. If osd_fast_fail_on_connection_refused were disabled, then this
fast shutdown procedure might be *worse* than the clean shutdown because
we would have to wait for the heartbeat timeout.
Signed-off-by: Sage Weil <sage@redhat.com>
Valgrind makes the MDS slowwwww. The newish mds_heartbeat_grace config allows
us to keep sending beacons to the mons even if the internal heartbeat is slow.
This avoids the laggy messages which are useful to grep for unrelated messaging
issues.
Fixes: http://tracker.ceph.com/issues/38723
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
We've disabled the "clean" shutdown in ceph-mgr due to
https://tracker.ceph.com/issues/38621
Until then, no valgrind leak checks!
Signed-off-by: Sage Weil <sage@redhat.com>
This will be followed by removing common CephFS configurations in the
ceph.conf.template in teuthology.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This utilizes the recent feature in teuthology [1] to skip hidden files in
suites when building the job matrix.
Idea of this change is to enable referring to the top-level qa directory in a
position-independent way such that copies of a suite to another location do not
break any symlinks.
[1] https://github.com/ceph/teuthology/pull/1185
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Valgrind runs itself on forked children, and does its cleanup when they
complete, and this is slow... slow enough that it frequently makes the
test time out.
Valgrind let's you ignore child *processes* that you exec, but I can't
find a way to skip forked children in the same address space.
Work around this by skip this validation when running under valgrind.
Fixes: http://tracker.ceph.com/issues/20602
Signed-off-by: Sage Weil <sage@redhat.com>
This reverts 693bd23851, which was
added in response to http://tracker.ceph.com/issues/18126. But
we updated the Ubuntu packages in sepia so it should be good to go.
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
This reverts 693bd23851, which was
added in response to http://tracker.ceph.com/issues/18126. But
we updated the Ubuntu packages in sepia so it should be good to go.
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
This change happened a while back, but it got rolled back
when the generic objectstore/ dir had its filestore
entry split out into xfs and btrfs in 208675af.
Signed-off-by: John Spray <john.spray@redhat.com>
Now that we send these to the cluster log, we must
whitelist them in the tests that exercise those
unhealthy states.
Fixes: http://tracker.ceph.com/issues/19551
Signed-off-by: John Spray <john.spray@redhat.com>
These were running so few ops that they weren't
giving any meaningful exercise to a multimds
system beyond what we're already covering in
the fs suite.
Signed-off-by: John Spray <john.spray@redhat.com>