Commit Graph

21 Commits

Author SHA1 Message Date
Matan Breizman
d580e2392f qa/suites/rados/verify/validater/valgrind: increase op thread timeout
Fixes: https://tracker.ceph.com/issues/62992

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2023-11-14 11:30:00 +00:00
Sage Weil
dc64ccf063 qa/suites: do not use notcmalloc flavor
teuthology now knows how to run valgrind against a tcmalloc binary

Signed-off-by: Sage Weil <sage@newdream.net>
2021-02-18 10:26:28 -06:00
Sage Weil
c7244e7aad misc language changes: whitelist -> ignore etc
Signed-off-by: Sage Weil <sage@newdream.net>
2020-08-24 19:53:08 +00:00
Sage Weil
2ee9365d0b qa: log-whitelist -> log-ignorelist
Signed-off-by: Sage Weil <sage@newdream.net>
2020-08-24 19:53:08 +00:00
Sage Weil
7c19c1534b qa/suites/rados/verify/validater/valgrind: tolerate SLOW_OPS
Signed-off-by: Sage Weil <sage@redhat.com>
2020-03-17 19:32:42 -05:00
Sage Weil
baeb051910 qa/suites/rados/verify/validater/valgrind: less bluestore logging
Signed-off-by: Sage Weil <sage@redhat.com>
2020-03-17 19:32:42 -05:00
Sage Weil
4fda9d50f0 qa/suites/rados/verify/validater: increase heartbeat grace
Signed-off-by: Sage Weil <sage@redhat.com>
2020-03-17 19:32:42 -05:00
Sage Weil
12105ed9d7 Revert "qa/suites/rados/verify/validator/valgrind: debug refs = 5"
This reverts commit 65e81e6eb4.

This slows things down too much with valgrind.

Signed-off-by: Sage Weil <sage@redhat.com>
2020-03-17 19:32:42 -05:00
Sage Weil
cf352c3ac0 osd: add osd_fast_shutdown option (default true)
If we get a SIGINT or SIGTERM or are deleted from the OSDMap, do a fast
shutdown by exiting immediately.  This has a few important benefits:

 - We immediately stop responding (binding) to any sockets, which means
   other OSDs will immediately decide we are down (and dead!).  This
   minimizes IO interruption.
 - We avoid the complex "clean" shutdown process, which is historically a
   source of bugs.

In reality, the only purpose of the "clean" shutdown is to try to tear down
everything in memory so we can do memory leak checking with valgrind.  Set
this option to false for valgrind QA runs so we can still do that.

Not that with the new read leases in octopus, we rely on the default
behavior that a ECONNREFUSED is taken to mean that the OSD is fully dead,
so that we don't have to wait for any leases to time out.  This works in
sane environments with normal IP networks, but that behavior could
conceivably be a bad idea if there are some weird network shenanigans
going on.  If osd_fast_fail_on_connection_refused were disabled, then this
fast shutdown procedure might be *worse* than the clean shutdown because
we would have to wait for the heartbeat timeout.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-11-15 09:31:50 -06:00
Sage Weil
52d706c75f qa/suites/rados/verify: whitelist MON_DOWN when using valgrind
Signed-off-by: Sage Weil <sage@redhat.com>
2019-09-29 10:27:01 -05:00
Sage Weil
e79dc454db qa/suites: disable valgrind leak checks on ceph-mgr
We've disabled the "clean" shutdown in ceph-mgr due to
https://tracker.ceph.com/issues/38621

Until then, no valgrind leak checks!

Signed-off-by: Sage Weil <sage@redhat.com>
2019-03-07 13:03:28 -06:00
Sage Weil
65e81e6eb4 qa/suites/rados/verify/validator/valgrind: debug refs = 5
If we detect a leak, let's include logging so we can find it.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-02-07 12:10:34 -06:00
Sage Weil
03908113b4 qa/suites: valgrind ceph-mgr too
Signed-off-by: Sage Weil <sage@redhat.com>
2018-11-09 08:52:07 -06:00
Patrick Donnelly
716db6e2fd
qa: add .qa helper link
This utilizes the recent feature in teuthology [1] to skip hidden files in
suites when building the job matrix.

Idea of this change is to enable referring to the top-level qa directory in a
position-independent way such that copies of a suite to another location do not
break any symlinks.

[1] https://github.com/ceph/teuthology/pull/1185

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-06-26 11:33:48 -07:00
Sage Weil
41e5a85308 qa/suites/rados/verify/validater/valgrind: whitelist PG_
Peering might be slow due to valgrind.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-12 14:18:59 -04:00
Sage Weil
960f00071f qa/suites: disable mon crush smoke test with valgrind
Valgrind runs itself on forked children, and does its cleanup when they
complete, and this is slow... slow enough that it frequently makes the
test time out.

Valgrind let's you ignore child *processes* that you exec, but I can't
find a way to skip forked children in the same address space.

Work around this by skip this validation when running under valgrind.

Fixes: http://tracker.ceph.com/issues/20602
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-14 11:51:47 -04:00
Sage Weil
c7893283cd do all valgrind runs on centos
We are fighting two issues with valgrind on ubuntu (xenial, yakkety,
and z):

	http://tracker.ceph.com/issues/18126
	http://tracker.ceph.com/issues/20360

Revert this when it is fixed.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-30 09:33:18 -04:00
Greg Farnum
7d33e98bd3 qa: do not restrict valgrind runs to centos
This reverts 693bd23851, which was
added in response to http://tracker.ceph.com/issues/18126. But
we updated the Ubuntu packages in sepia so it should be good to go.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2017-06-23 16:25:16 -04:00
Sage Weil
aa76cf7488 Revert "qa: do not restrict valgrind runs to centos"
This reverts commit 5923961465.

See http://tracker.ceph.com/issues/20360

Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-20 17:14:52 -04:00
Greg Farnum
5923961465 qa: do not restrict valgrind runs to centos
This reverts 693bd23851, which was
added in response to http://tracker.ceph.com/issues/18126. But
we updated the Ubuntu packages in sepia so it should be good to go.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2017-05-31 08:37:19 -07:00
Sage Weil
c01f2ee0e2 move ceph-qa-suite dirs into qa/ 2016-12-14 11:29:55 -06:00