RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-02-15 23:07:55 +00:00

Author	SHA1	Message	Date
Sage Weil	4f742f200d	qa/suites/rados/verify: debug_ms = 1, osd_heartbeat_grace = 60 The rados api tests are failing WatchNotify because the OSDs are so heavily lagged.. in large part due to the high debug level of debug_ms=20 and debug_osd=25. Reduce that. Also increase the heartbeat grace so slow valgrind-y osds don't get marked down. Signed-off-by: Sage Weil <sage@redhat.com>	2020-03-11 06:57:52 -05:00
Sage Weil	1400b35858	qa/suites/rados/verity/tasks/mon_recovery: whitelist SLOW_OPS The mon can see slow ops when thrashing. Signed-off-by: Sage Weil <sage@redhat.com>	2020-03-01 07:58:11 -06:00
Sridhar Seshasayee	e527067666	qa: Whitelist 'slow request' within a bunch of tests Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2020-02-24 19:59:56 +05:30
Sage Weil	8e3eb592b0	qa/suites/rados/verify: debug monc = 20 Hunting https://tracker.ceph.com/issues/43882 Signed-off-by: Sage Weil <sage@redhat.com>	2020-01-29 09:53:41 -06:00
Sage Weil	344ff7f0ef	qa/suites/rados/verify: ping to specific centos The simple os_type: centos in valgrind.yaml doesn't pick a particular centos, and we end up with the teuthology default (currently 7.6). Signed-off-by: Sage Weil <sage@redhat.com>	2019-12-20 07:17:10 -06:00
Sage Weil	cf352c3ac0	osd: add osd_fast_shutdown option (default true) If we get a SIGINT or SIGTERM or are deleted from the OSDMap, do a fast shutdown by exiting immediately. This has a few important benefits: - We immediately stop responding (binding) to any sockets, which means other OSDs will immediately decide we are down (and dead!). This minimizes IO interruption. - We avoid the complex "clean" shutdown process, which is historically a source of bugs. In reality, the only purpose of the "clean" shutdown is to try to tear down everything in memory so we can do memory leak checking with valgrind. Set this option to false for valgrind QA runs so we can still do that. Not that with the new read leases in octopus, we rely on the default behavior that a ECONNREFUSED is taken to mean that the OSD is fully dead, so that we don't have to wait for any leases to time out. This works in sane environments with normal IP networks, but that behavior could conceivably be a bad idea if there are some weird network shenanigans going on. If osd_fast_fail_on_connection_refused were disabled, then this fast shutdown procedure might be worse than the clean shutdown because we would have to wait for the heartbeat timeout. Signed-off-by: Sage Weil <sage@redhat.com>	2019-11-15 09:31:50 -06:00
Sage Weil	71d74aa8c6	qa: more tries for mon tell when injecting msgr failures With failure injection the default 2 tries isn't quite enough Signed-off-by: Sage Weil <sage@redhat.com>	2019-10-11 14:16:42 -05:00
David Zafman	fdf93add0b	Merge pull request #30714 from dzafman/wip-41743 test: Ignore OSD_SLOW_PING_TIME* if injecting socket failures Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-10-04 18:28:48 -07:00
David Zafman	ded58ef91d	test: Ignore OSD_SLOW_PING_TIME* if injecting socket failures Fixes: https://tracker.ceph.com/issues/41743 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-10-03 09:09:10 -07:00
Sage Weil	52d706c75f	qa/suites/rados/verify: whitelist MON_DOWN when using valgrind Signed-off-by: Sage Weil <sage@redhat.com>	2019-09-29 10:27:01 -05:00
Sage Weil	e79dc454db	qa/suites: disable valgrind leak checks on ceph-mgr We've disabled the "clean" shutdown in ceph-mgr due to https://tracker.ceph.com/issues/38621 Until then, no valgrind leak checks! Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-07 13:03:28 -06:00
Josh Durgin	d45f18119b	qa/suites: remove mon kv backend options rocksdb is the default, leveldb is not recommended at this point, so drop it. Signed-off-by: Josh Durgin <jdurgin@redhat.com>	2019-02-08 16:58:44 -05:00
Sage Weil	65e81e6eb4	qa/suites/rados/verify/validator/valgrind: debug refs = 5 If we detect a leak, let's include logging so we can find it. Signed-off-by: Sage Weil <sage@redhat.com>	2019-02-07 12:10:34 -06:00
Sage Weil	d518eb6cac	qa/msgr: move msgr factet into generic re-usable dir Signed-off-by: Sage Weil <sage@redhat.com>	2019-01-03 11:17:38 -06:00
Sage Weil	03908113b4	qa/suites: valgrind ceph-mgr too Signed-off-by: Sage Weil <sage@redhat.com>	2018-11-09 08:52:07 -06:00
Casey Bodley	d897b92878	osd: remove statelog from osd_class_load_list config Signed-off-by: Casey Bodley <cbodley@redhat.com>	2018-09-19 10:32:55 -04:00
Sage Weil	44de03d5e6	qa/suites: test pg merging Signed-off-by: Sage Weil <sage@redhat.com>	2018-09-07 12:09:05 -05:00
Patrick Donnelly	b39f9d06dc	qa: fix symlinks indirectly pointing at qa to .qa Building on the previous commit. Command used: $ find suites/ -type l -and -not -name .qa -execdir ~/fix.sh {} \; fix.sh: #!/bin/bash link="$(readlink "$1")" echo $link dirlink="$(dirname "$link")" baselink="$(basename "$link")" while true; do echo $dirlink if [ "$dirlink" -ef ~/ceph/qa ]; then ln -nsf ".qa/$baselink" "$1" exit else baselink="$(basename "$dirlink")/$baselink" dirlink="$(dirname "$dirlink")" if [ "$dirlink" -ef . ]; then break fi fi done Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2018-06-26 11:48:38 -07:00
Patrick Donnelly	716db6e2fd	qa: add .qa helper link This utilizes the recent feature in teuthology [1] to skip hidden files in suites when building the job matrix. Idea of this change is to enable referring to the top-level qa directory in a position-independent way such that copies of a suite to another location do not break any symlinks. [1] https://github.com/ceph/teuthology/pull/1185 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2018-06-26 11:33:48 -07:00
Kefu Chai	c237d0befb	qa/suites/rados/verify: remove random-distro$ the distro specified by random-distro$ will be overwrited by the one specfied by valgrind.yaml. and teuthology-suite will give KeyError: '16.04 not a centos version or codename' when scheduling a suite involving the facets above. also, i think it's of not much value to run valgrind/lockdep with different distros. Signed-off-by: Kefu Chai <kchai@redhat.com>	2018-05-17 19:11:14 +08:00
Sage Weil	664af17b30	Merge pull request #21932 from yuriw/wip-yuriw-add-dollar-rgw tests/qa: Adding $ distro mix - rgw Reviewed-by: Casey Bodley <cbodley@redhat.com>	2018-05-15 16:15:05 -05:00
Casey Bodley	7da0fe2832	Merge pull request #21680 from cbodley/wip-rm-replica-log rgw: remove all traces of cls replica_log Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2018-05-10 10:26:55 -04:00
Yuri Weinstein	c79a74a33c	tests/qa: adding rados/.. dirs Signed-off-by: Yuri Weinstein <yweinste@redhat.com>	2018-05-08 16:00:05 -07:00
Casey Bodley	f9ee48caa2	rgw: remove all traces of cls replica_log replica log was for the old radosgw sync agent, which was replaced with multisite v2 in jewel. no sense in continuing to maintain and test it Signed-off-by: Casey Bodley <cbodley@redhat.com>	2018-04-26 11:40:11 -04:00
Sage Weil	e331311b87	qa/suites/rados/verify/tasks/rados_api_tests: whitelist OBJECT_MISPLACED The api tests do some splits, which can move data. Signed-off-by: Sage Weil <sage@redhat.com>	2018-04-25 10:33:52 -05:00
Kefu Chai	cdcbd47e1e	qa/suite: whitelist PG_AVAILABILITY in rados_api_tests.yaml pg will be created when increasing pgp-num and pg-num. so at that moment, PG_AVAILABILITY is reported. so whitelist it in all tests which run rados/test.sh. that script exercises ceph_test_rados_api_list. Fixes: http://tracker.ceph.com/issues/23763 Signed-off-by: Kefu Chai <kchai@redhat.com>	2018-04-24 10:16:12 +08:00
Gregory Farnum	6d2e4c9b7b	Merge pull request #19973 from liewegas/wip-peering-fast-dispatch osd: fast dispatch of peering events and pg_map + osd sharded wq refactor Reviewed-by: Greg Farnum <gfarnum@redhat.com>	2018-04-06 11:48:11 -07:00
Joao Eduardo Luis	3997eed4db	qa: enable mon osdmap pruning on 'rados/' suites Signed-off-by: Joao Eduardo Luis <joao@suse.de>	2018-04-06 04:18:23 +01:00
Sage Weil	26f00dd67c	qa/suites: mon warn on pool no app = false for api tests Among other things, the list.cc tests set pg_num which waits for cluster healthy. Signed-off-by: Sage Weil <sage@redhat.com>	2018-04-04 08:26:58 -05:00
Kefu Chai	f5f2ced624	mgr/PGMap: drop REQUEST_{SLOW,STUCK} HEALTH_WARNs in mimic SLOW_OPS unifies both of them since mimic Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-11-23 17:41:47 +08:00
Kefu Chai	4a1f2a5c78	qa: silence SLOW_OPS,PENDING_CREATING_PGS warnings this is an intermediate step to deprecate REQUEST_SLOW warnings. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-11-23 13:59:42 +08:00
Sage Weil	d8dead1aaf	qa/suites/rados: remove luminous tests - snapdir conversion (at-end) stuff - merge luminous-specific collections that avoided the above back into their normal locations Signed-off-by: Sage Weil <sage@redhat.com>	2017-08-28 23:10:32 -04:00
Sage Weil	41e5a85308	qa/suites/rados/verify/validater/valgrind: whitelist PG_ Peering might be slow due to valgrind. Signed-off-by: Sage Weil <sage@redhat.com>	2017-08-12 14:18:59 -04:00
Sage Weil	f683d2d374	qa/suites: change fixed-2.yaml users to get 4 openstack disks Follow-up for `4203c4f887` Signed-off-by: Sage Weil <sage@redhat.com>	2017-08-07 11:56:33 -04:00
Kefu Chai	d12c51ca91	qa/suites: escape the parenthesis of the whitelist text so we can avoid the warnings like grep: Unmatched ( or \( because we pass the whitelisted string to `egrep -v "$1"` directly. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-08-01 21:54:44 +08:00
Sage Weil	e398fd4ee4	qa/suites: more whitelisting Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-27 09:31:24 -04:00
Sage Weil	326019a466	qa/suites/rados: whitelist various tests Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-25 22:29:07 -04:00
John Spray	343e1a4281	qa: update whitelist for "wrongly marked me down" Signed-off-by: John Spray <john.spray@redhat.com>	2017-07-24 14:54:46 +01:00
Sage Weil	960f00071f	qa/suites: disable mon crush smoke test with valgrind Valgrind runs itself on forked children, and does its cleanup when they complete, and this is slow... slow enough that it frequently makes the test time out. Valgrind let's you ignore child processes that you exec, but I can't find a way to skip forked children in the same address space. Work around this by skip this validation when running under valgrind. Fixes: http://tracker.ceph.com/issues/20602 Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-14 11:51:47 -04:00
Sage Weil	93de19adcf	qa: whitelist health warnings Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-12 12:52:03 -04:00
Sage Weil	63f97ddcf6	qa/suites/rados: whitelist health warnings Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-12 12:52:02 -04:00
Sage Weil	c7893283cd	do all valgrind runs on centos We are fighting two issues with valgrind on ubuntu (xenial, yakkety, and z): http://tracker.ceph.com/issues/18126 http://tracker.ceph.com/issues/20360 Revert this when it is fixed. Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-30 09:33:18 -04:00
Greg Farnum	7d33e98bd3	qa: do not restrict valgrind runs to centos This reverts `693bd23851`, which was added in response to http://tracker.ceph.com/issues/18126. But we updated the Ubuntu packages in sepia so it should be good to go. Signed-off-by: Greg Farnum <gfarnum@redhat.com>	2017-06-23 16:25:16 -04:00
Sage Weil	aa76cf7488	Revert "qa: do not restrict valgrind runs to centos" This reverts commit `5923961465`. See http://tracker.ceph.com/issues/20360 Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-20 17:14:52 -04:00
Greg Farnum	5923961465	qa: do not restrict valgrind runs to centos This reverts `693bd23851`, which was added in response to http://tracker.ceph.com/issues/18126. But we updated the Ubuntu packages in sepia so it should be good to go. Signed-off-by: Greg Farnum <gfarnum@redhat.com>	2017-05-31 08:37:19 -07:00
Sage Weil	5f5f370925	qa/suites/rados: switch require-luminous facet to use full_sequential_finally This lets us run multiple cleanup steps right before ceph teardown. Note that we drop the facet from multimon/ because it doesn't factor out cluster creation before this step properly. That's fine because the require_luminous cleanup shouldn't be related to the multimon tests. Signed-off-by: Sage Weil <sage@redhat.com>	2017-05-05 13:39:14 -04:00
Sage Weil	83dcc988db	qa/suites/rados/verify: refactor thrash and cluster create Signed-off-by: Sage Weil <sage@redhat.com>	2017-05-05 13:39:14 -04:00
Yuri Weinstein	9cb79d2fe3	qa/added overrides Signed-off-by: Yuri Weinstein <yweinste@redhat.com>	2017-05-02 15:06:49 -07:00
Sage Weil	4857f51e68	qa/suites/rados: expand other collections with no-require-luminous Signed-off-by: Sage Weil <sage@redhat.com>	2017-04-14 11:45:05 -04:00
Sage Weil	73981ad807	qa/suites: remove 'fs' facet from all tests The objectstore facet now covers bluestore, filestore(xfs), and filestore(btrfs). Signed-off-by: Sage Weil <sage@redhat.com>	2017-03-28 11:57:21 -04:00

1 2

51 Commits