RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-02-18 08:28:02 +00:00

Author	SHA1	Message	Date
Patrick Donnelly	50c39dc007	qa: split fs begin task To allow switching to cephadm task. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2022-02-02 10:44:35 -05:00
Patrick Donnelly	83d252cc30	qa: fold frag confs into conf/mds.yaml These overrides are standard for all configurations. The config to enable fragmentation is also long removed. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2021-07-26 07:14:38 -07:00
Patrick Donnelly	ec1b82fd24	qa: skip exit-on-first-failure option for valgrind on ubuntu The valgrind version is too old. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2021-03-03 09:30:21 -08:00
Patrick Donnelly	5faf0ee0f3	mds,qa: exit instead of respawn under valgrind valgrind can't handle execve of /proc/self/exe: 2021-02-27T05:52:37.813 INFO:tasks.ceph.mds.d.smithi073.stderr:==00:01:03:20.556 41218== execve(0x18546740(/proc/self/exe), 0x18546670, 0x133ef310) failed, errno 2 2021-02-27T05:52:37.813 INFO:tasks.ceph.mds.d.smithi073.stderr:==00:01:03:20.556 41218== EXEC FAILED: I can't recover from execve() failing, so I'm dying. 2021-02-27T05:52:37.813 INFO:tasks.ceph.mds.d.smithi073.stderr:==00:01:03:20.556 41218== Add more stringent tests in PRE(sys_execve), or work out how to recover. So configure the MDS to just exit so it can be restarted by QA infra (the daemon watchdog). Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2021-03-03 09:30:21 -08:00
Patrick Donnelly	1d85c9d535	qa: ignore all slow request warnings Generalize the ignorelist for: 2021-02-27T05:54:27.644 INFO:teuthology.orchestra.run.smithi002.stdout:2021-02-27T05:20:24.513041+0000 mds.d (mds.0) 1 : cluster [WRN] 1 slow requests, 1 included below; oldest blocked for > 183.680676 secs From: /ceph/teuthology-archive/pdonnell-2021-02-26_23:40:39-fs-wip-pdonnell-testing-20210226.181017-distro-basic-smithi/5917580/teuthology.log Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2021-03-03 09:30:21 -08:00
Patrick Donnelly	dcac1dbe62	qa: add new mds beacon grace mon config Otherwise the mons don't observe it. Fixes: https://tracker.ceph.com/issues/49507 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2021-03-03 09:30:21 -08:00
Patrick Donnelly	6093b3a581	qa: run fs:verify on all distros It's believed this is no longer a problem now that we use tcmalloc. Fixes: https://tracker.ceph.com/issues/49391 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2021-02-25 13:27:24 -08:00
Sage Weil	dc64ccf063	qa/suites: do not use notcmalloc flavor teuthology now knows how to run valgrind against a tcmalloc binary Signed-off-by: Sage Weil <sage@newdream.net>	2021-02-18 10:26:28 -06:00
Patrick Donnelly	7f449dd09f	qa: merge multimds:verify with fs:verify Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> Fixes: https://tracker.ceph.com/issues/48121	2021-01-07 12:55:25 -08:00
Patrick Donnelly	36d731c6f3	qa: only run valgrind on cephfs daemons OSD valgrind slows things down too much to the point where some tasks fail to complete. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2021-01-07 12:55:24 -08:00
Xiubo Li	0422673b61	qa/cephfs: add session_timeout option support When the mds revoking the Fwbl caps, the clients need to flush the dirty data back to the OSDs, but the flush may make the OSDs to be overloaded and slow, which may take more than 60 seconds to finish. Then the MDS daemons will report the WRN messages. For the teuthology test cases, let's just increase the timeout value to make it work. Fixes: https://tracker.ceph.com/issues/47565 Signed-off-by: Xiubo Li <xiubli@redhat.com>	2020-10-23 14:27:37 +08:00
Sage Weil	2ee9365d0b	qa: log-whitelist -> log-ignorelist Signed-off-by: Sage Weil <sage@newdream.net>	2020-08-24 19:53:08 +00:00
Patrick Donnelly	1fc33c54f8	qa: specify random distros in multimds Note: the name is important so that kclient mount can override the distro setting. Fixes: https://tracker.ceph.com/issues/43968 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2020-02-05 12:36:50 -08:00
Patrick Donnelly	2cdb2972cd	qa: define centos version for fs:verify Otherwise it uses the teuthology default of 7.6. Fixes: https://tracker.ceph.com/issues/43516 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2020-01-07 13:20:00 -08:00
Sage Weil	cf352c3ac0	osd: add osd_fast_shutdown option (default true) If we get a SIGINT or SIGTERM or are deleted from the OSDMap, do a fast shutdown by exiting immediately. This has a few important benefits: - We immediately stop responding (binding) to any sockets, which means other OSDs will immediately decide we are down (and dead!). This minimizes IO interruption. - We avoid the complex "clean" shutdown process, which is historically a source of bugs. In reality, the only purpose of the "clean" shutdown is to try to tear down everything in memory so we can do memory leak checking with valgrind. Set this option to false for valgrind QA runs so we can still do that. Not that with the new read leases in octopus, we rely on the default behavior that a ECONNREFUSED is taken to mean that the OSD is fully dead, so that we don't have to wait for any leases to time out. This works in sane environments with normal IP networks, but that behavior could conceivably be a bad idea if there are some weird network shenanigans going on. If osd_fast_fail_on_connection_refused were disabled, then this fast shutdown procedure might be worse than the clean shutdown because we would have to wait for the heartbeat timeout. Signed-off-by: Sage Weil <sage@redhat.com>	2019-11-15 09:31:50 -06:00
Patrick Donnelly	7b520755ce	qa: extend MDS heartbeat grace for valgrind Valgrind makes the MDS slowwwww. The newish mds_heartbeat_grace config allows us to keep sending beacons to the mons even if the internal heartbeat is slow. This avoids the laggy messages which are useful to grep for unrelated messaging issues. Fixes: http://tracker.ceph.com/issues/38723 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2019-03-13 09:18:32 -07:00
Sage Weil	e79dc454db	qa/suites: disable valgrind leak checks on ceph-mgr We've disabled the "clean" shutdown in ceph-mgr due to https://tracker.ceph.com/issues/38621 Until then, no valgrind leak checks! Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-07 13:03:28 -06:00
Sage Weil	03908113b4	qa/suites: valgrind ceph-mgr too Signed-off-by: Sage Weil <sage@redhat.com>	2018-11-09 08:52:07 -06:00
Patrick Donnelly	73fa0efcbb	qa: create common conf for all cephfs suites This will be followed by removing common CephFS configurations in the ceph.conf.template in teuthology. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2018-07-04 13:08:10 -07:00
Patrick Donnelly	b39f9d06dc	qa: fix symlinks indirectly pointing at qa to .qa Building on the previous commit. Command used: $ find suites/ -type l -and -not -name .qa -execdir ~/fix.sh {} \; fix.sh: #!/bin/bash link="$(readlink "$1")" echo $link dirlink="$(dirname "$link")" baselink="$(basename "$link")" while true; do echo $dirlink if [ "$dirlink" -ef ~/ceph/qa ]; then ln -nsf ".qa/$baselink" "$1" exit else baselink="$(basename "$dirlink")/$baselink" dirlink="$(dirname "$dirlink")" if [ "$dirlink" -ef . ]; then break fi fi done Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2018-06-26 11:48:38 -07:00
Patrick Donnelly	716db6e2fd	qa: add .qa helper link This utilizes the recent feature in teuthology [1] to skip hidden files in suites when building the job matrix. Idea of this change is to enable referring to the top-level qa directory in a position-independent way such that copies of a suite to another location do not break any symlinks. [1] https://github.com/ceph/teuthology/pull/1185 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2018-06-26 11:33:48 -07:00
Sage Weil	d0732fc96f	qa/cephfs: test ec data pool Signed-off-by: Sage Weil <sage@redhat.com>	2017-10-23 21:11:24 -05:00
Patrick Donnelly	9d348ad8c9	qa: add health whitelist for all fs sub-suites Fixes: http://tracker.ceph.com/issues/20892 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-08-03 14:01:28 -07:00
Sage Weil	960f00071f	qa/suites: disable mon crush smoke test with valgrind Valgrind runs itself on forked children, and does its cleanup when they complete, and this is slow... slow enough that it frequently makes the test time out. Valgrind let's you ignore child processes that you exec, but I can't find a way to skip forked children in the same address space. Work around this by skip this validation when running under valgrind. Fixes: http://tracker.ceph.com/issues/20602 Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-14 11:51:47 -04:00
Sage Weil	c7893283cd	do all valgrind runs on centos We are fighting two issues with valgrind on ubuntu (xenial, yakkety, and z): http://tracker.ceph.com/issues/18126 http://tracker.ceph.com/issues/20360 Revert this when it is fixed. Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-30 09:33:18 -04:00
Greg Farnum	7d33e98bd3	qa: do not restrict valgrind runs to centos This reverts `693bd23851`, which was added in response to http://tracker.ceph.com/issues/18126. But we updated the Ubuntu packages in sepia so it should be good to go. Signed-off-by: Greg Farnum <gfarnum@redhat.com>	2017-06-23 16:25:16 -04:00
Sage Weil	aa76cf7488	Revert "qa: do not restrict valgrind runs to centos" This reverts commit `5923961465`. See http://tracker.ceph.com/issues/20360 Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-20 17:14:52 -04:00
Greg Farnum	5923961465	qa: do not restrict valgrind runs to centos This reverts `693bd23851`, which was added in response to http://tracker.ceph.com/issues/18126. But we updated the Ubuntu packages in sepia so it should be good to go. Signed-off-by: Greg Farnum <gfarnum@redhat.com>	2017-05-31 08:37:19 -07:00
John Spray	6369120d63	qa/suites: don't use btrfs for cephfs testing This change happened a while back, but it got rolled back when the generic objectstore/ dir had its filestore entry split out into xfs and btrfs in `208675af`. Signed-off-by: John Spray <john.spray@redhat.com>	2017-04-24 11:19:55 +01:00
John Spray	131d1bd570	qa: add log whitelists for MDS health messages Now that we send these to the cluster log, we must whitelist them in the tests that exercise those unhealthy states. Fixes: http://tracker.ceph.com/issues/19551 Signed-off-by: John Spray <john.spray@redhat.com>	2017-04-14 05:47:43 -04:00
Sage Weil	73981ad807	qa/suites: remove 'fs' facet from all tests The objectstore facet now covers bluestore, filestore(xfs), and filestore(btrfs). Signed-off-by: Sage Weil <sage@redhat.com>	2017-03-28 11:57:21 -04:00
John Spray	76b73befd9	qa: remove simple functional tests from multimds These were running so few ops that they weren't giving any meaningful exercise to a multimds system beyond what we're already covering in the fs suite. Signed-off-by: John Spray <john.spray@redhat.com>	2017-02-07 13:51:47 +00:00
Sage Weil	c01f2ee0e2	move ceph-qa-suite dirs into qa/	2016-12-14 11:29:55 -06:00

33 Commits