Commit Graph

257 Commits

Author SHA1 Message Date
Kefu Chai
9541a97605
Merge pull request #40316 from batrick/i49605
pybind/mgr/volumes: avoid deadlock in ceph-mgr Finisher thread

Reviewed-by: Kotresh HR <khiremat@redhat.com>
2021-04-03 22:24:22 +08:00
Patrick Donnelly
17b291e57d
qa: bump debugging for mgr
Hunting [1].

[1] https://tracker.ceph.com/issues/49605
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-24 11:37:24 -07:00
Patrick Donnelly
d06689e16e
qa: wait for daemons to come up via cephadm
Rather than waiting for a set amount of time.

Fixes: https://tracker.ceph.com/issues/49684
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-21 10:35:07 -07:00
Sage Weil
6901edc5d1 qa/suites/fs/multiclient: use clients: not all: for pexec
This matches the setup work we are trying to tear down.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-03-16 20:51:36 -04:00
Venky Shankar
3e13f48937 test: add tests for mirroring bootstrap interfaces
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-03-11 03:41:52 -05:00
Patrick Donnelly
c33f054465
Merge PR #39502 into master
* refs/pull/39502/head:
	qa: add sleep for blocklisting to take effect

Reviewed-by: Rishabh Dave <ridave@redhat.com>
2021-03-09 13:38:57 -08:00
Patrick Donnelly
ea41874e5f
Merge PR #39841 into master
* refs/pull/39841/head:
	qa: ignorelist slow ops during scrub

Reviewed-by: Rishabh Dave <ridave@redhat.com>
2021-03-05 11:34:14 -08:00
Patrick Donnelly
061eab7713
qa: add sleep for blocklisting to take effect
Fixes: https://tracker.ceph.com/issues/49318
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-04 19:58:17 -08:00
Patrick Donnelly
cab87f956b
Merge PR #39787 into master
* refs/pull/39787/head:
	qa: Update featureful_client suite to use octopus instead of nautilus

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-04 13:33:14 -08:00
Patrick Donnelly
c0824f4201
qa: ignorelist slow ops during scrub
Fixes: https://tracker.ceph.com/issues/49607
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-04 13:22:08 -08:00
Patrick Donnelly
ec1b82fd24
qa: skip exit-on-first-failure option for valgrind on ubuntu
The valgrind version is too old.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-03 09:30:21 -08:00
Patrick Donnelly
5faf0ee0f3
mds,qa: exit instead of respawn under valgrind
valgrind can't handle execve of /proc/self/exe:

    2021-02-27T05:52:37.813 INFO:tasks.ceph.mds.d.smithi073.stderr:==00:01:03:20.556 41218== execve(0x18546740(/proc/self/exe), 0x18546670, 0x133ef310) failed, errno 2
    2021-02-27T05:52:37.813 INFO:tasks.ceph.mds.d.smithi073.stderr:==00:01:03:20.556 41218== EXEC FAILED: I can't recover from execve() failing, so I'm dying.
    2021-02-27T05:52:37.813 INFO:tasks.ceph.mds.d.smithi073.stderr:==00:01:03:20.556 41218== Add more stringent tests in PRE(sys_execve), or work out how to recover.

So configure the MDS to just exit so it can be restarted by QA infra (the
daemon watchdog).

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-03 09:30:21 -08:00
Patrick Donnelly
1d85c9d535
qa: ignore all slow request warnings
Generalize the ignorelist for:

    2021-02-27T05:54:27.644 INFO:teuthology.orchestra.run.smithi002.stdout:2021-02-27T05:20:24.513041+0000 mds.d (mds.0) 1 : cluster [WRN] 1 slow requests, 1 included below; oldest blocked for > 183.680676 secs

From: /ceph/teuthology-archive/pdonnell-2021-02-26_23:40:39-fs-wip-pdonnell-testing-20210226.181017-distro-basic-smithi/5917580/teuthology.log

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-03 09:30:21 -08:00
Patrick Donnelly
dcac1dbe62
qa: add new mds beacon grace mon config
Otherwise the mons don't observe it.

Fixes: https://tracker.ceph.com/issues/49507
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-03 09:30:21 -08:00
Sidharth Anupkrishnan
659288ed1d qa: Update featureful_client suite to use octopus instead of nautilus
Signed-off-by: Sidharth Anupkrishnan <sanupkri@redhat.com>
2021-03-03 14:03:21 +05:30
Patrick Donnelly
4a6b11ac49
Merge PR #39710 into master
* refs/pull/39710/head:
	qa: run fs:verify on all distros

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Yuri Weinstein <yweins@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
2021-03-01 12:05:47 -08:00
Patrick Donnelly
6093b3a581
qa: run fs:verify on all distros
It's believed this is no longer a problem now that we use tcmalloc.

Fixes: https://tracker.ceph.com/issues/49391
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-02-25 13:27:24 -08:00
Patrick Donnelly
4526c74569
qa: use tcmalloc with valgrind in fs:valgrind
Follow-up: dc64ccf063
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-02-25 13:18:37 -08:00
Patrick Donnelly
66409b22a9
Merge PR #38914 into master
* refs/pull/38914/head:
	qa: bump osd heartbeat grace for ffsb workload

Reviewed-by: Ramana Raja <rraja@redhat.com>
2021-02-24 19:34:23 -08:00
Patrick Donnelly
a3591378a5
Merge PR #39138 into master
* refs/pull/39138/head:
	qa: valgrind test for cephfs-mirror daemon
	cephfs-mirror: use preforker for daemonizing
	test: adjust sleep time to account for valgrind runs
	cephfs-mirror: gracefully shutdown threads, timers, etc..
	cephfs-mirror: call ceph_release() to cleanup mount alloc
	cephfs-mirror: shutdown filesystem/cluster connections on shutdown
	cephfs-mirror: set init failed flag on FSMirror::init() failure

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-02-22 10:35:32 -08:00
Sage Weil
dc64ccf063 qa/suites: do not use notcmalloc flavor
teuthology now knows how to run valgrind against a tcmalloc binary

Signed-off-by: Sage Weil <sage@newdream.net>
2021-02-18 10:26:28 -06:00
Kefu Chai
5ca820fb9e Revert "msg,mon,common: log when DispatchQueue throttle limit is reached"
Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-02-08 18:03:14 +08:00
Sage Weil
6d8c0e4722 Merge PR #39147 into master
* refs/pull/39147/head:
	qa/tasks/ceph_fuse: do not createfs
	qa/tasks/cephfs/fuse_mount: pass admin_socket path
	qa/suites/fs/cephadm/multivolume: add basic multivolume test
	mgr/mds_autoscaler: some fixes and cleanup
	mgr/volumes: deploy MDSs when creating fs

Reviewed-by: Milind Changire <mchangir@redhat.com>
2021-02-04 12:19:25 -05:00
Venky Shankar
7583126d19 qa: valgrind test for cephfs-mirror daemon
Fixes: http://tracker.ceph.com/issues/49040
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-02-02 09:25:20 -05:00
Sage Weil
964cd3e028 qa/suites/fs/cephadm/multivolume: add basic multivolume test
Signed-off-by: Sage Weil <sage@newdream.net>
2021-02-01 10:50:33 -06:00
Jos Collin
8367448765
qa: test DispatchQueue throttling
Fixes: https://tracker.ceph.com/issues/46226
Signed-off-by: Jos Collin <jcollin@redhat.com>
2021-01-25 11:18:04 +05:30
Venky Shankar
3478b2a062 test: cephfs-mirror teuthology task and test yamls
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-01-19 01:08:10 -05:00
Patrick Donnelly
a6891a0c8a
qa: add new client tests
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-01-15 17:30:40 -08:00
Patrick Donnelly
84528b1693
qa: bump osd heartbeat grace for ffsb workload
To avoid recovery under heavy I/O.

Fixes: https://tracker.ceph.com/issues/48877
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-01-14 14:03:12 -08:00
Patrick Donnelly
78a7df1500
Merge PR #38846 into master
* refs/pull/38846/head:
	*: remove legacy ceph_volume_client.py library

Reviewed-by: Varsha Rao <varao@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
2021-01-14 08:10:19 -08:00
Patrick Donnelly
bc57344ded
Merge PR #38889 into master
* refs/pull/38889/head:
	qa: add delays only for osd/mds

Reviewed-by: Jeff Layton <jlayton@redhat.com>
2021-01-14 08:07:40 -08:00
Patrick Donnelly
ef19481399
qa: add delays only for osd/mds
The delays were applied everywhere and needlessly interfere with test
commands sent to mons from the ceph admin command. Furthermore, the
delays would not affect the kernel client. Now the delays are performed
by the MDS on clients.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-01-13 07:54:09 -08:00
Patrick Donnelly
a3db265ad5
*: remove legacy ceph_volume_client.py library
This library is obsolete with the mgr volumes plugin since Nautilus.

The last remaining user of this library was Manila which will be using
the volumes plugin with Pacific and onwards.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-01-12 06:54:29 -08:00
Patrick Donnelly
b66ca823fe
qa: ignore MDS_SLOW_METADATA_IO with osd thrasher
Fixes: https://tracker.ceph.com/issues/48834
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-01-11 10:10:04 -08:00
Patrick Donnelly
318d3f4d80
Merge PR #38108 into master
* refs/pull/38108/head:
	doc, man: man page for `cephfs-top` utility
	doc: document `cephfs-top` utility
	test: selftest for `cephfs-top` utility
	spec, deb: package cephfs-top utility
	cephfs-top: top(1) like utility for Ceph Filesystem
	mgr/stats: include kernel version (for kclients) in `perf stats` command output
	mgr/stats: include version with `perf stats` output

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-01-11 08:38:52 -08:00
Venky Shankar
0329d9b884 test: selftest for cephfs-top utility
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-01-11 06:15:53 -05:00
Patrick Donnelly
14b4787de1
client: add debug messages for osdmap wait
To help debug i47294.

See-also: https://tracker.ceph.com/issues/47294
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-01-10 21:11:06 -08:00
Patrick Donnelly
f7617cf4d7
qa: bump scrub timeout
To avoid timeouts for large workloads.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-01-07 12:55:25 -08:00
Patrick Donnelly
488f10c62f
qa: move cephfs_ec_profile under cephfs
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-01-07 12:55:25 -08:00
Patrick Donnelly
4a45b9eb3e
qa: skip check-counters for light workloads
None of these are likely to generate exports.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-01-07 12:55:25 -08:00
Patrick Donnelly
7f449dd09f
qa: merge multimds:verify with fs:verify
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Fixes: https://tracker.ceph.com/issues/48121
2021-01-07 12:55:25 -08:00
Patrick Donnelly
cb45fc085c
qa: merge multimds:thrash to fs:thrash
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Fixes: https://tracker.ceph.com/issues/48121
2021-01-07 12:55:25 -08:00
Patrick Donnelly
474cb0a9ca
qa: move functional multimds tests to fs:functional
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Fixes: https://tracker.ceph.com/issues/48121
2021-01-07 12:55:24 -08:00
Patrick Donnelly
a32462fe4d
qa: migrate multimds workloads to fs:workloads
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Fixes: https://tracker.ceph.com/issues/48121
2021-01-07 12:55:24 -08:00
Patrick Donnelly
36d731c6f3
qa: only run valgrind on cephfs daemons
OSD valgrind slows things down too much to the point where some tasks
fail to complete.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-01-07 12:55:24 -08:00
Milind Changire
a4757451f2 qa: forward scrubbing background task
Add forward scrubbing thrasher task to scrub file-system while a long
running IO is taking place.

Fixes: https://tracker.ceph.com/issues/17856
Signed-off-by: Milind Changire <mchangir@redhat.com>
2020-12-14 03:32:38 +05:30
Ramana Raja
a38b836589 qa/suites/fs: enable thrashing in multifs environment
Fixes: https://tracker.ceph.com/issues/15134
Co-authored-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Ramana Raja <rraja@redhat.com>
2020-11-27 15:55:01 +05:30
Venky Shankar
a8c8b3ade2 tests: add snap schedule tests
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2020-11-17 08:39:09 -05:00
Patrick Donnelly
a4941c1d5b
qa: ignore skip errors for kclient
To avoid this failure:

	2020-11-01T07:21:35.117 INFO:tasks.cephfs_test_runner:test_volume_without_namespace_isolation (tasks.cephfs.test_volume_client.TestVolumeClient) ... ok
	2020-11-01T07:21:35.118 INFO:tasks.cephfs_test_runner:
	2020-11-01T07:21:35.118 INFO:tasks.cephfs_test_runner:======================================================================
	2020-11-01T07:21:35.119 INFO:tasks.cephfs_test_runner:FAIL: test_evict_client (tasks.cephfs.test_volume_client.TestVolumeClient)
	2020-11-01T07:21:35.119 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
	2020-11-01T07:21:35.119 INFO:tasks.cephfs_test_runner:Requires FUSE client to inject client metadata
	2020-11-01T07:21:35.119 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
	2020-11-01T07:21:35.119 INFO:tasks.cephfs_test_runner:Ran 18 tests in 732.749s
	2020-11-01T07:21:35.120 INFO:tasks.cephfs_test_runner:
	2020-11-01T07:21:35.120 INFO:tasks.cephfs_test_runner:FAILED (failures=1)

Fixes: https://tracker.ceph.com/issues/23718
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2020-11-03 13:01:19 -08:00
Patrick Donnelly
f9ca58a3f3
qa: add more clients for test_volume_client
It requires 4.

Fixes: https://tracker.ceph.com/issues/23718
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2020-11-03 13:01:16 -08:00