Commit Graph

1356 Commits

Author SHA1 Message Date
Venky Shankar
9545e578a4 Merge PR #52547 into main
* refs/pull/52547/head:
	qa: add test cases for vanilla ops commands
	mds: dump locks when printing mutation ops
	common/TrackedOp: support overriding the _dump method
	mds: remove op field obsoleted by more usable "reqid"
	mds: dump metareq_t instead of full op
	mds: add lock type to formatter dump of SimpleLock
	mds: mark print methods const
	mds: drop MDRequestImpl::msg_lock
	mds: lock TrackedOp when dumping
	mds: avoid recursive locks dumping state
	common/TrackedOp: fix race updating description with proper lock
	common/Formatter: add support for dumping null
	common/Formatter: refactor generating xml name

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-08-14 17:52:47 +05:30
Venky Shankar
53f89ea09b Merge PR #52765 into main
* refs/pull/52765/head:
	mgr/volumes: Fix pending_subvolume_deletions in volume info
	qa: Add testcase for pending_subvolume_deletions count

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Neeraj Pratap Singh <neesingh@redhat.com>
2023-08-11 11:39:52 +05:30
Patrick Donnelly
ca4d0dc42b
qa: add test cases for vanilla ops commands
To test they work, not that the output is useful.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-08-08 08:58:42 -04:00
Leonid S. Usov
8262586cd0
Merge pull request #52792 from leonid-s-usov/bulk-data-pool
mgr/volumes: create bulk data pool for new volumes
2023-08-08 11:24:59 +03:00
Kotresh HR
8b1303f4b1 qa: Add testcase for pending_subvolume_deletions count
Fixes: https://tracker.ceph.com/issues/62278
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2023-08-04 12:47:24 +05:30
Leonid Usov
9a8219cc2b mgr/volumes: set the 'bulk' flag for data pools created automatically for a new volume
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
Fixes: https://tracker.ceph.com/issues/61595
2023-08-03 19:41:12 +03:00
Venky Shankar
cd18c51548 Merge PR #48732 into main
* refs/pull/48732/head:
	doc: add MDS treatise on segments
	pybind/mgr/dashboard: bump teuthology version
	qa/tasks/vstart_runner: stop overriding _run_python
	qa: stop overriding ceph_w prefix in vstart_runner
	qa/tasks/vstart_runner: update teuthology helper tool paths
	qa/tasks/vstart_runner: allow writing to command's stdin
	vstart.sh: always add CEPH_CONF to vstart_environment.sh
	qa: use stdin-killer for python3 command
	qa: add killpoint testing for mds shutdown
	qa: fix background exit condition
	qa: add filesystem helper for setting transient config
	qa: add helper for waiting for a rank to fail
	mds: add incompat feature for minor log segments
	mds: introduce ELid event to create/close log
	mds: change EResetJournal to major segment boundary
	mds: add killpoints for MDS shutdown
	qa: add numerous subtree test
	mds: track larger log events in perf dump
	mds: add minor LogSegment boundaries
	mds: obviate MDLog::start_entry
	mds: retype to properly sized unsigned ints
	mds: use unsigned type for event count
	mds: use base Context class for generalization
	mds: optimize segment lookup
	mds: add stream dump for LogSegment
	mds: handle conf changes in mdlog
	mds: remove redundant comment
	mds: remove unused method
	mds: set a reasonable minimum number of segments
	mds: sort configs

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-08-02 22:07:58 +05:30
Rishabh Dave
0584cd34b6
Merge pull request #52709 from rishabh-d-dave/cephfs-test_snapshots
qa/cephfs: fix test_disallow_monitor_managed_snaps_for_fs_pools

Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-08-02 19:18:07 +05:30
Casey Bodley
568a21d83c qa/cephfs: redefinition of unused 'random' from line 7
seeing this run-tox-qa failure about tasks/cephfs/test_client_recovery.py:

246/285 Test #264: run-tox-qa ................................***Failed   58.54 sec
Requirement already satisfied: tox in /home/jenkins-build/build/workspace/ceph-pull-requests/build/qa-virtualenv/lib/python3.10/site-packages (4.6.4)
Requirement already satisfied: cachetools>=5.3.1 in /home/jenkins-build/build/workspace/ceph-pull-requests/build/qa-virtualenv/lib/python3.10/site-packages (from tox) (5.3.1)
Requirement already satisfied: chardet>=5.1 in /home/jenkins-build/build/workspace/ceph-pull-requests/build/qa-virtualenv/lib/python3.10/site-packages (from tox) (5.1.0)
Requirement already satisfied: colorama>=0.4.6 in /home/jenkins-build/build/workspace/ceph-pull-requests/build/qa-virtualenv/lib/python3.10/site-packages (from tox) (0.4.6)
Requirement already satisfied: filelock>=3.12.2 in /home/jenkins-build/build/workspace/ceph-pull-requests/build/qa-virtualenv/lib/python3.10/site-packages (from tox) (3.12.2)
Requirement already satisfied: packaging>=23.1 in /home/jenkins-build/build/workspace/ceph-pull-requests/build/qa-virtualenv/lib/python3.10/site-packages (from tox) (23.1)
Requirement already satisfied: platformdirs>=3.8 in /home/jenkins-build/build/workspace/ceph-pull-requests/build/qa-virtualenv/lib/python3.10/site-packages (from tox) (3.10.0)
Requirement already satisfied: pluggy>=1.2 in /home/jenkins-build/build/workspace/ceph-pull-requests/build/qa-virtualenv/lib/python3.10/site-packages (from tox) (1.2.0)
Requirement already satisfied: pyproject-api>=1.5.2 in /home/jenkins-build/build/workspace/ceph-pull-requests/build/qa-virtualenv/lib/python3.10/site-packages (from tox) (1.5.3)
Requirement already satisfied: tomli>=2.0.1 in /home/jenkins-build/build/workspace/ceph-pull-requests/build/qa-virtualenv/lib/python3.10/site-packages (from tox) (2.0.1)
Requirement already satisfied: virtualenv>=20.23.1 in /home/jenkins-build/build/workspace/ceph-pull-requests/build/qa-virtualenv/lib/python3.10/site-packages (from tox) (20.24.2)
Requirement already satisfied: distlib<1,>=0.3.7 in /home/jenkins-build/build/workspace/ceph-pull-requests/build/qa-virtualenv/lib/python3.10/site-packages (from virtualenv>=20.23.1->tox) (0.3.7)
flake8: install_deps /home/jenkins-build/build/workspace/ceph-pull-requests/qa> python -I -m pip install flake8
flake8: freeze /home/jenkins-build/build/workspace/ceph-pull-requests/qa> python -m pip freeze --all
flake8: flake8==6.1.0,mccabe==0.7.0,pip==22.3.1,pycodestyle==2.11.0,pyflakes==3.1.0,setuptools==65.6.3,wheel==0.38.4
flake8: commands[0] /home/jenkins-build/build/workspace/ceph-pull-requests/qa> flake8 --select=F,E9 --exclude=venv,.tox
./tasks/cephfs/test_client_recovery.py:12:1: F811 redefinition of unused 'random' from line 7
flake8: exit 1 (3.72 seconds) /home/jenkins-build/build/workspace/ceph-pull-requests/qa> flake8 --select=F,E9 --exclude=venv,.tox pid=706315
flake8: FAIL ✖ in 15.42 seconds

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2023-08-01 12:31:44 -04:00
Patrick Donnelly
20184c23d3
qa: use stdin-killer for python3 command
This relies on the new stdin-killer [1] teuthology helper that allows
interacting with the command's stdin.

[1] https://github.com/ceph/teuthology/pull/1846

Fixes: 8bb77ed9e1
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-08-01 11:16:26 -04:00
Patrick Donnelly
936da39a15
qa: add killpoint testing for mds shutdown
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-08-01 11:16:02 -04:00
Patrick Donnelly
1fa0039a98
qa: fix background exit condition
This change causes the program to exit gracefully when stdin is closed
rather than with a Python exception.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-08-01 11:16:02 -04:00
Patrick Donnelly
1962322f59
qa: add filesystem helper for setting transient config
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-08-01 11:16:02 -04:00
Patrick Donnelly
a6b8bbd2cb
qa: add helper for waiting for a rank to fail
For killpoint testing.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-08-01 11:16:01 -04:00
Patrick Donnelly
2142114a2d
qa: add numerous subtree test
When the ESubtreeMap is very large (~5k+ subtrees), the MDS will
end up logging only a few events (as bad as 1) per segment as the
subtree map dominates the segment size.

This test simply creates an artificially large subtree and confirms that
other file system activity completes in a timely manner. This is now
taking advantage of the minor segments which allows for a normal set of
events per log segment (and fewer subtree maps). The test fails on the
current main HEAD.

Historical note: when I first observed this abberant behavior, the
vstart cluster was actually using mds_debug_subtrees = True (the default
for every vstart cluster). This caused the MDS to write out the subtree
map (for debugging reasons) with every event. When testing the MDS with
large subtrees (distributed ephemeral pinning), this caused the MDS to
slow to a trickle of operations per second. Despite this unintentional
misconfiguration, the problem still exists but the number of auth
subtrees must be large for a particlar rank to replicate the behavior.

On main HEAD, the creation of 10k files (workload stage) takes ~110
seconds. On this branch, it takes ~30 seconds.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-08-01 11:16:01 -04:00
Venky Shankar
86c9a7a08d Merge PR #51995 into main
* refs/pull/51995/head:
	qa: wait for file to have correct size

Reviewed-by: Milind Changire <mchangir@redhat.com>
2023-08-01 20:36:59 +05:30
Rishabh Dave
27f43c9a89 qa/cephfs: fix test_disallow_monitor_managed_snaps_for_fs_pools
run_cluster_cmd() method is not available anymore because it was deleted
here on this PR -
https://github.com/ceph/ceph/pull/50569/files#diff-1c6c246ba42f343603d7174198dd1fb9c2654b6c883594d1a0891096b7a35875L408

Fixes: https://tracker.ceph.com/issues/62243
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-07-31 18:18:51 +05:30
Venky Shankar
4e5d800406 Merge PR #51539 into main
* refs/pull/51539/head:
	doc: users now need to provide scrub_mdsdir and recursive flags
	qa: add recursive flag to test_flag_scrub_mdsdir
	mds: remove code to bypass dumping empty header scrub info
	mds: dump_values no more needed
	mds: enqueue ~mdsdir at the time of enqueing root

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-07-18 20:15:24 +05:30
Rishabh Dave
f0588bd3b3
Merge pull request #50569 from rishabh-d-dave/CephManager-in-CephFSTestCase
qa/cephfs: add helper methods in CephFSTestCase

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-07-13 12:44:22 +05:30
Venky Shankar
a9391e5b29 Merge PR #51959 into main
* refs/pull/51959/head:
	qa: test for session ls with filters
	mds: session ls command appears twice in command listing

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-07-13 11:18:34 +05:30
Venky Shankar
94f3e167a0 Merge PR #51278 into main
* refs/pull/51278/head:
	mgr/snap_schedule: rephrase log message when pruning
	doc: add note about snap-schedule snapshot retention
	qa: test user defined number of snaps retention spec
	mgr/snap_schedule: adapt test to new argument list
	doc/cephfs: Add note how mds_max_snaps_per_dir affects snapshot retention
	mgr/snap_schedule: Use mds_max_snaps_per_dir as snapshot count limit

Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-07-13 11:17:16 +05:30
Yuri Weinstein
6e02660f10
Merge pull request #51275 from mchangir/mon-block-osd-pool-mksnap-for-fs-pools
mon: block osd pool mksnap for fs pools


Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2023-07-07 17:09:07 -04:00
Dhairya Parmar
e40ca408a1 qa: add recursive flag to test_flag_scrub_mdsdir
Code has been changed, in order to scrub ~mdsdir at root,
recursive flag also needs to be provided along with
scrub_mdsdir.

Fixes: https://tracker.ceph.com/issues/59350
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
2023-06-28 18:12:27 +05:30
Rishabh Dave
0a781ef080 qa/cephfs: use run_ceph_cmd() when cmd output is not needed
In filesystem.py and wherever instance of class Filesystem are used, use
run_ceph_cmd() instead of get_ceph_cluster_stdout() when output of Ceph
command is not required.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-06-28 17:38:19 +05:30
Rishabh Dave
1bf87e6eec qa/cephfs: add helper methods to filesystem.py
Add run_ceph_cmd(), get_ceph_cmd_stdout() and get_ceph_cmd_result() to
class Filesystem so that running Ceph command is easier. This affects
not only methods inside class Filesystem but also methods elsewhere that
uses instance of class Filesystem to run Ceph commands.

Instead of "self.fs.mon_manager.raw_cluster_cmd()" writing
"self.fs.run_ceph_cmd()" will suffice.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-06-28 17:38:19 +05:30
Rishabh Dave
c7c38ba558 qa/cephfs: when cmd output is not needed call run_ceph_cmd()
instead of get_ceph_cmd_stdout().

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-06-28 17:38:19 +05:30
Rishabh Dave
13168834e3 qa/cephfs: add and use get_ceph_cmd_stdout()
Add method get_ceph_cmd_stdout() to class CephFSTestCase so that one
doesn't have to type something as long as
"self.mds_cluster.mon_manager.raw_cluster_cmd()" to execute a
command and get its output. And delete and replace
CephFSTestCase.run_cluster_cmd() too.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-06-28 17:38:19 +05:30
Rishabh Dave
f8f2154e54 qa/cephfs: add and use run_ceph_cmd()
Instead of writing something as long as
"self.mds_cluster.mon_manager.run_cluster_cmd()" to execute a command,
let's add a helper method to class CephFSTestCase and use it instead.

With this, running a command becomes simple - "self.run_ceph_cmd()".

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-06-28 17:38:19 +05:30
Rishabh Dave
82814ac49d qa/cephfs: add and use get_ceph_cmd_result()
To run a command and get its return value, instead of typing something
as long as "self.mds_cluster.mon_manager.raw_cluster_cmd_result" add a
hepler method in CephFSTestCase and use it. This makes this task very
simple - "self.get_ceph_cmd_result()".

Also, remove method CephFSTestCase.run_cluster_cmd_result() in favour of
this new method.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-06-28 17:38:14 +05:30
Venky Shankar
809d475814 Merge PR #49971 into main
* refs/pull/49971/head:
	doc/cephfs: document MDS_CLIENTS_LAGGY health warning
	qa: ignore warnings
	qa: add test cases to check client eviction if an OSD is laggy
	mds,messages: enable beacon to report clients lagginess
	mds: do not evict client on laggy osds
	common: add new config option to defer client eviction
	osd: add method to check for laggy osds

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-06-28 10:23:54 +05:30
Venky Shankar
f370b581f6 qa: assign file system affinity for replaced MDS
Otherwise, the MDS that just got replaced can transition to a rank
for another file system and the test cannot deterministically infer
which MDS needs to checked.

Fixes: http://tracker.ceph.com/issues/61764
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2023-06-27 09:23:14 +05:30
neeraj pratap singh
36bf907f9e qa: test for session ls with filters
Fixes: https://tracker.ceph.com/issues/61444
Signed-off-by: Neeraj Pratap Singh <neesingh@redhat.com>
2023-06-26 17:55:15 +05:30
Rishabh Dave
2e12e5086d qa/cephfs: create admin_remote instance in CephFSTestCase
admin_remote contains lots of methods that can be useful during testing,
so let's have an easy access to it too.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-06-24 19:33:44 +05:30
Rishabh Dave
0c0041005e qa/cephfs: create CephManager instance in CephFSTestCase
To run a Ceph command conveniently, run_cluster_cmd(), raw_cluster_cmd()
or raw_cluster_cmd_result() must be called. These methods are available
in class CephManager which in turn is available only if an instance of
Filesystem, MDSCluster, CephCluster or MgrCluster is initialized. Having
an instance of CephManager in CephFSTestCase will provide easy access to
these methods.

For example, in CephFS tests writing "self.mon_manager.raw_cluser_cmd()"
instead of writing "self.mds_cluster.mon_manager.raw_cluster()" will
suffice.

This commit provides a basis for upcoming commits in this patch series.
With next patches, running Ceph command will be further simplified. Just
writing self.run_ceph_cmd() will suffice for running a CephFS command.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-06-24 19:33:36 +05:30
Rishabh Dave
437f2c75f5 qa/cephfs: don't import entire module needlessly
Importing entire module ceph_manager.py is pointless since only
ceph_manager.CephManager is required in qa/tasks/cephfs/filesystem.py.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-06-24 19:18:05 +05:30
Milind Changire
a63ffd4079
qa: test user defined number of snaps retention spec
Signed-off-by: Milind Changire <mchangir@redhat.com>
2023-06-20 20:22:33 +05:30
Rishabh Dave
67b1935a18
Merge pull request #51132 from lxbsz/wip-59349
qa: wait for 100 seconds to make sure the quota to be enforced

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
2023-06-16 17:52:00 +05:30
Xiubo Li
6183c992d7
Merge pull request #51703 from lxbsz/wip-59683
xfstests_dev: install extra packages from powertools repo for xfsprogs
2023-06-13 09:44:24 +08:00
Xiubo Li
4a60f6749a
Merge pull request #50728 from lxbsz/wip-59195
qa: switch to use the merge fragment for fscrypt
2023-06-13 07:39:00 +08:00
Xiubo Li
dedf3aae65 xfstests_dev: install extra packages from powertools repo for xfsprogs
Centos Stream 8 has removed the 'device-mapper-devel', 'libedit-devel'
and 'userspace-rcu-devel' packages from the mirrors and we need to
install it from powertools repo.

Fixes: https://tracker.ceph.com/issues/59683
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2023-06-12 15:08:28 +08:00
Patrick Donnelly
3486dd872f
qa: wait for file to have correct size
Otherwise suspending the netns of the other mount will prevent it from
completing a flush on the file handle or even telling the MDS that the
file size has changed!

Fixes: https://tracker.ceph.com/issues/61409
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-06-10 14:15:06 -04:00
Yuri Weinstein
40b9a8b2cc
Merge pull request #50876 from rishabh-d-dave/qa-ceph-man-get-keyring
qa/ceph_manager: preserve newline char at EOF in keyring

Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
2023-05-25 10:49:53 -04:00
Patrick Donnelly
62d1cc0568
Merge PR #50875 into main
* refs/pull/50875/head:
	mon/MDSMonitor: ignore extraneous up:boot messages
	qa: add test case for mds sending multiple boot messages
	qa: support checking for a log message that should not exist

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
2023-05-25 08:25:34 -04:00
Venky Shankar
7b2968570a Merge PR #49691 into main
* refs/pull/49691/head:
	qa: add test for opening a file via a hard link that is not in the same mds as the inode
	mds: rdlock_path_xlock_dentry supports returning auth target inode

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-05-18 12:49:33 +05:30
Dhairya Parmar
51cca9b9dc qa: add test cases to check client eviction if an OSD is laggy
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
2023-05-17 14:38:31 +05:30
Venky Shankar
8391374c08 Merge PR #51251 into main
* refs/pull/51251/head:
	PendingReleaseNotes: add a note about deleting files from lost+found directory
	qa: add checks that validate removal of entries from lost+found dir
	mds: allow unlink operation under lost+found directory

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2023-05-11 11:21:14 +05:30
Venky Shankar
cc2f423ce1 Merge PR #51201 into main
* refs/pull/51201/head:
	qa: run scrub post file system recovery

Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
Reviewed-by: Kotresh Hiremath Ravishankar <khiremat@redhat.com>
2023-05-11 11:19:13 +05:30
Venky Shankar
4680336650 qa: run scrub post file system recovery
Running file system scrub is recommended post running filesystem
data and metadata recovery. Running scrub isn't covered in tests.

Fixes: http://tracker.ceph.com/issues/59527
Signed-off-by: Venky Shankar <vshankar@redhat.com>

Signed-off-by: Venky Shankar <vshankar@redhat.com>
2023-05-09 22:54:30 -04:00
Milind Changire
ab64bfaaf9
qa: add test to verify blocking of osd pool mksnap for fs pools
Signed-off-by: Milind Changire <mchangir@redhat.com>
2023-05-08 13:23:15 +05:30
Venky Shankar
0252313c87 qa: add checks that validate removal of entries from lost+found dir
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2023-05-06 11:03:09 -04:00