Commit Graph

10918 Commits

Author SHA1 Message Date
Ilya Dryomov
1579f3649e
Merge pull request #52560 from petrutlucian94/rbd_service_restart_test
qa: add ceph-rbd windows service restart test 

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2024-05-17 09:09:38 +02:00
Patrick Donnelly
bfe574c6ce
Merge PR #57302 into main
* refs/pull/57302/head:
	qa/tasks/quiescer: dump ops in parallel

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
2024-05-16 21:12:51 -04:00
Patrick Donnelly
15f734ec62
qa/tasks/quiescer: dump ops in parallel
Since this --flags=locks takes the mds_lock and dumps thousands of ops, this
may take a long time to complete for each individual MDS. The entire quiesce
set may timeout (and all q ops killed) before we finish dumping ops.

Fixes: https://tracker.ceph.com/issues/65823
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2024-05-16 12:11:49 -04:00
NitzanMordhai
48db64c217
Merge pull request #56743 from NitzanMordhai/wip-nitzan-backword-forword-dencoder-tests
suites: adding dencoder test multi versions
2024-05-16 15:40:11 +03:00
Venky Shankar
999ca78a1a Merge PR #56944 into main
* refs/pull/56944/head:
	qa: add a YAML to ignore MGR_DOWN warning

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2024-05-16 14:52:13 +05:30
Venky Shankar
deb2cddb7e Merge PR #57275 into main
* refs/pull/57275/head:
	qa/fsx: use a specified sha1 to build the xfstest-dev

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
2024-05-16 11:26:45 +05:30
nmordech@redhat.com
3f26a965f6 suites: adding dencoder test multi versions
We are currently conducting regular ceph-dencoder tests for backward compatibility.
However, we are omitting tests for forward compatibility.
This suite will introduce tests against the ceph-objects-corpus to address forward
compatibility issues that may arise.
the script will install N-2 version and run against the latest version corpus objects
that we have, then install N-1 to N version and check them as well.

Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
2024-05-16 05:16:17 +00:00
Patrick Donnelly
70ed3825f8
Merge PR #57274 into main
* refs/pull/57274/head:
	mds: don't stall the asok thread for flush commands
	qa/quiescer: relax some timing requirements in the quiescer

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2024-05-15 22:56:38 -04:00
Patrick Donnelly
3e92f50796
Merge PR #57329 into main
* refs/pull/57329/head:
	qa: unmount clients before damaging the fs

Reviewed-by: Kotresh Hiremath Ravishankar <khiremat@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2024-05-15 22:56:11 -04:00
Lucian Petrut
8d294f948a qa: update rbd-wnbd test, using MBR instead of GPT
We're getting the following error while initializing 64MB disks
on WS 2019: "The disk is not large enough to support a GPT
partition style.".

For this reason, we'll use MBR instead.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
2024-05-15 19:49:40 +03:00
Lucian Petrut
d6d36b535c qa: add ceph-rbd windows service restart test
We're adding a test that:

* maps a configurable number of images
* runs a specified test - we're reusing the ones from stress_test,
  making just a few minor changes to allow running the same test
  multiple times
* restarts the ceph-rbd Windows service
* waits for the images to be reconnected and refreshes the mount
  information
* reruns the test
* repeats the above workflow for a specified number of times,
  reusing the same images

This test ensures that:

* mounted images are still available after a service restart
* drive letters are retained
* the image content is retained
* there are no race conditions when connecting or disconnecting
  a large number of images in parallel
* the driver is capable of mapping a specified number of images
  simultaneously

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
2024-05-15 19:49:33 +03:00
Lucian Petrut
808d42d575 qa: reorganize Windows python test
We're splitting the rbd-wnbd python test into separate files so
that the common code may easily be reused by other tests. This
also makes the code easier to read and maintain.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
2024-05-15 08:54:55 +03:00
Ilya Dryomov
6e11983255
Merge pull request #57444 from idryomov/wip-51845
qa/suites/krbd: drop pre-single-major and move "layering only" coverage

Reviewed-by: Ramana Raja <rraja@redhat.com>
2024-05-14 10:06:01 +02:00
Yuval Lifshitz
5327afb7f7
Merge pull request #56979 from yuvalif/wip-yuval-65337
rgw/notification: start/stop endpoint managers in notification manager

Reviewed-By: cbodley@ibm.com , kchheda3@bloomberg.net
2024-05-13 17:25:27 +03:00
Ilya Dryomov
ad6a95d8af qa/suites/krbd: rename no-object-map to no-exclusive-lock
Exclusive lock has always been disabled by this facet, so it might as
well be reflected in its name.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-05-13 13:54:30 +02:00
Ilya Dryomov
7b9f28e743 qa/suites/krbd: move "layering only" coverage to fsx
It makes much more sense there since it's where we actually create
clones and flatten them a lot.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-05-13 13:54:30 +02:00
Ilya Dryomov
39a579144c qa/suites/krbd: drop pre-single-major test
Single-major mapping scheme was introduced in 2014 and became the
default in 2017.  It's getting increasingly difficult to build and,
more importantly, to boot a 10 year old kernel with recent userspace
(systemd, etc).  If someone is still running such a kernel, it's
really unlikely that they would have the most recent rbd CLI tool
installed.

Fixes: https://tracker.ceph.com/issues/51845
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-05-13 13:54:30 +02:00
Venky Shankar
bd42acbbe1 Merge PR #56699 into main
* refs/pull/56699/head:
	qa: ignore `Invalid tag char` warning
	qa: ignore `object missing on disk` warning

Reviewed-by: Milind Changire <mchangir@redhat.com>
Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
Reviewed-by: Jos Collin <jcollin@redhat.com>
2024-05-13 17:02:41 +05:30
NitzanMordhai
6d36b0d0f8
Merge pull request #55923 from NitzanMordhai/wip-nitzan-add-deprecate-cls-gather
objclass: deprecate cls_cxx_gather
2024-05-13 09:00:51 +03:00
NitzanMordhai
0f060f6e33
Merge pull request #56983 from NitzanMordhai/wip-nitzan-thrash-erasure-code-crush-4-nodes-8-6
suites/ec-rados-plugin=jerasure-k=8-m=6-crush: roles set
2024-05-13 08:12:14 +03:00
nmordech@redhat.com
0928f7b0c3 rados/test: Remove cls_remote_reade since gather deprecated
https://tracker.ceph.com/issues/64258
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
2024-05-12 10:25:40 +00:00
Yuval Lifshitz
fdfa6673da rgw/notifications: enable notifications debug logs in vstart
as well as in:
- multisite tests (used for notification v2 migration tests)
- the qa suites running notifications

enable lifecycle logs in notification tests: for the lc notification test cases

this is needed after: 429967917b

Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>
2024-05-12 10:16:20 +00:00
Adam King
69bd270cf5
Merge pull request #57214 from adk3798/cephadm-ignore-stray-on-upgrade
qa/cephadm: ignore CEPHADM_STRAY_DAEMON in upgrade tests

Reviewed-by: John Mulligan <jmulligan@redhat.com>
2024-05-10 16:11:36 -04:00
Adam King
b1f7205de0
Merge pull request #57089 from adk3798/test-cephadm-timeout-ignore-refresh-failed
qa/cephadm: ignore CEPHADM_REFRESH_FAILED on timeout test

Reviewed-by: John Mulligan <jmulligan@redhat.com>
2024-05-10 11:57:23 -04:00
Adam King
ad8329b168
Merge pull request #57080 from phlogistonjohn/jjm-teuth-pull-img
qa/tasks/cephadm: fix pulling containers from private registries

Reviewed-by: Adam King <adking@redhat.com>
2024-05-10 11:56:00 -04:00
Adam King
ec3f1f3370
Merge pull request #57042 from adk3798/host-drain-test-ignore-stray-host
qa/cephadm: ignore stray warnings on host drain test

Reviewed-by: John Mulligan <jmulligan@redhat.com>
2024-05-10 11:51:16 -04:00
Adam King
d67aa83e1d
Merge pull request #56970 from adk3798/cephadm-ignore-orch-paused
qa/cephadm: ignore CEPHADM_PAUSED on test_orch_cli test

Reviewed-by: Michael Fritch <mfritch@suse.com>
2024-05-10 11:48:22 -04:00
Dhairya Parmar
7d954cefb1 qa: add a YAML to ignore MGR_DOWN warning
RCA showed that it is not the NFS code that lead to the warning since the
warning occurred before the test cases started to execute, later on after
some discussion with the venky and greg, it was found that there were some
clog changes made recently which leads to this warning being added to the
clog.

Digging more further, it was found that the warning is generated when mgr fail
is run when there is no mgr available. The reason for unavailability is when
`setup_mgrs()` in class `MgrTestCase` stops the mgr daemons, sometimes the mgr
just crashes - `mgr handle_mgr_signal  *** Got signal Terminated ***`  and
after which `mgr fail` (again part of `setup_mgrs()`) is run and the `MGR_DOWN`
warning is generated.

This warning is only evident in nfs is because this is the only fs suite that
makes use of class `MgrTestCase`. To support my analysis, I had ran about eight
jobs in teuthology and I could not reproduce this warning. Since this is not
harming the NFS test cases execution and the logs do mention that the mgr
daemon did get restarted (`INFO:tasks.cephadm.mgr.x:Restarting mgr.x
(starting--it wasn't running)...`), it is good to conclude that ignoring this
warning is the simplest solution.

Fixes: https://tracker.ceph.com/issues/65265
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
2024-05-08 16:19:38 +05:30
Casey Bodley
9c89fb19b6
Merge pull request #57258 from cbodley/wip-qa-rgw-d4n-tests
qa/rgw/d4n: run ceph_test_d4n_ tests

Reviewed-by: Pritha Srivastava <prsrivas@redhat.com>
2024-05-07 22:10:16 +01:00
Patrick Donnelly
b54c9e8910
Merge PR #57192 into main
* refs/pull/57192/head:
	PendingReleaseNotes: add note on the client incompatibility health warning and feature bit
	doc/cephfs: add client_mds_auth_caps client feature bit
	doc/cephfs: add missing client feature bits
	doc/cephfs: document MDS_CLIENTS_BROKEN_ROOTSQUASH health error
	qa: add tests for MDS_CLIENTS_BROKEN_ROOTSQUASH
	mds: raise health warning if client lacks feature for root_squash
	mon/MDSMonitor: add note about missing metadata inclusion
	mds: check relevant caps for fs include root_squash
	mds: refactor out fs_name match in MDSAuthCaps
	qa: test for root_squash with multiple caps
	qa: pass kwargs to mount from remount
	qa: simplify update_attrs and only update relevant keys
	client: allow overriding client features

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
2024-05-07 15:49:03 -04:00
Patrick Donnelly
2d1715fcaf
Merge PR #57166 into main
* refs/pull/57166/head:
	qa: make quiesce ops dump world readable
	qa: use specific ops/cache dump file names

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
2024-05-07 10:34:19 -04:00
Patrick Donnelly
1a058b5694
Merge PR #57165 into main
* refs/pull/57165/head:
	qa: ignore variation of PG_DEGRADED health warning

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2024-05-07 10:33:44 -04:00
Patrick Donnelly
9d0ab233d8
qa: add tests for MDS_CLIENTS_BROKEN_ROOTSQUASH
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2024-05-07 08:19:28 -04:00
Patrick Donnelly
bccc8ceb47
qa: test for root_squash with multiple caps
Where the client has root_squash for one cap but not for another. The fs
without root_squash should not necessarily reject the client.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2024-05-07 08:19:27 -04:00
Patrick Donnelly
afcbfc040b
qa: pass kwargs to mount from remount
So we can pass mntargs.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2024-05-07 08:19:27 -04:00
Patrick Donnelly
597ff3cb15
qa: simplify update_attrs and only update relevant keys
So we can just pass the caller's kwargs to update_attrs.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2024-05-07 08:19:27 -04:00
Patrick Donnelly
907722553c
qa: unmount clients before damaging the fs
So clients do not try to unmount a damaged fs.

Fixes: https://tracker.ceph.com/issues/65837
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2024-05-07 08:09:46 -04:00
Patrick Donnelly
a907c7eb57
qa: make quiesce ops dump world readable
Fixes: https://tracker.ceph.com/issues/65701
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2024-05-06 21:04:26 -04:00
Patrick Donnelly
1114e99aea
qa: use specific ops/cache dump file names
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2024-05-06 21:04:26 -04:00
Leonid Usov
ae6b388dd9 qa/quiescer: relax some timing requirements in the quiescer
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
Fixes: https://tracker.ceph.com/issues/65803
2024-05-06 19:58:53 +03:00
Ilya Dryomov
ab12ccbc72
Merge pull request #57082 from idryomov/wip-65487
rbd-mirror: clean up stale pool replayers and callouts better

Reviewed-by: N Balachandran <nibalach@redhat.com>
2024-05-06 17:41:36 +02:00
Xiubo Li
740025da22 qa/fsx: use a specified sha1 to build the xfstest-dev
This sha1 is the latest master head and works well for our tests.

Fixes: https://tracker.ceph.com/issues/64572
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2024-05-06 20:16:49 +08:00
Ilya Dryomov
d1d848276f qa/workunits/rbd: wait for replaying status in bootstrap tests
wait_for_replay_complete() doesn't wait for image status to get
updated.  This didn't matter previously because these tests are run on
two different pools and nothing else was following.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-05-06 11:47:52 +02:00
Matan Breizman
8bb8a6978c
Merge pull request #57245 from Matan-B/wip-crimson-only-testing-report
qa/config/crimson_qa_overrides: adjust mgr_stats_period

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
2024-05-06 10:33:35 +03:00
Ilya Dryomov
b7e79642d5 rbd-mirror: remove callout when destroying pool replayer
If a pool replayer is removed in an error state (e.g. after failing to
connect to the remote cluster), its callout should be removed as well.
Otherwise, the error would persist causing "daemon health: ERROR"
status to be reported even after a new pool replayer is created and
started successfully.

Fixes: https://tracker.ceph.com/issues/65487
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-05-05 21:11:54 +02:00
Casey Bodley
385b97c350 qa/rgw/d4n: run ceph_test_d4n_ tests
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2024-05-03 12:30:50 -04:00
Rishabh Dave
04416f48ef
Merge pull request #56846 from rishabh-d-dave/test-fs-auth
qa/cephfs: fix and improve test_multifs_single_path_rootsquash

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
2024-05-03 19:12:58 +05:30
Connor Fawcett
b295696730
Merge pull request #57235 from connorfawcett/ec-bench-update
qa/workunits/erasure-code: add bench data tables and graph support for additional jerasure techniques
2024-05-03 10:33:38 +01:00
Rishabh Dave
35b62488f4
Merge pull request #56732 from mchangir/mgr-snap_schedule-restore-yearly-spec-from-Y-to-y
mgr/snap_schedule: restore yearly spec to lowercase y

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
2024-05-03 14:05:50 +05:30
Rishabh Dave
d9752a6973 qa/cephfs: fix test_multifs_single_path_rootsquash
test_multifs_single_path_rootsquash was never run with vstart_runner.py
or with teuthology and is therefore full of bugs. Fix it to make sure it
runs fine.

Introduced-by: 1fda8ed2d4
Fixes: https://tracker.ceph.com/issues/65246
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2024-05-03 13:33:30 +05:30