Commit Graph

3479 Commits

Author SHA1 Message Date
Kotresh HR
96c7963404 qa: Validate file quota attrs on clone subvolume
Fixes: https://tracker.ceph.com/issues/54121
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2022-02-15 13:40:17 +05:30
Adam King
884dc76683
Merge pull request #44965 from adk3798/test_cli_timeout
qa/tasks/cephadm_cases: increase timeouts in test_cli.py

Reviewed-by: Michael Fritch mfritch@suse.com
2022-02-14 08:24:32 -05:00
Yuri Weinstein
2624f51a72
Merge pull request #44588 from kamoltat/wip-ksirivad-disable-progress-by-default
pybind/mgr/progress: disable pg recovery event by default

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2022-02-11 14:49:17 -08:00
Adam King
46f939f057 qa/tasks/cephadm_cases: increase timeouts in test_cli.py
These seem to be failing sometimes but in my testing
sometimes these events are happening a few seconds after
we hit the timeout. Trying to see if this makes the tests
more consistent. No need to mark the test as failed
if we report something up in 34 seconds vs 25 especially
when cephadm works on a cyclic daemon refresh.

Signed-off-by: Adam King <adking@redhat.com>
2022-02-09 20:42:42 -05:00
Patrick Donnelly
e883dc3b82
Merge PR #42000 into master
* refs/pull/42000/head:
	qa: update rhel kclient to setup container tools
	qa: stop overriding distro for k-testing
	qa: only use RHEL for workload testing
	qa: convert fs:workload to use cephadm
	qa: split fs begin task
	qa/tasks/cephadm: setup CephManager when OSDs are provisioned
	qa/tasks/cephadm: setup file system if MDS are provisioned

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2022-02-09 09:34:49 -05:00
Venky Shankar
b7af2a94a4
Merge pull request #42549 from ajarr/wip-add-volume-rename
mgr/volumes: Add `fs volume rename` command

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2022-02-09 11:34:32 +05:30
Kamoltat
f06da20dff pybind/mgr/progress: disable pg recovery event by default
The progress module disabled the pg recovery event by default
since the event is expensive and has interrupted other serviceis
when there is OSDs being marked in/out from the the cluster.

To turn the event on manually:

ceph config set mgr mgr/progress/allow_pg_recovery_event true

Updated qa/tasks/mgr/test_progress.py to enable
the pg recovery event when testing the progress module.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
2022-02-03 17:51:42 +00:00
Soumya Koduri
9dfe5ac714 rgw/qa: Add test suite for lifecycle cases
Execute lifecycle s3-tests in the teuthology test-suite by configuring
required storage classes and 'rgw lc debug interval' option.

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
2022-02-03 00:16:03 +05:30
Patrick Donnelly
27c1110129
qa/tasks/cephadm: setup CephManager when OSDs are provisioned
The Filesystem object may use this when configuring EC data pools at
file system creation (via a FuseMount).

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2022-02-02 10:44:34 -05:00
Patrick Donnelly
2436405c5d
qa/tasks/cephadm: setup file system if MDS are provisioned
This is the same behavior/code as what the ceph task does.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2022-02-02 10:44:34 -05:00
Ilya Dryomov
82219b3bea
Merge pull request #44282 from orozery/qa-qemu-nbd-ide-interface
qa/tasks/qemu: switch nbd devices from virtio to ide

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-25 15:06:14 +01:00
Ali Maredia
99f0e82a95 qa: move certificates for kmip task into /etc/ceph
On rhel/centos the ceph user does not have permission
to access these certs which leads to s3-test failures
in teuthology.

Signed-off-by: Ali Maredia <amaredia@redhat.com>
2022-01-20 17:43:55 -05:00
Venky Shankar
ac28356234
Merge pull request #44557 from kotreshhr/clone-quota-failure
mgr/volumes: Fix subvoume snapshot clone failure

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2022-01-20 16:42:35 +05:30
Aashish Sharma
f771cd492c mgr/dashboard: Improve notifications for osd nearfull, full
This PR adds some visual hints for osds that are near full or full

Fixes: https://tracker.ceph.com/issues/53334
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
2022-01-19 16:35:27 +05:30
Ernesto Puerta
197987a5a8
Merge pull request #42603 from cypherean/feedback_frontend
mgr/dashboard: report ceph tracker bug/feature through GUI

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
2022-01-18 19:47:13 +01:00
Guillaume Abrioux
f8e22fb3da qa/nvme_loop: fix an issue on ubuntu 18.04
The following command:

```
echo /dev/sda | tee /sys/kernel/config/nvmet/subsystems/sda/namespaces/1/device_path
```

makes nvme_loop fail because fascinatingly, it adds an unexpected newline.

See:
```
/dev/sda
/dev/sda

1
tee: /sys/kernel/config/nvmet/subsystems/sda/namespaces/1/enable: No such file or directory
/dev/sda
1
```

Other distros don't have the same behavior:

```
CentOS 8
/dev/sda
/dev/sda
1

Ubuntu 20.04
/dev/sda
/dev/sda
1
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-01-17 17:10:08 +01:00
Avan Thakkar
ed2b4e7a56 mgr/dashboard: report ceph tracker bug/feature through GUI
Fixes: https://tracker.ceph.com/issues/44851
Signed-off-by: Shreya Sharma <shreyasharma.ss305@gmail.com>
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
2022-01-17 19:45:31 +05:30
Kotresh HR
7c0d31e52c qa: Add tests snapshot clone failure with quota
Fixes: https://tracker.ceph.com/issues/53848
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2022-01-17 12:14:33 +05:30
Ilya Dryomov
3c2b05a252
Merge pull request #44571 from idryomov/wip-xfstests-qemu-cert
qa/run_xfstests_qemu.sh: stop reporting success without actually running any tests

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
2022-01-14 10:28:06 +01:00
Ramana Raja
70697629bf mgr/volumes: Add fs volume rename command
The `fs volume rename` command renames the volume, i.e.,
orchestrator MDS service, file system, and the data and
metadata pool of the file system.

Fixes: https://tracker.ceph.com/issues/51162
Signed-off-by: Ramana Raja <rraja@redhat.com>
2022-01-13 10:36:46 -05:00
Venky Shankar
6b59fe1bec
Merge pull request #44397 from lxbsz/wip-53726
mds: dump tree '/' when the path is empty

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2022-01-13 18:15:24 +05:30
Yuval Lifshitz
b709091d81
Merge pull request #43995 from TRYTOBE8TME/wip-rgw-kafka-teuth-cleanup
qa/tasks: Checking for kafka cleanup
2022-01-13 11:57:03 +02:00
Ilya Dryomov
b47965b577 qa/tasks/qemu: get the new Let's Encrypt root certificate
Fixes: https://tracker.ceph.com/issues/53841
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-12 20:53:45 +01:00
Yuri Weinstein
a8bb49d4d9
Merge pull request #39440 from pdvian/wip-warn-filestore-osds
mon/OSDMonitor, osd: Add warning on filestore deprecation and force use of wpq scheduler for filestore OSDs

Reviewed-by: Neha Ojha <nojha@redhat.com>
2022-01-12 08:49:02 -08:00
Casey Bodley
81d3517bde
Merge pull request #42891 from ofriedma/wip-ofriedma-rgw-qos-finale
rgw: Add rgw rate limiting per user and per bucket

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Reviewed-by: Yuval Lifshitz <ylifshit@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
2022-01-11 11:35:05 -05:00
Kalpesh Pandya
6135747a06 qa/tasks: Checking for kafka cleanup
Adding a sleep after running ./kafka-server-stop.sh and ./zookeeper-server-stop.sh
scripts so that nothing gets logged into the kafka logs after the sleep time.
And finally killing the process.

This resolves: https://tracker.ceph.com/issues/53220

Signed-off-by: Kalpesh Pandya <kapandya@redhat.com>
2022-01-11 21:14:15 +05:30
Ernesto Puerta
f5237e8b4a
Merge pull request #44088 from ceph/feature-48388-cache
mgr: TTL cache implementation

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: sebastian-philipp <NOT@FOUND>
2022-01-11 12:52:41 +01:00
Or Friedmann
fd084fd7fc rgw: Add admin ops API for rate limiting
Add admin ops API for rate limiting and some bug fixes

Signed-off-by: Or Friedmann <ofriedma@redhat.com>
2022-01-10 16:48:56 +00:00
Prashant D
39f5a61a3a mon/OSDMonitor: Raise health warning for filestore osds
Filestore will be deprecated in Quincy, considering
that BlueStore has been the default objectstore for
quite some time.

Fixes: https://tracker.ceph.com/issues/49275

Signed-off-by: Prashant D <pdhange@redhat.com>
2022-01-05 10:08:25 +00:00
Xiubo Li
bbc4f4461f qa: add test for dumpping subtrees
Fixes: https://tracker.ceph.com/issues/53726
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2022-01-05 17:58:53 +08:00
Pere Diaz Bou
15dfa71cf7 mgr: TTLCache basic implementation
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
Fixes: https://tracker.ceph.com/issues/48388
2022-01-05 10:11:58 +01:00
Nikhilkumar Shelke
a00893fbba qa: test cases for ceph fs perf stats command
Fixes: https://tracker.ceph.com/issues/48473
Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>
2022-01-05 10:45:05 +05:30
Venky Shankar
a612b3cb85
Merge pull request #43618 from kotreshhr/recover-symlink
mds: Store symlink target in first data object

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2022-01-04 14:39:22 +05:30
Venky Shankar
4d372e9557
Merge pull request #43236 from mchangir/mgr/snap_schedule-fix-db-connection-concurrent-usage
mgr/snap_schedule: fix db connection concurrent usage

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2022-01-04 14:36:07 +05:30
Patrick Donnelly
135be96971
Merge PR #44342 into master
* refs/pull/44342/head:
	mds: trigger stray reintegration when loading dentry
	qa: test that scrub causes reintegration

Reviewed-by: Xiubo Li <xiubli@redhat.com>
2021-12-27 12:55:30 -05:00
Patrick Donnelly
91a1d81eff
Merge PR #44322 into master
* refs/pull/44322/head:
	mds: skip directory size checks for reintegration
	qa: test reintegration with directory limits

Reviewed-by: Xiubo Li <xiubli@redhat.com>
2021-12-23 08:52:10 -05:00
Mykola Golub
a9a09fffae qa/tasks: improve backfill_toofull test
1) Write more data to the pool so we operate with larger ratios.
2) Round up ratios when truncating.

Fixes: https://tracker.ceph.com/issues/53677
Signed-off-by: Mykola Golub <mgolub@suse.com>
2021-12-21 20:00:28 +02:00
Joseph Sawaya
8742173e20
Merge pull request #43139 from josephsawaya/rook-orch-qa
qa/tasks/rook: test reapplication of drive groups stored in mgr
2021-12-17 12:50:18 -05:00
Joseph Sawaya
280d735847 qa/tasks/rook: test reapplication of drive groups stored in mgr
This commit adds testing for the drive_group_loop in the Rook orchestrator
that reapplies drive groups that were applied previously.

This test removes an OSD, zaps the underlying device then waits for the OSD
to be re-created by the drive_group_loop.

This commit also updates the rook test suite to test v1.7.2 instead of 1.7.0
since `orch device zap` is only supported from v1.7.2 onwards.

Fixes: https://tracker.ceph.com/issues/53501

Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
2021-12-16 18:17:29 -05:00
Patrick Donnelly
bf4168245d
qa: test that scrub causes reintegration
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-12-16 14:29:12 -05:00
Sage Weil
735869d891 Merge PR #44211 into master
* refs/pull/44211/head:
	mon: increase mon_down_mkfs_grace to 2m

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-12-16 09:52:24 -05:00
Patrick Donnelly
fe46985a63
qa: test reintegration with directory limits
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-12-15 11:05:18 -05:00
Or Ozeri
555a2896d7 qa/tasks/qemu: switch nbd devices from virtio to ide
This commit is a workaround of a bug in the virtio interface in qemu 6.1.0+.

Fixes: https://tracker.ceph.com/issues/53587
Signed-off-by: Or Ozeri <oro@il.ibm.com>
2021-12-13 10:49:48 +02:00
Matt Benjamin
669798af85
Merge pull request #42104 from linuxbox2/wip-rgwadminops-fsid
rgw: expose RADOS cluster_fsid via adminops
2021-12-09 18:35:03 -05:00
Matt Benjamin
338d024a28 rgw:adminops: remove "import json" from radosgw_admin_rest.py
This is perhaps erring a bit on the side of cosmetic fixes.

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2021-12-09 13:24:52 -05:00
Matt Benjamin
bcbf3b1067 rgw:adminops: slightly generalize /info
Adds a get_name() method to rgw::sal::Store, by which each store
returns its unique name in lowercase.

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2021-12-09 10:48:06 -05:00
Matt Benjamin
e8716889c7 rgw:adminops: add test case for 'info' section
Add 'info' section test case to the radosgw_admin_test.py qa
task.

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2021-12-09 10:47:49 -05:00
Venky Shankar
a15e6d6721 qa: exclude nofallback mount option when using v1-style syntax
Otherwise, certain upgrade tests fail which install pacific
or earlier releases since the mount helper does not understand
this mount option, thereby passing it to the kernel which would
does not handle this config causing mount to fail in tests.

Note that this mount config is only used during teuthology tests
to catch v2-style syntax implementation bugs in the kernel.

Fixes: http://tracker.ceph.com/issues/53487
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-12-09 13:30:22 +05:30
Kotresh HR
75291459f2 qa/test_backtrace: Validate symlink xattr is stored
Validate the symlink xattr is stored on first data
object of the symlink along with backtrace if
'mds_symlink_recovery' option is enabled and
vice-versa.

Also add 'string_wrapper' class to decode bufferlist
to string. This helps 'ceph-dencoder' tool to decode
the symlink target stored, which is used in tests to
validate.

Signed-off-by: Kotresh HR <khiremat@redhat.com>
Fixes: https://tracker.ceph.com/issues/46166
2021-12-07 15:39:52 +05:30
Kotresh HR
5c282616c6 qa/cephfs-data-scan: Validate symlink recovery
Validates that the 'cephfs-data-scan' tool recovers
symlink during disaster recovery of metadata pool
from data pool correctly as symlink

Signed-off-by: Kotresh HR <khiremat@redhat.com>
Fixes: https://tracker.ceph.com/issues/46166
2021-12-07 15:39:52 +05:30
Sage Weil
99249591ca mon: increase mon_down_mkfs_grace to 2m
1m isn't quite enough for teuthology, mainly because ceph.py
creates the monmap, then does --mkfs on all mons and osds (to create
the initial keyring), and *then* starts the mons.

2m looks like it'll be enough for most cases.

sage-2021-12-02_14:45:50-rados-wip-sage2-testing-2021-12-01-2041-distro-basic-smithi/6540015

Signed-off-by: Sage Weil <sage@newdream.net>
2021-12-03 16:18:47 -06:00
Venky Shankar
4d2791a786
Merge pull request #44063 from vshankar/tr-52487
qa: wait for purge queue operations to finish

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2021-12-02 10:44:55 +05:30
Xiubo Li
dc5e3b2622 qa: correct the parameters' order
The parameters' order is incorrect and missing the client_config.

Introduced-by: 242585656c
Fixes: https://tracker.ceph.com/issues/53216
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-12-01 15:06:06 +08:00
Xiubo Li
3b44f20ac0 qa: move the optional client_config parameter to the end
Fixes: https://tracker.ceph.com/issues/53216
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-12-01 15:05:58 +08:00
Xiubo Li
b44ee81e7e qa: rename and save the client_config for kernel mount
Fixes: https://tracker.ceph.com/issues/53216
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-12-01 15:05:16 +08:00
Venky Shankar
d9c7998323 qa: wait for purge queue operations to finish
TestFragmentation.test_deep_split relies on `num_strays`
to reach zero expecting that the purge threads would
have deleted the directory entries. However, checking
`num_strays` cannot be relied on since PurqeQueue merely
journals the purge item (see PurgeQueue::push) followed
by the StrayManager marking the stray as removed thereby
accounting `num_strays`.

So, add an additional condition to check if the purge
threads have finished processing items.

Fixes: http://tracker.ceph.com/issues/52487
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-12-01 09:45:41 +05:30
Venky Shankar
c44f2fcbb7
Merge pull request #43878 from jtlayton/wip-53214
qa: account for split of the kclient "metrics" debugfs file

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2021-12-01 09:38:05 +05:30
Venky Shankar
66763fc588
Merge pull request #43850 from batrick/i53194
mds: defer messages to bootstrapping ranks

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2021-12-01 09:37:00 +05:30
Venky Shankar
f8b939e128 test: mount kclient using new-style (v2) syntax
But, do not throw away the old style mount syntax since we would
want to continue testing it since users (scripts) might still be
using it.

Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-11-30 01:13:34 -05:00
Sage Weil
9260265bf1 Merge PR #44107 into master
* refs/pull/44107/head:
	qa/tasks/cephadm_cases/test_cli: fix test_daemon_restart

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-29 13:56:28 -05:00
Sage Weil
9ae9894827 qa/tasks/cephadm_cases/test_cli: fix test_daemon_restart
We cannot schedule a daemon start if there is another daemon action
with a higher priority (including stop) scheduled.  However,
that state isn't cleared until *after* the osd goes down, the
systemctl command returns, and mgr/cephadm gets around to updating
the inventory scheduled_daemon_action state.

Semi-fix: (1) wait for the orch status to change, and then (2)
wait a few more seconds after that.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-29 10:24:20 -06:00
Sage Weil
9d50154a93 qa/tasks/cephadm: pull image to all hosts in parallel
This doesn't affect bootstrap, but it does mean we avoid any delay
the first time we cephadm.shell on some non-boostrap host.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-25 07:52:56 -06:00
Sage Weil
3a110f6c00 qa/tasks/cephadm: add hosts via mon remote
If we use a new remote for each shell command, we end up waiting
for the image to pull on every host in sequence.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-25 07:52:56 -06:00
Sage Weil
0e40064d31 qa/tasks/cephadm: use shortname for remote directory
This aligns with what the ceph and syslog tasks do.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-25 07:52:56 -06:00
Sage Weil
689d7ceabd qa/tasks/cephadm: deploy no more than 5 mons in roleless mode
Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-25 07:52:55 -06:00
Sage Weil
e7bf9242c4 qa/tasks/radosbench: default clients to all clients (not client.0)
Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-25 07:52:55 -06:00
Sage Weil
99cdaaba70 qa/tasks/ceph_manager: parallelize flush_pg_stats()
Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-25 07:52:55 -06:00
Milind Changire
e2e4635c18 qa: add test for concurrent snap creates
Test if the number of snaps on the file-system and the stats on created
snaps in the DB match.

NOTE:
Since it is difficult to get the snapshot created on the exact second,
the timestamp comparison has been limited up to the last 'minute' as the
comparison granularity.

Signed-off-by: Milind Changire <mchangir@redhat.com>
2021-11-24 13:36:30 +05:30
Patrick Donnelly
402919cbe6
mds: test connections to bootstrapping MDS
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-11-23 09:23:09 -05:00
Matt Benjamin
0046803534 qa/rgw: use local runner with cmdline radosgw_admin.py
Restore ability to run radosgw_admin.py unit standalone--improved
to use vstart_runner hooks.

Local rgwadmin(...) wrapper suggested as a cleanup in review by Casey.

Fixes: https://tracker.ceph.com/issues/52837

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2021-11-17 11:22:32 -05:00
Laura Flores
92fcfbb464
Merge pull request #43411 from ljflores/wip-mgr-command-cleanup
mon: simplify 'mgr module ls' output
2021-11-10 14:09:51 -06:00
Patrick Donnelly
b4980dd1ed
Merge PR #43767 into master
* refs/pull/43767/head:
	qa: increase the timeout value to wait a litte longer

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2021-11-10 14:00:19 -05:00
Jeff Layton
e9f2bff8cd qa: account for split of the kclient "metrics" debugfs file
Recently, Luis posted a patch to turn the metrics debugfs file into a
directory with separate files for the different sections in the old
metrics file.

Account for this change in get_op_read_count().

Fixes: https://tracker.ceph.com/issues/53214
Signed-off-by: Jeff Layton <jlayton@redhat.com>
2021-11-10 13:14:27 -05:00
Venky Shankar
3a4dd30a1e test: add cephfs_mirror thrasher
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-11-09 00:08:05 -05:00
Venky Shankar
087d7aa8ca tasks/cephfs_mirror: optionally run in foreground
cephfs mirror damon thrasher needs to send SIGTERM to mirror
daemons. The mirror daemon needs to run in foreground for
it to receive signal via `daemon.signal`.

Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-11-09 00:08:05 -05:00
Patrick Donnelly
93cdc800e2
Merge PR #43666 into master
* refs/pull/43666/head:
	qa/vstart_runner: add "managers" to LocalContext instances

Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-11-04 16:54:14 -04:00
Patrick Donnelly
373b750bfe
Merge PR #43638 into master
* refs/pull/43638/head:
	qa: pass subdir arg when executing workunit

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-11-04 16:53:20 -04:00
Patrick Donnelly
10d8c7a4a5
Merge PR #43613 into master
* refs/pull/43613/head:
	qa: lengthen health warning wait

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
2021-11-04 16:52:23 -04:00
Sage Weil
d87d2bd146 Merge PR #43611 into master
* refs/pull/43611/head:
	doc/mgr/nfs: document rgw user and bucket exports
	PendingReleaseNotes: add note about nfs CLI change(s)
	qa/suites/orch/cephadm/smoke-roleless: add rgw user nfs export case
	mgr/nfs: take user-id and/or bucket for 'nfs export create rgw'
	mgr/nfs: reorder 'nfs export creage rgw' arguments
	mgr/nfs: reorder 'nfs export create cephfs' arguments
	mgr/nfs: use keyword args for 'nfs export create rgw'
	mgr/nfs: document and use keyword args for 'nfs export create cephfs'
	qa/tasks/cephfs/test_nfs: use keyword args
	pybind/ceph_argparse: handle misordered keyword arguments

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
2021-11-04 14:33:45 -04:00
Sage Weil
fa4ee0f3c6 qa/tasks/cephfs/test_nfs: use keyword args
Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-02 17:06:58 -04:00
Xiubo Li
8795d33185 qa: increase the timeout value to wait a litte longer
Sometimes the OpenFileTable::commit() will just come after the 30
seconds' waiting.

Fixes: https://tracker.ceph.com/issues/52887
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-11-02 11:06:14 +08:00
Sage Weil
38b6a8e8d0 Merge PR #43101 into master
* refs/pull/43101/head:
	mgr/rook: implement apply rbd-mirror

Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
2021-11-01 15:27:13 -04:00
Patrick Donnelly
e0c19acbf1
Merge PR #43590 into master
* refs/pull/43590/head:
	qa: test that new mounts of same fs function after old mount is evicted
	qa: remove REQUIRE_KCLIENT_REMOTE

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
2021-11-01 12:34:14 -04:00
Casey Bodley
74565072f7
Merge pull request #43625 from alimaredia/wip-marcus-teuthvault-2
qa/rgw: Fix vault token file access.

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2021-10-29 15:14:03 -04:00
Joseph Sawaya
1b808345ee mgr/rook: implement apply rbd-mirror
This commit implements `orch apply rbd-mirror` in the rook orchestrator,
it creates a CR with a default name if the service_id isn't specified in
the spec, else it sets the name of the CR to the service_id in the spec.
This commit also adds `orch apply rbd-mirror` to the rook QA. This commit
also implements `orch rm rbd-mirror`.

Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
2021-10-28 15:47:13 -04:00
Marcus Watts
454cc8a18c Fix vault token file access.
Put the vault token file in a location that ceph can read.
Make it readable only by ceph.

On rhel8 (and indeed, any vanilla rhel machine), $HOME is liable to be
mode 700.  This means the ceph user can't read things in that user's
directory.  This causes radosgw to emit the confusing message "ERROR:
Vault token file ... not found" even though the teuthology log will
plainly show it was created and made readable by ceph.

Fixes: http://tracker.ceph.com/issues/51539
Signed-off-by: Marcus Watts <mwatts@redhat.com>
2021-10-28 14:14:10 -04:00
Sage Weil
d7acc16860 qa/tasks/cephfs/test_nfs: wait for fs to come up before exporting
Signed-off-by: Sage Weil <sage@newdream.net>
2021-10-28 13:28:35 -04:00
Patrick Donnelly
c8810e46e8
qa: lengthen health warning wait
It's just a little too short!

Fixes: https://tracker.ceph.com/issues/52995
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-10-26 14:26:19 -04:00
Rishabh Dave
e650bc6e87 qa/vstart_runner: add "managers" to LocalContext instances
Without this plenty tests become incompatible with vstart_runner.py.
Ideally, vstart_runner.py should've been updated in commit 7812cfb674.

Fixes: https://tracker.ceph.com/issues/53043
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2021-10-26 20:51:09 +05:30
Venky Shankar
01154fc41c qa: pass subdir arg when executing workunit
`_run_tests()` accepts subdir argument (to run a workunit with
the passed in sub-directory as cwd). One invocation was missing
the subdir argument causing `subdir` tag in yaml to be ineffective.

Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-10-25 12:40:53 +05:30
Patrick Donnelly
04aabf8bee
Merge PR #38752 into master
* refs/pull/38752/head:
	qa: enable dynamic debug support to kclient

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-10-20 22:24:32 -04:00
Jeff Layton
242585656c qa: test that new mounts of same fs function after old mount is evicted
Signed-off-by: Jeff Layton <jlayton@redhat.com>
2021-10-20 10:34:01 -04:00
Neha Ojha
9fda06aa6a
Merge pull request #43572 from trociny/wip-qa-backfill-toofull-compress
qa/tasks/backfill_toofull: make test work when compression on

Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-10-20 06:57:06 -07:00
Jeff Layton
5ab91d53a1 qa: remove REQUIRE_KCLIENT_REMOTE
Nothing references this variable anymore since commit 2df7caae4b (qa:
remove obsolete test).

Signed-off-by: Jeff Layton <jlayton@redhat.com>
2021-10-19 11:07:09 -04:00
Ernesto Puerta
f5fddd6121
Merge pull request #42526 from liewegas/dashboard-nfs
mgr/dashboard: consume mgr/nfs

Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Varsha Rao <rvarsha016@gmail.com>
2021-10-19 11:17:17 +02:00
Ilya Dryomov
cf8b6dc972
Merge pull request #42760 from ideepika/wip-iscsi-testing
qa/suites/rbd: switch iscsi tests to cephadm

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2021-10-18 14:25:36 +02:00
Alfonso Martínez
58a6ab2147 mgr/dashboard: NFS exports: API + UI: integration with mgr/nfs; cleanups
mgr/dashboard: move NFS_GANESHA_SUPPORTED_FSALS to mgr_module.py

Importing from nfs module throws AttributeError because as a side effect the dashboard module is impersonating the nfs module.
https://gist.github.com/varshar16/61ac26426bbe5f5f562ebb14bcd0f548

mgr/dashboard: 'Create NFS export' form: list clusters from nfs module

mgr/dashboard: frontend+backend cleanups for NFS export

Removed all code and references related to daemons. UI cleanup and adopted unit-testing for
nfs-epxort create form for CEPHFS backend. Cleanup for export list/get/create/set/delete endpoints.

mgr/dashboard: rm set-ganesha ref + update docs

Remove existing set-ganesha-clusters-rados-pool-namespace references as
they are no longer required. Moreover, nfs doc in dashboard doc is
updated accordingly to the current nfs status.

mgr/dashboard: add nfs-export e2e test coverage

mgr/dashboard: 'Create NFS export' form: remove RGW user id field.

- Improve bucket typeahead behavior.
- Increase version for bucket list endpoint.
- Some refactoring.

mgr/dashboard: 'Create NFS export' form: allow RGW backend only when default realm is selected.

When RGW multisite is configured, the NFS module can only handle buckets in the default realm.

mgr/dashboard: 'Create service' form: fix NFS service creation.

After https://github.com/ceph/ceph/pull/42073, NFS pool and namespace are not customizable.

mgr/dashboard: 'Create NFS export' form: add bucket validation.

- Allow only existing buckets.
- Refactoring:
  - Moved bucket validator from bucket form to cd-validators.ts
  - Split bucket validator into 2: bucket name validator and bucket existence (that checks either existence or non-existence).

mgr/dashboard: 'Create NFS export' form: path validation refactor: allow only existing paths.

Fixes: https://tracker.ceph.com/issues/46493
Fixes: https://tracker.ceph.com/issues/51479
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
2021-10-18 12:58:54 +02:00
Deepika Upadhyay
cbd2c71398 qa/tasks: adapt ceph_iscsi.py task to ceph_iscsi_client
* we use setup_iscsi_client.py to deploy iscsi client services,
  configuring intiator and mulitpath this is done by qa task
  ceph_iscsi_client
* qa/cephadm: adds remotes ip addresses to iscsi gateway,
* rename poolname: iscsi >> datapool, which we usually use for tests and
  expresses type of pool more clearly.

Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
2021-10-18 13:21:50 +05:30
Mykola Golub
429ac06cbb qa/tasks/backfill_toofull: make test work when compression on
The osd backfill reservation does not take compression into account so
we need to operate with "uncompressed" bytes when calculating nearfull
ratio.

Signed-off-by: Mykola Golub <mgolub@suse.com>
2021-10-16 15:28:53 +03:00
Patrick Donnelly
5af9882f94
Merge PR #43426 into master
* refs/pull/43426/head:
	qa/cephfs: update xfstests_dev for centos stream

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-10-15 11:15:41 -04:00
Patrick Donnelly
a8e77365b1
Merge PR #43425 into master
* refs/pull/43425/head:
	qa: import CommandFailedError from exceptions not run

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Jos Collin <jcollin@redhat.com>
2021-10-15 11:13:55 -04:00
Patrick Donnelly
7aea7f48ba
Merge PR #43420 into master
* refs/pull/43420/head:
	qa: skip internal metadata directory when scanning ceph debugfs directory

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-10-15 11:13:05 -04:00
Laura Flores
bb0a39b7d9 qa/tasks/mgr: update tests that use mgr module ls to specify the json format
Signed-off-by: Laura Flores <lflores@redhat.com>
2021-10-14 23:59:47 +00:00
Kefu Chai
70b049ffdb
Merge pull request #43239 from trociny/wip-48959
osd: handle inconsistent hash info during backfill and deep scrub gracefully

Reviewed-by: Samuel Just <sjust@redhat.com>
2021-10-14 22:43:16 +08:00
Kefu Chai
b5d2548ceb
Merge pull request #43463 from Zhiwei-Dai/wip-enhance-qa-python3-compatibility
qa/tasks: replace iterkeys() with keys() for Python 3

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
2021-10-14 22:38:39 +08:00
Sebastian Wagner
d4783f5a65
Merge pull request #43214 from batrick/i52654
pybind/mgr/cephadm: set allow_standby_replay during CephFS upgrade

Reviewed-by: Sage Weil <sage@newdream.net>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-10-14 16:28:30 +02:00
Avan Thakkar
6644a00a2c mgr/dashboard: introduce gather facts in host list
Fixes: https://tracker.ceph.com/issues/52017
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
2021-10-13 16:02:51 +05:30
Avan Thakkar
b9f38cadc4 mgr/dashboard: Create Cluster Workflow welcome screen and e2e tests
A module option called CLUSTER_STATUS has two option. INSTALLED
AND POST_INSTALLED. When CLUSTER_STATUS is INSTALLED it will allow to show the
create-cluster-wizard after login the initial time.  After the cluster
creation is succesfull this option is set to POST_INSTALLED
Also has the e2e codes for the Review Section

Fixes: https://tracker.ceph.com/issues/50336
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Signed-off-by: Nizamudeen A <nia@redhat.com>
2021-10-13 15:52:14 +05:30
Sebastian Wagner
1be6dc174f
Merge pull request #43455 from liewegas/qa-nvme-loop
qa: use nvme_loop devices for (some) cephadm tests

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-10-11 16:43:10 +02:00
Dai Zhiwei
00e5e5d5cd qa/tasks: replace iterkeys() with keys() in Python 3
Python 2.7 reached the end of its lifetime, the pr fixes teuthology task
error in Python 3.x

Fixes: https://tracker.ceph.com/issues/52878
Signed-off-by: Dai Zhiwei <daizhiwei3@huawei.com>
2021-10-09 23:10:03 +08:00
Sage Weil
d4a1ec2d06 qa/tasks/nvme_loop: loop until 'nvme list' shows new devs
Sometimes this doesn't happen immediately.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-10-08 16:06:28 -05:00
Sage Weil
dda15a7924 qa/tasks/cephadm: wait for osds to start explicitly
Signed-off-by: Sage Weil <sage@newdream.net>
2021-10-08 16:06:28 -05:00
Sage Weil
d3c9486ed9 qa/tasks/cephadm: if no osd roles, --all-available-devices
Signed-off-by: Sage Weil <sage@newdream.net>
2021-10-08 16:06:28 -05:00
Sage Weil
65cf69c6ff qa/tasks/nvme_loop: set up nvme_loop on scratch_devs
Using an nvme loop device makes the LVs look like "real" disks,
which means we can exercise all of the normal code paths for
provisioning, deprovisioning, and zapping.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-10-08 16:06:28 -05:00
Patrick Donnelly
d33debc643
qa: fsync dir for asynchronous creat on stray tests
Use the enhanced create_n_files to dedup code. Also split the large test
into three.

Fixes: https://tracker.ceph.com/issues/52606
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-10-07 15:08:34 -04:00
Patrick Donnelly
395d20a2b7
qa: refactor and generalize create_n_files
Few things:

- Allow calling fsync on directory (to support async create kernel).
- Allow immediately unlinking the created file (for stray testing).
- Close any file descriptors created.
- Write unique content (the i variable) to each file.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-10-07 15:08:34 -04:00
Matt Benjamin
103b6cc355
Merge pull request #43442 from linuxbox2/wip-rgwadmin-logfix
qa/rgw: fix ops log tests to handle non-bucket ops (which are now valid)
2021-10-07 11:04:16 -04:00
Patrick Donnelly
2363078751
Merge PR #43231 into master
* refs/pull/43231/head:
	qa: fix promotion test

Reviewed-by: Ramana Raja <rraja@redhat.com>
2021-10-07 09:16:34 -04:00
Venky Shankar
ff88d7de52 qa: skip internal metadata directory when scanning ceph debugfs directory
kclient patchset

        https://patchwork.kernel.org/project/ceph-devel/list/?series=556049

introduces `meta` directory to add debugging entries. This needs to be filtered
when scanning ceph debugfs directory.

Fixes: https://tracker.ceph.com/issues/52824
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-10-07 00:40:13 -04:00
Matt Benjamin
788da98cde qa/rgw: fix ops log tests to handle non-bucket ops (which are now valid)
After 3863eb89512f1698b8e56f1f1ffc78a6ca8d5826--rgw: permit logging of
list-bucket (and any other no-bucket op-- the radosgw ops-log
contains entries for ops with no associated buckets--e.g., list_buckets.
When examining such a log object in the radosgw_admin task, don't assert
that it has any bucket name.

Fixes: https://tracker.ceph.com/issues/52647

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2021-10-06 16:08:06 -04:00
Ernesto Puerta
df89e6a174
Merge pull request #43256 from rhcs-dashboard/fix-48845-master
qa/mgr/dashboard/test_pool: don't check HEALTH_OK

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
2021-10-06 21:49:12 +02:00
Neha Ojha
363b223844
Merge pull request #42964 from trociny/wip-52448
osd: re-cache peer_bytes on every peering state activate

Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-10-06 09:26:16 -07:00
Patrick Donnelly
b56623342e
qa: fix promotion test
The test is not needing to check that the new MDS becomes active, only
that a replacement occurs.

Fixes: https://tracker.ceph.com/issues/52677
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-10-05 19:57:18 -04:00
Rishabh Dave
eb25549b8c qa/cephfs: update xfstests_dev for centos stream
Fixes: https://tracker.ceph.com/issues/52821
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2021-10-06 00:26:32 +05:30
Rishabh Dave
485841b255 qa: import CommandFailedError from exceptions not run
Stop importing CommandFailedError from teuthology.orchestra.run, it is
actually defined in teuthology.exception.

Fixes: https://tracker.ceph.com/issues/51226
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2021-10-05 23:41:09 +05:30
Patrick Donnelly
5a7382214f
qa: add tasks to check mds upgrade state
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-10-05 13:32:15 -04:00
Patrick Donnelly
dbe5573ed4
qa: add note about where caps are generated
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-10-05 13:32:15 -04:00
Patrick Donnelly
24bb450d39
qa: use ctx's ceph_manager to run ceph commands by mount
This allows hooks for `cephadm shell` to function so that this code
works with cephadm deployments.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-10-05 13:32:15 -04:00
Patrick Donnelly
7812cfb674
qa: move CephManager cluster instantiation to subtask
This needs to be available for the cephfs_setup task so administration
mounts can run ceph commands, potentially through `cephadm shell`.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-10-05 13:32:15 -04:00
Ernesto Puerta
2283cb068b
qa/mgr/dashboard/test_pool: don't check HEALTH_OK
Fixes: https://tracker.ceph.com/issues/48845
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
2021-09-30 14:16:46 +02:00
Sage Weil
e3bede0008 Merge PR #43287 into master
* refs/pull/43287/head:
	mgr/rook, qa/tasks/rook: change rgw daemon service name
	mgr/rook: fix placement_spec_to_node_selector
	mgr/rook: orch rm no longer uses rook api delete
	qa/tasks/rook: fix cluster deletion hanging due to CephObjectStore CR
	mgr/rook: use default replication size in orch apply rgw
	mgr/rook: add placement specs to apply rgw

Reviewed-by: Sage Weil <sage@redhat.com>
2021-09-29 14:38:47 -04:00
Ernesto Puerta
156defa48e
Merge pull request #43255 from rhcs-dashboard/fix-49344-master
qa/mgr/dashboard: add extra wait to test

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
2021-09-29 20:23:23 +02:00
Mykola Golub
d35920da5e qa/suites/rados: add inconsistent hinfo test
Signed-off-by: Mykola Golub <mgolub@suse.com>
2021-09-28 16:43:02 +01:00
Joseph Sawaya
8990280b22 mgr/rook, qa/tasks/rook: change rgw daemon service name
This commit changes the rgw daemon service name format from
rgw.<realm name>.<zone name> to rgw.<resource_name> and changes the daemon
removal in the QA accordingly. This also gets rid of the Rook API when
describing services.

Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
2021-09-27 14:52:59 -04:00
Joseph Sawaya
387c4f1310 qa/tasks/rook: fix cluster deletion hanging due to CephObjectStore CR
This commit fixes the issue where the cluster deletion hangs in the QA
while a CephObjectStore CR is still up by removing all rgw/nfs/mds/rbd-mirror
daemons before tearing down the rest of the cluster.

Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
2021-09-27 14:51:13 -04:00
Avan Thakkar
88a8732215 mgr/dashboard: make modified API endpoints backward compatible
Fixes: https://tracker.ceph.com/issues/52480
Signed-off-by: Avan Thakkar <athakkar@redhat.com>

Introducing APIVersion class to handle versioning for API-endpints and making
them backward compatible.
2021-09-24 18:48:35 +05:30
Ernesto Puerta
9ff778cdaa
qa/mgr/dashboard: add extra wait to test
Fixes: https://tracker.ceph.com/issues/49344
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
2021-09-22 14:11:23 +02:00
Patrick Donnelly
541cc173c6 Merge PR #43179 into master
* refs/pull/43179/head:
	qa: lengthen grace for fs map showing dead MDS

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
2021-09-20 20:48:00 -04:00
Patrick Donnelly
c8a900c6c6 Merge PR #42763 into master
* refs/pull/42763/head:
	mon/FSCommands: add 'recover' flag in `fs new` command

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-09-20 20:46:25 -04:00
Patrick Donnelly
0d9753fa3c Merge PR #43122 into master
* refs/pull/43122/head:
	qa: add test for standby-replay marking rank damaged
	MDSMonitor: handle damaged from standby-replay
	mds: add config to mark rank damaged in standby-replay
	include: unset std::hex after printing CompatSet
	mds: refactor iterator lookup
	mds: harden rank lookup

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2021-09-16 21:47:40 -04:00
Sage Weil
1a19d69679 Merge PR #43172 into master
* refs/pull/43172/head:
	qa/tasks/kubeadm: modify (do not clobber) daemon.json

Reviewed-by: Joseph Sawaya <jsawaya@redhat.com>
2021-09-15 22:48:36 -04:00
Patrick Donnelly
91c6f3364d Merge PR #42719 into master
* refs/pull/42719/head:
	mgr/volumes: Fix permission during subvol creation with mode

Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-09-15 22:34:23 -04:00
Patrick Donnelly
33331cf4aa Merge PR #42584 into master
* refs/pull/42584/head:
	doc: fix `daemon status` interface (exclude file system name)
	test: adjust mirroring tests for `daemon status` change
	mgr/mirroring: `daemon status` command does not require file system name

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-09-15 22:33:18 -04:00
Patrick Donnelly
ef5d7febeb
qa: lengthen grace for fs map showing dead MDS
Fixes: https://tracker.ceph.com/issues/52625
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-09-15 22:21:03 -04:00
Sage Weil
2a6ad93a76 qa/tasks/kubeadm: modify (do not clobber) daemon.json
Otherwise we blow away the mirror config.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-09-15 15:16:50 -05:00
Mykola Golub
76743e0058 qa/suites/rados: add backfill_toofull test
Signed-off-by: Mykola Golub <mgolub@suse.com>
2021-09-15 17:21:11 +03:00
Xiubo Li
0cb06740a9 qa: enable dynamic debug support to kclient
Add a 'kmount_count' counter in ctx to make sure the dynamic debug
log won't be disabled until the last kernel mounter is unmounted.

Fixes: https://tracker.ceph.com/issues/48736
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-09-15 09:31:04 +08:00
Sage Weil
13238ade13 Merge PR #43136 into master
* refs/pull/43136/head:
	qa/tasks/kubeadm: change calico encap to IPIPCrossSubnet
	qa/suites/orch/rook/smoke: add host networking to matrix
	qa/tasks/rook: fix shadowing of config arg in rook_cluster()

Reviewed-by: Joseph Sawaya <jsawaya@redhat.com>
2021-09-13 18:28:43 -04:00
Sage Weil
528880d3bb qa/tasks/kubeadm: change calico encap to IPIPCrossSubnet
Signed-off-by: Sage Weil <sage@newdream.net>
2021-09-13 15:26:54 -05:00
Ramana Raja
67bb13859a mon/FSCommands: add 'recover' flag in fs new command
Currently, to recover a file system after recovering monitor store, you
need to stop all the MDSs; create FSMap with defaults using `fs new`
command; execute `fs reset` command to get the file system's rank 0 into
existing but failed state; and then restart MDSs.

Add 'recover' flag to the `fs new` command that sets the file system's
rank 0 to existing but failed state, and sets the file system's
'joinable' setting to False. Using the `fs new` command with 'recover'
flag gets rid of the steps to stop all the MDSs and execute `fs reset`
command when recovering the file system after recoving monitor store.

Fixes: https://tracker.ceph.com/issues/51716
Signed-off-by: Ramana Raja <rraja@redhat.com>
2021-09-13 00:15:39 -04:00
Mykola Golub
e0a926a2c1 qa/tasks/ceph_manager: fix assertion
The osd may be 0.

Signed-off-by: Mykola Golub <mgolub@suse.com>
2021-09-10 15:47:41 +03:00
Patrick Donnelly
f4a11a3290
qa: add test for standby-replay marking rank damaged
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-09-09 20:16:03 -04:00
Yuri Weinstein
3b779e712f
Merge pull request #42853 from sseshasa/wip-fix-vstart-mon-permissions
mon/MonCap: Update osd profile to allow cmd to set iops capacity on mon db

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-09-09 10:10:06 -07:00
Sebastian Wagner
fe734adddd
Merge pull request #43045 from sebastian-philipp/qa-tox-import-yaml
qa: tox.ini: verify yaml syntax

Reviewed-by: Sage Weil <sage@newdream.net>
2021-09-08 17:10:13 +02:00
Kotresh HR
7440ef842a mgr/volumes: Fix permission during subvol creation with mode
The subvolume creation with specific mode leads to
creation of parent directories ('/volumes/_no_group') with
the same mode if it's not already created. Fixed the same.

Similarly, the subvolumegroup creation with specific mode
leads to creation of parent directory ('/volumes') with
same mode if it's not already created. Fixed the same.

Fixes: https://tracker.ceph.com/issues/51870
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2021-09-07 15:51:21 +05:30
Sebastian Wagner
7777603e8b
qa: tox.ini: verify yaml syntax
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-09-07 10:20:34 +02:00
Patrick Donnelly
ca906d0d7a Merge PR #42529 into master
* refs/pull/42529/head:
	qa: verify rank 0 does not fail during journal repair tests
	qa: avoid stopping/restarting mds in journal repair tests

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2021-09-06 14:00:41 -04:00
Sage Weil
42b4108073 qa/tasks/rook: fix shadowing of config arg in rook_cluster()
Signed-off-by: Sage Weil <sage@newdream.net>
2021-09-03 10:49:54 -05:00
Sage Weil
9cb2f444fd Merge PR #42873 into master
* refs/pull/42873/head:
	qa/tasks/rook: add OSD creation to Rook QA

Reviewed-by: Sage Weil <sage@redhat.com>
2021-09-02 17:11:51 -04:00
Joseph Sawaya
4b6de11169 qa/tasks/rook: add OSD creation to Rook QA
This commit adds OSD creation to the Rook QA tasks. The Rook task will
explicitly wait for the mgr to start and the CLI to work (instead of
implicitly doing so while waiting for 'ceph osd dump' to work).
Then it will do `ceph orch apply osd --all-available-devices` to create
OSDs on the rest of the PVs.

Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
2021-09-01 11:27:40 -04:00
Kalpesh Pandya
9c1e5d5c52 qa/tasks: Addition of new code for session tags in STS
Signed-off-by: Kalpesh Pandya <kapandya@redhat.com>
2021-09-01 17:09:54 +05:30
Kalpesh Pandya
74b5ec876c qa/tasks: Addition of two new parameters for sts-tests
Addition of SUB and AZP parameter for some new sts-tests

Signed-off-by: Kalpesh Pandya <kapandya@redhat.com>
2021-09-01 17:09:54 +05:30
Sridhar Seshasayee
4b0dba28b6 qa/tasks: Set default caps for 'osd' type in generate_caps()
Assign the default caps for osds to be the same as what the AuthMonitor
sets for a new osd. See AuthMonitor::validate_osd_new() which sets the
following caps for a new osd:

 mon='allow profile osd'
 mgr='allow profile osd'
 osd=''allow *'

When an actual real world cluster is deployed, the above caps are applied.
Unless the user modifies the defaults, a cluster will operate with the
above caps. Therefore, it makes sense to use the defaults when testing
Ceph so that issues if any due to the default settings may be caught and
fixed.

Therefore, the caps for the 'osd' type is reset to the default in
generate_caps(). The caps for 'mgr' already reflects the system defaults.
The caps for 'mds' type is not changed in this commit and will be
investigated and changed if necessary later.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2021-09-01 13:46:01 +05:30
Patrick Donnelly
ec69208deb Merge PR #38481 into master
* refs/pull/38481/head:
	qa/vstart_runner: inherit methods instead of duplicating them
	qa/ceph_manager: make it possible to reuse few methods
	qa/vstart_runner: don't use "shell=False" in run_ceph_w()
	qa/ceph_manager: minor refactor

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-08-27 21:26:41 -04:00
Patrick Donnelly
ea04087786 Merge PR #42371 into master
* refs/pull/42371/head:
	mgr/volumes: Fix a race during clone cancel
	mgr/volumes: Fail subvolume removal if it's in progress

Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2021-08-23 20:02:31 -04:00
Avan Thakkar
95543bb150 mgr/dashboard: stats=false not working when listing buckets
Fixes: https://tracker.ceph.com/issues/51154
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
2021-08-23 15:57:54 +05:30
Sage Weil
6f8bdfbb90 Merge PR #42252 into master
* refs/pull/42252/head:
	mgr/dashboard: set rgw credentials: fix api tests
	mgr/dashboard: run-frontend-e2e-tests.sh: remove unneeded rgw setting
	mgr/dashboard: rgw service creation form: add realm and zone to service spec.
	mgr/dashboard: connect-rgw: rename to set-rgw-credentials; refactoring
	mgr/dashboard: connect-rgw: adaptation and test coverage
	mgr/cephadm: re-check dashboard <-> rgw creds when rgw daemons created/destroyed
	mgr/dashboard: add 'dashboard connect-rgw' command
	doc/mgr/dashboard: simplify dashboard+rgw config docs

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
2021-08-11 11:28:28 -04:00
Alfonso Martínez
a682b9d7a4 mgr/dashboard: set rgw credentials: fix api tests
Fixes: https://tracker.ceph.com/issues/44605
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
2021-08-11 08:59:13 +02:00
Sage Weil
4b9a3b2171 Merge PR #42613 into master
* refs/pull/42613/head:
	qa/suites/roch/rook/smoke: test rook 1.7.0, not 1.6.2
	qa/tasks/rook: set storage_class to scratch

Reviewed-by: merge 42318
2021-08-10 16:47:22 -04:00
Sage Weil
3331a0a7ea Merge PR #42691 into master
* refs/pull/42691/head:
	mgr/nfs: add --port to 'nfs cluster create' and port to 'nfs cluster info'
	qa/suites/orch/cephadm/smoke-roleless: test taking ganeshas offline
	qa/tasks/vip: exec with bash -ex
	qa/suites/orch/cephadm: separate test_nfs from test_orch_cli

Reviewed-by: Varsha Rao <varao@redhat.com>
2021-08-10 16:37:38 -04:00
Alfonso Martínez
6e20ef1dd3 mgr/dashboard: connect-rgw: rename to set-rgw-credentials; refactoring
- Rename the dashboard command to better reflect its behavior.
- Rename '_radosgw_admin' method to 'send_rgwadmin_command' for consistency with
  'send_mon_command' and move it to the mgr_module.py .
- Cleanup: remove unneeded rgw settings.
- Better error handling and test coverage.

Fixes: https://tracker.ceph.com/issues/44605
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
2021-08-10 14:06:03 +02:00
Sage Weil
84479e03a7 Merge PR #42709 into master
* refs/pull/42709/head:
	qa/tasks/kubeadm: force docker cgroup engine to systemd

Reviewed-by: Travis Nielsen <tnielsen@redhat.com>
2021-08-09 15:23:11 -04:00
Casey Bodley
95f2161ee3
Merge pull request #42688 from cbodley/wip-52069
qa/rgw: update apache-maven mirror for rgw/hadoop-s3a

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2021-08-09 11:51:36 -04:00
Casey Bodley
e514b3a374
Merge pull request #42689 from cbodley/wip-52070
qa/rgw: barbican and pykmip tasks upgrade pip before installing pytz

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2021-08-09 11:51:21 -04:00
Sage Weil
517b7759b3 qa/tasks/kubeadm: force docker cgroup engine to systemd
Signed-off-by: Sage Weil <sage@newdream.net>
2021-08-06 14:21:08 -05:00
Kefu Chai
62944aefa0
Merge pull request #42277 from tchaikov/wip-vstart-runner-cleanups
qa/tasks/vstart_runner: do not send SIGTERM if no matched pid

Reviewed-by: Rishabh Dave <ridave@redhat.com>
2021-08-06 10:33:19 +08:00
Sage Weil
3c1e086be0 qa/tasks/vip: exec with bash -ex
Signed-off-by: Sage Weil <sage@newdream.net>
2021-08-05 17:45:56 -04:00
Casey Bodley
e5a5b4e379 qa/rgw: barbican and pykmip tasks upgrade pip before installing pytz
Downloading 461087a514/cryptography-3.4.7.tar.gz (546kB)
  Complete output from command python setup.py egg_info:

          =============================DEBUG ASSISTANCE==========================
          If you are seeing an error here please try the following to
          successfully install cryptography:

          Upgrade to the latest pip and try again. This will fix errors for most
          users. See: https://pip.pypa.io/en/stable/installing/#upgrading-pip
          =============================DEBUG ASSISTANCE==========================

  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "/tmp/pip-build-7fhnk5us/cryptography/setup.py", line 14, in <module>
      from setuptools_rust import RustExtension
  ModuleNotFoundError: No module named 'setuptools_rust'

Fixes: https://tracker.ceph.com/issues/52070

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2021-08-05 16:45:02 -04:00
Casey Bodley
9253733d08 qa/rgw: update apache-maven mirror for rgw/hadoop-s3a
Fixes: https://tracker.ceph.com/issues/52069

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2021-08-05 14:50:09 -04:00
Kefu Chai
a17ebc0406
Merge pull request #42575 from tchaikov/wip-venv
*: s/virtualenv/python -m venv/

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-08-04 18:37:45 +08:00
Sage Weil
460d7a215a qa/tasks/rook: set storage_class to scratch
Signed-off-by: Sage Weil <sage@newdream.net>
2021-08-03 16:13:13 -04:00
Venky Shankar
11b61b4fb9 test: adjust mirroring tests for daemon status change
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-08-02 06:39:16 -04:00
Rishabh Dave
d86bfbfe2d qa/vstart_runner: inherit methods instead of duplicating them
Inherit methods run_ceph_w(), run_cluster_cmd(), raw_cluster_cmd() and
raw_cluster_cmd_result() from ceph_manager.CephManager in
vstart_runner.LocalCephManager instead of duplicating them.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2021-08-02 11:37:49 +05:30
Rishabh Dave
93677576c1 qa/ceph_manager: make it possible to reuse few methods
Make minor adjustments to ceph_manager.CephManager so that methods
run_ceph_w(), run_cluster_cmd() raw_cluster_cmd() and
raw_cluster_cmd_result() can be reused, instead of duplicating, in
subclasses. The adjustments are -

* Having variables contain arguments that'll be prepended to every
  command received by the methods above.
* Grouping variables that needs to be overridden together so that it is
  easy to spot and override them for users.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2021-08-02 11:37:49 +05:30
Rishabh Dave
047c90f881 qa/vstart_runner: don't use "shell=False" in run_ceph_w()
Instead prepend "exec sudo" to the command arguments of
LocalCephManager.run_ceph_w(). This makes the default parameter
"shell=False" redundant in case of
ceph_manager.CephManager.run_ceph_w(), so get rid of it too and update
calls to run_ceph_w() accordingly.

The reason behind using any of these workarounds is that running "ceph
-w" with "shell" set to True leads to crash for Ceph API CI job. See
this ticket for more details: https://tracker.ceph.com/issues/49644.

The reason behind switching the workaround is that in the following
commits to reduce duplication LocalCephManager.run_ceph_w() will be
deleted and CephManager.run_ceph_w() will be used by LocalCephManager
via inheritance. However, due to the issue described above, Ceph API
test will fail since "shell" is set to "True" for the command issued by
CephManager.run_ceph_w(). Prepending "exec sudo" to the command when it
is used in LocalCephManager makes this duplication unnecessary and also
prevents Ceph API test from failing.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2021-08-02 11:37:44 +05:30
Rishabh Dave
4101f76ed6 qa/ceph_manager: minor refactor
Save the return value of method "teuthology.get_testdir()" instead of
calling it repeatedly in the same class.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2021-08-02 10:07:23 +05:30
Kefu Chai
f0ed7a188f qa/tasks: s/virtualenv/python3 -m venv/
so we don't need to use virtualenv python package for creating a
virtualenv, the "venv" module in Python3 would suffice.

see also https://docs.python.org/3/library/venv.html

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-07-31 22:34:05 +08:00
Patrick Donnelly
2cd3494771 qa: update mds_pre_upgrade to no longer stop standbys
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-30 16:28:54 -07:00
Patrick Donnelly
8e0b9bcad6 qa: update mds_pre_upgrade to disable standby-replay
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-30 16:28:54 -07:00
Patrick Donnelly
295971b9c6 qa: add tests for compat manipulation and upgrade
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-30 16:28:54 -07:00
Patrick Donnelly
5ae7b9202b Merge PR #42513 into master
* refs/pull/42513/head:
	qa: multifs already enabled as default

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-30 14:03:36 -07:00
Kotresh HR
103c7bdc70 mgr/volumes: Fail subvolume removal if it's in progress
Removing an in-progress subvolume clone with force doesn't
remove the clone index (tracker). This results in the cloner
thread to stuck in loop trying to clone the deleted one.

This patch addresses the issue by not allowing the subvolume clone
to be removed if it's not complete/cancelled/failed even with force option.
It throws the error EAGAIN, asking the user to cancel the pending clone
and retry.

Fixes: https://tracker.ceph.com/issues/51707
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2021-07-30 13:14:28 +05:30
Patrick Donnelly
0efa23572a qa: verify rank 0 does not fail during journal repair tests
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-29 13:53:20 -07:00
Patrick Donnelly
14324ab5c2 qa: avoid stopping/restarting mds in journal repair tests
It is enough to just fail ranks and manipulate the "joinable" flag of
the fs.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-29 13:53:20 -07:00
Brad Hubbard
434b325c40
Merge pull request #42442 from badone/wip-insights-reports-non-persistent-storage
Don't persist report data

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-07-29 09:19:32 +10:00
Patrick Donnelly
665b36de4e Merge PR #42349 into master
* refs/pull/42349/head:
	mon/MDSMonitor: propose if FSMap struct_v is too old
	mon/MDSMonitor: give a proper error message if FSMap struct_v is too old
	mds/FSMap: use DECODE_OLDEST to gate FSMap version
	qa: add tests for fs dump of epoch and trimming
	qa: add file system support for dumping epoch
	mon/MDSMonitor: return mon_mds_force_trim_to even if equal to current epoch
	mon: add debugging for trimming methods
	mon: fix debug spacing
	qa: add nofs upgrade suite

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
2021-07-28 10:45:08 -07:00
Patrick Donnelly
4f0f51e4cb Merge PR #41025 into master
* refs/pull/41025/head:
	qa: wait pgs to be clean before using the pools
	qa: ignore PG_RECOVERY_FULL and PG_DEGRADED for mds-full
	qa: wait more time since there have many more pgs than before
	qa: do not multiple the full ratio twice
	qa: do not raise for kclient for _fsync test
	qa: use the pg autoscale mode to calcuate the pg_num
	qa: set the object_size to 1M
	qa: move the is_full() to parent class

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-28 10:34:12 -07:00
Patrick Donnelly
5ddaa36d17 qa: add tests for fs dump of epoch and trimming
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-28 07:07:05 -07:00
Patrick Donnelly
ee899d9a44 qa: add file system support for dumping epoch
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-28 07:07:05 -07:00
Xiubo Li
361ee535dd qa: multifs already enabled as default
Since pacific already mark multifs enabled as defaut.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-07-28 13:56:10 +08:00