Commit Graph

10072 Commits

Author SHA1 Message Date
Laura Flores
66a6e7fdeb qa/suites/rados: whitelist POOL_APP_NOT_ENABLED for rados cls tests
Fixes: https://tracker.ceph.com/issues/59192
Signed-off-by: Laura Flores <lflores@redhat.com>
2023-06-05 15:35:54 -05:00
Laura Flores
c26674ef4c qa/suites/rados: remove rook coverage from the rados suite
The rook team relies on a daily CI system to validate
rook changes. It doesn't seem that the teuthology tests
are maintained, so it makes sense to remove them from the
rados suite.

By removing this symlink, rook test coverage will remain
in the orch suite, and coverage will only be removed from the
rados suite.

Workaround for: https://tracker.ceph.com/issues/58585
Signed-off-by: Laura Flores <lflores@redhat.com>
2023-06-05 15:23:42 -05:00
Yuri Weinstein
b2ec2aff80
Merge pull request #50651 from rosinL/cleanup
Cleanup the LevelDB residue


Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2023-06-05 11:32:51 -04:00
Ronen Friedman
83607c0610 qa/standalone: osd-recovery-scrub: fix slow updates and recovery concurrency
1. Setting frequent scrub status updates, to compensate for the removal
of some 'send updates' in PR#50283.

2. Switching back to using the wpq scheduler, as otherwise the number of
concurrent recovery operations is below what the test expects.

Fixes: https://tracker.ceph.com/issues/61386

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2023-06-04 07:34:02 -05:00
J. Eric Ivancich
aeffd1b598 qa/rgw: test that multipart re-upload does not leave any orphans
Runs a boto script that reuploads one part multiple times before
completing and then we check for any orphans.

Original boto script contributed by Matt Benjamin
<mbenjami@redhat.com> on top of which modifications were made.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
2023-06-02 17:39:26 -04:00
Nizamudeen A
553a0c9ad1
Merge pull request #51844 from rhcs-dashboard/fix-qa-failure-orch
mgr/dashboard: fix test_dashboard_e2e.sh failure

Reviewed-by: Pegonzal <NOT@FOUND>
2023-06-01 11:59:23 +05:30
Nizamudeen A
7c5d92ad48 mgr/dashboard: fix test_dashboard_e2e.sh failure
The qa e2e is failing because the script is not adapted with cypress 10.

Fixes: https://tracker.ceph.com/issues/61519
Signed-off-by: Nizamudeen A <nia@redhat.com>
2023-05-31 11:09:25 +05:30
John Mulligan
a1f6314fd8 qa/cephadm: teuthology test for nfs ingress-mode=haproxy-protocol
Signed-off-by: John Mulligan <jmulligan@redhat.com>
2023-05-26 10:43:11 -04:00
Yuri Weinstein
e513690ad1
Merge pull request #51570 from NitzanMordhai/wip-nitzan-test-mon-thrasher-quorum-delay-inc
test: monitor thrasher wait until quorum

Reviewed-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>
2023-05-25 12:09:18 -04:00
Yuri Weinstein
925edda1cb
Merge pull request #51527 from NitzanMordhai/wip-nitzan-thrash-eio-pool-size-correct
test: correct osd pool default size


Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Matan Breizman <Matan.Brz@gmail.com>
2023-05-25 12:08:48 -04:00
Yuri Weinstein
40b9a8b2cc
Merge pull request #50876 from rishabh-d-dave/qa-ceph-man-get-keyring
qa/ceph_manager: preserve newline char at EOF in keyring

Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
2023-05-25 10:49:53 -04:00
Patrick Donnelly
62d1cc0568
Merge PR #50875 into main
* refs/pull/50875/head:
	mon/MDSMonitor: ignore extraneous up:boot messages
	qa: add test case for mds sending multiple boot messages
	qa: support checking for a log message that should not exist

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
2023-05-25 08:25:34 -04:00
Kamoltat
2c25b29347 qa/standalone/mon-stretch/mon-stretch-uneven-crush-weights.sh: init
Initialize standalone test for stretched clusters,
testing uneven weight warnings and != 2 buckets
warnings.

Added `wait_for_health_gone()` function in ceph-helpers.sh
this function allows us to wait for health condition to
disappear when doing standalone tests.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
2023-05-24 18:35:27 +00:00
Yuri Weinstein
7873afce66
Merge pull request #51528 from NitzanMordhai/wip-nitzan-tests-using-override-instead-overrides
tests: change override to overrides so conf will take affect

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
2023-05-22 14:33:10 -04:00
John Mulligan
bd59b80919 qa/workunits/cephadm: align test_cephadm.sh with new cephadm version
The `cephadm version` command no longer bases the output on the
container images, rather it uses a special python file added to the
zipapp during the build to report on the version of cephadm (the
binary).

The other option was to preserve this behavior and add a new version
command or make it behave differently depending on what options were
provided. I discussed the options with AMK in person and we decided that
changing the tests was preferable.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2023-05-22 13:26:07 -04:00
Nitzan Mordechai
fbd10badbf test: monitor thrasher wait until quorum
With 1 sec. delay we may sometimes fail to get correct length of
quorum since the monitor didn't updated on time.
With the following fix, we will wait for quorum and check every few
seconds (3) until timeout (30).

Fixes: https://tracker.ceph.com/issues/52316
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
2023-05-22 13:54:14 +00:00
Sridhar Seshasayee
0f6404222f
Merge pull request #51480 from sseshasa/wip-fix-pr48703-followup
osd/scheduler: Reset ephemeral changes to mClock built-in profile

Reviewed-by: Samuel Just <sjust@redhat.com>
2023-05-22 16:37:53 +05:30
Yuri Weinstein
ecebe2f4b2
Merge pull request #50616 from batrick/i59120
qa: use parallel gzip for compressing logs

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2023-05-20 10:04:13 -04:00
Casey Bodley
aaa04882d9
Merge pull request #51494 from cbodley/wip-61168
qa/rgw: add POOL_APP_NOT_ENABLED to log-ignorelist

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2023-05-19 09:17:52 -04:00
Nizamudeen A
515aa566e5
Merge pull request #51532 from rhcs-dashboard/reorder-daemon-page
mgr/dashboard: reorder rgw daemons list items

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: cloudbehl <NOT@FOUND>
2023-05-19 14:53:25 +05:30
Sridhar Seshasayee
aed71b56be qa/tasks: Allow override of recovery configs for tests
With mClock scheduler enabled, a small subset of config options related
to recovery limits are not allowed to be modified unless
osd_mclock_override_recovery_settings option is enabled. This override
option is disabled by default. The following options cannot be modified
without enabling the override option:

 - osd_max_backfills
 - osd_recovery_max_active[_(hdd|ssd)]

The above options are removed from the mon kv store which effectively
restores them to the default values.

This was resulting in tests for example,
test_cluster_configuration.ClusterConfigurationTest to fail since it
modifies the recovery options and expects to verify the modified value.

Therefore, for tests, osd_mclock_override_recovery_settings option is
enabled in vstart_runner.py so that current and future tests
are not affected.

Fixes: https://tracker.ceph.com/issues/61155
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2023-05-18 21:42:27 +05:30
Nizamudeen A
130a52ed50 mgr/dashboard: reorder rgw daemons list items
Fixes: https://tracker.ceph.com/issues/61212
Signed-off-by: Nizamudeen A <nia@redhat.com>
2023-05-18 15:07:37 +05:30
Sridhar Seshasayee
414ac7dd2c osd/scheduler: Reset ephemeral changes to mClock built-in profile
This is a follow-up to PR: https://github.com/ceph/ceph/pull/48703.
This commit also considers changes made ephemerally using either the
'daemon' or the 'tell' interfaces to override the built-in mClock
QoS parameters. In such a scenario, the ephemeral changes are removed
using the rm_val() method exposed by the config subsytem and logging
this information.

Other changes:

1. Add a standalone test to exercise the fix.
2. Add documentation note on the outcome of the attempt to modify
   built-in profile defaults.

Fixes: https://tracker.ceph.com/issues/61155
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2023-05-18 14:03:45 +05:30
Ilya Dryomov
95551071f2
Merge pull request #51449 from amathuria/wip-rbd-suite-change-mclock-profile
qa/tasks: Changing default mClock profile to high_recovery_ops

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2023-05-18 09:28:00 +02:00
Venky Shankar
7b2968570a Merge PR #49691 into main
* refs/pull/49691/head:
	qa: add test for opening a file via a hard link that is not in the same mds as the inode
	mds: rdlock_path_xlock_dentry supports returning auth target inode

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-05-18 12:49:33 +05:30
Nitzan Mordechai
c9d98ec310 test: correct osd pool default size
Using the default pool size of 2 with random eio thrashing can cause
some of the object to mark as lost.
fixing typo from 'osd default pool size: 3' to 'osd pool default size: 3'
so we will have pool size 3 correctly.

Fixes: https://tracker.ceph.com/issues/49888
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
2023-05-18 04:34:50 +00:00
Aishwarya Mathuria
a7c0029ecc qa/tasks: Change default mClock profile to high_recovery_ops
With the new mClock default profile, tests were failing with "Exiting scrub checking -- not all pgs scrubbed" due to slower scrubs.
Changing the default profile to high_recovery_ops for testing purposes will fix this issue.

Fixes: https://tracker.ceph.com/issues/61228
Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>
2023-05-18 09:32:20 +05:30
Nitzan Mordechai
3a91670aa5 tests: change override to overrides so conf will take affect
We have few test suites that using 'override' in yaml file
while ceph.py task is looking for 'overrides', in that case
those configure params won't take any affects.

Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
2023-05-17 10:39:59 +00:00
Dhairya Parmar
833aa3483b qa: ignore warnings
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
2023-05-17 14:39:42 +05:30
Dhairya Parmar
51cca9b9dc qa: add test cases to check client eviction if an OSD is laggy
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
2023-05-17 14:38:31 +05:30
Casey Bodley
f0d53e56f8 qa/rgw: add POOL_APP_NOT_ENABLED to log-ignorelist
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2023-05-15 14:26:48 -04:00
Venky Shankar
2dec176827 Merge PR #51386 into main
* refs/pull/51386/head:
	qa: ignore cluster warning when fs flag refuse_client_session is set

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Kotresh Hiremath Ravishankar <khiremat@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2023-05-15 12:26:46 +05:30
Matan
653b97e472
Merge pull request #51388 from Matan-B/wip-matanb-c-enable-rbd-tests
qa/suites/crimson: Enhance rbd api testing

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Radosław Zarzyński <rzarzyns@redhat.com>
2023-05-11 16:28:55 +02:00
Venky Shankar
8391374c08 Merge PR #51251 into main
* refs/pull/51251/head:
	PendingReleaseNotes: add a note about deleting files from lost+found directory
	qa: add checks that validate removal of entries from lost+found dir
	mds: allow unlink operation under lost+found directory

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2023-05-11 11:21:14 +05:30
Venky Shankar
cc2f423ce1 Merge PR #51201 into main
* refs/pull/51201/head:
	qa: run scrub post file system recovery

Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
Reviewed-by: Kotresh Hiremath Ravishankar <khiremat@redhat.com>
2023-05-11 11:19:13 +05:30
Casey Bodley
6c07ed5e3d
Merge pull request #51345 from cbodley/wip-59639
rgw/dbstore: allow NULL RealmIDs in sqlite schema

Reviewed-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2023-05-10 08:56:37 -04:00
Ilya Dryomov
6544c0418c
Merge pull request #49742 from ajarr/fix-56724
mgr/rbd_support: recover from rados client blocklisting

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2023-05-10 11:55:42 +02:00
Venky Shankar
69882f5123 Merge PR #43184 into main
* refs/pull/43184/head:
	qa: fix journal flush failure issue due to the MDS daemon crashes
	qa: add test support for the alloc ino failing
	mds: do not take the ino which has been used

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2023-05-10 14:04:58 +05:30
Venky Shankar
4680336650 qa: run scrub post file system recovery
Running file system scrub is recommended post running filesystem
data and metadata recovery. Running scrub isn't covered in tests.

Fixes: http://tracker.ceph.com/issues/59527
Signed-off-by: Venky Shankar <vshankar@redhat.com>

Signed-off-by: Venky Shankar <vshankar@redhat.com>
2023-05-09 22:54:30 -04:00
Liu-Chunmei
94e24afbf0
Merge pull request #51167 from liu-chunmei/teuthology-multicore
crimson/qa: make crimson run multicore in teuthology test

Reviewed-by: Samuel Just <sjust@redhat.com>
2023-05-09 16:04:47 -07:00
chunmei
179a3e01c5 crimson/qa: enable multicore for crimson in teuthology test
Signed-off-by: chunmei <chunmei.liu@intel.com>
2023-05-09 09:08:44 +00:00
Kamoltat Sirivadhna
78a43309b2
Merge pull request #50857 from kamoltat/wip-ksirivad-iswriteable
mon/Monitor.cc: exit function if !osdmon()->is_writeable()
Reviewd-by: Gregory Farnum <gfarnum@redhat.com>
2023-05-08 21:04:59 -04:00
Ramana Raja
a2f15d4b2f qa/workunits/rbd: Add tests for rbd_support module recovery
... after the module's RADOS client is blocklisted.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2023-05-08 16:45:41 -04:00
Dhairya Parmar
e4e8c84431 qa: ignore cluster warning when fs flag refuse_client_session is set
Fixes: https://tracker.ceph.com/issues/59667
Introduced-by: https://github.com/ceph/ceph/pull/48720
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
2023-05-08 18:25:03 +05:30
Matan Breizman
01958e648e qa/suites/crimson: Introduce rbd_python_api_tests.yaml
Test python api with new image format.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2023-05-08 10:57:06 +00:00
Matan Breizman
5823c04542 qa/suites/crimson: Skip unsupported tests (Crimson)
Align with `rbd_api_tests` and skip deep_copy and breaklock tests
in Crimson.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2023-05-08 10:57:06 +00:00
Sridhar Seshasayee
4c22fcfbe8 qa/: Override mClock profile to 'high_recovery_ops' for qa tests
The qa tests are not client I/O centric and mostly focus on triggering
recovery/backfills and monitor them for completion within a finite amount
of time. The same holds true for scrub operations.

Therefore, an mClock profile that optimizes background operations is a
better fit for qa related tests. The osd_mclock_profile is therefore
globally overriden to 'high_recovery_ops' profile for the Rados suite as
it fits the requirement.

Also, many standalone tests expect recovery and scrub operations to
complete within a finite time. To ensure this, the osd_mclock_profile
options is set to 'high_recovery_ops' as part of the run_osd() function
in ceph-helpers.sh.

A subset of standalone tests explicitly used 'high_recovery_ops' profile.
Since the profile is now set as part of run_osd(), the earlier overrides
are redundant and therefore removed from the tests.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2023-05-08 16:22:00 +05:30
Samuel Just
5a649f3c94 common/options/osd.yaml.in: change mclock profile default to balanced
Let's use the middle profile as the default.
Modify the standalone tests accordingly.

Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2023-05-08 16:22:00 +05:30
Sridhar Seshasayee
b6a442c7cc osd: Retain overridden mClock recovery settings across osd restarts
Fix an issue where an overridden mClock recovery setting (set prior to
an osd restart) could be lost after an osd restart.

For e.g., consider that prior to an osd restart, the option
'osd_max_backfill' was successfully set to a value different from the
mClock default. If the osd was restarted for some reason, the
boot-up sequence was incorrectly resetting the backfill value to the
mclock default within the async local/remote reservers. This fix
ensures that no change is made if the current overriden value is
different from the mClock default.

Modify an existing standalone test to verify that the local and remote
async reservers are updated to the desired number of backfills under
normal conditions and also across osd restarts.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2023-05-08 16:22:00 +05:30
Sridhar Seshasayee
514cb598fb osd: Modify mClock scheduler's cost model to represent cost in bytes
The mClock scheduler's cost model for HDDs/SSDs is modified and now
represents the cost of an IO in terms of bytes.

The cost parameters, namely, osd_mclock_cost_per_io_usec_[hdd|ssd]
and osd_mclock_cost_per_byte_usec_[hdd|ssd] which represent the cost
of an IO in secs are inaccurate and therefore removed.

The new model considers the following aspects of an osd to calculate
the cost of an IO:

 - osd_mclock_max_capacity_iops_[hdd|ssd] (existing option)
   The measured random write IOPS at 4 KiB block size. This is
   measured during OSD boot-up using OSD bench tool.
 - osd_mclock_max_sequential_bandwidth_[hdd|ssd] (new config option)
   The maximum sequential bandwidth of of the underlying device.
   For HDDs, 150 MiB/s is considered, and for SSDs 750 MiB/s is
   considered in the cost calculation.

The following important changes are made to arrive at the overall
cost of an IO,

1. Represent QoS reservation and limit config parameter as proportion:
The reservation and limit parameters are now set in terms of a
proportion of the OSD's max IOPS capacity. The earlier representation
was in terms of IOPS per OSD shard which required the user to perform
calculations before setting the parameter. Representing the
reservation and limit in terms of proportions is much more intuitive
and simpler for a user.

2. Cost per IO Calculation:
Using the above config options, osd_bandwidth_cost_per_io for the osd is
calculated and set. It is the ratio of the max sequential bandwidth and
the max random write iops of the osd. It is a constant and represents the
base cost of an IO in terms of bytes. This is added to the actual size of
the IO(in bytes) to represent the overall cost of the IO operation.See
mClockScheduler::calc_scaled_cost().

3. Cost calculation in Bytes:
The settings for reservation and limit in terms a fraction of the OSD's
maximum IOPS capacity is converted to Bytes/sec before updating the
mClock server's ClientInfo structure. This is done for each OSD op shard
using osd_bandwidth_capacity_per_shard shown below:

    (res|lim)  = (IOPS proportion) * osd_bandwidth_capacity_per_shard
    (Bytes/sec)   (unitless)             (bytes/sec)

The above result is updated within the mClock server's ClientInfo
structure for different op_scheduler_class operations. See
mClockScheduler::ClientRegistry::update_from_config().

The overall cost of an IO operation (in secs) is finally determined
during the tag calculations performed in the mClock server. See
crimson::dmclock::RequestTag::tag_calc() for more details.

4. Profile Allocations:
Optimize mClock profile allocations due to the change in the cost model
and lower recovery cost.

5. Modify standalone tests to reflect the change in the QoS config
parameter representation of reservation and limit options.

Fixes: https://tracker.ceph.com/issues/58529
Fixes: https://tracker.ceph.com/issues/59080
Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2023-05-08 16:21:59 +05:30
Milind Changire
ab64bfaaf9
qa: add test to verify blocking of osd pool mksnap for fs pools
Signed-off-by: Milind Changire <mchangir@redhat.com>
2023-05-08 13:23:15 +05:30
Venky Shankar
0252313c87 qa: add checks that validate removal of entries from lost+found dir
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2023-05-06 11:03:09 -04:00
Xiubo Li
d851a9475c qa: fix journal flush failure issue due to the MDS daemon crashes
After the MDS daemon crashing, the journal flush request will fail.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
2023-05-05 18:46:01 +08:00
Xiubo Li
71797091a2 qa: add test support for the alloc ino failing
Fixes: https://tracker.ceph.com/issues/52280
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2023-05-05 18:46:01 +08:00
luo rixin
75adc57feb qa: remove leveldb support from qa
qa/suites: remove leveldb log setting
qa/rebuild_mondb: replace leveldb to rocksdb
qa/valgrind: remove leveldb from valgrind.supp

Signed-off-by: luo rixin <luorixin@huawei.com>
2023-05-04 10:43:08 +08:00
Adam King
2f3afa76ee
Merge pull request #51226 from jsoref/spelling-orchestrator
orchestrator: Fix spelling

Reviewed-by: Adam King<adking@redhat.com>
2023-05-03 17:31:04 -04:00
Adam King
9f3d21e020
Merge pull request #47199 from adk3798/osp-nfs-ha
mgr/cephadm: support for nfs backed by VIP

Reviewed-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>
2023-05-03 17:18:27 -04:00
Casey Bodley
770153dd5f qa/rgw: dbstore suite uses db for config store
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2023-05-03 11:21:12 -04:00
Matan
010e298f9b
Merge pull request #50457 from Matan-B/wip-matanb-c-new-rbd-api
qa/suites/crimson-rados/rbd: Add new rbd image format api tests

Reviewed-by: Samuel Just <sjust@redhat.com>
2023-05-02 12:21:35 +03:00
Venky Shankar
a0ab964b00 Merge PR #51005 into main
* refs/pull/51005/head:
	qa: fix test_nfs_export_creation_at_symlink
	qa: update test cases to check for ENOTDIR instead of EINVAL
	qa: fix test_nfs_export_with_invalid_path
	mgr/nfs: handle exceptions for cephfs_path_is_dir()
	mgr/nfs/utils: changes to helper func to check cephfs path

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-05-02 11:28:41 +05:30
Adam King
e905830833 qa/cephadm: teuth test for keepalive-only ingress over nfs
Signed-off-by: Adam King <adking@redhat.com>
2023-05-01 15:45:11 -04:00
Patrick Donnelly
f194b277ec
qa: add test case for mds sending multiple boot messages
Test case for [1].

[1] https://tracker.ceph.com/issues/59318

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-05-01 15:02:19 -04:00
Patrick Donnelly
c82d2a41f1
qa: support checking for a log message that should not exist
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-05-01 11:21:26 -04:00
Matan Breizman
27e309ef96 qa/suites/crimson-rados/rbd: Add new rbd image format api tests
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2023-05-01 09:09:20 +00:00
Matan
dbe797bf9d
Merge pull request #50584 from Matan-B/wip-matanb-crimson-only-add-thrash
qa/suites/crimson-rados/thrash: Enable supported tests and ops

Reviewed-by: Samuel Just <sjust@redhat.com>
2023-04-27 12:43:00 +03:00
Ilya Dryomov
c9f0ecd242
Merge pull request #51228 from jsoref/spelling-rbd
rbd: fix spelling errors

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2023-04-26 23:07:47 +02:00
Rishabh Dave
3bc21774f4
Merge pull request #51030 from batrick/i59425
qa: check each fs for health

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-04-26 19:44:39 +05:30
Josh Soref
965ee91d3f rbd: fix spelling errors
* acquire
* are
* asynchronous
* attempt
* bootstrap
* concurrent
* consume
* couldn't
* cumulative
* disable
* disabling
* disaster
* disconnected
* endianness
* entries
* exclusive
* filesystem
* flag
* generic
* github
* image
* information
* initiating
* latency
* limitations
* metadata
* modify
* namespace
* noautoconsole
* ourselves
* prefetch
* propagate
* protection
* recorder
* recover
* release
* replicated
* reserved
* selection
* sentinel
* several
* snapshot
* source
* specifying
* suppress
* synchronize
* the
* transfer
* triggering
* unknown
* validation
* version
* visible
* write log entries

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2023-04-26 09:30:53 -04:00
Josh Soref
94ade0cd16 orchestrator: fix spelling errors
* a new
* accommodated
* adopted
* appended
* because
* bootstrap
* bootstrapping
* brackets
* classes
* cluster
* compatible
* completely
* confusion
* daemon
* daemons
* dashboard
* enclosure
* existing
* explicit
* following
* format
* host
* implementation
* inferred
* keepalived
* kubic
* maintenance
* necessarily
* necessary
* network
* notifier
* octopus
* permanent
* presenting
* related
* see
* snapshot
* stateful
* the
* track
* version
* wasn't
* weird

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2023-04-26 09:21:42 -04:00
Dhairya Parmar
7a6ab315bb qa: fix test_nfs_export_creation_at_symlink
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
2023-04-26 16:59:36 +05:30
Dhairya Parmar
0c8962587f qa: update test cases to check for ENOTDIR instead of EINVAL
- test_nfs_export_creation_at_filepath:
ENOTDIR is raised instead of EINVAL which is better
aligned with the nature of the failure

- test_nfs_export_creation_at_symlink:
ENOTDIR is raised instead of ENOENT since the code
can now check if the path is symlink but won't follow
it.

Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
2023-04-26 15:50:18 +05:30
Dhairya Parmar
5cc0857f41 qa: fix test_nfs_export_with_invalid_path
It actually didn't test the invalid path but still ended with
ENOENT(which is expected in case path is invalid) as the test
didn't create a fs, and it failed saying "FS nfs-cephfs not found"
which too raises ENOENT and thus it always passed.

Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
2023-04-26 15:50:18 +05:30
Adam King
c7d382b0ff
Merge pull request #49103 from adk3798/mon-crush-location
mgr/cephadm: allow setting mon crush locations through mon service spec

Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>
2023-04-25 11:25:29 -04:00
Matan Breizman
b888cfa3da qa/suites/crimson-rados/thrash: Enable supported thrashers
Balanced/Localized reads are now supported.
snap_remove and rollback are supported as well.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2023-04-25 09:47:38 +00:00
Matan Breizman
8839f829d6 qa/suites/crimson-rados/thrash: Add snap_remove/create weights
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2023-04-25 09:47:38 +00:00
Rishabh Dave
ac9ca5d14b
Merge pull request #51127 from rishabh-d-dave/fs-qa-caps_helper-bug
qa/cephfs/caps_helper: fix a bug in methods that generate cap string

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-04-20 19:27:19 +05:30
Rishabh Dave
9be6de9185
Merge pull request #50882 from rishabh-d-dave/fs-qa-CapTester
qa/cephfs: improvements in caps_helper

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-04-20 19:27:02 +05:30
Rishabh Dave
f0ffade052 qa/cephfs/cap_tester: simplify CapTester and its instantiation
Class CapTester contains two distinct immiscible group of methods: one
that tests MON caps and other that tests MDS caps. When using CapTester
for the former reason the instantiation neither needs mount object and
the path where files for testing will be created nor it needs to run the
method that creates files for testing rw permissions. When using
this class for latter the case is the exact opposite.

Create 2 separate classes for each of these purpose and class that
inherits both of these classes so that instantiating the class becomes
as simple as it can be.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-04-18 20:32:15 +05:30
Xiubo Li
7ebd06441a qa: wait for 100 seconds to make sure the quota to be enforced
The worst case in kclient the dirty caps will be held for 60 seconds,
and also the MDS may defer updating the directory rstat for 5 seconds,
which is per tick, maybe longer if needs to wait for mdlog to flush.

Fixes: https://tracker.ceph.com/issues/59349
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2023-04-18 22:30:09 +08:00
Xiubo Li
ac1de56fcd qa: create a new directory to fill the volume space
When trying to filling the volume space by continuing filling multiple
files, and when flushing the dirty caps back to MDS the MDS will try
to skip updating the parent rstat in 'mds_dirstat_min_interval' to
avoid propagating more often than this. That means the quota changes
couldn't be broadcasted to the clients in time.

So after waiting for 20 seconds, and if we try to write the existing
files only the first file could successfully update the parent quota
realm in MDS, but this won't increase the total size.

Fixes: https://tracker.ceph.com/issues/59349
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2023-04-18 22:30:09 +08:00
Rishabh Dave
95c6daa45b qa/cephfs: update method caps_helper.CapTester.run_cap_tests()
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-04-18 19:42:55 +05:30
Rishabh Dave
ea9f13e553 qa/cephfs: move few methods such that they can be reused
Move get_mon_cap_from_keyring() and get_fsnmes_from_moncap() from class
CapTester to main namespace of caps_helper.py so that they can be
imported freely and reused by tests.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-04-18 19:42:55 +05:30
Rishabh Dave
ad68a55121 qa/cephfs: improve caps_helper.CapTester.run_mon_cap_tests()
This method checks if the output of the command "ceph fs ls" for client
ID it receives is same as the output printed for client.admin. Don't do
so, limit the test to only checking if "ceph fs ls --id client.x -k
keyring_file" prints fs name for which client.x has permissions.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-04-18 19:42:55 +05:30
Rishabh Dave
008dbe91e2 qa/cephfs: improve caps_helper.CapTester
Improvement #1:

CapTester.write_test_files() not only creates the test file but also
does the following for every mount object it receives in parameters -

* carefully produces the path for the test file as per parameters
  received
* generates the unique data for each test file on a CephFS mount
* creates a data structure -- list of lists -- that holds all this
  information along with mount object itself for each mount object so
  that tests can be conducted at a later point

Untangle this mess of code by splitting this method into 3 separate
methods -

1. To produce the path for test file (as per user's need).
2. To generate the data that will be written into the test file.
3. To actually create the test file on CephFS.

Improvement #2:

Remove the internal data structure used for testing -- self.test_set --
and use separate class attributes to store all the data required for
testing instead of a tuple. This serves two purpose -

One, it makes it easy to manipulate all this data from helper methods
and during debugging session, especially while using a PDB session.

And two, make it impossible to have multiple mounts/multiple "test sets"
within same CapTester instance for the sake of simplicity. Users can
instead create two instances of CapTester instances if needed.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-04-18 19:42:55 +05:30
Rishabh Dave
e8bdf94b81 qa/cephfs: don't inherit CephFSTestCase in CapTester
Inheritting CephFSTestCase in CapTester just for methods assertEqual()
and assertIn() from class unittest.TestCase is odd and heavy-weight.
Don't inherit CephFSTestCase and use simple assert instead.

Reference: https://github.com/ceph/ceph/pull/50882#discussion_r1160611549.

To avoid code duplication, a couple of similar methods have been added
instead.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-04-18 19:34:40 +05:30
Rishabh Dave
64e3dd7e62 qa/cephfs/caps_helper: fix a bug in methods that generate cap string
The tuple was not meant to be passed as a whole but its individual
members are to be passed as a list of positional arguments.

Introduced-by: 87025d1585
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-04-18 16:28:03 +05:30
Kamoltat
431c4559c4 qa/standalone: create mon-stretch standalone test
Separate `mon-stretch` from `mon`.

Renamed `mon-stretched-cluster.sh` to
`mon-stretch-fail-recovery.sh`.

This isolation of stretch cluster test will enable
developers to get results faster for stretch-cluster
related stuff.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
2023-04-17 16:06:22 +00:00
Venky Shankar
6684f3e55e Merge PR #50909 into main
* refs/pull/50909/head:
	qa/workunit: print the detail commands excuted in the scripts

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
2023-04-14 15:59:12 +05:30
Nizamudeen A
abff7f0bb7
Merge pull request #49531 from rhcs-dashboard/fix-rbd-snapshot-creation
mgr/dashboard: Fix rbd snapshot creation

Reviewed-by: VasishtaShastry <NOT@FOUND>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: sunilangadi2 <NOT@FOUND>
2023-04-14 12:06:05 +05:30
Ilya Dryomov
3b936801fd
Merge pull request #51051 from idryomov/wip-59431
qa/suites/rbd: install qemu-utils in addition to qemu-block-extra on Ubuntu

Reviewed-by: Ramana Raja <rraja@redhat.com>
2023-04-13 11:54:57 +02:00
Ilya Dryomov
c529fdd63a qa/suites/rbd: install qemu-utils in addition to qemu-block-extra on Ubuntu
qemu-utils is usually pre-installed but, due to what appears to be
a Ubuntu packaging bug, it's not upgraded when qemu-block-extra is
installed:

  The following NEW packages will be installed:
    qemu-block-extra
  The following packages will be upgraded:
    qemu-system-common qemu-system-data qemu-system-gui qemu-system-x86

However, the version of the block driver must match exactly the version
of the qemu-img tool, so the above leads to:

  $ qemu-img convert -f qcow2 -O raw /home/ubuntu/cephtest/qemu/base.client.0.0.qcow2 rbd:rbd/client.0.0
  Failed to initialize module: /usr/lib/x86_64-linux-gnu/qemu/block-rbd.so
  Note: only modules from the same build can be loaded.
  qemu: module block-block-rbd not found, do you want to install qemu-block-extra package?
  qemu-img: Unknown protocol 'rbd'

Fixes: https://tracker.ceph.com/issues/59431
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2023-04-12 15:37:44 +02:00
Aashish Sharma
5ea4171ae3 mgr/dashboard: fix rbd mirror snapshot creation
There are two types of snapshots that can be created on a snapshot based mirroring image - Normal Snapshot(same as journal based snapshot) and Nirror Image Snapshot. Till now Dashboard allowed only Mirror image snapshot, this PR intends to enable both the types

Signed-off-by: Aashish Sharma <aasharma@redhat.com>
2023-04-12 11:50:40 +05:30
Samuel Just
6a56d85f19 qa/standalone/scrub/osd-scrub-dump.sh: drop unnecessary primary lookup
1e44d86b2 swapped this to a pg tell command which doesn't actually
need the primary specified.  Drop the now unnecessary lookup.

Signed-off-by: Samuel Just <sjust@redhat.com>
2023-04-11 20:39:19 -07:00
Patrick Donnelly
06c90a6c48
qa: check each fs for health
Fixes: https://tracker.ceph.com/issues/59425
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-04-11 10:16:53 -04:00
Rishabh Dave
1944977522 qa/ceph_manager: preserve newline char at EOF in keyring
Lack of newline character at the end of keyring file makes CephFS mount
command end with error.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-04-11 16:09:21 +05:30
Rishabh Dave
1a7ca489b4 qa/cephfs: minor improvement in caps_helper
Use Python type list instead of tuple since it get's necessary to modify
members of this sequence.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-04-11 12:03:49 +05:30
Adam King
8c52a0a77b qa/suites/orch/cephadm: teuth test for mon crush locations
Trying to add a feature where mon crush locations
can be set through the orchestrator using the mon
service spec. This is meant to be a test for that.

Signed-off-by: Adam King <adking@redhat.com>
2023-04-10 14:35:41 -04:00
yaarith
e0a6d6aa70
Merge pull request #50699 from yaarith/telemetry-leaderboard-doc
mgr/telemetry: add leaderboard description and documentation

Reviewed-by: Zac Dover <zac.dover@proton.me>
Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
2023-04-10 13:55:36 -04:00
Venky Shankar
bdf4d4677f Merge PR #50844 into main
* refs/pull/50844/head:
	qa: wait for MDSMonitor tick to replace daemons

Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-04-06 23:11:38 +05:30
Yuri Weinstein
23a958d647
Merge pull request #47893 from kotreshhr/ceph-mgr-finisher-block
mgr: Add one finisher thread per module

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
2023-04-06 09:23:24 -07:00