Commit Graph

163 Commits

Author SHA1 Message Date
Ilya Dryomov
c870ead3d4
Merge pull request #55595 from VallariAg/wip-nvmeof-test-v3
qa/suite/rbd/nvmeof: Deploy multiple gateways and namespaces

Reviewed-by: Barak Davidov <barakda@il.ibm.com>
Reviewed-by: Aviv Caro <Aviv.Caro@ibm.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2024-03-20 10:49:36 +01:00
Vallari Agrawal
00651cfac2
qa/suite/rbd/nvmeof: Deploy multiple gateways and namespaces
1. Deploy 2 gateways on different nodes, then check for multi-path.
    To add another gateway, only "roles" need to be changed in job yaml.
2. Create "n" nvmeof namespaces, configured by 'namespaces_count'
3. Rename qa/suites/rbd/nvmeof/cluster/fixed-3.yaml to fixed-4.yaml
    which contains 2 gateways and 2 initiators.

Signed-off-by: Vallari Agrawal <val.agl002@gmail.com>
2024-03-19 20:48:26 +05:30
Ramana Raja
92b254138d qa/suites: add diff-continuous and compare-mirror-image tests
... to rbd and krbd suites respectively.

This allows the compare-mirror-image tests introduced in ea3a567
to be run against various kernel branches, e.g., testing branch.
And allows diff_continuous test in rbd_suite to run against distro
kernel.

Fixes: https://tracker.ceph.com/issues/64574
Signed-off-by: Ramana Raja <rraja@redhat.com>
2024-02-29 12:12:19 -05:00
Ramana Raja
af43f61624 qa/suites/rbd: rename nbd folder to device folder
Signed-off-by: Ramana Raja <rraja@redhat.com>
2024-02-29 11:55:08 -05:00
Ilya Dryomov
fa5ef874ac
Merge pull request #54802 from ajarr/wip-61617
qa: Add tests to validate synced images on rbd-mirror

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2024-02-23 23:47:42 +01:00
Ramana Raja
b7aae5c3c5 qa: Add tests to validate syncing of images using rbd-mirror
Introduce functional tests to validate that the images under
workloads are correctly mirrored between two clusters using snapshot
based mirroring.

Run workload on a primary image using a krbd or nbd client. Take
mirror snapshots of the image under workload. Unmount the mapped image
and calculate its MD5 checksum before demoting it. After demotion,
wait for the mirror status of the image to be 'up+unknown' in both
the clusters. This is to make sure that the non-primary image in the
other cluster is ready to be promoted. Now promote the non-primary
image in the other cluster. Map the promoted image and calculate its
MD5 checksum. Verify that the checksums of the demoted and promoted
images in the two clusters are the same.

The above test is run as part of two different workunits:
 - a workunit that validates the syncing of multiple mirrored images
   with workloads running on them
 - another workunit that validates the syncing of a single mirrored
   image with workload running on it and the image is set as primary
   alternatively between the two clusters, as it happens during
   failover and failback scenarios.

Fixes: https://tracker.ceph.com/issues/61617
Signed-off-by: Ramana Raja <rraja@redhat.com>
Co-authored-by: Ilya Dryomov <idryomov@redhat.com>
Co-authored-by: Christopher Hoffman <choffman@redhat.com>
2024-02-22 11:44:36 -05:00
Vallari Agrawal
1713c4852c
qa: add qa/tasks/nvmeof.py and rbd/nvmeof_basic_task and fio workunits
This is v2 of the rbd/nvmeof test: It deploys 1 gateway and 1 initiator.
Then does basic verification on nvme commands and runs fio.

This commit creates:
1. qa/tasks/nvmeof.py: adds a new 'Nvmeof' task which deploys
    the gateway and shares config with the initiator hosts.
    Sharing config was previously done by 'nvmeof_gateway_cfg' task
    in qa/tasks/cephadm.py (that task is removed in this commit).
2. qa/workunits/rbd/nvmeof_basic_tests.sh:
    Runs nvme commands (discovery, connect, connect-all, disconnect-all,
    and list-subsys) and does basic verification of the output.
3. qa/workunits/rbd/nvmeof_fio_test.sh:
    Runs fio command. Also runs iostat in parallel if IOSTAT_INTERVAL
    variable is set. This variable configures the delay between each iostat
    print.

nvmeof-cli upgrade from v0.0.6 to v0.0.7 introduced major changes
to all nvmeof commands. This commit changes v0.0.6 commands to
v0.0.7 in qa/workunits/rbd/nvmeof_initiator.sh

Signed-off-by: Vallari Agrawal <val.agl002@gmail.com>
2024-02-12 13:00:09 +05:30
Ilya Dryomov
d9147a14c4
Merge pull request #54205 from VallariAg/wip-nvmeof-test
qa: add rbd/nvmeof integration test

Reviewed-by: Zack Cerza <zack@redhat.com>
Reviewed-by: Aviv Caro <Aviv.Caro@ibm.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2023-12-04 18:14:38 +01:00
Vallari Agrawal
42e121a42a
qa: add rbd/nvmeof test
A basic test for ceph-nvmeof[1] where
nvmeof initiator is created.
It requires use of a new task "nvmeof_gateway_cfg"
under cephadm which shares config information
between two remote hosts.

[1] https://github.com/ceph/ceph-nvmeof/

Signed-off-by: Vallari Agrawal <val.agl002@gmail.com>
2023-12-04 19:27:54 +05:30
Suyashd999
9b773eec4a qa/suites/rbd: Cleanup of MIRROR_IMAGE_MODE
Fixes: https://tracker.ceph.com/issues/63431
Signed-off-by: Suyash Dongre <suyashd999@gmail.com>
2023-11-14 18:28:02 +05:30
Ilya Dryomov
c93a53aa66
Merge pull request #48508 from pkalever/rbd-tests
qa/workunits/rbd: merge journal and snapshot test scripts

Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2023-11-03 12:55:02 +01:00
Prasanna Kumar Kalever
3fd8a03887 qa/workunits/rbd: merge journal and snapshot test scripts
The idea is to avoid the maintenance of duplicate code in both the journal
and snapshot test scripts.

Usage:
   RBD_MIRROR_MODE=journal rbd_mirror.sh

Use environment variable RBD_MIRROR_MODE to set the mode
Available modes: snapshot | journal

Fixes: https://tracker.ceph.com/issues/54312
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2023-11-02 18:11:55 +05:30
Ilya Dryomov
c5eb0ce432
Merge pull request #53535 from ajarr/wip-62891
qa/suites/rbd: add test to check rbd_support module recovery

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Mykola Golub <mgolub@suse.com>
2023-11-01 10:45:59 +01:00
Ramana Raja
2f2cd3bcff qa/suites/rbd: add test to check rbd_support module recovery
... on repeated blocklisting of its client.

There were issues with rbd_support module not being able to recover
from its RADOS client being repeatedly blocklisted. This occured for
example in clusters with OSDs slow to process RBD requests while the
module's mirror_snapshot_scheduler was taking mirror snapshots by
requesting exclusive locks on the RBD images and workloads were running
on the snapshotted images via kernel clients.

Fixes: https://tracker.ceph.com/issues/62891
Signed-off-by: Ramana Raja <rraja@redhat.com>
2023-10-10 12:58:19 -04:00
Ilya Dryomov
e40752ec25 qa/suites/rbd: drop redundant ignorelist entries
CACHE_POOL_NO_HIT_SET is retained in *api_tests*.yaml and
rbd_mirror.yaml snippets for TestLibRBD.ListChildrenTiered and
TestClusterWatcher.CachePools tests.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2023-10-10 12:50:36 +02:00
Ilya Dryomov
83880580aa qa/suites/rbd: deduplicate (data) pool facets
With cache tiering facets gone, "pool" facets are strictly about
--data-pool option now.  Rename to "data-pool" and create symlinks
to a common directory.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2023-10-10 09:42:24 +02:00
Ilya Dryomov
194dd09263 qa/suites/rbd: drop cache tiering workload tests
Cache tiering facets have been a constant source of job timeouts
accompanied by "slow request" warnings on the OSDs for at least two
years.  Same workloads pass without pool/small-cache-pool.yaml or
thrashers/cache.yaml.

See cache tiering deprecation note added in commit 535b8db33e ("doc:
deprecate the cache tiering").

Fixes: https://tracker.ceph.com/issues/63149
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2023-10-10 09:42:11 +02:00
Ilya Dryomov
9e884ddeec qa/suites/rbd: drop POOL_APP_NOT_ENABLED from ignorelists
With "mon warn on pool no app = false" in the config, it's obviously
redundant.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2023-09-21 23:36:07 +02:00
Ilya Dryomov
e64830eb8e qa/suites/rbd: disable POOL_APP_NOT_ENABLED health check
Commit 990806e635 ("mon, qa: issue pool application warning even
if pool is empty") made it impossible to create a pool without raising
a (bogus) health alert.  See [1] for details.

[1] https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/ZTDYC5HN677RR26EB4P6PORN6L2IFH4R/

Fixes: https://tracker.ceph.com/issues/62711
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2023-09-21 23:36:07 +02:00
Casey Bodley
cbdd520995 qa/suites: install pytest for pybind tasks
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2023-07-17 16:31:08 -04:00
Ilya Dryomov
acb270a3dd qa/workunits/rbd: make continuous export-diff test actually work
The current version is pretty useless:

- "rbd bench" writes the same byte (0xff) over and over again, so
  almost all checksumming is in vain
- snapshots are taken in a steady state (i.e. not under I/O), so no
  race conditions can get exposed
- even with these caveats, it's not wired up into the suite

Redo this workunit to be a reliable reproducer for the issue fixed
in the previous commit and wire it up for both krbd and rbd-nbd.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2023-06-20 22:14:39 +02:00
Ilya Dryomov
c529fdd63a qa/suites/rbd: install qemu-utils in addition to qemu-block-extra on Ubuntu
qemu-utils is usually pre-installed but, due to what appears to be
a Ubuntu packaging bug, it's not upgraded when qemu-block-extra is
installed:

  The following NEW packages will be installed:
    qemu-block-extra
  The following packages will be upgraded:
    qemu-system-common qemu-system-data qemu-system-gui qemu-system-x86

However, the version of the block driver must match exactly the version
of the qemu-img tool, so the above leads to:

  $ qemu-img convert -f qcow2 -O raw /home/ubuntu/cephtest/qemu/base.client.0.0.qcow2 rbd:rbd/client.0.0
  Failed to initialize module: /usr/lib/x86_64-linux-gnu/qemu/block-rbd.so
  Note: only modules from the same build can be loaded.
  qemu: module block-block-rbd not found, do you want to install qemu-block-extra package?
  qemu-img: Unknown protocol 'rbd'

Fixes: https://tracker.ceph.com/issues/59431
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2023-04-12 15:37:44 +02:00
Or Ozeri
3b2908b6fb qa/tasks/qemu: use formatted clones on encrypted disks
This commit changes the format for encrypted disks to have
the child image and the parent image encrypted with different keys.
This to allow testing of the new formatted clones feature in librbd/crypto.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
2022-08-25 18:41:48 +03:00
Ilya Dryomov
0a6a70760a qa/suites/rbd: disable workunit timeout for dynamic_features_no_cache
The I/O workload in this test is xfstests (qa/run_xfstests_qemu.sh)
which isn't subjected to any timeout other than global max_job_time
limit in any other subsuite (e.g. qemu/workloads/qemu_xfstests.yaml).
But here, there is a parallel "op" workload defined as a workunit.
The workunit task has a default timeout of 3 hours which is effectively
imposed on the entire job.  In the "rbd cache = false" configuration,
it's sometimes exceeded.

Fixes: https://tracker.ceph.com/issues/48038
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-07-17 19:06:18 +02:00
Ilya Dryomov
2de0574382 qa/tasks: rename persistent write log cache trash task
It doesn't really thrash anything, just repeatedly restarts the
workload on top of a dirty cache file.  rbd_pwl_cache_recovery is
more on point and gets covered by existing CODEOWNERS.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-07-16 09:46:58 +02:00
Yin Congmin
0eab8de3c0 qa/tasks: add thrash test for persistent write log cache
add thrash test for persistent write log cache. run rbd bench
on persistent write log cache, thrashes rbd bench, test the
recovery function of persistent write log cache.

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
2022-07-13 13:31:02 +08:00
Ilya Dryomov
23759e0034 qa/suites/rbd: place cache file on tmpfs for xfstests
The RWL mode needs DAX and is dog slow otherwise -- qemu_xfstests.yaml
job always hits the 6 hour max_job_time limit.

As our tmpfs instance is limited and qemu_xfstests.yaml opens three
images at the same time, reduce the "big cache" size to 5G.  This facet
was added to iron out 32-bit head/tail pointer issues and 5G still does
the job there.

Going through the loop device is needed because tmpfs doesn't support
O_DIRECT.

Fixes: https://tracker.ceph.com/issues/55400
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-06-03 19:20:37 +02:00
Ilya Dryomov
3475f9ef07 qa/suites/rbd: refactor persistent-writeback-cache suite
Rename to pwl-cache, introduce home subdirectory and 4-cache-path.yaml.
No functional changes.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-06-03 00:06:35 +02:00
Ilya Dryomov
8f0fd0af3d qa/suites/rbd: make sure block-rbd.so is installed
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-02-16 12:20:44 +01:00
Patrick Donnelly
1f714da814
qa: fix or add missing .qa links
Using this command:

    find qa/suites/ -type d -execdir ln -sfT ../.qa/ {}/.qa \;

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2022-02-03 10:08:30 -05:00
Ilya Dryomov
4ed1e74d83 qa/suites/rbd: add cram-based mon command API test
With mon (rbd_support mgr module in this case) command definitions
generated automatically by @CLI{Read,Write}Command decorator, it's
very easy to accidentally break the external facing API.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-30 14:22:35 +01:00
Deepika Upadhyay
9b306af421 qa/rbd: update the cephadm required distro
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
2021-12-06 23:00:16 +05:30
Yin Congmin
3da4a9401c qa/suites/rbd/persistent-writeback-cache: add test case
Add the test case which size is 8GB, So that some problems that occur
only in test scenarios above 4GB may be found in this test. For example,
the variables of 32-bit may be unexpected value when it operates with
a 64 bit value.

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
2021-11-12 17:31:00 +08:00
Ilya Dryomov
6278a04ac2 qa/suites/rbd: whitelist POOL_FULL due to quota for test_librbd.sh
RemoveFullTry tests fill up the pool and expect EDQUOT.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-10-23 22:04:58 +05:30
Deepika Upadhyay
a7952949a8 qa/suites/rbd: remove baremetal based setup needed for iscsi testing
* replace ceph baremetal deployment with cephadm based deployment

Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
2021-10-18 13:21:50 +05:30
Ilya Dryomov
366e9c51a8 qa/suites/rbd: test case for one-way snapshot-based mirroring
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-09-24 12:30:14 +02:00
Jason Dillaman
e2c9c5cd41 qa/suites/rbd: added SSD PWL cache mode to tests
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2021-03-10 11:30:47 -05:00
Jason Dillaman
5c991fed21
Merge pull request #38921 from lixiaoy1/pwl_teuthology
qa: add tests for persistent writeback cache

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2021-02-24 16:31:20 -05:00
Sage Weil
dc64ccf063 qa/suites: do not use notcmalloc flavor
teuthology now knows how to run valgrind against a tcmalloc binary

Signed-off-by: Sage Weil <sage@newdream.net>
2021-02-18 10:26:28 -06:00
Mykola Golub
b4d9cc45d6
Merge pull request #39155 from dillaman/wip-49037
librbd: correct incremental deep-copy object-map inconsistencies

Reviewed-by: Mykola Golub <mgolub@suse.com>
2021-02-10 18:37:34 +02:00
lixiaoy1
86ae486cb1 qa: add tests for persistent writeback cache
Signed-off-by: Li, Xiaoyan <xiaoyan.li@intel.com>
2021-02-09 23:41:18 +08:00
Jason Dillaman
094bfeaf8e qa/suites/rbd: add snapshot-based mirroring stress test
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2021-02-08 09:23:35 -05:00
Jason Dillaman
d652cd7a5c
Merge pull request #39298 from dillaman/wip-rbd-suite-readahead
qa/suites/rbd: drop require-osd-release command

Reviewed-by: Mykola Golub <mgolub@suse.com>
2021-02-04 14:17:07 -05:00
Jason Dillaman
e14f90eea7 qa/suites/rbd: drop require-osd-release command
Teuthology already defaults to quincy now and results in a failure
when trying to set to pacific. Additionally, drop the LUKS readbalance
test since it's unnecessary to duplicate that test.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2021-02-04 11:44:15 -05:00
Jason Dillaman
28ebc6086d
Merge pull request #38715 from lxbsz/rest_api
qa: add REST API method support for ceph-iscsi

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2021-02-04 11:38:16 -05:00
Xiubo Li
6d0d1d96c2 qa: add REST API method support for ceph-iscsi
Fixes: https://tracker.ceph.com/issues/48529
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-02-02 11:06:07 +08:00
Xiubo Li
a18c1e6658 qa: rename gwcli_client to iscsi_client
This could be used for both gwcli and REST API methods.

Fixes: https://tracker.ceph.com/issues/48529
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-02-02 08:10:41 +08:00
Or Ozeri
4f438f0dc3 test/librbd: add luks encryption cli test
This commit adds a cli test for rbd encryption verifying LUKS compatbility with cryptsetup

Signed-off-by: Or Ozeri <oro@il.ibm.com>
2021-01-24 09:11:50 +02:00
Or Ozeri
3754c665a1 :qa/tasks/rbd: test qemu on top of rbd encryption
This commit adds new qemu xfstests workloads that run on top of librbd luks1/luks2 encryption.
This is currently done via nbd, instead of the qemu rbd driver.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
2021-01-19 19:59:22 +00:00
Jason Dillaman
652963e7df librbd/migration: address code review comments
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2021-01-15 09:33:40 -05:00