1. Deploy 2 gateways on different nodes, then check for multi-path.
To add another gateway, only "roles" need to be changed in job yaml.
2. Create "n" nvmeof namespaces, configured by 'namespaces_count'
3. Rename qa/suites/rbd/nvmeof/cluster/fixed-3.yaml to fixed-4.yaml
which contains 2 gateways and 2 initiators.
Signed-off-by: Vallari Agrawal <val.agl002@gmail.com>
... to rbd and krbd suites respectively.
This allows the compare-mirror-image tests introduced in ea3a567
to be run against various kernel branches, e.g., testing branch.
And allows diff_continuous test in rbd_suite to run against distro
kernel.
Fixes: https://tracker.ceph.com/issues/64574
Signed-off-by: Ramana Raja <rraja@redhat.com>
Introduce functional tests to validate that the images under
workloads are correctly mirrored between two clusters using snapshot
based mirroring.
Run workload on a primary image using a krbd or nbd client. Take
mirror snapshots of the image under workload. Unmount the mapped image
and calculate its MD5 checksum before demoting it. After demotion,
wait for the mirror status of the image to be 'up+unknown' in both
the clusters. This is to make sure that the non-primary image in the
other cluster is ready to be promoted. Now promote the non-primary
image in the other cluster. Map the promoted image and calculate its
MD5 checksum. Verify that the checksums of the demoted and promoted
images in the two clusters are the same.
The above test is run as part of two different workunits:
- a workunit that validates the syncing of multiple mirrored images
with workloads running on them
- another workunit that validates the syncing of a single mirrored
image with workload running on it and the image is set as primary
alternatively between the two clusters, as it happens during
failover and failback scenarios.
Fixes: https://tracker.ceph.com/issues/61617
Signed-off-by: Ramana Raja <rraja@redhat.com>
Co-authored-by: Ilya Dryomov <idryomov@redhat.com>
Co-authored-by: Christopher Hoffman <choffman@redhat.com>
This is v2 of the rbd/nvmeof test: It deploys 1 gateway and 1 initiator.
Then does basic verification on nvme commands and runs fio.
This commit creates:
1. qa/tasks/nvmeof.py: adds a new 'Nvmeof' task which deploys
the gateway and shares config with the initiator hosts.
Sharing config was previously done by 'nvmeof_gateway_cfg' task
in qa/tasks/cephadm.py (that task is removed in this commit).
2. qa/workunits/rbd/nvmeof_basic_tests.sh:
Runs nvme commands (discovery, connect, connect-all, disconnect-all,
and list-subsys) and does basic verification of the output.
3. qa/workunits/rbd/nvmeof_fio_test.sh:
Runs fio command. Also runs iostat in parallel if IOSTAT_INTERVAL
variable is set. This variable configures the delay between each iostat
print.
nvmeof-cli upgrade from v0.0.6 to v0.0.7 introduced major changes
to all nvmeof commands. This commit changes v0.0.6 commands to
v0.0.7 in qa/workunits/rbd/nvmeof_initiator.sh
Signed-off-by: Vallari Agrawal <val.agl002@gmail.com>
A basic test for ceph-nvmeof[1] where
nvmeof initiator is created.
It requires use of a new task "nvmeof_gateway_cfg"
under cephadm which shares config information
between two remote hosts.
[1] https://github.com/ceph/ceph-nvmeof/
Signed-off-by: Vallari Agrawal <val.agl002@gmail.com>
The idea is to avoid the maintenance of duplicate code in both the journal
and snapshot test scripts.
Usage:
RBD_MIRROR_MODE=journal rbd_mirror.sh
Use environment variable RBD_MIRROR_MODE to set the mode
Available modes: snapshot | journal
Fixes: https://tracker.ceph.com/issues/54312
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
... on repeated blocklisting of its client.
There were issues with rbd_support module not being able to recover
from its RADOS client being repeatedly blocklisted. This occured for
example in clusters with OSDs slow to process RBD requests while the
module's mirror_snapshot_scheduler was taking mirror snapshots by
requesting exclusive locks on the RBD images and workloads were running
on the snapshotted images via kernel clients.
Fixes: https://tracker.ceph.com/issues/62891
Signed-off-by: Ramana Raja <rraja@redhat.com>
CACHE_POOL_NO_HIT_SET is retained in *api_tests*.yaml and
rbd_mirror.yaml snippets for TestLibRBD.ListChildrenTiered and
TestClusterWatcher.CachePools tests.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
With cache tiering facets gone, "pool" facets are strictly about
--data-pool option now. Rename to "data-pool" and create symlinks
to a common directory.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Cache tiering facets have been a constant source of job timeouts
accompanied by "slow request" warnings on the OSDs for at least two
years. Same workloads pass without pool/small-cache-pool.yaml or
thrashers/cache.yaml.
See cache tiering deprecation note added in commit 535b8db33e ("doc:
deprecate the cache tiering").
Fixes: https://tracker.ceph.com/issues/63149
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
The current version is pretty useless:
- "rbd bench" writes the same byte (0xff) over and over again, so
almost all checksumming is in vain
- snapshots are taken in a steady state (i.e. not under I/O), so no
race conditions can get exposed
- even with these caveats, it's not wired up into the suite
Redo this workunit to be a reliable reproducer for the issue fixed
in the previous commit and wire it up for both krbd and rbd-nbd.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
qemu-utils is usually pre-installed but, due to what appears to be
a Ubuntu packaging bug, it's not upgraded when qemu-block-extra is
installed:
The following NEW packages will be installed:
qemu-block-extra
The following packages will be upgraded:
qemu-system-common qemu-system-data qemu-system-gui qemu-system-x86
However, the version of the block driver must match exactly the version
of the qemu-img tool, so the above leads to:
$ qemu-img convert -f qcow2 -O raw /home/ubuntu/cephtest/qemu/base.client.0.0.qcow2 rbd:rbd/client.0.0
Failed to initialize module: /usr/lib/x86_64-linux-gnu/qemu/block-rbd.so
Note: only modules from the same build can be loaded.
qemu: module block-block-rbd not found, do you want to install qemu-block-extra package?
qemu-img: Unknown protocol 'rbd'
Fixes: https://tracker.ceph.com/issues/59431
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
This commit changes the format for encrypted disks to have
the child image and the parent image encrypted with different keys.
This to allow testing of the new formatted clones feature in librbd/crypto.
Signed-off-by: Or Ozeri <oro@il.ibm.com>
The I/O workload in this test is xfstests (qa/run_xfstests_qemu.sh)
which isn't subjected to any timeout other than global max_job_time
limit in any other subsuite (e.g. qemu/workloads/qemu_xfstests.yaml).
But here, there is a parallel "op" workload defined as a workunit.
The workunit task has a default timeout of 3 hours which is effectively
imposed on the entire job. In the "rbd cache = false" configuration,
it's sometimes exceeded.
Fixes: https://tracker.ceph.com/issues/48038
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
It doesn't really thrash anything, just repeatedly restarts the
workload on top of a dirty cache file. rbd_pwl_cache_recovery is
more on point and gets covered by existing CODEOWNERS.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
add thrash test for persistent write log cache. run rbd bench
on persistent write log cache, thrashes rbd bench, test the
recovery function of persistent write log cache.
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
The RWL mode needs DAX and is dog slow otherwise -- qemu_xfstests.yaml
job always hits the 6 hour max_job_time limit.
As our tmpfs instance is limited and qemu_xfstests.yaml opens three
images at the same time, reduce the "big cache" size to 5G. This facet
was added to iron out 32-bit head/tail pointer issues and 5G still does
the job there.
Going through the loop device is needed because tmpfs doesn't support
O_DIRECT.
Fixes: https://tracker.ceph.com/issues/55400
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
With mon (rbd_support mgr module in this case) command definitions
generated automatically by @CLI{Read,Write}Command decorator, it's
very easy to accidentally break the external facing API.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Add the test case which size is 8GB, So that some problems that occur
only in test scenarios above 4GB may be found in this test. For example,
the variables of 32-bit may be unexpected value when it operates with
a 64 bit value.
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
Teuthology already defaults to quincy now and results in a failure
when trying to set to pacific. Additionally, drop the LUKS readbalance
test since it's unnecessary to duplicate that test.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
This commit adds new qemu xfstests workloads that run on top of librbd luks1/luks2 encryption.
This is currently done via nbd, instead of the qemu rbd driver.
Signed-off-by: Or Ozeri <oro@il.ibm.com>