We're getting the following error while initializing 64MB disks
on WS 2019: "The disk is not large enough to support a GPT
partition style.".
For this reason, we'll use MBR instead.
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
We're adding a test that:
* maps a configurable number of images
* runs a specified test - we're reusing the ones from stress_test,
making just a few minor changes to allow running the same test
multiple times
* restarts the ceph-rbd Windows service
* waits for the images to be reconnected and refreshes the mount
information
* reruns the test
* repeats the above workflow for a specified number of times,
reusing the same images
This test ensures that:
* mounted images are still available after a service restart
* drive letters are retained
* the image content is retained
* there are no race conditions when connecting or disconnecting
a large number of images in parallel
* the driver is capable of mapping a specified number of images
simultaneously
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
We're splitting the rbd-wnbd python test into separate files so
that the common code may easily be reused by other tests. This
also makes the code easier to read and maintain.
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
The "rbd-wnbd unmap" command is currently telling the WNBD driver
to remove the mapping without contacting the rbd-wnbd daemon
and waiting for it to perform its cleanup.
For this reason, attempting to delete the image immediately after
unmapping it can fail due to existing watchers.
As a temporary solution, we'll retry the image remove operation.
At a later time, we'll update the "rbd-wnbd unmap" command to go
through the rbd-wnbd daemon, ensuring that all the necessary
cleanup is performed before returning.
While at it, we're dropping a redundant LOG.error call so that we
won't print expected exceptions.
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
At the moment, the latency results are reported in nanoseconds.
In order to improve readability, we'll convert it to seconds.
While at it, we'll fix the fio duration report, which we're
wrongfully dividing by 1000 twice.
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
Added data regarding completion latency (ns) to the fio results
table, such as min, max, mean, stddev, etc.
Signed-off-by: Stefan Chivu <schivu@cloudbasesolutions.com>
We have a few Python rbd-wnbd tests that are invoked explicitly
by the ceph-build scripts [1].
There are a few issues with that:
* it's a separate repo that has to be updated whenever we add new
tests
* new tests that reside in the ceph repo will not be executed by
the PR check
* some tests may be missing in case of older branches
For this reason, we're adding a new script as part of the Ceph
repo that will take care of invoking the Windows rbd-wnbd tests.
The ceph-build script has already been updated accordingly [2].
[1] https://github.com/ceph/ceph-build/blob/main/scripts/ceph-windows/run_tests#L73-L80
[2] https://github.com/ceph/ceph-build/pull/2094
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
Co-Authored-By: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>
We're adding a test for the newly introduced live resize feature.
It will simply extend/shrink the image, wait for the new size to
be picked up and then run FIO tests to validate the resized image.
While at it, we're fixing two unrelated linter warnings:
E275 missing whitespace after keyword
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
This test uses certain PS commands that attempt to display
a progress bar. However, this can cause issues when invoked
remotely (e.g. by the jenkins job).
For this reason, we're defining a helper (ps_execute) that runs
PS commands, disabling the progress bars and enabling the non
interactive mode.
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
Certain FS related operations can fail, especially under load
(e.g. initializing partitions, volume formatting, etc).
For this reason, we're going to introduce some retries.
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
The following operations may fail right after a block device
is attached:
* retrieving the disk number (can return -1)
* opening the disk
* setting the disk online or writable
For this reason, we'll need to add some retries. For convenience,
we're moving the existing retry logic to a separate decorator.
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
Instead of trying to use the first partiton which may be reserved
by Windows, we'll fetch the first non-empty drive letter from
the disk that we've just mounted.
While at it, we're ensuring that the drive letter is actually a
letter and not a null character, which the Powershell command
returns in case of empty drive letters.
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
The Windows rbd-wnbd python test performs various IO operations
against raw disks.
However, it can be useful to test overlaying filesystems as well.
For this reason, we're adding the following tests:
* RbdFsTest
* RbdFsFioTest
* RbdFsStampFioTest
To simplify the implementation, those tests reuse the existing
ones along with a mixin class (RbdFsTestMixin).
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
We'll make the following improvements to the Windows rbd-wnbd
Python test:
* expose fio write validation, defaulting to crc32c
* change the default fio operation to "rw"
* enable the disk and clear the "rw" flag only if required by the
test and if "--skip-enabling-disk" is not set (useful with custom
SAN policies). This operation can take a significant amount of
time under heavy load.
* print fio read and write results separately instead of
aggregating them, useful when running rw tests
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
Use `main` instead of `master` in the workunit scripts for the
Windows Teuthology job.
Signed-off-by: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>
Due to lack of Windows support in the Teuthology, the test case adopts
the following workaround:
* Deploy baremetal machine with `ubuntu_latest.yaml` and
configure it with libvirt KVM.
* Create a libvirt VM and provision it with Windows Server 2019, using
the official ISO from Microsoft.
* Configure SSH in the Windows VM, and run the tests remotely via SSH.
The implementation of the test case consists of workunit scripts.
`qa/workunits/windows/test_rbd_wnbd.py` is the main Python script
to test Ceph on Windows basic functionality. This is executed in the
libvirt VM configured with Windows Server 2019.
Co-authored-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
Co-authored-by: Daniel Vincze <dvincze@cloudbasesolutions.com>
Signed-off-by: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>