Due to lack of Windows support in the Teuthology, the test case adopts
the following workaround:
* Deploy baremetal machine with `ubuntu_latest.yaml` and
configure it with libvirt KVM.
* Create a libvirt VM and provision it with Windows Server 2019, using
the official ISO from Microsoft.
* Configure SSH in the Windows VM, and run the tests remotely via SSH.
The implementation of the test case consists of workunit scripts.
`qa/workunits/windows/test_rbd_wnbd.py` is the main Python script
to test Ceph on Windows basic functionality. This is executed in the
libvirt VM configured with Windows Server 2019.
Co-authored-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
Co-authored-by: Daniel Vincze <dvincze@cloudbasesolutions.com>
Signed-off-by: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>
Otherwise the tests may run forever. This was already done for
mds upgrade sequence, justadding it in the other two places here
Related to: https://tracker.ceph.com/issues/53939
Signed-off-by: Adam King <adking@redhat.com>
qa/workunits/fs/misc/subvolume.sh is getting in the way of fs:workload
testing with subvolumes. Hence moved this script to a python test.
Signed-off-by: Milind Changire <mchangir@redhat.com>
This commit removes orchestrator commands from the
Rook task and the Rook test suite because the Rook
orchestrator is not being maintained, and the Rook
orchestrator CLI is obsolete. This should also
clarify the issue:
https://tracker.ceph.com/issues/53680
Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
The default pids-limit (docker 4096/podman 2048) prevent some
customization from working (http threads on RGW) or limits the number
of luns per iscsi target.
Fixes: https://tracker.ceph.com/issues/52898
Signed-off-by: Teoman ONAY <tonay@redhat.com>
Added mds daemons so that it can create
cephFS pools and set options using
`do_set_pool()` in FSCommand.cc. Such that
we can cover corner cases like that in
https://tracker.ceph.com/issues/54263
Signed-off-by: Kamoltat <ksirivad@redhat.com>
A new job that doesn't want ms_mode to be set underneath it is about to
be added. Rename rxbounce to ms_modeless to make this purpose obvious.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
It's more appropriate to use --subset to reduce the scheduling size. It
was previously laid out this way because we wanted to link to the common
`qa/cephfs/mount` directory so that ceph-fuse mounts are not needlessly
multiplied. We should just organize it correctly so that is not an
issue.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/42000/head:
qa: update rhel kclient to setup container tools
qa: stop overriding distro for k-testing
qa: only use RHEL for workload testing
qa: convert fs:workload to use cephadm
qa: split fs begin task
qa/tasks/cephadm: setup CephManager when OSDs are provisioned
qa/tasks/cephadm: setup file system if MDS are provisioned
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Accessing one single dentry could be fastened by set this option to
false, when dir is not in the memory.
Signed-off-by: "Shen, Hang" <shenhang@kuaishou.com>
Use the override in ./src/qa/distros/container-hosts/ubuntu_20.04.yaml
in order to use hwe kernel for Ubuntu 20.04
This is because the ubuntu 20.04 kernel (5.4) has a bug that prevents
from using nvme-loop.
see https://lkml.org/lkml/2020/9/21/1456
Fixes: https://tracker.ceph.com/issues/54094
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
For basic, rbd and rbd-nomount subsuites, replace legacy and crc
facets with "legacy or legacy+rxbounce" and "crc or crc+rxbounce"
facets (chosen at random).
For fsx, singleton and thrash subsuites, add legacy+rxbounce and
crc+rxbounce facets and drop prefer-crc facet. The expected behaviour
of the latter depends on cluster configuration and should be tested
separately.
The total number of jobs remains the same.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
So links can be elsewhere in the qa suite (not used yet) and to simplify
a find command in a follow-up commit.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
It's not useful testing workloads with different distributions; it just
adds to the maintenance burden of this qa suite as distro upgrades often
break compilation of various tests.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Note: it's important to keep the install task which supplies packages
needed for some workloads.
Fixes: https://tracker.ceph.com/issues/51333
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
With mon (rbd_support mgr module in this case) command definitions
generated automatically by @CLI{Read,Write}Command decorator, it's
very easy to accidentally break the external facing API.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Currently, every rados run of ~400 jobs is running ~150 cephadm tests,
which is unnecessary and redundant. With this change, we will run some
basic cephadm tests within the rados suite. The following seems to be
a good start.
qa/suites/rados/cephadm/osds
qa/suites/rados/cephadm/smoke
qa/suites/rados/cephadm/smoke-singlehost
qa/suites/rados/cephadm/workunits
Signed-off-by: Neha Ojha <nojha@redhat.com>
On rhel/centos the ceph user does not have permission
to access these certs which leads to s3-test failures
in teuthology.
Signed-off-by: Ali Maredia <amaredia@redhat.com>
disable the sending of async datalog notifications on one zone per
cluster. this helps to verify that tests don't rely on notifications to
succeed
Signed-off-by: Casey Bodley <cbodley@redhat.com>
qa/suites/orch/cephadm: Also run the rbd/iscsi suite
Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Melissa Li <mingkli@redhat.com>
set and unset the noautoscale flag,
evaluate if the results are what
we expected. As well as, evaluate
if the flag is correct when we
create new pools.
Signed-off-by: Kamoltat <ksirivad@redhat.com>
This commit adds testing for the drive_group_loop in the Rook orchestrator
that reapplies drive groups that were applied previously.
This test removes an OSD, zaps the underlying device then waits for the OSD
to be re-created by the drive_group_loop.
This commit also updates the rook test suite to test v1.7.2 instead of 1.7.0
since `orch device zap` is only supported from v1.7.2 onwards.
Fixes: https://tracker.ceph.com/issues/53501
Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
mgr/cephadm: store contianer registry credentials in config-key
Reviewed-by: Sage Weil <sage@newdream.net>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
mgr/cephadm: Add client.admin keyring when upgrading from older version
Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Sage Weil <sage@newdream.net>
* refs/pull/43936/head:
qa/tasks/cephadm: pull image to all hosts in parallel
qa/tasks/cephadm: add hosts via mon remote
qa/tasks/cephadm: use shortname for remote directory
qa/tasks/cephadm: deploy no more than 5 mons in roleless mode
qa/tasks/radosbench: default clients to all clients (not client.0)
qa/tasks/ceph_manager: parallelize flush_pg_stats()
qa/suites/big: remove thrasher
qa/suites/big: update for cephadm
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
* refs/pull/43974/head:
qa: disable metrics on kernel client during upgrade
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
v16.2.4 MDS triggers an assert from these messages.
Also: add latest pacific for extra coverage.
Fixes: https://tracker.ceph.com/issues/53293
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
After https://github.com/ceph/ceph/pull/42526 and https://github.com/ceph/ceph/pull/43725 merges,
the following files do not exist but there were still references to them:
- src/pybind/mgr/dashboard/services/ganesha.py
- qa/tasks/mgr/dashboard/test_ganesha.py
The following files were renamed but there were still references to old names:
- src/pybind/mgr/dashboard/controllers/nfsganesha.py: nfsganesha.py --> nfs.py
- src/pybind/mgr/dashboard/tests/test_ganesha.py: test_ganesha.py --> test_nfs.py
Other changes in qa/suites/rados/dashboard/tasks/dashboard.yaml:
- Add missing task: tasks.mgr.dashboard.test_api
- Sort dashboard tasks alphabetically.
Fixes: https://tracker.ceph.com/issues/53123
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
* refs/pull/43894/head:
qa/suites/orch/cephadm: verify that 'orch ls' reports OSDs properly
mgr/cephadm: show unmanaged OSDs under 'osd' service
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
librbd/cache/pwl/ssd: make log entry 64 bit and add ssd version control
Reviewed-by: Mykola Golub <mykola.golub@clyso.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Add the test case which size is 8GB, So that some problems that occur
only in test scenarios above 4GB may be found in this test. For example,
the variables of 32-bit may be unexpected value when it operates with
a 64 bit value.
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
* refs/pull/43046/head:
mgr/rook: get running pods, auth rm, better error checking for orch nfs
qa/tasks/rook: add apply nfs to rook qa task
mgr/rook: prevent creation of NFS clusters not in .nfs rados pool
mgr/rook, mgr/nfs: update rook orchestrator to create and use .nfs pool
Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
Reviewed-by: Varsha Rao <rvarsha016@gmail.com>
* refs/pull/42520/head:
test: add cephfs-mirror HA active/active workunit and test yamls
test: add cephfs_mirror thrasher
tasks/cephfs_mirror: optionally run in foreground
mgr/mirroring: throttle directory reassigment to mirror daemons
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
This commit adds apply nfs to the rook qa task to see if the
command runs with no errors, this doesn't actually check if
an NFS daemon was created.
Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
* refs/pull/43827/head:
qa/suites/orch/cephadm: add repave-all test case
mgr/cephadm/services/osd: less noisy
mgr/cephadm/services/osd: do not log ok-to-stop/safe-to-destroy failures
mgr/orchestrator: clean up 'orch osd rm status'
Reviewed-by: Adam King <adking@redhat.com>
To fix failure like
Failure Reason:
Command failed on smithi085 with status 1: 'sudo yum -y install ceph-volume'
Signed-off-by: Neha Ojha <nojha@redhat.com>
This commit implements `orch apply rbd-mirror` in the rook orchestrator,
it creates a CR with a default name if the service_id isn't specified in
the spec, else it sets the name of the CR to the service_id in the spec.
This commit also adds `orch apply rbd-mirror` to the rook QA. This commit
also implements `orch rm rbd-mirror`.
Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
I think we have enough coverage. Always testing all
objectstores is a bit excessive in my opinion
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
Adds the ability to zap OSD devices after removal, implemented as a flag
on the 'orch osd rm' command.
Fixes: https://tracker.ceph.com/issues/43692
Signed-off-by: Cory Snyder <csnyder@iland.com>
* refs/pull/43510/head:
qa/suites/orch/cephadm/upgrade: smoke test for 'orch upgrade ls'
mgr/cephadm: make upgrade ls output structured
mgr/cephadm: add 'orch upgrade ls' to list available versions
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
pybind/mgr/cephadm: set allow_standby_replay during CephFS upgrade
Reviewed-by: Sage Weil <sage@newdream.net>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
* refs/pull/43049/head:
mgr/rook: apply mds using placement spec and osd_pool_default_size
mgr/rook: factor out replica/failureDomain calc
Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
Add host section of the cluster creation workflow.
1. Fix bug in the modal where going forward one step on the wizard and coming back opens up the add host modal.
2. Rename Create Cluster to Expand Cluster as per the discussions
3. A skip confirmation modal to warn the user when he tries to skip the
cluster creation
4. Adapted all the tests
5. Did some UI improvements like fixing and aligning the styles,
colors..
- Used routed modal for host Additon form
- Renamed the Create to Add in Host Form
Fixes: https://tracker.ceph.com/issues/51517
Fixes: https://tracker.ceph.com/issues/51640
Fixes: https://tracker.ceph.com/issues/50336
Fixes: https://tracker.ceph.com/issues/50565
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
Signed-off-by: Nizamudeen A <nia@redhat.com>
This commit has been causing scheduled jobs to request e.g. aarch64
smithi machines, which don't exist. The dispatcher then tries to find them forever, requiring the dispatcher to be killed and restarted. The queue
will sit idle until someone notices the problem.
Signed-off-by: Zack Cerza <zack@redhat.com>
This commit changes the apply_mds command in the rook orchestrator
to support some placement specs and also sets the replica size according
to the osd_pool_default_size ceph option.
This commit also adds `orch apply mds` to the QA to test if the command
runs.
Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
* refs/pull/43163/head:
qa: fsync dir for asynchronous creat on stray tests
qa: refactor and generalize create_n_files
qa: only set frag confs for workloads
mds: improve debugging for fragment size check
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
modified: qa/standalone/erasure-code/test-erasure-code-plugins.sh
new file: qa/suites/rados/thrash-erasure-code-isa/arch/aarch64.yaml
Signed-off-by: Dai Zhiwei <daizhiwei3@huawei.com>
This also checks max_mds>1 and allow_standby_replay are restored to
previous values.
Future work can add tests for multiple file systems (or volumes).
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This commit changes `orch apply rgw` to use the osd_pool_default_size
when setting the replication size for the data pool and metadata pool
of the rgw daemon. This commit also adds `orch apply rgw` to the Rook
QA.
Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
Currently, to recover a file system after recovering monitor store, you
need to stop all the MDSs; create FSMap with defaults using `fs new`
command; execute `fs reset` command to get the file system's rank 0 into
existing but failed state; and then restart MDSs.
Add 'recover' flag to the `fs new` command that sets the file system's
rank 0 to existing but failed state, and sets the file system's
'joinable' setting to False. Using the `fs new` command with 'recover'
flag gets rid of the steps to stop all the MDSs and execute `fs reset`
command when recovering the file system after recoving monitor store.
Fixes: https://tracker.ceph.com/issues/51716
Signed-off-by: Ramana Raja <rraja@redhat.com>
IMO the amount of symlinks we have to manually maintain
is tedious and error prone. Any ideas on improving thing?
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
Force a subset of tests that explicitly employ the filestore backend to
use WPQ scheduler. This is because mclock scheduler will not be
optimized for filestore.
Fixes: https://tracker.ceph.com/issues/52025
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
* refs/pull/42687/head:
qa: test the "ms_mode" options in kclient workflows
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
not really fixing anything, but moves the failures out of the normal
upgrade suite
Fixes: https://tracker.ceph.com/issues/49955
Signed-off-by: Casey Bodley <cbodley@redhat.com>