the ability to filter tests by attribute is provided by the
nose.plugins.attrib plugin, which wasn't being loaded by default
Signed-off-by: Casey Bodley <cbodley@redhat.com>
whitelist_health.yaml -> ignorelist_health.yaml
whitelist_wrongly_marked_down.yaml -> ignore_wrongly_marked_down.yaml
This was mostly addressed in
2ee9365d0b,
but the rename wasn't done there.
Signed-off-by: Zack Cerza <zack@cerza.org>
These temporary files don't matter for test execution with teuthology
but they do matter for execution with vstart_runner.py since the test
fails if these files exist already. And tests are often run repeatedly
with vstart_runner.py, unlike with teuthology.
Fixes: https://tracker.ceph.com/issues/55719
Signed-off-by: Rishabh Dave <ridave@redhat.com>
All `rados/thrash-erasure-code-big` tests that die due to the “wait_for_recovery” timeout have one thing in common: They contain either `thrashers/pggrow` or `thrashers/mapgap`.
The difference between pggrow and mapgap vs. all other non-offending thrashers (default, careful, fastread, and morepggrow) is that they lack an override setting for `osd max backfills`. `osd max backfills` is the max number of backfill operations allowed to/from an OSD. The higher the number, the quicker the recovery. By default, this value is 1. On all of the non-offending thrashers (default, careful, fastread, and morepggrow), the default 1 value gets overridden in their .yaml files with a value > 1. This is not the case for pggrow and mapgap, however, as they lack an `osd max backfills` override setting.
The mclock op scheduler is known to override `osd max backfills` with a high value, but all of the thrash-erasure-code-big thrashers have their op queue set to “debug_random”, which chooses randomly between op queues (the debug_random op queue is set to override the default mclock_scheduler in qa/config/rados.yaml). So, coupled with the “debug_random” op queue, the low `osd max backfill` setting is causing some tests to time out in recovery.
WITHOUT `osd max backfills`, as they are now, “mapgap” and “pggrow” tests die due to timed-out recovery about 17/100 times, as seen here with a pggrow test: http://pulpito.front.sepia.ceph.com/lflores-2022-05-18_14:24:29-rados:thrash-erasure-code-big-master-distro-default-smithi/
WITH `osd max backfills` specified, as I have suggested in this PR, 99/100 tests passed, with one test failing for a different reason:
http://pulpito.front.sepia.ceph.com/lflores-2022-05-17_22:40:27-rados:thrash-erasure-code-big-master-distro-default-smithi/
I also scheduled 145 tests WITH `osd max backfills` that are a mix of pggrow and mapgap thrashers. 144/145 tests passed, with one test failing for a different reason. http://pulpito.front.sepia.ceph.com/lflores-2022-05-17_15:27:54-rados:thrash-erasure-code-big-master-distro-default-smithi/
Fixes: https://tracker.ceph.com/issues/51076
Signed-off-by: Laura Flores <lflores@redhat.com>
rgw/qa: enable s3-tests related to cloud-transition feature
Reviewed-by: casey Bodley <cbodley@redhat.com>
Reviewed-by: Maredia, Ali <amaredia@redhat.com>
Run cloudtier tests with parameter 'retain_head_object'
set to true and false.
However having multiple cloudtier storage classes in the same task
is increasing the transition time and resulting in spurious failures.
Hence until there is a consistent way of running the tests, without
having to depend on lc_debug_interval, disabled one of the config for
now.
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
1. Method cluster() in ceph.py creates a dictionary "ctx.ceph", attaches
a namespace to ctx.ceph[cluster_name], create an attribute "fsid" and
stores Ceph cluster's FSID in it.
2. The method kernel_mount.KernelMount._get_debug_dir() uses that "fsid"
attribute to get Ceph cluster's FSID. (The exact that does that is
"fsid = self.ctx.ceph[cluster_name].fsid").
3. Test test_readahead.TestReadahead.test_flush() crashes with
vstart_runner.py because that test eventually calls _get_debug_dir()
and "ctx" in case of vstart_runner.py doesn't hold "ceph" dictionary
or anything similar.
Adding a dictionary, similar to the one added in ceph.py, to
vstart_runner.LocalContext's instances will fix this issue.
Fixes: https://tracker.ceph.com/issues/55694
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Removing the subvol support exposed a spurious argument to the status
command which was assgned to the 'subvol' parameter but was unused in
this command implementation.
The spurious argument is now removed.
Signed-off-by: Milind Changire <mchangir@redhat.com>
don't rely on the ceph manager task to parse a config file. each rgw
could be using a different config. instead, revert to an s3tests
override called 'with-sse-s3'
this way, the only job that enables sse-s3, vault_transit.yaml, contains
both the 'rgw crypt sse s3' configurables, and the flag to enable the
associated test cases
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Rationale: get and put now demand both the paths mandatorily.
Also testing of get and put without target paths
have been take of in other tests in class TestGetAndPut().
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
Result of os.path.join() before "./bin/ceph-mds" and after
"./bin/./ceph-mds".
Before -
2022-05-05 19:36:11,100.100 DEBUG:__main__:> ./bin/./ceph-mds -i a
After -
2022-05-05 19:38:48,179.179 DEBUG:__main__:> ./bin/ceph-mds -i a
Signed-off-by: Rishabh Dave <ridave@redhat.com>
The message regarding deletion of helper tools is printed for every
command. This message should be printed only when applicable.
Besides -
* Move XXX comments to _do_run() since it increases visibility of
these messages.
* Move omission of arguments stuff to new method clear up the clutter.
* And remove shell as a parameter from _perform_checks_and_adjustments
since it's redundant.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
NOTE: Although most of the issues are fixed but a few function
and variable names are unchanged in order to prevent
ambiguity and preserve their meaning.
They are:
- functions: setUp(), test_ls_H_prints_human_readable_file_size(),
- variables: ls_H_output, ls_H_file_size
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
By the introduction of range blocklist, the 'blocklist ls' command outputs
two lists. It's also straightforward to get the blocklisted clients directly
from 'osd dump' to avoid regression.
Fixes: https://tracker.ceph.com/issues/55516
Signed-off-by: Jos Collin <jcollin@redhat.com>