Notably, this recovery procedure was missing scan_links.
Also, the test was oddly trying to recover the real file system in
addition to the recovery file system. I've removed that unnecessary
recovery.
Fixes: https://tracker.ceph.com/issues/57598
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Old cephs, such as nautilus, will use "blacklist" instead of "blocklist".
Fixes: https://tracker.ceph.com/issues/56529
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Stop importing CommandFailedError from teuthology.orchestra.run, it is
actually defined in teuthology.exception.
Fixes: https://tracker.ceph.com/issues/51226
Signed-off-by: Rishabh Dave <ridave@redhat.com>
* refs/pull/40431/head:
qa/cephfs: remove create_keyring_file from cephfs_test_case.py
qa/cephfs: don't use sudo to write files in /tmp
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Since teuthology.orchestra.remote.mktemp() can write a temporary file
and not just create it, create_keyring_file() is now redundant.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Files in /tmp cannot be written by any user( including the root user)
other than the file owner even if the permission mode on the file is
777.
Fixes: https://tracker.ceph.com/issues/49466
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Instead of stopping MDS daemons and individually failing MDS daemons,
just fail the ranks or the entire file system, where possible.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/38684/head:
qa: add _check_scrub_status helper to simplify the code
qa: add run_scrub helper in filesystem class
qa: add get_scrub_status helper in filesystem class
qa: wait the scrub task to complete
qa: remove passed_validation check for test_damage
qa: move wait_until_scrub_complete helper to filesystem class
mds: simplify the C_MDS_EnqueueScrub finish code
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Modify CephManager.run_cluster_cmd() to accept command arguments as
string as well since typing commands as strings is much lesser effort
than typing as list. This brings the interface a step closer to
teuthology.orchestra.remote.run()'s interface since it too can accept
commands arguments as string.
The change in cephfs_test_case.py is just to allow testing this PR
locally and on teuthology.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
During the ceph task Unwind, the MDS are stopped. If any file system
still exists, we will see failover messages in the cluster log.
Fixes: https://tracker.ceph.com/issues/49510
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Also filter out client-id's starting with "mirror" when
cleaning leftover auth-ids since teuthology would be
configured to create client.mirror and client.mirror_remote
clients before executing mirroring tests.
Signed-off-by: Venky Shankar <vshankar@redhat.com>
A trivial "find" command on a large directory hierarchy will cause the
client to receive caps significantly faster than it will release. The
MDS will try to have the client reduce its caps below the
mds_max_caps_per_client limit but the recall throttles prevent it from
catching up to the pace of acquisition. The solution is to throttle
readdir from client. This patch does the same.
The readdir is throttled on the condition that the number of caps
acquired is greater than certain percentage of mds_max_caps_per_client
(default is 10%) and cap acquisition via readdir is certain percentage
of mds_max_caps_per_client (the default is 50%). When the above
condition is met, the readdir request is retried after
'mds_cap_acquisition_throttle_retry_request_timeout' (default is 0.5)
seconds.
Fixes: https://tracker.ceph.com/issues/47307
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Right now, only client IDs are stashed and restored but with the recent
changes (addition of more attributes to mount objects, specifically),
this is not enough. Saving and restoring these details before and after
tests respectively ensures that mount commands rus smoothly. Not doing
this typically leads to mount command failure for the second test in the
testsuite under execution since the client IDs are saved and restored in
CephFSTestCase.setUp and CephFSTestCase.tearDown respectively but the
rest of the details are not.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Add testsuite for testing authorization on Ceph cluster with multiple
file systems and enable it to be executable with Teuthology framework.
Also add helper methods required to setup the test environment for
multi-FS tests.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
The 'mon_allow_pool_delete' option is set to 'True'
in 'setUp' of 'TestVolumes' and is cleared of in
corresponding 'tearDown' function. Hence, any pool
deletion in parent classes such as 'CephFSTestCase'
would fail. This patch fixes the same by setting
the config 'mon_allow_pool_delete' option in the
'CephFSTestCase'.
Fixes: https://tracker.ceph.com/issues/46597
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h
If the first character of this file is a pipe symbol (|), then the
remainder of the line is interpreted as the command-line for a
user-space program (or script) that is to be executed. More detail,
please see: http://man7.org/linux/man-pages/man5/core.5.html
Here will just skip cleaning the core dumps in this case.
Fixes: https://tracker.ceph.com/issues/45530
Signed-off-by: Xiubo Li <xiubli@redhat.com>
New, unwritten files, fail when backtracing during scrub.
This is not necessarily bad. So flag such failures as okay and continue
with other entries.
Fixes: https://tracker.ceph.com/issues/43543
Signed-off-by: Milind Changire <mchangir@redhat.com>
Mostly we should wait the mountpoint to get ready, especially for
the fuse mountpoint, sometimes it may take a few seconds to get
ready.
Fixes: https://tracker.ceph.com/issues/44044
Signed-off-by: Xiubo Li <xiubli@redhat.com>
A first step to do more automatic code checks on the qa/
directory. This is useful while transitioning to python3.
Also use log_exc to top-level to not run into:
error: Argument 1 to "log_exc" has incompatible type
"Callable[[OSDThrasher], Any]"; expected "OSDThrasher"
Signed-off-by: Thomas Bechtold <tbechtold@suse.com>
This provides a generic framework for modifying Ceph configuration
changes in tests through the monitors rather than the asok interface or
local ceph.conf changes. Any changes are reverted during test teardown.
A future patch will convert existing tests manipulating the local
ceph.conf or admin socket.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
"fs fail" will only fail the MDS that are part of the file system which
will generally allow us to avoid spurious MDS_INSUFFICIENT_STANDBY
warnings. Further, only restart the MDS, there's no reason to leave them
offline.
Fixes: https://tracker.ceph.com/issues/43514
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Move it up into CephTestCase so that mgr tests can
use it too, and pick it up in vstart_runner.py so
that these tests will work neatly there.
Signed-off-by: John Spray <john.spray@redhat.com>
Add support for testing recovery of CephFS metadata into an alternate
RADOS pool, useful as a disaster recovery mechanism that avoids
modifying the metadata in-place.
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
Add support for testing recovery of CephFS metadata into an alternate
RADOS pool, useful as a disaster recovery mechanism that avoids
modifying the metadata in-place.
Signed-off-by: Douglas Fuller <dfuller@redhat.com>