When stray directory inodes are corrupted, MDS may go to damaged state
after becoming active. (MDCache::open_root/populate_mydir is called by
MDSRank::starting_done).
Fixes: #14196
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Use the Mount.* wrappers for filesystem operations,
so that changes like making run_shell use sudo just work.
Signed-off-by: John Spray <john.spray@redhat.com>
This was causing permissions issues when
running inside teuthology, as run_python
was using sudo and run_shell wasn't.
Would be nice to get rid of all the rootishness,
but for the moment just make it more uniform.
This tests the forward scrub's ability to traverse
some metadata and tag it, and the corresponding
functionality in cephfs-data-scan to filter based
on tag and inject orphaned items.
Signed-off-by: John Spray <john.spray@redhat.com>
Use named error codes instead of numbers, and
use the helper fn for getting inode number
instead of doing it by hand.
Signed-off-by: John Spray <john.spray@redhat.com>
This was previously using a bunch of files and a small
MDCache limit to force things out of cache. It is much
simpler to just drop the journal.
Signed-off-by: John Spray <john.spray@redhat.com>
...specifically that we don't have lingering
MDS sessions after running it. This is testing
that Client::shutdown is doing the right thing
and closing sessions.
Signed-off-by: John Spray <john.spray@redhat.com>
A quick check that clients refuse to mount
when daemons are laggy, and while we're at it,
that the basics of failover work. It's a trivial
test, but it's nice to have this kind of thing
so that we don't have to wait for weird thrasher
failures if something breaks.
Signed-off-by: John Spray <john.spray@redhat.com>
To get the health warning, first we need to make sure requests are
added to session's completed request list. Then we need to send an
extra request to MDS to trigger the code that generates the warning.
Fixes: #13437
Signed-off-by: Yan, Zheng <zyan@redhat.com>
FuseMount only uses the prefix for finding the 'ceph'
executable, which is in ./ for either cmake or
authtools, not ./src for cmake like other binaries.
Signed-off-by: John Spray <john.spray@redhat.com>
It was trying to get the output file from
a different remote than the one used to
run the journal tool.
Signed-off-by: John Spray <john.spray@redhat.com>
This is to allow running CephFSTestCase tests
against a vstart cluster, for much faster turnaround
during development than running teuthology against
built ceph packages.
Not everything will be runnable this way, but for
certain things like filesystem repair scenarios we
have everything we need within a vstart environment.
Signed-off-by: John Spray <john.spray@redhat.com>
For tests to advertise that they need the client
to be able to trim its cache (i.e. currently that
means requiring run as root)
Signed-off-by: John Spray <john.spray@redhat.com>
A means for test cases to mark particular methods
as long running, so that the vstart runner can skip
them when running for developers.
This is not a scientific thing, anything that takes
more than about 2 minutes due to lots of iteration
or sleeps.
Signed-off-by: John Spray <john.spray@redhat.com>
In teuthology this isn't needed because we join the
mds child processes after killing them. In vstart
we're killing them asynchronously, so be a bit more
careful to ensure they can't re-insert themselves
to the mdsmap between our calling fail and our calling
fs rm.
Signed-off-by: John Spray <john.spray@redhat.com>
...into the part that requires a network-isolated
client and the part that doesn't.
This happens to also be the part that won't work with
vstart vs. the part that will. teuthology yaml will
still pick up and run both parts.
Signed-off-by: John Spray <john.spray@redhat.com>
* Instead of creating files in background, create
them in foreground (simpler).
* Instead of creating max_request*2 files, just create
max_requests plus a litle bit.
* Set max_requests to 1000 instead of 5000 to run a bit
faster.
Signed-off-by: John Spray <john.spray@redhat.com>
We weren't waiting for export dir to complete (the asok
just starts the process). This wasn't noticeable when running
remotely due to latency between the test runner and the MDS,
but it shows up when running against a local vstart cluster.
Signed-off-by: John Spray <john.spray@redhat.com>
I am seeing a strange thing where it seems like sometimes
a ls of /sys/fs/fuse/connections is returning empty when
connections do exist. It is pretty easy to make this
a non-issue by waiting for "more conns than we started with"
instead of "list of conns is different", so do that.
Signed-off-by: John Spray <john.spray@redhat.com>