When running under valgrind, MDS may be slow to be added to the FSMap
(especially if mons are in valgrind too). The file system creation that
follows will throw unnecessary warnings about insufficient standbys if
no MDS is available.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This method is unused in the teuthology repo. The helper method better
belongs here where it is more easily modified.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/37708/head:
qa/suites/fs: enable thrashing in multifs environment
qa/workunits/fs/snaps: allow tests to be run
qa/tasks/{kclient,ceph_fuse}: allow mounting
qa/tasks: allow per file system config setting
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Commit 9536625558 ("qa/tasks/ceph: use Cluster.sh() and Remote.sh()
when appropriate") dropped run.wait(), which waits for all given
processes to exit. This resulted in errors like
INFO:teuthology.orchestra.run.smithi107.stderr:tar: ./objectstore_tool..log: file changed as we read it
INFO:teuthology.orchestra.run.smithi107.stderr:tar: ./ceph-client.admin.175125.log: File removed before we read it
as the task moved on to archiving semi-corrupted and uncompressed logs,
filling up the lab cluster.
Revert that hunk, as Cluster.sh() is useless here -- we don't need
stdout or stderr, but very much need parallel execution and wait for
the compression to finish.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
When running teuthology interactively, ctx.archive might not be set.
If it's not set, there is no point trying to access files there.
Fixes: https://tracker.ceph.com/issues/48058
Signed-off-by: Marcus Watts <mwatts@redhat.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
* refs/pull/37629/head:
qa/cephfs: add session_timeout option support
qa/cephfs: move the cephfs's opertions setting to create()
qa/cephfs: add 'cephfs:' section support
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
When the mds revoking the Fwbl caps, the clients need to flush
the dirty data back to the OSDs, but the flush may make the OSDs
to be overloaded and slow, which may take more than 60 seconds to
finish. Then the MDS daemons will report the WRN messages.
For the teuthology test cases, let's just increase the timeout
value to make it work.
Fixes: https://tracker.ceph.com/issues/47565
Signed-off-by: Xiubo Li <xiubli@redhat.com>
This is hopefully a transient issue that can be ignored.
Fixes: https://tracker.ceph.com/issues/42433
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This avoids unnecessary MDS_ALL_DOWN messages because the MDS daemons
have not yet been spawned.
Fixes: https://tracker.ceph.com/issues/47518
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
as it is shell who interprets ">>" and redirect the stderr to given
file, but the shell process is launched by ubuntu:ububunt without using
sudo, so the command fails with "Permission denied" failure. to address
this issue, in this change, a file with proper priviledges is created
beforehand using `install`, so shell is able to write to it.
also, instead of creating this file in `maybe_redirect_stderr()`, it
returns the command to create the log file.
Signed-off-by: Kefu Chai <kchai@redhat.com>
This commit adds the file path of ceph log directories to the job's
info.yaml log file. The motivation behind this is, in case of job
timeout, the logs would still be tranferred to teuthology host
before nuking test machines using these ceph log directory paths in
job's info.yaml log file.
Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>
we should redirect stderr for crimson instead for default flavor. this
change addresses a regression introduced by
da76f46461
Signed-off-by: Kefu Chai <kchai@redhat.com>
we could have following logging messages:
tasks.ceph:Waiting for all PGs to be active+clean and split+merged, waiting on ['2.6', '2.5', '1.0', '2.4'] to go clean and/or [] to split/merge
if the cluster has non-active+clean pgs when the "ceph" is about to
end. but this message is a little bit confusing in the sense it
lists "[]" in it.
in this change, only PGs being waited are listed. also, added some
cleanups:
* use "else" to check if the loop is terminated by a break
* remove "0" from the range() call
Signed-off-by: Kefu Chai <kchai@redhat.com>
This is super basic right now and only works for monitor daemons
as it has to parse out their IPs from cluster information, then
turn that into the Host objects. We can extend it in future.
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
The ".format" builtin logs dicts nicely right out of the box.
Also, some of the log messages were too cryptic - fixed them in this commit as
well.
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Nowadays, get_wwn_id_map is essentially a noop - it does:
return dict((d, d) for d in devs)
This reverts another bit of 8f720454cb from 2013.
References: https://tracker.ceph.com/issues/42313
Signed-off-by: Nathan Cutler <ncutler@suse.com>
otherwise bash will intepret "kind" as a file when handling command like
```
sudo zgrep <kind> /var/log/ceph/valgrind/* /dev/null | sort | uniq
```
and try to feed its content to zgrep, and write the output of zgrep
to /var/log/ceph/valgrind/*. this is not the intended behavior. what we
what to do is to pass "<kind>" as an argument to zgrep, along with
the globbed files names which matches "/var/log/ceph/valgrind/*".
in this change, "<kind>" is quoted as in the command line. it's also
what `pipes.quote()` does before the change of
35cf5131e7152ce20d916aa99c124751d6a97f5c.
this addresses the regression introduced by
35cf5131e7152ce20d916aa99c124751d6a97f5c.
Fixes: https://tracker.ceph.com/issues/44454
Signed-off-by: Kefu Chai <kchai@redhat.com>
Use io.BytesIO instead of cStringIO.StringIO
Use six.ensure_str whenever it needs to convert binary to str.
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>