Analogous to raw_cluster_command, but instead
of calling blocking CLI command we're invoking
the -w mode.
Signed-off-by: John Spray <john.spray@redhat.com>
The ceph pg scrub ... command isn't really guarranteed to
start a scrub, keep reissuing it until the scrub actually
happens.
Related: #12746
Signed-off-by: Samuel Just <sjust@redhat.com>
/var/run/ceph is 770. This is mainly necessary for any
interaction with the daemon sockets, but it is what users do
and it may avoid log noise.
Signed-off-by: Sage Weil <sage@redhat.com>
also waits to remove it from dead_osds. this fixes an issue where
do_sighup tries to send a signal to an osd that has not been revived
yet.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This method runs in a separate greenlet than do_thrash and will pick a
random live osd to send a signal.SIGHUP to. There is a config option,
sighup_delay, which controls how long to delay between sending the
signals.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This patch also adds some convenience facilities for making
some of the ceph_manager methods into tasks usable from a
yaml file.
Signed-off-by: Samuel Just <sjust@redhat.com>
In python, isinstance(foo, str) will fail if
a unicode string is passed in. The correct check
is basestring.
Signed-off-by: John Spray <john.spray@redhat.com>
* add a wrapper to log uncaught exception to self.logger, greenlet also
prints the backtrace and exception to stderr, but teuthology.log does
not capture stderr. so we need to catch them by ourselves to reveal
more info to root-cause this issue.
* log uncaught exception thrown by Thrasher.do_thrash() to self.log.
See: #10630
Signed-off-by: Kefu Chai <kchai@redhat.com>
Require ceph-objectstore-tool to be available on all OSD nodes
Log a message when tool is not available
Signed-off-by: David Zafman <dzafman@redhat.com>
Add the CephManager.objectstore_tool method to encapsulate a call to
ceph-objectstore-tool. The wrapper can convert an object name into the
PG id and figure out the primary OSD. The designated OSD is stopped
before running the command and restarted afterwards.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
The commit is large but does not introduce any semantic change and
consists primarily in code moving around, re-indented and removed.
Replace functions generating functions by functions and sequentially
iterating over a list of functions with a sequential call to the
functions.
Replace the setup/teardown with an equivalent using a with
statement and the ceph_manager.pool method.
Replace inline code with a call to ceph_manager.wait_for_all_up
It makes it easier to modify the tests, for instance to create erasure
coded pools and tests specific to them.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
To create a pool before running a code bloc and remove it after.
with manager.pool("mypool"):
mytest..
Signed-off-by: Loic Dachary <ldachary@redhat.com>
This was tripping over the recent commit 42c85e80
in Ceph master, which tightens the limits on
acceptable PG counts per OSD, and was making
teuthology runs fail due to never going clean.
Rather than put in a new hardcoded count, infer
it from config. Move some code around so that
the ceph task can get at a Filesystem object
to use in FS setup (this already has conf-getting
methods).
Signed-off-by: John Spray <john.spray@redhat.com>
The osd dump command displays pool types using numerics instead of
symbolic names. Create constants in the CephManager class to use instead
of numbers.
Signed-off-by: Loic Dachary <ldachary@redhat.com>