In cases where the mds thrasher continuously loops
waiting for an mds to be removed from the map, or
for a new mds to become active, we want to start logging
the mds state for debugging.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Debug the osd op ordering by default. Most of the runs have a small number
of clients, which makes the STL maps cheap.
Signed-off-by: Sage Weil <sage@inktank.com>
Pass the desc to the lock operation.
The unlock operation now clears desc for us; no need to do it outselves.
Signed-off-by: Sage Weil <sage@inktank.com>
Note whenever locks are acquired/released, or a machine's description is updated.
Under apache, these will go to error.log.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Verify there is no /var/lib/ceph, just like we do with the cephtest
directory. We will need to change this (or make it optional) when we
allow runs against an existing cluster, but then a whole bunch of other
things will need to change then as well.
Signed-off-by: Sage Weil <sage@inktank.com>
This patch corrects an issue where a workunit task is
not cleaning up generated directories
if the 'all' key is used to specify clients.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Sam Lang <sam.lang@inktank.com>
Don't use exit status info to track daemon state. We need to find
a better way to do this for the restart task.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Do yum install rather than yum reinstall for CentOS.
When exiting CentOS, yum erase the ceph-release rpm.
Signed-off-by: Warren Usui <warren.usui@inktank.com>
The exitstatus on the process is a gevent.AsyncResult
(not an int). Use the try/except pattern for handling
errors instead.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Tested for the existence of /sys/fs/fuse/connections/*/abort
before clobbering it. This problem was generated when all
the machines were virtual CentOS machines.
Signed-off-by: Warren Usui <warren.usui@inktank.com>
The last command a restart script outputs is 'done'
indicating the script does not require being restarted
further. Handle this case properly.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
The ceph daemons support being killed at a specific code point
with a config option. In some cases, we want to test a kill point
only once for a given daemon run (such as replay that only occurs
during daemon startup). This task allows running a script or executable
and (when the script sends a command to the task) restarting it with
a temporary config that has the appropriate kill point set. Once
the daemon asserts and gets restarted, the original config is used.
Adds a specific restart_with_args() method to the DaemonState in the
ceph task.
Right now this task follows the workunit task closely, but uses stdout/stdin
to specify when to restart a daemon.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
install_packages, remove_packages and remove_sources are now the
installation and removal functions used by teuthology. Debian
references have been removed outside of tasks/install.py. CentOS
functionality parallel to Debian have been added to tasks/install.py,
and el6 references have been added to nuke.py, task/ceph-fuse.y and
task/install.py.
Some files created by CentOS are removed with rm -fr. This should
be changed once the installation/removal rpm procedure is implemented.
Signed-off-by: Warren Usui <warren.usui@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
We don't need to setup the ipmi console on runs that
don't use powercycling, so delay setup of the RemoteConsole
with ipmi to the thrashosd task and only then if the powercycle
config is set. This avoids spurious test failures from flaky
ipmi.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
If powercycling was requested for the osd thrasher
we should ensure that we are able to reach the
ipmi console. This helps us avoid weird errors.
Signed-off-by: Sam Lang <sam.lang@inktank.com>