Not all plana nodes have symlinks setup when
we check /dev/disk/by-id/wwn-*. Instead of failing
here, just use the /dev/disk/sd* devices.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
With the addition of the --name argument to the
teuthology program (run.py), jobs were failing
because --name was being treated as a non-arg
option, even though the name was being supplied
by the workers. Fix that and give it a metavar.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Handling of ipmi via the console is now done through the
Console class in teuthology/orchestra/remote.py.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Linux doesn't guarantee device names (/dev/sdb, etc.)
are always mapped to the same disk. Instead of assigning
nominal devices to osds, we map devices by their wwn
(/dev/disk/by-id/wwn-*) to an osd (both data and journal).
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
This patch defines a RemoteConsole class associated
with each Remote class instance, allowing
power cycling a target through ipmi.
Fixes/Implements #3782.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Teuthology uses /tmp/cephtest/ as the scratch test directory for
a run. This patch replaces /tmp/cephtest/ everywhere with a
per-run directory: {basedir}/{rundir} where {basedir} is a directory
configured in .teuthology.yaml (/tmp/cephtest if not specified),
and {rundir} is the name of the run, as given in --name. If no name
is specified, {user}-{timestamp} is used.
To get the old behavior (/tmp/cephtest), set test_path: /tmp/cephtest
in .teuthology.yaml.
This change was modivated by #3782, which requires a test dir that
survives across reboots, but also resolves#3767.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
This patch includes minor fixes to the teuthology
python code for syntax errors found by running
check-syntax.sh (which runs pyflakes on each file).
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
The monitors may report either positive or negative clock skews, and by
not using an absolute value we were constantly ignoring reported negative
clock skews.
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
... even if we didn't get a clean/finished result from the monitors
This ought to significantly cut the waiting time if something else (or
someone else) is leaving the leader hanging thus unable to finish a given
timecheck round.
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
We were kicking-off the timeout as soon as we started; it's better however
to kick if off only when we are told to stop (as long as 'at-least-once'
is true).
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
This code change is so that instead of pulling the tarball of github
which can be unreliable at times it instead uses the ceph repo mirror
and serves as the same function. Now it is using git archive and no
longer uses wget. Because of this less tar-fu is needed to extract
the necessary files as it can be done directly through git archive.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Sam Lang <sam.lang@inktank.com>
The workunit task assumes that a mount exists
at /tmp/cephtest/mnt.{id}
This patch creates the path if it doesn't
exist, enabling workunits to run in the absense
of kclient or ceph-fuse tasks.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Sam Lang <sam.lang@inktank.com>
at-least-once Runs at least once, even if we are told to stop.
(default: True)
at-least-once-timeout If we were told to stop but we are attempting to
run at least once, timeout after this many
seconds. (default: 300)
Fixes: #3854
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
Test 167 was failing due to running out of space on the scratch
file system. The test reserves 21MB in a file, and repeats 50
times. It required just over 1GB, so I bumped the default size
for the testing device to 1200 MB. I increased the test device
size as well.
This resolves http://tracker.newdream.net/issues/3864.
Signed-off-by: Alex Elder <elder@inktank.com>
This runs cram tests, which are an easy way to test output
stays consistent. We already use cram for basic cli tests with no cluster,
and now we can use it for whole system tests too.
ceph.git master now separates across crush hosts without this setting.
For teuthology clusters, we don't want that (unless the tests specifies
otherwise).