Added the ability to support multiple types of machines with
--machine-type added to teuthology-lock when used with --lock-many
or --machine-type with teuthology --lock (automated tests). It
defaults to 'plana' and the 'vps' type is currently unused but
should be in the future.
Also updated teutholoy-lock --summary to be machine type aware
and sort things in a nice output.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Added the ability to support multiple types of machines with
--machine-type added to teuthology-lock when used with --lock-many
or --machine-type with teuthology --lock (automated tests). It
defaults to 'plana' and the 'vps' type is currently unused but
should be in the future.
Signed-off-by: Sandon Van Ness <sandon@van-ness.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Also fix up the template to use {{field}} for stuff we don't want to parse.
There is probably a better way...
Signed-off-by: Sage Weil <sage@inktank.com>
This can cause issues when unmount hangs. Our automatic runs reboot
everything unconditionally, so this caused a bunch of unecessary hangs
when an fs was accidentally rendered un-unmountable.
When nodes are rebooted, the connections remain open
even after calling reconnect and setting up new ssh
sessions to the rebooted nodes. This causes ECONNRESET
errors to show up in the teuthology output.
Close the existing connections before trying to reconnect.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
kill_mon is getting a config set to None, which blows
up now due to the check for powercycle. Initialize
the config to an empty dict if we don't get anything
on init. This is the error showing up in teuthology:
2013-02-04T15:04:16.595 ERROR:teuthology.run_tasks:Manager failed: <contextlib.GeneratorContextManager object at 0x1fcafd0>
Traceback (most recent call last):
File "/var/lib/teuthworker/teuthology-master/teuthology/run_tasks.py", line 45, in run_tasks
suppress = manager.__exit__(*exc_info)
File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/var/lib/teuthworker/teuthology-master/teuthology/task/mon_thrash.py", line 142, in task
thrash_proc.do_join()
File "/var/lib/teuthworker/teuthology-master/teuthology/task/mon_thrash.py", line 69, in do_join
self.thread.get()
File "/var/lib/teuthworker/teuthology-master/virtualenv/local/lib/python2.7/site-packages/gevent/greenlet.py", line 308, in get
raise self._exception
AttributeError: 'NoneType' object has no attribute 'get'
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Nuke will cleanup the base test directory by default, but can
cleanup the test directory for a given run if specified.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
I think this is what is going on...
Traceback (most recent call last):
File "/var/lib/teuthworker/teuthology-master/teuthology/contextutil.py", line 27, in nested
yield vars
File "/var/lib/teuthworker/teuthology-master/teuthology/task/ceph.py", line 1158, in task
yield
File "/var/lib/teuthworker/teuthology-master/teuthology/run_tasks.py", line 25, in run_tasks
manager = _run_one_task(taskname, ctx=ctx, config=config)
File "/var/lib/teuthworker/teuthology-master/teuthology/run_tasks.py", line 14, in _run_one_task
return fn(**kwargs)
File "/var/lib/teuthworker/teuthology-master/teuthology/task/dump_stuck.py", line 93, in task
manager.kill_osd(id_)
File "/var/lib/teuthworker/teuthology-master/teuthology/task/ceph_manager.py", line 665, in kill_osd
if 'powercycle' in self.config and self.config['powercycle']:
TypeError: argument of type 'NoneType' is not iterable
Nuke used to remove /tmp/cephtest, now it tries to
remove the test dir, which it may not have the name
for. Instead of removing the test dir, we just
remove the base directory for all test directories,
which may or may not be /tmp/cephtest.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Not all plana nodes have symlinks setup when
we check /dev/disk/by-id/wwn-*. Instead of failing
here, just use the /dev/disk/sd* devices.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
With the addition of the --name argument to the
teuthology program (run.py), jobs were failing
because --name was being treated as a non-arg
option, even though the name was being supplied
by the workers. Fix that and give it a metavar.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Handling of ipmi via the console is now done through the
Console class in teuthology/orchestra/remote.py.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Linux doesn't guarantee device names (/dev/sdb, etc.)
are always mapped to the same disk. Instead of assigning
nominal devices to osds, we map devices by their wwn
(/dev/disk/by-id/wwn-*) to an osd (both data and journal).
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
This patch defines a RemoteConsole class associated
with each Remote class instance, allowing
power cycling a target through ipmi.
Fixes/Implements #3782.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Teuthology uses /tmp/cephtest/ as the scratch test directory for
a run. This patch replaces /tmp/cephtest/ everywhere with a
per-run directory: {basedir}/{rundir} where {basedir} is a directory
configured in .teuthology.yaml (/tmp/cephtest if not specified),
and {rundir} is the name of the run, as given in --name. If no name
is specified, {user}-{timestamp} is used.
To get the old behavior (/tmp/cephtest), set test_path: /tmp/cephtest
in .teuthology.yaml.
This change was modivated by #3782, which requires a test dir that
survives across reboots, but also resolves#3767.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
This patch includes minor fixes to the teuthology
python code for syntax errors found by running
check-syntax.sh (which runs pyflakes on each file).
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
pyflakes runs a basic syntax checker against python code.
The added check-syntax.sh script and Makefile run pyflakes
on the python code within the teuthology directory reporting
any syntax errors that are found.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
The monitors may report either positive or negative clock skews, and by
not using an absolute value we were constantly ignoring reported negative
clock skews.
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
... even if we didn't get a clean/finished result from the monitors
This ought to significantly cut the waiting time if something else (or
someone else) is leaving the leader hanging thus unable to finish a given
timecheck round.
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>