With only a 5 second sleep via ssh and python it looks like a
race-condition was sometimes hitting where it would think
the machine is back up before the reboot command had completed.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
If you do 'timeout 5 sync' and sync hangs, timeout will block trying to
kill it.
Instead, just background sync, wait a few seconds, and reboot. This means
we wait a few seconds even if sync returns immediately, but who cares!
Signed-off-by: Sage Weil <sage@inktank.com>
This matches an existing argument (with the same meaning) and
avoids an error like
2013-10-01T17:20:35.395 CRITICAL:root: File "/var/lib/teuthworker/teuthology-master/virtualenv/bin/teuthology", line 9, in <module>
load_entry_point('teuthology==0.0.1', 'console_scripts', 'teuthology')()
File "/home/teuthworker/teuthology-master/teuthology/run.py", line 235, in main
nuke(ctx, log, ctx.lock)
File "/home/teuthworker/teuthology-master/teuthology/nuke.py", line 391, in nuke
if ctx.run_name:
2013-10-01T17:20:35.395 CRITICAL:root:AttributeError: 'Namespace' object has no attribute 'run_name'
Signed-off-by: Sage Weil <sage@inktank.com>
When teuthology-nuke is passed with --archive/-a to kill and nuke
machines from an archive folder it blindly will nuke all the
targets it grabs from the config.yaml in the archive dir. This
change will check the description of locked machines and make sure
the run name is in the description. if not it removes the target
from the list passed to nuke().
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
We don't want to block on sync for fear of a hung kernel
mount. However, we can give it a try and wait a few seconds
to get what we can.
This fixes a problem where our recent modifications to the
sudoers file are lost, with a 0 byte file left in its place,
because the task fails and we do a reboot -f -n.
Signed-off-by: Sage Weil <sage@inktank.com>
$ teuthology-nuke -a . -r -u
Traceback (most recent call last):
File "/home/ubuntu/bin/teuthology-nuke", line 9, in <module>
load_entry_point('teuthology==0.0.1', 'console_scripts', 'teuthology-nuke')()
File "/home/ubuntu/teuthology/teuthology/nuke.py", line 343, in main
ifn = os.path.join(ctx.archive, 'info.yaml')
UnboundLocalError: local variable 'os' referenced before assignment
Signed-off-by: Sage Weil <sage@inktank.com>
We should never run with a conflicting testdir in the basedir, and the
code to do this is confusing and buggy. Go back to a single testdir and
simple checks.
Signed-off-by: Sage Weil <sage@inktank.com>
Fix of #5494 although bad description. Instead of adding a wait
the code used to detect if the guest was back up is fixed. The
previous code appeared to assume only one machine and broke
when it was waiting for multiple machines if the guests did not
come up within 10 seconds of each other
Make nuke not do the normal stuff if the machine is a VPS as we
just destroy them when they get unlocked.
Instead of getting downburst options from ~/.teuthology.yaml get
it from the yaml given to teuthology for the test/task instead.
Fixed an error that would make all the default downburst values
not take effect if any of them were set via a yaml.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Warren Usui <warren.usui@inktank.com>
This included:
A). changes made so that full path names on some files were used
(scheduled tasks started in different home directories).
B.) Changes to insure tasks come up on the beanstalkc queue properly,
C.) Finding and inserting the libvirt eqivalent code for vm machines
in order to simulate ipmi actions,
D.) Fix host key code, report valgrind issue more clearly.
E.) Some message and downburst call changes.
Fix#4988Fix#5122
Signed-off-by: Warren Usui <warren.usui@inktank.com>
This is called from run.py too, which won't have ctx.noipmi.
The default of using impmi is fine for now for run.py.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Generalizes the install task to specify a "project" which defaults to
'ceph', but can be configured to install different project packages,
for example:
install:
project: samba
extra_packages: samba
The default install task uses 'ceph' as the project, and relies on an
existing set of defined packages to install. For other projects, the
packages to be installed must be specified with the extra_packages
field. Multiple install tasks can be specified:
install:
install:
project: samba
extra_packages: samba
Which installs ceph packages and then samba packages.
Also, cleanup in nuke.py so that nuke and install use the same list of
packages when doing the remove steps.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
install_packages, remove_packages and remove_sources are now the
installation and removal functions used by teuthology. Debian
references have been removed outside of tasks/install.py. CentOS
functionality parallel to Debian have been added to tasks/install.py,
and el6 references have been added to nuke.py, task/ceph-fuse.y and
task/install.py.
Some files created by CentOS are removed with rm -fr. This should
be changed once the installation/removal rpm procedure is implemented.
Signed-off-by: Warren Usui <warren.usui@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
This required reordering the cluster setup so that we do the ceph-osd
--mkfs --mkkey prior to gathering keys and initializing the monitors.
Also, run daemons as root.
Signed-off-by: Sage Weil <sage@inktank.com>
Installing debs means we are more likely to hit a case where we interrupt
apt/dpkg. Try to mop up as best we can in nuke.
Signed-off-by: Sage Weil <sage@inktank.com>
The ceph task installs ceph using the debian
packages now, and all invocations of binaries installed
in {tmpdir}/binary/usr/local/bin/ are replace with
the use of the binaries installed in standard locations
by the debs.
Author: Sander Pool <sander.pool@inktank.com>
Signed-off-by: Sam Lang <sam.lang@inktank.com>
This can cause issues when unmount hangs. Our automatic runs reboot
everything unconditionally, so this caused a bunch of unecessary hangs
when an fs was accidentally rendered un-unmountable.