When teuthology-nuke is passed with --archive/-a to kill and nuke
machines from an archive folder it blindly will nuke all the
targets it grabs from the config.yaml in the archive dir. This
change will check the description of locked machines and make sure
the run name is in the description. if not it removes the target
from the list passed to nuke().
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
We don't want to block on sync for fear of a hung kernel
mount. However, we can give it a try and wait a few seconds
to get what we can.
This fixes a problem where our recent modifications to the
sudoers file are lost, with a 0 byte file left in its place,
because the task fails and we do a reboot -f -n.
Signed-off-by: Sage Weil <sage@inktank.com>
$ teuthology-nuke -a . -r -u
Traceback (most recent call last):
File "/home/ubuntu/bin/teuthology-nuke", line 9, in <module>
load_entry_point('teuthology==0.0.1', 'console_scripts', 'teuthology-nuke')()
File "/home/ubuntu/teuthology/teuthology/nuke.py", line 343, in main
ifn = os.path.join(ctx.archive, 'info.yaml')
UnboundLocalError: local variable 'os' referenced before assignment
Signed-off-by: Sage Weil <sage@inktank.com>
We should never run with a conflicting testdir in the basedir, and the
code to do this is confusing and buggy. Go back to a single testdir and
simple checks.
Signed-off-by: Sage Weil <sage@inktank.com>
Fix of #5494 although bad description. Instead of adding a wait
the code used to detect if the guest was back up is fixed. The
previous code appeared to assume only one machine and broke
when it was waiting for multiple machines if the guests did not
come up within 10 seconds of each other
Make nuke not do the normal stuff if the machine is a VPS as we
just destroy them when they get unlocked.
Instead of getting downburst options from ~/.teuthology.yaml get
it from the yaml given to teuthology for the test/task instead.
Fixed an error that would make all the default downburst values
not take effect if any of them were set via a yaml.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Warren Usui <warren.usui@inktank.com>
This included:
A). changes made so that full path names on some files were used
(scheduled tasks started in different home directories).
B.) Changes to insure tasks come up on the beanstalkc queue properly,
C.) Finding and inserting the libvirt eqivalent code for vm machines
in order to simulate ipmi actions,
D.) Fix host key code, report valgrind issue more clearly.
E.) Some message and downburst call changes.
Fix#4988Fix#5122
Signed-off-by: Warren Usui <warren.usui@inktank.com>
This is called from run.py too, which won't have ctx.noipmi.
The default of using impmi is fine for now for run.py.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Generalizes the install task to specify a "project" which defaults to
'ceph', but can be configured to install different project packages,
for example:
install:
project: samba
extra_packages: samba
The default install task uses 'ceph' as the project, and relies on an
existing set of defined packages to install. For other projects, the
packages to be installed must be specified with the extra_packages
field. Multiple install tasks can be specified:
install:
install:
project: samba
extra_packages: samba
Which installs ceph packages and then samba packages.
Also, cleanup in nuke.py so that nuke and install use the same list of
packages when doing the remove steps.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
install_packages, remove_packages and remove_sources are now the
installation and removal functions used by teuthology. Debian
references have been removed outside of tasks/install.py. CentOS
functionality parallel to Debian have been added to tasks/install.py,
and el6 references have been added to nuke.py, task/ceph-fuse.y and
task/install.py.
Some files created by CentOS are removed with rm -fr. This should
be changed once the installation/removal rpm procedure is implemented.
Signed-off-by: Warren Usui <warren.usui@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
This required reordering the cluster setup so that we do the ceph-osd
--mkfs --mkkey prior to gathering keys and initializing the monitors.
Also, run daemons as root.
Signed-off-by: Sage Weil <sage@inktank.com>
Installing debs means we are more likely to hit a case where we interrupt
apt/dpkg. Try to mop up as best we can in nuke.
Signed-off-by: Sage Weil <sage@inktank.com>
The ceph task installs ceph using the debian
packages now, and all invocations of binaries installed
in {tmpdir}/binary/usr/local/bin/ are replace with
the use of the binaries installed in standard locations
by the debs.
Author: Sander Pool <sander.pool@inktank.com>
Signed-off-by: Sam Lang <sam.lang@inktank.com>
This can cause issues when unmount hangs. Our automatic runs reboot
everything unconditionally, so this caused a bunch of unecessary hangs
when an fs was accidentally rendered un-unmountable.
Nuke will cleanup the base test directory by default, but can
cleanup the test directory for a given run if specified.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Nuke used to remove /tmp/cephtest, now it tries to
remove the test dir, which it may not have the name
for. Instead of removing the test dir, we just
remove the base directory for all test directories,
which may or may not be /tmp/cephtest.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
This patch defines a RemoteConsole class associated
with each Remote class instance, allowing
power cycling a target through ipmi.
Fixes/Implements #3782.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Teuthology uses /tmp/cephtest/ as the scratch test directory for
a run. This patch replaces /tmp/cephtest/ everywhere with a
per-run directory: {basedir}/{rundir} where {basedir} is a directory
configured in .teuthology.yaml (/tmp/cephtest if not specified),
and {rundir} is the name of the run, as given in --name. If no name
is specified, {user}-{timestamp} is used.
To get the old behavior (/tmp/cephtest), set test_path: /tmp/cephtest
in .teuthology.yaml.
This change was modivated by #3782, which requires a test dir that
survives across reboots, but also resolves#3767.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Especially on older hosts, we keep triggering errors::
ServerNotFoundError: Unable to find the server at
teuthology.front.sepia.ceph.com: [Errno 3] name does not exist
That comes from libevent's evdns via gevent.dns and httplib2. The rate
of these errors is low enough that they seem to be perhaps timeouts,
or more arbitrary. Busy looping on DNS resolution calls has never
triggered them, so far.
With ``monkey.patch_all(dns=False)``, the teuthology process will
block as a whole whenever doing DNS resolution. This will hopefully be
rare enough that it won't matter.
The only real "fix" seems to be upgrading libraries and hoping for the
best; this commit can be reverted after that is done.