Commit Graph

63 Commits

Author SHA1 Message Date
Sage Weil
0985f8c386 nuke: killall ceph-disk, too
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-18 12:31:11 -07:00
Sandon Van Ness
d54932cbc8 Fix VM issues.
Fix of #5494 although bad description. Instead of adding a wait
the code used to detect if the guest was back up is fixed. The
previous code appeared to assume only one machine and broke
when it was waiting for multiple machines if the guests did not
come up within 10 seconds of each other

Make nuke not do the normal stuff if the machine is a VPS as we
just destroy them when they get unlocked.

Instead of getting downburst options from ~/.teuthology.yaml get
it from the yaml given to teuthology for the test/task instead.

Fixed an error that would make all the default downburst values
not take effect if any of them were set via a yaml.

Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Warren Usui <warren.usui@inktank.com>
2013-07-03 19:07:35 -07:00
Warren Usui
a4994e3bde Support added for running scheduled tasks on virtual machines.
This included:
    A). changes made so that full path names on some files were used
        (scheduled tasks started in different home directories).
    B.) Changes to insure tasks come up on the beanstalkc queue properly,
    C.) Finding and inserting the libvirt eqivalent code for vm machines
        in order to simulate ipmi actions,
    D.) Fix host key code, report valgrind issue more clearly.
    E.) Some message and downburst call changes.

    Fix #4988
    Fix #5122
    Signed-off-by: Warren Usui <warren.usui@inktank.com>
2013-06-07 19:32:15 -07:00
Josh Durgin
d7fe5c0a34 nuke: don't require noipmi in ctx
This is called from run.py too, which won't have ctx.noipmi.
The default of using impmi is fine for now for run.py.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-09 18:21:05 -07:00
Sage Weil
f7c8c27c7b Merge branch 'next' 2013-05-06 21:31:36 -07:00
Sam Lang
980973dc55 task/install.py: Allow installation of non-ceph
Generalizes the install task to specify a "project" which defaults to
'ceph', but can be configured to install different project packages,
for example:

install:
  project: samba
  extra_packages: samba

The default install task uses 'ceph' as the project, and relies on an
existing set of defined packages to install.  For other projects, the
packages to be installed must be specified with the extra_packages
field.  Multiple install tasks can be specified:

install:
install:
  project: samba
  extra_packages: samba

Which installs ceph packages and then samba packages.

Also, cleanup in nuke.py so that nuke and install use the same list of
packages when doing the remove steps.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-05-06 17:37:25 -07:00
Sam Lang
9e6f7b126b nuke.py: Allow ipmi power cycling to be skipped
Some nodes don't have ipmi setup.  Allow nuke to
skip the ipmi checking if -i (--noipmi) is specified.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-05-01 17:18:16 -07:00
Sage Weil
4a6e3b97e3 install, nuke: explicitly purge /var/lib/ceph
The packages won't do this anymore.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-22 15:22:38 -07:00
Warren Usui
0c75c6b1f7 Added el6 install functionality for CentOS systems.
install_packages, remove_packages and remove_sources are now the
installation and removal functions used by teuthology.  Debian
references have been removed outside of tasks/install.py.  CentOS
functionality parallel to Debian have been added to tasks/install.py,
and el6 references have been added to nuke.py, task/ceph-fuse.y and
task/install.py.

Some files created by CentOS are removed with rm -fr.  This should
be changed once the installation/removal rpm procedure is implemented.

Signed-off-by: Warren Usui <warren.usui@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-03-14 16:25:18 -07:00
Warren Usui
01a40cfbf1 Use service instead of initctl to restart rsyslog.
This change is needed to make sure teuthology works on CentOS when the
-a option is specified.

Signed-off-by: Warren Usui <warren.usui@inktank.com>
2013-03-13 18:37:25 -07:00
Sage Weil
bee8dffc34 nuke: blow away /home/ubuntu/cephtest too
(along with /tmp/cephtest)

Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-25 17:54:49 -08:00
Sage Weil
d8021a1aa0 nuke: sudo for killall
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-22 10:51:51 -08:00
Josh Durgin
a862d8bf77 Fix unused vars, unused imports, and aliasing
Found by pyflakes

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-21 14:47:00 -08:00
Sage Weil
3f7c9bcaa4 move the install to a separate task.
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 15:06:52 -08:00
Sage Weil
d1d36241b7 ceph: use default data, keyring locations
This required reordering the cluster setup so that we do the ceph-osd
--mkfs --mkkey prior to gathering keys and initializing the monitors.

Also, run daemons as root.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:39:05 -08:00
Sage Weil
a54200d444 nuke: tolerate failed dpkg --configure -a/apt-get -f install
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:39:05 -08:00
Sage Weil
149be93639 nuke: dpkg --configure -a and apt-get -f install
Installing debs means we are more likely to hit a case where we interrupt
apt/dpkg.  Try to mop up as best we can in nuke.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:39:04 -08:00
Sage Weil
3400ea39ba nuke: whitespace
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:39:04 -08:00
Sage Weil
28116db6a0 nuke: remove librados, librbd
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:39:04 -08:00
Sander Pool
c525e1061b Install ceph debs and use installed debs
The ceph task installs ceph using the debian
packages now, and all invocations of binaries installed
in {tmpdir}/binary/usr/local/bin/ are replace with
the use of the binaries installed in standard locations
by the debs.

Author:    Sander Pool <sander.pool@inktank.com>
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-18 13:39:03 -08:00
Sage Weil
d790eeb451 nuke: testrados -> ceph_test_rados
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:38:54 -08:00
Josh Durgin
ed3c3615c3 nuke: don't try unmount if we're rebooting everything anyway
This can cause issues when unmount hangs. Our automatic runs reboot
everything unconditionally, so this caused a bunch of unecessary hangs
when an fs was accidentally rendered un-unmountable.
2013-02-05 23:31:39 -08:00
Josh Durgin
c6504bab9a nuke: make tmpfs check only umount tmpfs
This would catch things like /tmp/cephtest/mnt.client.0, which are
used by cfuse, rbd, and kclient.
2013-02-05 23:28:12 -08:00
Sam Lang
887e93e7e5 nuke.py: Allow name of job/run to be specified
Nuke will cleanup the base test directory by default, but can
cleanup the test directory for a given run if specified.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-03 11:09:49 -06:00
Sam Lang
9de9ebcf05 nuke: get_testdir_base needs to be imported
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 13:01:25 -06:00
Sam Lang
edfe5eeda1 nuke: Fix cleanup of test dir
Nuke used to remove /tmp/cephtest, now it tries to
remove the test dir, which it may not have the name
for.  Instead of removing the test dir, we just
remove the base directory for all test directories,
which may or may not be /tmp/cephtest.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 11:45:04 -06:00
Sam Lang
dcf99e43b9 nuke: Optionally check console status
Only check the ipmi console status if the ipmi
parameters have been defined in .teuthology.yaml.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 08:24:41 -06:00
Sam Lang
58111595d4 Support power cycling osds/nodes through ipmi
This patch defines a RemoteConsole class associated
with each Remote class instance, allowing
power cycling a target through ipmi.

Fixes/Implements #3782.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 08:23:37 -06:00
Sam Lang
ace4cb07b2 Replace /tmp/cephtest/ with configurable path
Teuthology uses /tmp/cephtest/ as the scratch test directory for
a run.  This patch replaces /tmp/cephtest/ everywhere with a
per-run directory: {basedir}/{rundir} where {basedir} is a directory
configured in .teuthology.yaml (/tmp/cephtest if not specified),
and {rundir} is the name of the run, as given in --name.  If no name
is specified, {user}-{timestamp} is used.

To get the old behavior (/tmp/cephtest), set test_path: /tmp/cephtest
in .teuthology.yaml.

This change was modivated by #3782, which requires a test dir that
survives across reboots, but also resolves #3767.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 08:23:31 -06:00
Josh Durgin
57bb434def Fix errors found by pyflakes
A bunch of unused imports and variables.
2012-09-21 16:46:24 -07:00
tamil
78b7b02c07 imported subprocess module in nuke script
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
2012-09-14 15:04:40 -07:00
Josh Durgin
d27806a293 nuke: add missing import 2012-09-13 14:31:46 -07:00
Mike Ryan
7f6591b556 ceph: support tmpfs_journal option to put journal on tmpfs
Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
2012-08-16 15:50:10 -07:00
Tommi Virtanen
99ac6b0b3e Disable asynchronous DNS lookups.
Especially on older hosts, we keep triggering errors::

  ServerNotFoundError: Unable to find the server at
  teuthology.front.sepia.ceph.com: [Errno 3] name does not exist

That comes from libevent's evdns via gevent.dns and httplib2. The rate
of these errors is low enough that they seem to be perhaps timeouts,
or more arbitrary. Busy looping on DNS resolution calls has never
triggered them, so far.

With ``monkey.patch_all(dns=False)``, the teuthology process will
block as a whole whenever doing DNS resolution. This will hopefully be
rare enough that it won't matter.

The only real "fix" seems to be upgrading libraries and hoping for the
best; this commit can be reverted after that is done.
2012-08-13 16:18:33 -07:00
Sage Weil
55847fc298 nuke: log what pid we are killing when we kill it 2012-07-18 11:04:30 -07:00
Sage Weil
9b28948635 nuke: honor 'check-locks: ...' field in targets file
If you are nuking a yaml file with check-locks: false, don't check locks.
2012-07-11 14:23:51 -07:00
Sage Weil
9ea22133b7 use sudo to kill teuthology proc 2012-07-06 20:15:55 -07:00
Sage Weil
132dc0066d nuke: be more careful about kill; simplify
If the archive dir is specified, make sure we are killing the right
process.

Also drop the kill_process helper; it's simple enough to open-code.
2012-07-04 14:47:33 -07:00
Sage Weil
6dbf53e298 nuke: nuke based on archive path
Use path/config.yaml for targets, path/pid for pid to kill, and
path/owner for job owner.
2012-07-04 14:47:33 -07:00
tamil
f3c2451797 nuke - optionally kill the process hung
Added a function kill_process to kill the process hung in the nightly runs.
It takes in pid as an optional argument.

Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
2012-07-03 12:23:36 -07:00
Josh Durgin
25114bf9a4 nuke: refactor to run in parallel and add unlock option
nuke-on-error already did this, but now teuthology-nuke does it
too. Also outputs targets that couldn't be nuked at the end.
2012-04-24 17:52:01 -07:00
Sage Weil
a11b69fd4c nuke: ignore ntpdate errors
We keep seeing a race between ntpd startup and our stop + ntpdate + start
sequence.  Ignore errors here.
2012-04-23 09:21:02 -07:00
Sage Weil
952940272b nuke: don't run umount when no xargs args
Gets rid of this noise:

INFO:teuthology.nuke:Unmount any osd data directories...
INFO:teuthology.orchestra.run.err:Usage: umount -h | -V
INFO:teuthology.orchestra.run.err:       umount -a [-d] [-f] [-r] [-n] [-v] [-t vfstypes] [-O opts]
INFO:teuthology.orchestra.run.err:       umount [-d] [-f] [-r] [-n] [-v] special | node...
INFO:teuthology.orchestra.run.err:Usage: umount -h | -V
INFO:teuthology.orchestra.run.err:       umount -a [-d] [-f] [-r] [-n] [-v] [-t vfstypes] [-O opts]
INFO:teuthology.orchestra.run.err:       umount [-d] [-f] [-r] [-n] [-v] special | node...
...
2012-04-03 15:56:36 -07:00
Sage Weil
2a18c3e1d0 nuke: unmount osd data directories
This helps us avoid reboot to clean up osd data directories that are left
mounted.
2012-03-06 09:34:38 -08:00
Sage Weil
50cc60f02d nuke: nuke testrados too
Slightly fewer nuke -r's
2012-02-14 15:23:19 -08:00
Sage Weil
975d73a2bb nuke: nuke testrados and rados processes, too
So that -r is needed slightly less often.
2012-02-13 15:28:24 -08:00
Josh Durgin
0da44591a9 nuke: take config files from -t argument
teuthology-lock and teuthology-updatekeys both use -t for this already
2012-01-12 14:48:36 -08:00
Sage Weil
721c0e9720 nuke: don't specify full path
/tmp/cephtest/binary may have been removed; kill stray daemons by name
only.  we really don't care about false positives here!
2011-11-19 20:56:49 -08:00
Josh Durgin
afa56f16d1 nuke: increase reboot timeout
Some sepia nodes are very slow to reboot.
2011-11-09 10:49:37 -08:00
Josh Durgin
5d32bcae50 Add nuke-on-error option.
This lets automated jobs nuke and unlock machines after failed
tests. Each machine is nuke individually, so one down machine won't
keep others from being nuked and unlocked.
2011-11-08 16:09:21 -08:00