Commit Graph

75 Commits

Author SHA1 Message Date
Zack Cerza
efd5ccbc4b Merge pull request #118 from ceph/wip-nukeskip
Check description of machines before nuking when -a is passed
2013-10-01 16:40:44 -07:00
Sandon Van Ness
6b248e80a6 Check description of machines before nuking when -a is passed
When teuthology-nuke is passed with --archive/-a to kill and nuke
machines from an archive folder it blindly will nuke all the
targets it grabs from the config.yaml in the archive dir. This
change will check the description of locked machines and make sure
the run name is in the description. if not it removes the target
from the list passed to nuke().

Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-10-01 16:12:31 -07:00
Sage Weil
3e31c49344 nuke: make half-hearted attempt to sync before reboot
We don't want to block on sync for fear of a hung kernel
mount.  However, we can give it a try and wait a few seconds
to get what we can.

This fixes a problem where our recent modifications to the
sudoers file are lost, with a 0 byte file left in its place,
because the task fails and we do a reboot -f -n.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-10-01 09:57:34 -07:00
Zack Cerza
21765ce4f8 Move 'import os' to inside main()
This is necessary because of the monkey-patching.
2013-09-26 14:03:44 -05:00
Zack Cerza
a2c9bdc7ba Fix undefined name errors
(cherry picked from commit f59497ef2214f29d5995435d83766c7994e8f2cd)
2013-09-26 14:01:17 -05:00
Sage Weil
25bc62dec1 nuke: add missing import os
$ teuthology-nuke  -a . -r -u
Traceback (most recent call last):
  File "/home/ubuntu/bin/teuthology-nuke", line 9, in <module>
    load_entry_point('teuthology==0.0.1', 'console_scripts', 'teuthology-nuke')()
  File "/home/ubuntu/teuthology/teuthology/nuke.py", line 343, in main
    ifn = os.path.join(ctx.archive, 'info.yaml')
UnboundLocalError: local variable 'os' referenced before assignment

Signed-off-by: Sage Weil <sage@inktank.com>
2013-09-25 13:42:03 -07:00
tamil
eb4c575f54 made help more readable
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
2013-09-12 15:03:10 -07:00
tamil
40d6c60f13 feature # 5942. Added examples to teuthology binaries help page
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
2013-09-11 17:13:22 -07:00
Sage Weil
5acc57f5ad remove basedir/testdir distinction
We should never run with a conflicting testdir in the basedir, and the
code to do this is confusing and buggy.  Go back to a single testdir and
simple checks.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-09-10 10:53:41 -07:00
Sage Weil
dcbf50b86c nuke: get pid, owner from info.yaml (if present)
Fall back to the old files if info.yaml is missing.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-30 10:02:27 -07:00
Zack Cerza
3981a8f1af Never use 'except:' without specifying an Exception. 2013-08-30 11:10:05 -05:00
Sage Weil
86caebbed7 nuke: clean up stray firmware.git locks
These get lost occasionally and cause all firmware.git updates to
fail when the kernel task runs.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-23 09:00:47 -07:00
Sage Weil
0985f8c386 nuke: killall ceph-disk, too
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-18 12:31:11 -07:00
Sandon Van Ness
d54932cbc8 Fix VM issues.
Fix of #5494 although bad description. Instead of adding a wait
the code used to detect if the guest was back up is fixed. The
previous code appeared to assume only one machine and broke
when it was waiting for multiple machines if the guests did not
come up within 10 seconds of each other

Make nuke not do the normal stuff if the machine is a VPS as we
just destroy them when they get unlocked.

Instead of getting downburst options from ~/.teuthology.yaml get
it from the yaml given to teuthology for the test/task instead.

Fixed an error that would make all the default downburst values
not take effect if any of them were set via a yaml.

Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Warren Usui <warren.usui@inktank.com>
2013-07-03 19:07:35 -07:00
Warren Usui
a4994e3bde Support added for running scheduled tasks on virtual machines.
This included:
    A). changes made so that full path names on some files were used
        (scheduled tasks started in different home directories).
    B.) Changes to insure tasks come up on the beanstalkc queue properly,
    C.) Finding and inserting the libvirt eqivalent code for vm machines
        in order to simulate ipmi actions,
    D.) Fix host key code, report valgrind issue more clearly.
    E.) Some message and downburst call changes.

    Fix #4988
    Fix #5122
    Signed-off-by: Warren Usui <warren.usui@inktank.com>
2013-06-07 19:32:15 -07:00
Josh Durgin
d7fe5c0a34 nuke: don't require noipmi in ctx
This is called from run.py too, which won't have ctx.noipmi.
The default of using impmi is fine for now for run.py.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-09 18:21:05 -07:00
Sage Weil
f7c8c27c7b Merge branch 'next' 2013-05-06 21:31:36 -07:00
Sam Lang
980973dc55 task/install.py: Allow installation of non-ceph
Generalizes the install task to specify a "project" which defaults to
'ceph', but can be configured to install different project packages,
for example:

install:
  project: samba
  extra_packages: samba

The default install task uses 'ceph' as the project, and relies on an
existing set of defined packages to install.  For other projects, the
packages to be installed must be specified with the extra_packages
field.  Multiple install tasks can be specified:

install:
install:
  project: samba
  extra_packages: samba

Which installs ceph packages and then samba packages.

Also, cleanup in nuke.py so that nuke and install use the same list of
packages when doing the remove steps.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-05-06 17:37:25 -07:00
Sam Lang
9e6f7b126b nuke.py: Allow ipmi power cycling to be skipped
Some nodes don't have ipmi setup.  Allow nuke to
skip the ipmi checking if -i (--noipmi) is specified.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-05-01 17:18:16 -07:00
Sage Weil
4a6e3b97e3 install, nuke: explicitly purge /var/lib/ceph
The packages won't do this anymore.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-22 15:22:38 -07:00
Warren Usui
0c75c6b1f7 Added el6 install functionality for CentOS systems.
install_packages, remove_packages and remove_sources are now the
installation and removal functions used by teuthology.  Debian
references have been removed outside of tasks/install.py.  CentOS
functionality parallel to Debian have been added to tasks/install.py,
and el6 references have been added to nuke.py, task/ceph-fuse.y and
task/install.py.

Some files created by CentOS are removed with rm -fr.  This should
be changed once the installation/removal rpm procedure is implemented.

Signed-off-by: Warren Usui <warren.usui@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-03-14 16:25:18 -07:00
Warren Usui
01a40cfbf1 Use service instead of initctl to restart rsyslog.
This change is needed to make sure teuthology works on CentOS when the
-a option is specified.

Signed-off-by: Warren Usui <warren.usui@inktank.com>
2013-03-13 18:37:25 -07:00
Sage Weil
bee8dffc34 nuke: blow away /home/ubuntu/cephtest too
(along with /tmp/cephtest)

Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-25 17:54:49 -08:00
Sage Weil
d8021a1aa0 nuke: sudo for killall
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-22 10:51:51 -08:00
Josh Durgin
a862d8bf77 Fix unused vars, unused imports, and aliasing
Found by pyflakes

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-21 14:47:00 -08:00
Sage Weil
3f7c9bcaa4 move the install to a separate task.
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 15:06:52 -08:00
Sage Weil
d1d36241b7 ceph: use default data, keyring locations
This required reordering the cluster setup so that we do the ceph-osd
--mkfs --mkkey prior to gathering keys and initializing the monitors.

Also, run daemons as root.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:39:05 -08:00
Sage Weil
a54200d444 nuke: tolerate failed dpkg --configure -a/apt-get -f install
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:39:05 -08:00
Sage Weil
149be93639 nuke: dpkg --configure -a and apt-get -f install
Installing debs means we are more likely to hit a case where we interrupt
apt/dpkg.  Try to mop up as best we can in nuke.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:39:04 -08:00
Sage Weil
3400ea39ba nuke: whitespace
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:39:04 -08:00
Sage Weil
28116db6a0 nuke: remove librados, librbd
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:39:04 -08:00
Sander Pool
c525e1061b Install ceph debs and use installed debs
The ceph task installs ceph using the debian
packages now, and all invocations of binaries installed
in {tmpdir}/binary/usr/local/bin/ are replace with
the use of the binaries installed in standard locations
by the debs.

Author:    Sander Pool <sander.pool@inktank.com>
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-18 13:39:03 -08:00
Sage Weil
d790eeb451 nuke: testrados -> ceph_test_rados
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:38:54 -08:00
Josh Durgin
ed3c3615c3 nuke: don't try unmount if we're rebooting everything anyway
This can cause issues when unmount hangs. Our automatic runs reboot
everything unconditionally, so this caused a bunch of unecessary hangs
when an fs was accidentally rendered un-unmountable.
2013-02-05 23:31:39 -08:00
Josh Durgin
c6504bab9a nuke: make tmpfs check only umount tmpfs
This would catch things like /tmp/cephtest/mnt.client.0, which are
used by cfuse, rbd, and kclient.
2013-02-05 23:28:12 -08:00
Sam Lang
887e93e7e5 nuke.py: Allow name of job/run to be specified
Nuke will cleanup the base test directory by default, but can
cleanup the test directory for a given run if specified.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-03 11:09:49 -06:00
Sam Lang
9de9ebcf05 nuke: get_testdir_base needs to be imported
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 13:01:25 -06:00
Sam Lang
edfe5eeda1 nuke: Fix cleanup of test dir
Nuke used to remove /tmp/cephtest, now it tries to
remove the test dir, which it may not have the name
for.  Instead of removing the test dir, we just
remove the base directory for all test directories,
which may or may not be /tmp/cephtest.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 11:45:04 -06:00
Sam Lang
dcf99e43b9 nuke: Optionally check console status
Only check the ipmi console status if the ipmi
parameters have been defined in .teuthology.yaml.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 08:24:41 -06:00
Sam Lang
58111595d4 Support power cycling osds/nodes through ipmi
This patch defines a RemoteConsole class associated
with each Remote class instance, allowing
power cycling a target through ipmi.

Fixes/Implements #3782.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 08:23:37 -06:00
Sam Lang
ace4cb07b2 Replace /tmp/cephtest/ with configurable path
Teuthology uses /tmp/cephtest/ as the scratch test directory for
a run.  This patch replaces /tmp/cephtest/ everywhere with a
per-run directory: {basedir}/{rundir} where {basedir} is a directory
configured in .teuthology.yaml (/tmp/cephtest if not specified),
and {rundir} is the name of the run, as given in --name.  If no name
is specified, {user}-{timestamp} is used.

To get the old behavior (/tmp/cephtest), set test_path: /tmp/cephtest
in .teuthology.yaml.

This change was modivated by #3782, which requires a test dir that
survives across reboots, but also resolves #3767.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 08:23:31 -06:00
Josh Durgin
57bb434def Fix errors found by pyflakes
A bunch of unused imports and variables.
2012-09-21 16:46:24 -07:00
tamil
78b7b02c07 imported subprocess module in nuke script
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
2012-09-14 15:04:40 -07:00
Josh Durgin
d27806a293 nuke: add missing import 2012-09-13 14:31:46 -07:00
Mike Ryan
7f6591b556 ceph: support tmpfs_journal option to put journal on tmpfs
Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
2012-08-16 15:50:10 -07:00
Tommi Virtanen
99ac6b0b3e Disable asynchronous DNS lookups.
Especially on older hosts, we keep triggering errors::

  ServerNotFoundError: Unable to find the server at
  teuthology.front.sepia.ceph.com: [Errno 3] name does not exist

That comes from libevent's evdns via gevent.dns and httplib2. The rate
of these errors is low enough that they seem to be perhaps timeouts,
or more arbitrary. Busy looping on DNS resolution calls has never
triggered them, so far.

With ``monkey.patch_all(dns=False)``, the teuthology process will
block as a whole whenever doing DNS resolution. This will hopefully be
rare enough that it won't matter.

The only real "fix" seems to be upgrading libraries and hoping for the
best; this commit can be reverted after that is done.
2012-08-13 16:18:33 -07:00
Sage Weil
55847fc298 nuke: log what pid we are killing when we kill it 2012-07-18 11:04:30 -07:00
Sage Weil
9b28948635 nuke: honor 'check-locks: ...' field in targets file
If you are nuking a yaml file with check-locks: false, don't check locks.
2012-07-11 14:23:51 -07:00
Sage Weil
9ea22133b7 use sudo to kill teuthology proc 2012-07-06 20:15:55 -07:00
Sage Weil
132dc0066d nuke: be more careful about kill; simplify
If the archive dir is specified, make sure we are killing the right
process.

Also drop the kill_process helper; it's simple enough to open-code.
2012-07-04 14:47:33 -07:00