Commit Graph

855 Commits

Author SHA1 Message Date
Sage Weil
28116db6a0 nuke: remove librados, librbd
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:39:04 -08:00
Sage Weil
a529bb725f ceph: install ceph-mds, ceph-common
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:39:04 -08:00
Sage Weil
5235fc18a0 ceph: fix purge
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:39:03 -08:00
Sander Pool
c525e1061b Install ceph debs and use installed debs
The ceph task installs ceph using the debian
packages now, and all invocations of binaries installed
in {tmpdir}/binary/usr/local/bin/ are replace with
the use of the binaries installed in standard locations
by the debs.

Author:    Sander Pool <sander.pool@inktank.com>
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-18 13:39:03 -08:00
Sage Weil
d790eeb451 nuke: testrados -> ceph_test_rados
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-18 13:38:54 -08:00
Sage Weil
7a5fd05edd misc: replace : with - in testdir name
The :'s break the list in $PATH.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-17 22:13:45 -08:00
Sage Weil
f931cad8f1 schedule_suite.sh: fix s3branch
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-15 09:33:27 -08:00
Sage Weil
9513f2f206 rbd_fsx: binary name now has ceph_ prefix
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-15 09:12:25 -08:00
Sage Weil
5d6d6884fe rados: testrados -> ceph_test_rados
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-13 14:10:33 -08:00
Sage Weil
db41f26132 schedule_suite.sh: choose s3branch based on teuthology branch
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-13 08:50:46 -08:00
Sage Weil
7309bccca3 schedule_suite.sh: take option teuthology branch arg
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-12 21:31:44 -08:00
Sage Weil
0c663ca84e schedule_suite.sh: ensure ceph and kernel branches exist
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-12 21:24:16 -08:00
Sage Weil
6e3c2d93fb peer: add recovery delay to make test behave
Otherwise it was (very) racy!
2013-02-11 06:59:17 -08:00
Sandon Van Ness
a56eb88c16 Merge to include --machine-type and changes to --summary
Added the ability to support multiple types of machines with
--machine-type added to teuthology-lock when used with --lock-many
or --machine-type with teuthology --lock (automated tests). It
defaults to 'plana' and the 'vps' type is currently unused but
should be in the future.

Also updated teutholoy-lock --summary to be machine type aware
and sort things in a nice output.

Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-07 16:34:14 -08:00
Sandon Van Ness
75d86e47fd Made teuthology-lock --summary machine type aware.
Signed-off-by: Sandon Van Ness <sandon@van-ness.com>
2013-02-07 16:06:21 -08:00
Sandon Van Ness
030bc7c23d Added support for multiple types of machines.
Added the ability to support multiple types of machines with
--machine-type added to teuthology-lock when used with --lock-many
or --machine-type with teuthology --lock (automated tests). It
defaults to 'plana' and the 'vps' type is currently unused but
should be in the future.

Signed-off-by: Sandon Van Ness <sandon@van-ness.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-07 13:26:37 -08:00
Sage Weil
ed9103aad5 rgw: parse testdir into apache.conf
Also fix up the template to use {{field}} for stuff we don't want to parse.
There is probably a better way...

Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-06 22:02:10 -08:00
Sage Weil
67bbb9c77b osd_recovery: add missing testdir arg 2013-02-06 21:44:10 -08:00
Sage Weil
561ea14c6e ceph_manager: take int or string to osd_admin_socket
This fixes a failure on dump_stuck.
2013-02-06 17:14:24 -08:00
Sage Weil
3fbb552240 radosbench: fix missing format value
tdir is substituted in at the end.  There is probably a better way to do
this.
2013-02-06 09:06:35 -08:00
Sage Weil
936f314a63 rgw: fix testdir format on f
Format the path, not filehandle
2013-02-06 09:04:37 -08:00
Josh Durgin
ed3c3615c3 nuke: don't try unmount if we're rebooting everything anyway
This can cause issues when unmount hangs. Our automatic runs reboot
everything unconditionally, so this caused a bunch of unecessary hangs
when an fs was accidentally rendered un-unmountable.
2013-02-05 23:31:39 -08:00
Josh Durgin
c6504bab9a nuke: make tmpfs check only umount tmpfs
This would catch things like /tmp/cephtest/mnt.client.0, which are
used by cfuse, rbd, and kclient.
2013-02-05 23:28:12 -08:00
Sage Weil
82273e951b rbd: fix rbd image unmount
The testdir param was missing.  Avoid this class of errors by unmounting
exactly what we mounted.
2013-02-05 23:19:23 -08:00
Sage Weil
6099045990 rbd: set env before running sudo
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-05 23:01:25 -08:00
Sam Lang
100e9056ed misc: Close connections on reboot
When nodes are rebooted, the connections remain open
even after calling reconnect and setting up new ssh
sessions to the rebooted nodes.  This causes ECONNRESET
errors to show up in the teuthology output.

Close the existing connections before trying to reconnect.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-05 16:20:52 -06:00
Sam Lang
da10b58d65 task/ceph_manager: Fix NoneType config issue
kill_mon is getting a config set to None, which blows
up now due to the check for powercycle.  Initialize
the config to an empty dict if we don't get anything
on init.  This is the error showing up in teuthology:

2013-02-04T15:04:16.595 ERROR:teuthology.run_tasks:Manager failed: <contextlib.GeneratorContextManager object at 0x1fcafd0>
Traceback (most recent call last):
  File "/var/lib/teuthworker/teuthology-master/teuthology/run_tasks.py", line 45, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/var/lib/teuthworker/teuthology-master/teuthology/task/mon_thrash.py", line 142, in task
    thrash_proc.do_join()
  File "/var/lib/teuthworker/teuthology-master/teuthology/task/mon_thrash.py", line 69, in do_join
    self.thread.get()
  File "/var/lib/teuthworker/teuthology-master/virtualenv/local/lib/python2.7/site-packages/gevent/greenlet.py", line 308, in get
    raise self._exception
AttributeError: 'NoneType' object has no attribute 'get'

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-05 10:38:48 -06:00
Josh Durgin
2f41f81dfa misc: don't use colon in default run name
LD_LIBRARY_PATH does not work with colons (and backslash does not escape them.)
2013-02-04 10:39:40 -08:00
Sam Lang
55c1bcf6b0 Add testdir param to get_valgrind_args() calls
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-03 22:09:15 -06:00
Sam Lang
a5ba4f6a94 Merge branch 'wip-misc-fixes'
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-03 11:38:29 -06:00
Sam Lang
887e93e7e5 nuke.py: Allow name of job/run to be specified
Nuke will cleanup the base test directory by default, but can
cleanup the test directory for a given run if specified.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-03 11:09:49 -06:00
Sam Lang
46d3ff94f5 run.py: Add target name to logging info
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-03 11:09:04 -06:00
Sage Weil
ada803db0f rbd: fix .format() call with {1} syntax
IndexError: tuple index out of range
2013-02-03 08:18:52 -08:00
Sage Weil
fe9fb49e27 ceph_manager: use get() for self.config powercycle checks
I think this is what is going on...

Traceback (most recent call last):
  File "/var/lib/teuthworker/teuthology-master/teuthology/contextutil.py", line 27, in nested
    yield vars
  File "/var/lib/teuthworker/teuthology-master/teuthology/task/ceph.py", line 1158, in task
    yield
  File "/var/lib/teuthworker/teuthology-master/teuthology/run_tasks.py", line 25, in run_tasks
    manager = _run_one_task(taskname, ctx=ctx, config=config)
  File "/var/lib/teuthworker/teuthology-master/teuthology/run_tasks.py", line 14, in _run_one_task
    return fn(**kwargs)
  File "/var/lib/teuthworker/teuthology-master/teuthology/task/dump_stuck.py", line 93, in task
    manager.kill_osd(id_)
  File "/var/lib/teuthworker/teuthology-master/teuthology/task/ceph_manager.py", line 665, in kill_osd
    if 'powercycle' in self.config and self.config['powercycle']:
TypeError: argument of type 'NoneType' is not iterable
2013-02-02 21:01:08 -08:00
Sam Lang
7280980f34 Fixup latest commits that use /tmp/cephtest.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-02 11:00:17 -06:00
Sam Lang
d9fff40f6b task/chdir-coredump: Use readlink -e
realpath isn't available everywhere, use readlink -e instead.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 16:07:29 -06:00
Sam Lang
9a9fe73ec3 task/ceph: Fix typo in previous commit
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 14:07:10 -06:00
Sam Lang
9de9ebcf05 nuke: get_testdir_base needs to be imported
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 13:01:25 -06:00
Sam Lang
edfe5eeda1 nuke: Fix cleanup of test dir
Nuke used to remove /tmp/cephtest, now it tries to
remove the test dir, which it may not have the name
for.  Instead of removing the test dir, we just
remove the base directory for all test directories,
which may or may not be /tmp/cephtest.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 11:45:04 -06:00
Sam Lang
4ebd90eb81 task/ceph: Initialize disk_config maps
The mount_options and fstype maps need to be
initialized properly for later.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 11:37:13 -06:00
Sam Lang
150a3d7d9e misc: Don't include existing partitions in devs
We don't want to include /dev/sda1, etc. in the
list of devices to use.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 10:53:47 -06:00
Sam Lang
3806dc5e72 task/ceph: Fix device list
dict.items() returns a tuple, whereas we want
the values().

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 10:16:44 -06:00
Sam Lang
64e3966779 misc: get_wwn_id_map() needs to return dict
If we can't find device ids, we need to return
a dict, not a list.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 09:13:48 -06:00
Sam Lang
dcf99e43b9 nuke: Optionally check console status
Only check the ipmi console status if the ipmi
parameters have been defined in .teuthology.yaml.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 08:24:41 -06:00
Sam Lang
ac4ba69d8d misc: Fix get_wwn_id_map() to be optional
Not all plana nodes have symlinks setup when
we check /dev/disk/by-id/wwn-*.  Instead of failing
here, just use the /dev/disk/sd* devices.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 08:20:43 -06:00
Sam Lang
933cc3c382 run.py: Fix argument parsing for --name
With the addition of the --name argument to the
teuthology program (run.py), jobs were failing
because --name was being treated as a non-arg
option, even though the name was being supplied
by the workers.  Fix that and give it a metavar.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 07:46:04 -06:00
Samuel Just
fadc22c0b9 ceph_manager: wait for admin socket on restart, use for set_config
Fixes: #3966
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-01-31 12:59:00 -08:00
Josh Durgin
8f9267cf0e thrashosds: note assumption for powercycling 2013-01-31 09:14:06 -08:00
Sam Lang
77e8d801b1 Remove console.py
Handling of ipmi via the console is now done through the
Console class in teuthology/orchestra/remote.py.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 08:23:41 -06:00
Sam Lang
8f720454cb Assign devices to osds using the device wwn
Linux doesn't guarantee device names (/dev/sdb, etc.)
are always mapped to the same disk.  Instead of assigning
nominal devices to osds, we map devices by their wwn
(/dev/disk/by-id/wwn-*) to an osd (both data and journal).

Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 08:23:39 -06:00