Commit Graph

900 Commits

Author SHA1 Message Date
Sage Weil
5d6d6884fe rados: testrados -> ceph_test_rados
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-13 14:10:33 -08:00
Sage Weil
db41f26132 schedule_suite.sh: choose s3branch based on teuthology branch
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-13 08:50:46 -08:00
Sage Weil
7309bccca3 schedule_suite.sh: take option teuthology branch arg
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-12 21:31:44 -08:00
Sage Weil
0c663ca84e schedule_suite.sh: ensure ceph and kernel branches exist
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-12 21:24:16 -08:00
Sage Weil
6e3c2d93fb peer: add recovery delay to make test behave
Otherwise it was (very) racy!
2013-02-11 06:59:17 -08:00
Sandon Van Ness
a56eb88c16 Merge to include --machine-type and changes to --summary
Added the ability to support multiple types of machines with
--machine-type added to teuthology-lock when used with --lock-many
or --machine-type with teuthology --lock (automated tests). It
defaults to 'plana' and the 'vps' type is currently unused but
should be in the future.

Also updated teutholoy-lock --summary to be machine type aware
and sort things in a nice output.

Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-07 16:34:14 -08:00
Sandon Van Ness
75d86e47fd Made teuthology-lock --summary machine type aware.
Signed-off-by: Sandon Van Ness <sandon@van-ness.com>
2013-02-07 16:06:21 -08:00
Sandon Van Ness
030bc7c23d Added support for multiple types of machines.
Added the ability to support multiple types of machines with
--machine-type added to teuthology-lock when used with --lock-many
or --machine-type with teuthology --lock (automated tests). It
defaults to 'plana' and the 'vps' type is currently unused but
should be in the future.

Signed-off-by: Sandon Van Ness <sandon@van-ness.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-07 13:26:37 -08:00
Sage Weil
ed9103aad5 rgw: parse testdir into apache.conf
Also fix up the template to use {{field}} for stuff we don't want to parse.
There is probably a better way...

Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-06 22:02:10 -08:00
Sage Weil
67bbb9c77b osd_recovery: add missing testdir arg 2013-02-06 21:44:10 -08:00
Sage Weil
561ea14c6e ceph_manager: take int or string to osd_admin_socket
This fixes a failure on dump_stuck.
2013-02-06 17:14:24 -08:00
Sage Weil
3fbb552240 radosbench: fix missing format value
tdir is substituted in at the end.  There is probably a better way to do
this.
2013-02-06 09:06:35 -08:00
Sage Weil
936f314a63 rgw: fix testdir format on f
Format the path, not filehandle
2013-02-06 09:04:37 -08:00
Josh Durgin
ed3c3615c3 nuke: don't try unmount if we're rebooting everything anyway
This can cause issues when unmount hangs. Our automatic runs reboot
everything unconditionally, so this caused a bunch of unecessary hangs
when an fs was accidentally rendered un-unmountable.
2013-02-05 23:31:39 -08:00
Josh Durgin
c6504bab9a nuke: make tmpfs check only umount tmpfs
This would catch things like /tmp/cephtest/mnt.client.0, which are
used by cfuse, rbd, and kclient.
2013-02-05 23:28:12 -08:00
Sage Weil
82273e951b rbd: fix rbd image unmount
The testdir param was missing.  Avoid this class of errors by unmounting
exactly what we mounted.
2013-02-05 23:19:23 -08:00
Sage Weil
6099045990 rbd: set env before running sudo
Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-05 23:01:25 -08:00
Sam Lang
100e9056ed misc: Close connections on reboot
When nodes are rebooted, the connections remain open
even after calling reconnect and setting up new ssh
sessions to the rebooted nodes.  This causes ECONNRESET
errors to show up in the teuthology output.

Close the existing connections before trying to reconnect.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-05 16:20:52 -06:00
Sam Lang
da10b58d65 task/ceph_manager: Fix NoneType config issue
kill_mon is getting a config set to None, which blows
up now due to the check for powercycle.  Initialize
the config to an empty dict if we don't get anything
on init.  This is the error showing up in teuthology:

2013-02-04T15:04:16.595 ERROR:teuthology.run_tasks:Manager failed: <contextlib.GeneratorContextManager object at 0x1fcafd0>
Traceback (most recent call last):
  File "/var/lib/teuthworker/teuthology-master/teuthology/run_tasks.py", line 45, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/var/lib/teuthworker/teuthology-master/teuthology/task/mon_thrash.py", line 142, in task
    thrash_proc.do_join()
  File "/var/lib/teuthworker/teuthology-master/teuthology/task/mon_thrash.py", line 69, in do_join
    self.thread.get()
  File "/var/lib/teuthworker/teuthology-master/virtualenv/local/lib/python2.7/site-packages/gevent/greenlet.py", line 308, in get
    raise self._exception
AttributeError: 'NoneType' object has no attribute 'get'

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-05 10:38:48 -06:00
Josh Durgin
2f41f81dfa misc: don't use colon in default run name
LD_LIBRARY_PATH does not work with colons (and backslash does not escape them.)
2013-02-04 10:39:40 -08:00
Sam Lang
55c1bcf6b0 Add testdir param to get_valgrind_args() calls
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-03 22:09:15 -06:00
Sam Lang
a5ba4f6a94 Merge branch 'wip-misc-fixes'
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-03 11:38:29 -06:00
Sam Lang
887e93e7e5 nuke.py: Allow name of job/run to be specified
Nuke will cleanup the base test directory by default, but can
cleanup the test directory for a given run if specified.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-03 11:09:49 -06:00
Sam Lang
46d3ff94f5 run.py: Add target name to logging info
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-03 11:09:04 -06:00
Sage Weil
ada803db0f rbd: fix .format() call with {1} syntax
IndexError: tuple index out of range
2013-02-03 08:18:52 -08:00
Sage Weil
fe9fb49e27 ceph_manager: use get() for self.config powercycle checks
I think this is what is going on...

Traceback (most recent call last):
  File "/var/lib/teuthworker/teuthology-master/teuthology/contextutil.py", line 27, in nested
    yield vars
  File "/var/lib/teuthworker/teuthology-master/teuthology/task/ceph.py", line 1158, in task
    yield
  File "/var/lib/teuthworker/teuthology-master/teuthology/run_tasks.py", line 25, in run_tasks
    manager = _run_one_task(taskname, ctx=ctx, config=config)
  File "/var/lib/teuthworker/teuthology-master/teuthology/run_tasks.py", line 14, in _run_one_task
    return fn(**kwargs)
  File "/var/lib/teuthworker/teuthology-master/teuthology/task/dump_stuck.py", line 93, in task
    manager.kill_osd(id_)
  File "/var/lib/teuthworker/teuthology-master/teuthology/task/ceph_manager.py", line 665, in kill_osd
    if 'powercycle' in self.config and self.config['powercycle']:
TypeError: argument of type 'NoneType' is not iterable
2013-02-02 21:01:08 -08:00
Sam Lang
7280980f34 Fixup latest commits that use /tmp/cephtest.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-02 11:00:17 -06:00
Sam Lang
d9fff40f6b task/chdir-coredump: Use readlink -e
realpath isn't available everywhere, use readlink -e instead.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 16:07:29 -06:00
Sam Lang
9a9fe73ec3 task/ceph: Fix typo in previous commit
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 14:07:10 -06:00
Sam Lang
9de9ebcf05 nuke: get_testdir_base needs to be imported
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 13:01:25 -06:00
Sam Lang
edfe5eeda1 nuke: Fix cleanup of test dir
Nuke used to remove /tmp/cephtest, now it tries to
remove the test dir, which it may not have the name
for.  Instead of removing the test dir, we just
remove the base directory for all test directories,
which may or may not be /tmp/cephtest.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 11:45:04 -06:00
Sam Lang
4ebd90eb81 task/ceph: Initialize disk_config maps
The mount_options and fstype maps need to be
initialized properly for later.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 11:37:13 -06:00
Sam Lang
150a3d7d9e misc: Don't include existing partitions in devs
We don't want to include /dev/sda1, etc. in the
list of devices to use.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 10:53:47 -06:00
Sam Lang
3806dc5e72 task/ceph: Fix device list
dict.items() returns a tuple, whereas we want
the values().

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 10:16:44 -06:00
Sam Lang
64e3966779 misc: get_wwn_id_map() needs to return dict
If we can't find device ids, we need to return
a dict, not a list.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 09:13:48 -06:00
Sam Lang
dcf99e43b9 nuke: Optionally check console status
Only check the ipmi console status if the ipmi
parameters have been defined in .teuthology.yaml.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 08:24:41 -06:00
Sam Lang
ac4ba69d8d misc: Fix get_wwn_id_map() to be optional
Not all plana nodes have symlinks setup when
we check /dev/disk/by-id/wwn-*.  Instead of failing
here, just use the /dev/disk/sd* devices.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 08:20:43 -06:00
Sam Lang
933cc3c382 run.py: Fix argument parsing for --name
With the addition of the --name argument to the
teuthology program (run.py), jobs were failing
because --name was being treated as a non-arg
option, even though the name was being supplied
by the workers.  Fix that and give it a metavar.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 07:46:04 -06:00
Samuel Just
fadc22c0b9 ceph_manager: wait for admin socket on restart, use for set_config
Fixes: #3966
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-01-31 12:59:00 -08:00
Josh Durgin
8f9267cf0e thrashosds: note assumption for powercycling 2013-01-31 09:14:06 -08:00
Sam Lang
77e8d801b1 Remove console.py
Handling of ipmi via the console is now done through the
Console class in teuthology/orchestra/remote.py.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 08:23:41 -06:00
Sam Lang
8f720454cb Assign devices to osds using the device wwn
Linux doesn't guarantee device names (/dev/sdb, etc.)
are always mapped to the same disk.  Instead of assigning
nominal devices to osds, we map devices by their wwn
(/dev/disk/by-id/wwn-*) to an osd (both data and journal).

Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 08:23:39 -06:00
Sam Lang
58111595d4 Support power cycling osds/nodes through ipmi
This patch defines a RemoteConsole class associated
with each Remote class instance, allowing
power cycling a target through ipmi.

Fixes/Implements #3782.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 08:23:37 -06:00
Sam Lang
87b9849628 add --name option to teuthology
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 08:23:34 -06:00
Sam Lang
ace4cb07b2 Replace /tmp/cephtest/ with configurable path
Teuthology uses /tmp/cephtest/ as the scratch test directory for
a run.  This patch replaces /tmp/cephtest/ everywhere with a
per-run directory: {basedir}/{rundir} where {basedir} is a directory
configured in .teuthology.yaml (/tmp/cephtest if not specified),
and {rundir} is the name of the run, as given in --name.  If no name
is specified, {user}-{timestamp} is used.

To get the old behavior (/tmp/cephtest), set test_path: /tmp/cephtest
in .teuthology.yaml.

This change was modivated by #3782, which requires a test dir that
survives across reboots, but also resolves #3767.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 08:23:31 -06:00
Sam Lang
14730276b9 Fixes for syntax errors found by pyflakes.
This patch includes minor fixes to the teuthology
python code for syntax errors found by running
check-syntax.sh (which runs pyflakes on each file).

Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 07:58:57 -06:00
Sam Lang
3390cc30a6 Scripts to use pyflakes to check python syntax.
pyflakes runs a basic syntax checker against python code.
The added check-syntax.sh script and Makefile run pyflakes
on the python code within the teuthology directory reporting
any syntax errors that are found.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 07:56:56 -06:00
Joao Eduardo Luis
a63fac32f8 task: mon_clock_skew_check: use absolute value when comparing mon_skew
The monitors may report either positive or negative clock skews, and by
not using an absolute value we were constantly ignoring reported negative
clock skews.

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
2013-01-30 20:52:39 +00:00
Joao Eduardo Luis
89e09fa90c task: mon_clock_skew_check: mark as ran once if an expected skew was found
... even if we didn't get a clean/finished result from the monitors

This ought to significantly cut the waiting time if something else (or
someone else) is leaving the leader hanging thus unable to finish a given
timecheck round.

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
2013-01-30 20:52:03 +00:00
Sage Weil
19f4273190 peer: fix filtering out of scrub from pg state 2013-01-29 14:04:09 -08:00