Commit Graph

547 Commits

Author SHA1 Message Date
Josh Durgin
0da44591a9 nuke: take config files from -t argument
teuthology-lock and teuthology-updatekeys both use -t for this already
2012-01-12 14:48:36 -08:00
Josh Durgin
96e89d30ec kernel: loop reconnecting in case we race with shutdown
Previously, if we reconnected before shutdown completed we asserted
that the kernel did not boot into the new version, when we just needed
to wait for the machine to reboot.
2012-01-12 13:02:22 -08:00
Sage Weil
59369237c9 thrasher: don't mark down osds out; tell monitor same
Stopping ceph-osd doesn't make it out (immediately).  Prevent monitor
from doing this after a delay too so we can keep our notion of what is
up/down/in/out accurate.
2012-01-11 12:54:09 -08:00
Sage Weil
3c0346b4cb lost_unfound: typo 2012-01-11 12:54:09 -08:00
Sage Weil
6dae2f8ae3 thrasher: adjust min_dead default
Make this 1, not 2.  That's a bit more friendly.  It doesn't strictly
matter, tho, since we revive osds before waiting for clean.
2012-01-11 12:54:09 -08:00
Sage Weil
fb74b90152 thrasher: add max_dead
Add max_dead, and revive osds prior to waiting for clean.  Otherwise we
can leave too many OSDs down and the cluster will never go clean.
2012-01-11 12:54:08 -08:00
Sage Weil
50463ffddd verify all osds start before checking health
Just checking health isn't good enough, since it races with OSD startup:
we can have a healthy cluster with 0 (or something else < total) OSDs.
2012-01-11 12:54:08 -08:00
Josh Durgin
f4883ebf09 ceph: let the user running ceph-osd remove subvolumes
This will prevent EPERM when using the SNAP_DESTROY ioctl,
so the filestore will use btrfs snaps.
2012-01-10 16:07:04 -08:00
Josh Durgin
d2fadf9fe2 syslog: ignore lockdep non-static key warning
It looks like this warning was made default in linux 3.2.
This will keep happening until #1922 is done.
2012-01-10 15:28:42 -08:00
Sage Weil
b354ce4e91 run: put pid in archive dir
This will make it easy for teuthology-ls to show you the running process's
pid (if it's still running).  Or for other utiltizes to kill + clean up
a hung teuthology run.
2012-01-08 14:39:30 -08:00
Sage Weil
13445d237b ceph_manager: a booting osd is no longer automatically marked in
as of ceph.git commit 96b7b0d83e
2012-01-06 17:21:38 -08:00
Sage Weil
001701a0f7 mon_recovery: need n/2 + 1 monitors for quorum 2012-01-06 15:12:15 -08:00
Sage Weil
da9210779e ceph: don't skip monitor ports
We can use the same port multiple times if they are on a different hosts.
2012-01-06 13:36:54 -08:00
Josh Durgin
561f06cf94 suite: make email-on-success the default behavior
This way you can tell when a run is complete, instead of wondering if
it's stuck in the queue.
2012-01-05 17:27:31 -08:00
Josh Durgin
ec3a3a9654 rados: fix example config 2012-01-03 14:07:45 -08:00
Josh Durgin
cdd5c456a0 nuke-on-error: only unlock if this run locked the machines 2012-01-03 13:02:31 -08:00
Josh Durgin
0176c9ab0f Remove unused mon.0 variables. 2012-01-03 13:02:31 -08:00
Josh Durgin
2e9b1c75f9 rados: use testrados instead of testsnaps and testreadwrite 2012-01-03 13:02:29 -08:00
Josh Durgin
932257fb6e rados: remove unused variable 2011-12-30 14:37:45 -08:00
Josh Durgin
0af9c0a2e7 rados: clean up argument construction
Only the client id varies, so it can be done outside the loop. Also
handle coredumps and coverage, and use LD_LIBRARY_PATH instead of
LD_PRELOAD.
2011-12-30 14:37:45 -08:00
Josh Durgin
6df4ce5075 rados: fix references to testrados 2011-12-30 14:37:45 -08:00
Josh Durgin
cdf142b597 rados: fix documentation format 2011-12-30 14:37:45 -08:00
Josh Durgin
2f71f03fdd misc: simplify reconnect logic
Ignore all errors until the timeout expires so we don't have to worry
about whitelisting them.
2011-12-30 14:37:37 -08:00
Mark Kampe
f04e29557e teuthology rgw-admin: annotated test cases for inventory
this is not a nose suite, so I simply added test case
   descriptions in csv format, and put a file to extract
   them at the top of the file.
Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com>
2011-12-29 13:09:08 -08:00
Josh Durgin
d0e90d71bd syslog checking: forgot a pipe 2011-12-16 18:09:17 -08:00
Yehuda Sadeh
7eec30946d rountrip: add task 2011-12-15 13:24:53 -08:00
Yehuda Sadeh
97cc6c2990 readwrite: fix task with default conf 2011-12-15 12:39:39 -08:00
Yehuda Sadeh
659e66aa09 readwrite: fix conf, task runs 2011-12-14 17:14:30 -08:00
Yehuda Sadeh
7d085ad939 readwrite: add readwrite task
still not really running, but at least getting configured
2011-12-14 16:12:55 -08:00
Josh Durgin
31b5ccbf1b coverage: use locally stored build instead of downloading from a gitbuilder 2011-12-13 16:16:09 -08:00
Josh Durgin
c9e4504fbd Ignore lockdep being turned off for now.
Some machines are hitting this udev issue:
http://marc.info/?l=linux-kernel&m=132033587908426&w=2 and lockdep is
turned off after the first warning.
2011-12-12 16:29:41 -08:00
Josh Durgin
a768ad738a coverage: don't generate html reports for each test
These can always be generated from the lcov files later, right now they just waste space.
2011-12-08 17:47:14 -08:00
Josh Durgin
7b52dd1410 syslog: ignore 'task blocked' warnings
These will happen under heavy load (usually on the osd).
2011-12-08 17:17:47 -08:00
Josh Durgin
e69057e4a1 internal: check syslog for errors
This should catch lockdep warnings and mark tests with them as failed.
2011-12-07 15:20:33 -08:00
Josh Durgin
95e632475f workunit: set client id and secretfile env vars
These are used by the kernel rbd workunit to know how to map images.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-06 16:16:38 -08:00
Tommi Virtanen
e80c32c442 Rename "testrados" and "testswift" tasks to not begin with "test".
Anything "test*" looks like a unit test, and shouldn't be used for
actual code.
2011-12-05 10:07:25 -08:00
Tommi Virtanen
0dd4d69ffe Fix unit tests for SSH keep-alive setting.
Commit 6e3e0d7cdc failed to pass
unit tests.
2011-12-05 10:02:30 -08:00
Tommi Virtanen
50c4b312a2 Handle interactive-on-error also when error is from contextmanager exit.
Closes: http://tracker.newdream.net/issues/1745
2011-11-30 17:07:26 -08:00
Tommi Virtanen
c651c88eac Properly handle case where first error is inside a context manager __exit__.
Closes: http://tracker.newdream.net/issues/1743
2011-11-21 16:00:49 -08:00
Sage Weil
721c0e9720 nuke: don't specify full path
/tmp/cephtest/binary may have been removed; kill stray daemons by name
only.  we really don't care about false positives here!
2011-11-19 20:56:49 -08:00
Sage Weil
4b53288b0c ceph_manager: % 2011-11-19 20:56:49 -08:00
Josh Durgin
508f4f8359 Save summary after nuking machines.
This way you can tell when tests are entirely finished running.
2011-11-18 13:53:51 -08:00
Josh Durgin
91cfdfea72 Add an example overrides file for running regression tests. 2011-11-18 12:22:18 -08:00
Josh Durgin
42cecb5e55 suite: put common config before facets
This lets you add tasks to the beginning of a run, like the chef task.
2011-11-17 17:26:21 -08:00
Josh Durgin
044a88ce59 suite: schedule a list of collections for running instead of a single suite directory 2011-11-17 17:16:23 -08:00
Yehuda Sadeh
23aae67aff testswift: fix config 2011-11-17 16:53:57 -08:00
Tommi Virtanen
d8fc151365 Clean up C++isms. 2011-11-17 17:00:44 -08:00
Tommi Virtanen
c545094895 Add a task for easily running chef-solo on all the nodes. 2011-11-17 16:49:47 -08:00
Sage Weil
89f80412c2 ceph_manager: fix logging 2011-11-17 13:46:02 -08:00
Josh Durgin
f85f5dd7e3 ceph: deep merge overrides, so e.g. log whitelists can be overridden 2011-11-17 13:07:03 -08:00