Commit Graph

509 Commits

Author SHA1 Message Date
Sage Weil
c5bbfffa05 hammer.sh: new -nuke syntax 2012-01-16 13:18:31 -08:00
Sage Weil
8fb115fe2c include run duration in summary.yaml 2012-01-16 12:39:20 -08:00
Sage Weil
7b47e49fa8 ls: fix extraneous newline 2012-01-16 10:47:44 -08:00
Sage Weil
b58f9560ea ceph: ignore all leaks
unless/until we figure out where the DefinitelyLost records are coming
from.. at first glance they look bogus.
2012-01-16 09:55:47 -08:00
Sage Weil
40fb86ff81 ceph: take single arg or list for valgrind args 2012-01-16 09:22:45 -08:00
Sage Weil
c88ec5719e combined mon, osd, mds starter functions 2012-01-15 22:54:09 -08:00
Sage Weil
f8ec23e79d rbd: default to all: 2012-01-15 22:53:39 -08:00
Sage Weil
72057a9cd8 use local mirrors for (most) github urls
A cronjob on ceph.newdream.net updates these every 15 minutes.  Sigh.
2012-01-15 22:52:58 -08:00
Sage Weil
fbfa94bb09 teuthology-ls: show pid, last line of output for running jobs 2012-01-15 22:52:58 -08:00
Sage Weil
f70b158cd1 show host -> roles mapping on startup
Less guessing when manually inspecting an in-progress or hung run.
2012-01-15 22:52:58 -08:00
Sage Weil
f795261454 lost_unfound: make test work with backfill
If we backfill, we fail to peer instead of having every object show up as
'unfound'.  Avoid that by preventing log trimming, so that we always do
log recovery for this test.
2012-01-15 22:52:58 -08:00
Tommi Virtanen
3bfa41cf6a Use yaml.safe_dump so unicode doesn't mess up the yaml files.
In general, yaml.dump is comparable to pickle, and my personal
coding standard says *never* use it. yaml.safe_dump is much nicer.
yaml.dump should have been named yaml.unsafe_dump, yaml.safe_dump
should have been named yaml.dump :(
2012-01-13 11:26:36 -08:00
Josh Durgin
0da44591a9 nuke: take config files from -t argument
teuthology-lock and teuthology-updatekeys both use -t for this already
2012-01-12 14:48:36 -08:00
Josh Durgin
96e89d30ec kernel: loop reconnecting in case we race with shutdown
Previously, if we reconnected before shutdown completed we asserted
that the kernel did not boot into the new version, when we just needed
to wait for the machine to reboot.
2012-01-12 13:02:22 -08:00
Sage Weil
59369237c9 thrasher: don't mark down osds out; tell monitor same
Stopping ceph-osd doesn't make it out (immediately).  Prevent monitor
from doing this after a delay too so we can keep our notion of what is
up/down/in/out accurate.
2012-01-11 12:54:09 -08:00
Sage Weil
3c0346b4cb lost_unfound: typo 2012-01-11 12:54:09 -08:00
Sage Weil
6dae2f8ae3 thrasher: adjust min_dead default
Make this 1, not 2.  That's a bit more friendly.  It doesn't strictly
matter, tho, since we revive osds before waiting for clean.
2012-01-11 12:54:09 -08:00
Sage Weil
fb74b90152 thrasher: add max_dead
Add max_dead, and revive osds prior to waiting for clean.  Otherwise we
can leave too many OSDs down and the cluster will never go clean.
2012-01-11 12:54:08 -08:00
Sage Weil
50463ffddd verify all osds start before checking health
Just checking health isn't good enough, since it races with OSD startup:
we can have a healthy cluster with 0 (or something else < total) OSDs.
2012-01-11 12:54:08 -08:00
Josh Durgin
f4883ebf09 ceph: let the user running ceph-osd remove subvolumes
This will prevent EPERM when using the SNAP_DESTROY ioctl,
so the filestore will use btrfs snaps.
2012-01-10 16:07:04 -08:00
Josh Durgin
d2fadf9fe2 syslog: ignore lockdep non-static key warning
It looks like this warning was made default in linux 3.2.
This will keep happening until #1922 is done.
2012-01-10 15:28:42 -08:00
Sage Weil
b354ce4e91 run: put pid in archive dir
This will make it easy for teuthology-ls to show you the running process's
pid (if it's still running).  Or for other utiltizes to kill + clean up
a hung teuthology run.
2012-01-08 14:39:30 -08:00
Sage Weil
13445d237b ceph_manager: a booting osd is no longer automatically marked in
as of ceph.git commit 96b7b0d83e
2012-01-06 17:21:38 -08:00
Sage Weil
001701a0f7 mon_recovery: need n/2 + 1 monitors for quorum 2012-01-06 15:12:15 -08:00
Sage Weil
da9210779e ceph: don't skip monitor ports
We can use the same port multiple times if they are on a different hosts.
2012-01-06 13:36:54 -08:00
Josh Durgin
561f06cf94 suite: make email-on-success the default behavior
This way you can tell when a run is complete, instead of wondering if
it's stuck in the queue.
2012-01-05 17:27:31 -08:00
Josh Durgin
ec3a3a9654 rados: fix example config 2012-01-03 14:07:45 -08:00
Josh Durgin
cdd5c456a0 nuke-on-error: only unlock if this run locked the machines 2012-01-03 13:02:31 -08:00
Josh Durgin
0176c9ab0f Remove unused mon.0 variables. 2012-01-03 13:02:31 -08:00
Josh Durgin
2e9b1c75f9 rados: use testrados instead of testsnaps and testreadwrite 2012-01-03 13:02:29 -08:00
Josh Durgin
932257fb6e rados: remove unused variable 2011-12-30 14:37:45 -08:00
Josh Durgin
0af9c0a2e7 rados: clean up argument construction
Only the client id varies, so it can be done outside the loop. Also
handle coredumps and coverage, and use LD_LIBRARY_PATH instead of
LD_PRELOAD.
2011-12-30 14:37:45 -08:00
Josh Durgin
6df4ce5075 rados: fix references to testrados 2011-12-30 14:37:45 -08:00
Josh Durgin
cdf142b597 rados: fix documentation format 2011-12-30 14:37:45 -08:00
Josh Durgin
2f71f03fdd misc: simplify reconnect logic
Ignore all errors until the timeout expires so we don't have to worry
about whitelisting them.
2011-12-30 14:37:37 -08:00
Mark Kampe
f04e29557e teuthology rgw-admin: annotated test cases for inventory
this is not a nose suite, so I simply added test case
   descriptions in csv format, and put a file to extract
   them at the top of the file.
Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com>
2011-12-29 13:09:08 -08:00
Josh Durgin
d0e90d71bd syslog checking: forgot a pipe 2011-12-16 18:09:17 -08:00
Yehuda Sadeh
7eec30946d rountrip: add task 2011-12-15 13:24:53 -08:00
Yehuda Sadeh
97cc6c2990 readwrite: fix task with default conf 2011-12-15 12:39:39 -08:00
Yehuda Sadeh
659e66aa09 readwrite: fix conf, task runs 2011-12-14 17:14:30 -08:00
Yehuda Sadeh
7d085ad939 readwrite: add readwrite task
still not really running, but at least getting configured
2011-12-14 16:12:55 -08:00
Josh Durgin
31b5ccbf1b coverage: use locally stored build instead of downloading from a gitbuilder 2011-12-13 16:16:09 -08:00
Josh Durgin
c9e4504fbd Ignore lockdep being turned off for now.
Some machines are hitting this udev issue:
http://marc.info/?l=linux-kernel&m=132033587908426&w=2 and lockdep is
turned off after the first warning.
2011-12-12 16:29:41 -08:00
Josh Durgin
a768ad738a coverage: don't generate html reports for each test
These can always be generated from the lcov files later, right now they just waste space.
2011-12-08 17:47:14 -08:00
Josh Durgin
7b52dd1410 syslog: ignore 'task blocked' warnings
These will happen under heavy load (usually on the osd).
2011-12-08 17:17:47 -08:00
Josh Durgin
e69057e4a1 internal: check syslog for errors
This should catch lockdep warnings and mark tests with them as failed.
2011-12-07 15:20:33 -08:00
Josh Durgin
95e632475f workunit: set client id and secretfile env vars
These are used by the kernel rbd workunit to know how to map images.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-06 16:16:38 -08:00
Tommi Virtanen
e80c32c442 Rename "testrados" and "testswift" tasks to not begin with "test".
Anything "test*" looks like a unit test, and shouldn't be used for
actual code.
2011-12-05 10:07:25 -08:00
Tommi Virtanen
0dd4d69ffe Fix unit tests for SSH keep-alive setting.
Commit 6e3e0d7cdc failed to pass
unit tests.
2011-12-05 10:02:30 -08:00
Tommi Virtanen
50c4b312a2 Handle interactive-on-error also when error is from contextmanager exit.
Closes: http://tracker.newdream.net/issues/1745
2011-11-30 17:07:26 -08:00