Commit Graph

371 Commits

Author SHA1 Message Date
Sage Weil
fb74b90152 thrasher: add max_dead
Add max_dead, and revive osds prior to waiting for clean.  Otherwise we
can leave too many OSDs down and the cluster will never go clean.
2012-01-11 12:54:08 -08:00
Sage Weil
50463ffddd verify all osds start before checking health
Just checking health isn't good enough, since it races with OSD startup:
we can have a healthy cluster with 0 (or something else < total) OSDs.
2012-01-11 12:54:08 -08:00
Josh Durgin
f4883ebf09 ceph: let the user running ceph-osd remove subvolumes
This will prevent EPERM when using the SNAP_DESTROY ioctl,
so the filestore will use btrfs snaps.
2012-01-10 16:07:04 -08:00
Josh Durgin
d2fadf9fe2 syslog: ignore lockdep non-static key warning
It looks like this warning was made default in linux 3.2.
This will keep happening until #1922 is done.
2012-01-10 15:28:42 -08:00
Sage Weil
13445d237b ceph_manager: a booting osd is no longer automatically marked in
as of ceph.git commit 96b7b0d83e
2012-01-06 17:21:38 -08:00
Sage Weil
001701a0f7 mon_recovery: need n/2 + 1 monitors for quorum 2012-01-06 15:12:15 -08:00
Josh Durgin
ec3a3a9654 rados: fix example config 2012-01-03 14:07:45 -08:00
Josh Durgin
0176c9ab0f Remove unused mon.0 variables. 2012-01-03 13:02:31 -08:00
Josh Durgin
2e9b1c75f9 rados: use testrados instead of testsnaps and testreadwrite 2012-01-03 13:02:29 -08:00
Josh Durgin
932257fb6e rados: remove unused variable 2011-12-30 14:37:45 -08:00
Josh Durgin
0af9c0a2e7 rados: clean up argument construction
Only the client id varies, so it can be done outside the loop. Also
handle coredumps and coverage, and use LD_LIBRARY_PATH instead of
LD_PRELOAD.
2011-12-30 14:37:45 -08:00
Josh Durgin
6df4ce5075 rados: fix references to testrados 2011-12-30 14:37:45 -08:00
Josh Durgin
cdf142b597 rados: fix documentation format 2011-12-30 14:37:45 -08:00
Mark Kampe
f04e29557e teuthology rgw-admin: annotated test cases for inventory
this is not a nose suite, so I simply added test case
   descriptions in csv format, and put a file to extract
   them at the top of the file.
Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com>
2011-12-29 13:09:08 -08:00
Josh Durgin
d0e90d71bd syslog checking: forgot a pipe 2011-12-16 18:09:17 -08:00
Yehuda Sadeh
7eec30946d rountrip: add task 2011-12-15 13:24:53 -08:00
Yehuda Sadeh
97cc6c2990 readwrite: fix task with default conf 2011-12-15 12:39:39 -08:00
Yehuda Sadeh
659e66aa09 readwrite: fix conf, task runs 2011-12-14 17:14:30 -08:00
Yehuda Sadeh
7d085ad939 readwrite: add readwrite task
still not really running, but at least getting configured
2011-12-14 16:12:55 -08:00
Josh Durgin
c9e4504fbd Ignore lockdep being turned off for now.
Some machines are hitting this udev issue:
http://marc.info/?l=linux-kernel&m=132033587908426&w=2 and lockdep is
turned off after the first warning.
2011-12-12 16:29:41 -08:00
Josh Durgin
7b52dd1410 syslog: ignore 'task blocked' warnings
These will happen under heavy load (usually on the osd).
2011-12-08 17:17:47 -08:00
Josh Durgin
e69057e4a1 internal: check syslog for errors
This should catch lockdep warnings and mark tests with them as failed.
2011-12-07 15:20:33 -08:00
Josh Durgin
95e632475f workunit: set client id and secretfile env vars
These are used by the kernel rbd workunit to know how to map images.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-06 16:16:38 -08:00
Tommi Virtanen
e80c32c442 Rename "testrados" and "testswift" tasks to not begin with "test".
Anything "test*" looks like a unit test, and shouldn't be used for
actual code.
2011-12-05 10:07:25 -08:00
Sage Weil
4b53288b0c ceph_manager: % 2011-11-19 20:56:49 -08:00
Yehuda Sadeh
23aae67aff testswift: fix config 2011-11-17 16:53:57 -08:00
Tommi Virtanen
d8fc151365 Clean up C++isms. 2011-11-17 17:00:44 -08:00
Tommi Virtanen
c545094895 Add a task for easily running chef-solo on all the nodes. 2011-11-17 16:49:47 -08:00
Sage Weil
89f80412c2 ceph_manager: fix logging 2011-11-17 13:46:02 -08:00
Josh Durgin
f85f5dd7e3 ceph: deep merge overrides, so e.g. log whitelists can be overridden 2011-11-17 13:07:03 -08:00
Josh Durgin
c6988a07f4 Save config after locking nodes, so targets are included. 2011-11-17 11:57:07 -08:00
Josh Durgin
4e6cd55c59 filestore_idempotent: remove unused import 2011-11-17 11:18:24 -08:00
Josh Durgin
7d51e3d381 mon_recovery: remove unused code and import 2011-11-17 11:16:08 -08:00
Josh Durgin
f4d527e743 thrashosds: timeout for every clean check, not just the last one 2011-11-17 11:11:33 -08:00
Josh Durgin
9d12b720e8 ceph_manager: add a default timeout of 5 minutes for mon quorum 2011-11-17 11:05:12 -08:00
Josh Durgin
cb9ac0897b ceph_manager: log mon quorum status so the logs show progress (or lack thereof) 2011-11-17 10:45:19 -08:00
Yehuda Sadeh
f3c569ee23 rgw: add swift task
still not completely working (for some reason it skips all the tests)
2011-11-16 16:00:01 -08:00
Sage Weil
c5f070b8a9 filestore_idempotent.py: simple task to test non-idempotent osd ops
Write some non-idempotent events to the osd.  Simulate a failure.  Verify
the result is correct on replay.

This must be preceeded by the ceph task just so that we get the binaries
installed.  Should clean this up later if/when the installation gets
factored out of ceph.py.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 21:35:11 -08:00
Sage Weil
6618a0275c mon_recovery: add task to test monitor cluster failure recovery
Some simple tests to start with.  We still need some sort of mon cluster
thrashing.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-08 22:17:00 -08:00
Sage Weil
60863f70eb ceph_manager: manipulate monitors 2011-11-08 22:17:00 -08:00
Sage Weil
6d39cc1146 ceph: keep ceph.conf at ctx.ceph.conf
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-08 22:17:00 -08:00
Josh Durgin
006a0dd423 Remove unused imports and variable. 2011-11-08 16:09:21 -08:00
Tommi Virtanen
c764b2475b Fix leftover orchestra import clause.
This seems to be a leftover from
a2372fce12,
no idea how it stayed hidden this long.
2011-11-07 13:05:14 -08:00
Josh Durgin
4f3b113832 ceph_manager: log ceph -s output so progress is visible in the logs 2011-11-03 13:27:44 -07:00
Josh Durgin
0b451f9475 Keep each ssh connection alive.
With long-running jobs like thrashing, ssh connections were timing
out.
2011-11-03 13:08:49 -07:00
Samuel Just
a2f406ef49 testrados: set CEPH_CLIENT_ID without a ;
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-11-02 11:33:37 -07:00
Samuel Just
810cae1a1d testrados: specify CEPH_CONF directly
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-31 14:54:24 -07:00
Yehuda Sadeh
10c3508741 rgw: add user suspend/enable test 2011-10-27 12:11:28 -07:00
Yehuda Sadeh
86aa940ffb rgw: log-to-stderr is now a binary flag 2011-10-27 11:32:12 -07:00
Samuel Just
8d0a7c5977 testrados: rename testsnaps to testrados and make snap testing optional
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-24 14:25:22 -07:00
Josh Durgin
a1249d07ca workunit: set PYTHONPATH so we can test python bindings 2011-10-24 13:52:58 -07:00
Sage Weil
b8beff3dd5 ceph_manager: count active+clean+<somjething else> as active+clean
In my case, one pg was active+clean+scrubbing.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-21 10:54:05 -07:00
Sage Weil
4ec37b2391 add lost_unfound task
Also some misc useful bits to ceph_manager.
2011-10-17 15:32:22 -07:00
Josh Durgin
bcded7f163 ceph: add whitelist for cluster log errors
Some messages are expected when thrashing osds or creating unfound
objects.

Fixes: #1622
2011-10-17 14:42:08 -07:00
Yehuda Sadeh
493596a7fd radosgw-admin: test swift keys creation/removal 2011-10-12 15:37:33 -07:00
Josh Durgin
3d3eb0efea Remove --keep-locked-on-error, and behave as if it were specified
This will help prevent machines with cephtest dirs still present from
being used. It's easy to unlock machines - the targets yaml fragment
is output during a run.
2011-10-07 14:49:53 -07:00
Samuel Just
4722d468c6 task/watch_notify_stress: watch_notify_stress now thrashes clients
This should exercise the watch notify timeout code.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-06 14:34:44 -07:00
Sage Weil
4e61e4835e rgw: keep radosgw in foreground
It defaults to a daemon now.
2011-10-06 12:50:12 -07:00
Josh Durgin
107db6a913 Retry listing machines if the lock server goes down. 2011-10-04 17:21:00 -07:00
Sage Weil
39a1e76065 rgw: use normal logging mechanism
Keep capturing stdout/err, even though it should end up empty.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-04 16:09:51 -07:00
Josh Durgin
3d3ba1ebb1 daemon-helper: detect the signal actually sent
I thought I fixed this when I implemented coverage collection, but I
guess it got lost in a rebase or something.
2011-10-04 12:17:19 -07:00
Josh Durgin
d305d61b86 ceph_manager: remove unused raw_pg_status method 2011-10-03 17:49:53 -07:00
Josh Durgin
8e031730c1 ceph_manager: run ceph -s as a normal program
This allows failures from it to be detected better.
2011-10-03 17:49:13 -07:00
Josh Durgin
1cad309d65 Add failure_reason to summary for the first failure detected.
For now, this is the exception raised during a task, the error found
in the central log, or coredumps found. More specific errors
(i.e. s3-tests had 3 failures) can be added later as exceptions raised
by tasks.
2011-10-03 17:07:41 -07:00
Josh Durgin
817b950494 radosbench: get coverage and cores 2011-10-03 17:07:41 -07:00
Samuel Just
fe1a271d69 watch_notify_stress.py: add ceph flags option
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-03 14:26:08 -07:00
Samuel Just
28d60172f6 ceph.py: add btrfs option
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-03 14:26:04 -07:00
Sage Weil
2b601a32d0 radosgw-admin: test additional keys, log list/show/rm
Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-03 09:45:11 -07:00
Sage Weil
b93a00771f tasks/radosgw-admin: test radosgw-admin tool
Not yet complete...
2011-10-03 09:45:11 -07:00
Greg Farnum
9b44469e5e s3-tests: use radosgw-admin instead of radosgw_admin
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-09-30 09:26:42 -07:00
Josh Durgin
52427589a6 ceph_manager: parse osd numbers with dots
This is necessary since wip-dot-names was merged.
2011-09-29 09:09:31 -07:00
Sage Weil
a92fef77dc rename c* -> ceph-*
Leave cfuse task name unchanged for now...
2011-09-23 08:57:18 -07:00
Samuel Just
ef56a72b73 task/watch_notify_stress.py: add simple watch_notify stress test
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-09-22 13:25:21 -07:00
Greg Farnum
e4dfe3d4bd lockfile: increase interval to prevent incorrect locking orders
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-09-20 10:04:01 -07:00
Greg Farnum
5ff88d1902 lockfile: don't fail cleanup if no lock procs exist
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-09-20 10:03:33 -07:00
Tommi Virtanen
0d5dbfa27e workunit: Fetch source from github.
Needed an elaborate dance because Github won't let us download
an archive of a subdirectory.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
2011-09-16 11:32:15 -07:00
Tommi Virtanen
5583fac383 s3tests: Clone repository from github.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
2011-09-16 11:09:45 -07:00
Samuel Just
4a0f8fee54 ceph.py: remove unused variables mds_daemons and mon_daemons
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-09-15 17:26:38 -07:00
Samuel Just
a3c886af19 ceph.py/cephmanager.py: add ctx.daemons for restarting daemons
ctx.daemons will now be an instance of CephState.

ctx.daemons.get_daemon(role, id).stop() to stop daemon, retart() to
restart the daemon, etc.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-09-15 17:08:34 -07:00
Samuel Just
85cb29d345 testsnaps: LD_PRELOAD needed for librados
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-09-14 16:28:06 -07:00
Tommi Virtanen
a2372fce12 Move orchestra to teuthology.orchestra so there's just one top-level package. 2011-09-13 14:53:02 -07:00
Tommi Virtanen
cc72fe6cf3 Callers of task s3tests.create_users don't need to provide dummy "fixtures" dict. 2011-09-09 13:22:03 -07:00
Josh Durgin
1970bad9d9 thrashosds: fix timeout when no options are specified 2011-09-09 10:31:08 -07:00
Josh Durgin
8dd52f9941 thrashosds: fail if cluster doesn't finally become clean in 5 minutes 2011-09-08 18:09:11 -07:00
Josh Durgin
fc1b14ddcc thrasher: get coverage and cores from calling ceph commands 2011-09-08 14:09:13 -07:00
Josh Durgin
b72c5a8363 thrashosds: wait for every pg to go active and clean before exiting 2011-09-08 14:07:23 -07:00
Josh Durgin
08747c5bfb thrasher: clean up a bit 2011-09-08 12:58:59 -07:00
Josh Durgin
091b0ae3de autotest: allow tests to be run on all clients 2011-09-07 17:50:12 -07:00
Josh Durgin
e45109b645 rbd: allow specifying all clients 2011-09-07 16:54:24 -07:00
Greg Farnum
655e4a4cfe locktest: don't fail cleanup if the dir doesn't exist
We're doing this the cheapest way possible: make the dir!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-09-06 12:39:21 -07:00
Josh Durgin
5c99f9f264 rgw: run as an external fastcgi server to match dho 2011-09-02 17:58:19 -07:00
Sage Weil
e66dffc3d3 don't eat exceptions for breakfast
fixes 0c2bee1514
2011-09-02 11:07:10 -07:00
Greg Farnum
7c4a5ac83b locktest: make it actually run the executable test
This was missing an argument (the file to run on!) and apparently
that didn't cause the command to output a failure return code.

Additionally, the ceph wrappers were blocking a crash and falsely
reporting success back to teuthology. (Yikes!)

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-09-01 14:47:48 -07:00
Sage Weil
c502418fca thrashosds: make it work when first mon isn't mon.0 2011-09-01 12:56:29 -07:00
Sage Weil
3ce1cbb3c4 thrashosds: no camelcaps, add some whitespace 2011-09-01 12:56:29 -07:00
Josh Durgin
3d69965c42 workunits: remove unused variable 2011-08-31 16:49:05 -07:00
Josh Durgin
ec768ba3ca Fix pyflakes warnings. 2011-08-31 14:36:01 -07:00
Josh Durgin
5b42b08527 workunit: save coverage and coredumps
Anything that runs a ceph utility should be using these commands.
2011-08-30 17:13:44 -07:00
Greg Farnum
6d91915217 workunits: rework a little bit to allow "all" clients in a run
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-30 15:49:10 -07:00
Sage Weil
ec97dd8203 cfuse: support running through valgrind
Also switch up the config code so we can take per-client options.
2011-08-30 13:34:52 -07:00
Greg Farnum
0c2bee1514 valgrind: don't run valgrind_post if there's no valgrind
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-29 16:47:22 -07:00
Greg Farnum
3a3c859f5b valgrind: scan logs for bad results
It's not sophisticated but it will warn you about a node
if at least one node has issues.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-29 14:03:02 -07:00
Greg Farnum
50a648bdfc valgrind: use xml output for tools that support it
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-29 14:03:02 -07:00
Greg Farnum
fb33ef3c69 thrasher: improve documentation a little
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-25 15:27:30 -07:00
Greg Farnum
83e263425a thrasher: add option to mark OSDs down instead of out.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-25 15:19:30 -07:00
Greg Farnum
0f9b74e28c thrasher: allow a config to set values
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-25 15:18:42 -07:00
Greg Farnum
5d5de0e70c thrasher: remove redundant wait_till_clean()
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-25 14:38:34 -07:00
Greg Farnum
1130e5fe37 coverage: create dir conditionally
We don't need to create the dir if we aren't using coverage.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-24 16:48:14 -07:00
Greg Farnum
0840d05a8f lockfile: add a lockfile task
This allows pretty highly configurable testing of
fcntl locking via a teuthology task.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-24 16:39:23 -07:00
Sage Weil
42318c57cb check ceph cluster log for badness (ERR, WRN, SEC) 2011-08-23 21:00:26 -07:00
Sage Weil
21d04419b8 ceph: copy cluster log file to archive/ceph.log 2011-08-22 22:04:57 -07:00
Sage Weil
e79dda9a9d workunits: set CEPH_CONF environment
This allows any ceph util we run (including the rados-api tests) find
the config and keyrings they need.
2011-08-21 17:26:15 -07:00
Sage Weil
aa575c1318 rbd: make default image 10G instead of 1G 2011-08-21 15:14:02 -07:00
Greg Farnum
e20bae2a7f valgrind: Document!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-17 10:35:37 -07:00
Greg Farnum
73de620c9e Merge branch 'wip-valgrind' 2011-08-17 10:32:57 -07:00
Greg Farnum
4efc95fa57 include log in valgrind log file names
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-17 10:30:26 -07:00
Greg Farnum
d5eb2c2b77 ceph task: split up arguments a little more
This allows selective daemon kill signal changes. With valgrind
daemons we want term instead of kill, for instance.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-17 10:30:24 -07:00
Greg Farnum
5323e1796f valgrind: move valgrind logs to log dir
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-17 10:29:54 -07:00
Greg Farnum
aa74481728 ceph: split up daemon-running arguments and insert valgrind ones
This setup should let us insert other kinds of things too, if we
need them.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-15 15:35:42 -07:00
Greg Farnum
9ec19f13df ceph: Set up valgrind as a flavor, and create a dir for logging.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-15 15:32:23 -07:00
Greg Farnum
98ac89a54e ceph task: pass the full config to the daemon startup subs
So far as I can tell there is no reason to reduce them to
the coverage config, and I want the full config for my
soon-to-exist valgrind options.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-15 15:31:18 -07:00
Tommi Virtanen
747deecaf6 Add assert to catch simple typos in roles list.
Input of "roles:\n- [mds,1]" used to make teuthology crash
in a non-obviou way.
2011-08-15 09:36:06 -07:00
Greg Farnum
0139323e51 Merge branch 'wip-nuke'
Conflicts:
	teuthology/task/kernel.py
2011-08-10 16:16:25 -07:00
Greg Farnum
6938946a19 manypools: remove commented-out code
This accidentally got left in from my development.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-10 16:12:53 -07:00
Greg Farnum
b5859f877a Move reconnect function from kernel task to misc.py
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-10 14:37:24 -07:00
Tommi Virtanen
7fd798a347 Configure grub to default to the right kernel, not the greatest installed one.
This is sticky; that is, even if you install other kernels (manually/via fab/etc),
grub will keep booting up the one that was last enabled via teuthology config.
Use teuthology to switch kernels and it'll just work.

If the kernel the grub default points to is removed, grub will fall back to
booting the kernel with the greatest version number.

Closes: http://tracker.newdream.net/issues/1364
2011-08-10 13:40:00 -07:00
Tommi Virtanen
39e22e4c0a Handle socket.timeout when waiting for a reconnect.
Now it gets ignored, just like the other harmless socket errors.
2011-08-10 13:22:14 -07:00
Tommi Virtanen
742109f4d9 Wait up to 300 seconds for a reboot.
At least sepia86 was reliably slower than the previous 180 second default.
2011-08-10 13:21:39 -07:00
Sage Weil
bf7b1dd4a7 ceph: fix max_mds calculation
Signed-off-by: Sage Weil <sage@newdream.net>
2011-08-10 12:47:20 -07:00
Greg Farnum
a1f3cac0b6 kernel: comment reconnect task, clean up reporting
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-10 09:07:48 -07:00
Greg Farnum
663bbf8b2b manypools: remove commented-out code
This accidentally got left in from my development.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-09 16:53:46 -07:00
Tommi Virtanen
1ccdcb9896 Make rbd task use mnt.N not mnt.client.N as mountpoint.
Everything else expects this, so e.g. workunits wouldn't work with rbd.
2011-08-09 16:25:00 -07:00
Tommi Virtanen
780ebcdf1b Make sure workunit task does not create mnt.N by itself.
This used to hide a bug in the rbd task, where rbd
created the mountpoint with the wrong name. The workunits
ended up running against the local filesystem.
2011-08-09 16:11:32 -07:00
Stephon Striplin
eee1d9a9e4 allow s3tests.create_users defaults be overridden 2011-08-09 14:28:08 -07:00
Sage Weil
b5ba155c17 Revert "fix get_clients"
This reverts commit 83b6678e79.  The bug I was
hitting was actually fxied by 06e3e69c29.
2011-08-09 13:23:58 -07:00
Gregory Farnum
137f36d533 teuthology: add task manypools
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-08 15:13:21 -07:00
Sage Weil
3f2ad30aca cfuse, kclient: print remote host 2011-08-05 14:35:44 -07:00
Sage Weil
83b6678e79 fix get_clients
Only return the clients that are listed (not _all_ clients).  There might
be a combination of cfuse and kclient (or other) clients here!
2011-08-05 14:35:44 -07:00
Sage Weil
06e3e69c29 tasks/kclient: don't clobber remote 2011-08-05 14:35:43 -07:00
Sage Weil
ef2b80910a use coverage_dir 2011-08-05 14:35:43 -07:00
Josh Durgin
f38c3697fd kernel: install in parallel 2011-08-05 11:17:28 -07:00
Josh Durgin
f66c010ef5 kernel: debug weird socket exceptions 2011-08-05 11:08:02 -07:00
Josh Durgin
6df0d71abf kernel: reboot immediately after installing
This hides the latency of rebooting when installing on many machines.
2011-08-05 11:07:40 -07:00
Josh Durgin
3e6b17f1b8 Down machines shouldn't be considered free. 2011-08-05 10:59:16 -07:00
Josh Durgin
68e6f2b77e Make scheduled tasks leave some machines free. 2011-08-04 18:32:57 -07:00
Josh Durgin
4e399da700 Log connections to targets
This way you can tell which machines have problems in case of an
error.
2011-08-04 18:25:43 -07:00
Greg Farnum
6ac6f7ab38 teuthology: convert from bzip2 to gzip.
gzip is much, much faster on large log files. With a 7.7GB client log, gzip
took 2:45 to compress it to 624MB. bzip2 took 34:38 to compress it to
366MB. For our purposes the space savings are not worth the time loss.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-07-29 10:35:02 -07:00
Sage Weil
277c4ff7aa set max_mds based on non-standbys 2011-07-28 10:25:30 -07:00
Sage Weil
5b0924494a tolerate named (not numbered) mons 2011-07-26 22:07:02 -07:00
Sage Weil
7c0f7c23c7 specify and clean up admin socket 2011-07-26 22:00:39 -07:00