Commit Graph

253 Commits

Author SHA1 Message Date
Yehuda Sadeh
7d085ad939 readwrite: add readwrite task
still not really running, but at least getting configured
2011-12-14 16:12:55 -08:00
Josh Durgin
c9e4504fbd Ignore lockdep being turned off for now.
Some machines are hitting this udev issue:
http://marc.info/?l=linux-kernel&m=132033587908426&w=2 and lockdep is
turned off after the first warning.
2011-12-12 16:29:41 -08:00
Josh Durgin
7b52dd1410 syslog: ignore 'task blocked' warnings
These will happen under heavy load (usually on the osd).
2011-12-08 17:17:47 -08:00
Josh Durgin
e69057e4a1 internal: check syslog for errors
This should catch lockdep warnings and mark tests with them as failed.
2011-12-07 15:20:33 -08:00
Josh Durgin
95e632475f workunit: set client id and secretfile env vars
These are used by the kernel rbd workunit to know how to map images.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-06 16:16:38 -08:00
Tommi Virtanen
e80c32c442 Rename "testrados" and "testswift" tasks to not begin with "test".
Anything "test*" looks like a unit test, and shouldn't be used for
actual code.
2011-12-05 10:07:25 -08:00
Sage Weil
4b53288b0c ceph_manager: % 2011-11-19 20:56:49 -08:00
Yehuda Sadeh
23aae67aff testswift: fix config 2011-11-17 16:53:57 -08:00
Tommi Virtanen
d8fc151365 Clean up C++isms. 2011-11-17 17:00:44 -08:00
Tommi Virtanen
c545094895 Add a task for easily running chef-solo on all the nodes. 2011-11-17 16:49:47 -08:00
Sage Weil
89f80412c2 ceph_manager: fix logging 2011-11-17 13:46:02 -08:00
Josh Durgin
f85f5dd7e3 ceph: deep merge overrides, so e.g. log whitelists can be overridden 2011-11-17 13:07:03 -08:00
Josh Durgin
c6988a07f4 Save config after locking nodes, so targets are included. 2011-11-17 11:57:07 -08:00
Josh Durgin
4e6cd55c59 filestore_idempotent: remove unused import 2011-11-17 11:18:24 -08:00
Josh Durgin
7d51e3d381 mon_recovery: remove unused code and import 2011-11-17 11:16:08 -08:00
Josh Durgin
f4d527e743 thrashosds: timeout for every clean check, not just the last one 2011-11-17 11:11:33 -08:00
Josh Durgin
9d12b720e8 ceph_manager: add a default timeout of 5 minutes for mon quorum 2011-11-17 11:05:12 -08:00
Josh Durgin
cb9ac0897b ceph_manager: log mon quorum status so the logs show progress (or lack thereof) 2011-11-17 10:45:19 -08:00
Yehuda Sadeh
f3c569ee23 rgw: add swift task
still not completely working (for some reason it skips all the tests)
2011-11-16 16:00:01 -08:00
Sage Weil
c5f070b8a9 filestore_idempotent.py: simple task to test non-idempotent osd ops
Write some non-idempotent events to the osd.  Simulate a failure.  Verify
the result is correct on replay.

This must be preceeded by the ceph task just so that we get the binaries
installed.  Should clean this up later if/when the installation gets
factored out of ceph.py.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 21:35:11 -08:00
Sage Weil
6618a0275c mon_recovery: add task to test monitor cluster failure recovery
Some simple tests to start with.  We still need some sort of mon cluster
thrashing.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-08 22:17:00 -08:00
Sage Weil
60863f70eb ceph_manager: manipulate monitors 2011-11-08 22:17:00 -08:00
Sage Weil
6d39cc1146 ceph: keep ceph.conf at ctx.ceph.conf
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-08 22:17:00 -08:00
Josh Durgin
006a0dd423 Remove unused imports and variable. 2011-11-08 16:09:21 -08:00
Tommi Virtanen
c764b2475b Fix leftover orchestra import clause.
This seems to be a leftover from
a2372fce12,
no idea how it stayed hidden this long.
2011-11-07 13:05:14 -08:00
Josh Durgin
4f3b113832 ceph_manager: log ceph -s output so progress is visible in the logs 2011-11-03 13:27:44 -07:00
Josh Durgin
0b451f9475 Keep each ssh connection alive.
With long-running jobs like thrashing, ssh connections were timing
out.
2011-11-03 13:08:49 -07:00
Samuel Just
a2f406ef49 testrados: set CEPH_CLIENT_ID without a ;
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-11-02 11:33:37 -07:00
Samuel Just
810cae1a1d testrados: specify CEPH_CONF directly
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-31 14:54:24 -07:00
Yehuda Sadeh
10c3508741 rgw: add user suspend/enable test 2011-10-27 12:11:28 -07:00
Yehuda Sadeh
86aa940ffb rgw: log-to-stderr is now a binary flag 2011-10-27 11:32:12 -07:00
Samuel Just
8d0a7c5977 testrados: rename testsnaps to testrados and make snap testing optional
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-24 14:25:22 -07:00
Josh Durgin
a1249d07ca workunit: set PYTHONPATH so we can test python bindings 2011-10-24 13:52:58 -07:00
Sage Weil
b8beff3dd5 ceph_manager: count active+clean+<somjething else> as active+clean
In my case, one pg was active+clean+scrubbing.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-21 10:54:05 -07:00
Sage Weil
4ec37b2391 add lost_unfound task
Also some misc useful bits to ceph_manager.
2011-10-17 15:32:22 -07:00
Josh Durgin
bcded7f163 ceph: add whitelist for cluster log errors
Some messages are expected when thrashing osds or creating unfound
objects.

Fixes: #1622
2011-10-17 14:42:08 -07:00
Yehuda Sadeh
493596a7fd radosgw-admin: test swift keys creation/removal 2011-10-12 15:37:33 -07:00
Josh Durgin
3d3eb0efea Remove --keep-locked-on-error, and behave as if it were specified
This will help prevent machines with cephtest dirs still present from
being used. It's easy to unlock machines - the targets yaml fragment
is output during a run.
2011-10-07 14:49:53 -07:00
Samuel Just
4722d468c6 task/watch_notify_stress: watch_notify_stress now thrashes clients
This should exercise the watch notify timeout code.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-06 14:34:44 -07:00
Sage Weil
4e61e4835e rgw: keep radosgw in foreground
It defaults to a daemon now.
2011-10-06 12:50:12 -07:00
Josh Durgin
107db6a913 Retry listing machines if the lock server goes down. 2011-10-04 17:21:00 -07:00
Sage Weil
39a1e76065 rgw: use normal logging mechanism
Keep capturing stdout/err, even though it should end up empty.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-04 16:09:51 -07:00
Josh Durgin
3d3ba1ebb1 daemon-helper: detect the signal actually sent
I thought I fixed this when I implemented coverage collection, but I
guess it got lost in a rebase or something.
2011-10-04 12:17:19 -07:00
Josh Durgin
d305d61b86 ceph_manager: remove unused raw_pg_status method 2011-10-03 17:49:53 -07:00
Josh Durgin
8e031730c1 ceph_manager: run ceph -s as a normal program
This allows failures from it to be detected better.
2011-10-03 17:49:13 -07:00
Josh Durgin
1cad309d65 Add failure_reason to summary for the first failure detected.
For now, this is the exception raised during a task, the error found
in the central log, or coredumps found. More specific errors
(i.e. s3-tests had 3 failures) can be added later as exceptions raised
by tasks.
2011-10-03 17:07:41 -07:00
Josh Durgin
817b950494 radosbench: get coverage and cores 2011-10-03 17:07:41 -07:00
Samuel Just
fe1a271d69 watch_notify_stress.py: add ceph flags option
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-03 14:26:08 -07:00
Samuel Just
28d60172f6 ceph.py: add btrfs option
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-03 14:26:04 -07:00
Sage Weil
2b601a32d0 radosgw-admin: test additional keys, log list/show/rm
Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-03 09:45:11 -07:00