Yehuda Sadeh
7d085ad939
readwrite: add readwrite task
...
still not really running, but at least getting configured
2011-12-14 16:12:55 -08:00
Josh Durgin
c9e4504fbd
Ignore lockdep being turned off for now.
...
Some machines are hitting this udev issue:
http://marc.info/?l=linux-kernel&m=132033587908426&w=2 and lockdep is
turned off after the first warning.
2011-12-12 16:29:41 -08:00
Josh Durgin
7b52dd1410
syslog: ignore 'task blocked' warnings
...
These will happen under heavy load (usually on the osd).
2011-12-08 17:17:47 -08:00
Josh Durgin
e69057e4a1
internal: check syslog for errors
...
This should catch lockdep warnings and mark tests with them as failed.
2011-12-07 15:20:33 -08:00
Josh Durgin
95e632475f
workunit: set client id and secretfile env vars
...
These are used by the kernel rbd workunit to know how to map images.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-06 16:16:38 -08:00
Tommi Virtanen
e80c32c442
Rename "testrados" and "testswift" tasks to not begin with "test".
...
Anything "test*" looks like a unit test, and shouldn't be used for
actual code.
2011-12-05 10:07:25 -08:00
Sage Weil
4b53288b0c
ceph_manager: %
2011-11-19 20:56:49 -08:00
Yehuda Sadeh
23aae67aff
testswift: fix config
2011-11-17 16:53:57 -08:00
Tommi Virtanen
d8fc151365
Clean up C++isms.
2011-11-17 17:00:44 -08:00
Tommi Virtanen
c545094895
Add a task for easily running chef-solo on all the nodes.
2011-11-17 16:49:47 -08:00
Sage Weil
89f80412c2
ceph_manager: fix logging
2011-11-17 13:46:02 -08:00
Josh Durgin
f85f5dd7e3
ceph: deep merge overrides, so e.g. log whitelists can be overridden
2011-11-17 13:07:03 -08:00
Josh Durgin
c6988a07f4
Save config after locking nodes, so targets are included.
2011-11-17 11:57:07 -08:00
Josh Durgin
4e6cd55c59
filestore_idempotent: remove unused import
2011-11-17 11:18:24 -08:00
Josh Durgin
7d51e3d381
mon_recovery: remove unused code and import
2011-11-17 11:16:08 -08:00
Josh Durgin
f4d527e743
thrashosds: timeout for every clean check, not just the last one
2011-11-17 11:11:33 -08:00
Josh Durgin
9d12b720e8
ceph_manager: add a default timeout of 5 minutes for mon quorum
2011-11-17 11:05:12 -08:00
Josh Durgin
cb9ac0897b
ceph_manager: log mon quorum status so the logs show progress (or lack thereof)
2011-11-17 10:45:19 -08:00
Yehuda Sadeh
f3c569ee23
rgw: add swift task
...
still not completely working (for some reason it skips all the tests)
2011-11-16 16:00:01 -08:00
Sage Weil
c5f070b8a9
filestore_idempotent.py: simple task to test non-idempotent osd ops
...
Write some non-idempotent events to the osd. Simulate a failure. Verify
the result is correct on replay.
This must be preceeded by the ceph task just so that we get the binaries
installed. Should clean this up later if/when the installation gets
factored out of ceph.py.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 21:35:11 -08:00
Sage Weil
6618a0275c
mon_recovery: add task to test monitor cluster failure recovery
...
Some simple tests to start with. We still need some sort of mon cluster
thrashing.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-08 22:17:00 -08:00
Sage Weil
60863f70eb
ceph_manager: manipulate monitors
2011-11-08 22:17:00 -08:00
Sage Weil
6d39cc1146
ceph: keep ceph.conf at ctx.ceph.conf
...
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-08 22:17:00 -08:00
Josh Durgin
006a0dd423
Remove unused imports and variable.
2011-11-08 16:09:21 -08:00
Tommi Virtanen
c764b2475b
Fix leftover orchestra import clause.
...
This seems to be a leftover from
a2372fce12
,
no idea how it stayed hidden this long.
2011-11-07 13:05:14 -08:00
Josh Durgin
4f3b113832
ceph_manager: log ceph -s output so progress is visible in the logs
2011-11-03 13:27:44 -07:00
Josh Durgin
0b451f9475
Keep each ssh connection alive.
...
With long-running jobs like thrashing, ssh connections were timing
out.
2011-11-03 13:08:49 -07:00
Samuel Just
a2f406ef49
testrados: set CEPH_CLIENT_ID without a ;
...
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-11-02 11:33:37 -07:00
Samuel Just
810cae1a1d
testrados: specify CEPH_CONF directly
...
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-31 14:54:24 -07:00
Yehuda Sadeh
10c3508741
rgw: add user suspend/enable test
2011-10-27 12:11:28 -07:00
Yehuda Sadeh
86aa940ffb
rgw: log-to-stderr is now a binary flag
2011-10-27 11:32:12 -07:00
Samuel Just
8d0a7c5977
testrados: rename testsnaps to testrados and make snap testing optional
...
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-24 14:25:22 -07:00
Josh Durgin
a1249d07ca
workunit: set PYTHONPATH so we can test python bindings
2011-10-24 13:52:58 -07:00
Sage Weil
b8beff3dd5
ceph_manager: count active+clean+<somjething else> as active+clean
...
In my case, one pg was active+clean+scrubbing.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-21 10:54:05 -07:00
Sage Weil
4ec37b2391
add lost_unfound task
...
Also some misc useful bits to ceph_manager.
2011-10-17 15:32:22 -07:00
Josh Durgin
bcded7f163
ceph: add whitelist for cluster log errors
...
Some messages are expected when thrashing osds or creating unfound
objects.
Fixes : #1622
2011-10-17 14:42:08 -07:00
Yehuda Sadeh
493596a7fd
radosgw-admin: test swift keys creation/removal
2011-10-12 15:37:33 -07:00
Josh Durgin
3d3eb0efea
Remove --keep-locked-on-error, and behave as if it were specified
...
This will help prevent machines with cephtest dirs still present from
being used. It's easy to unlock machines - the targets yaml fragment
is output during a run.
2011-10-07 14:49:53 -07:00
Samuel Just
4722d468c6
task/watch_notify_stress: watch_notify_stress now thrashes clients
...
This should exercise the watch notify timeout code.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-06 14:34:44 -07:00
Sage Weil
4e61e4835e
rgw: keep radosgw in foreground
...
It defaults to a daemon now.
2011-10-06 12:50:12 -07:00
Josh Durgin
107db6a913
Retry listing machines if the lock server goes down.
2011-10-04 17:21:00 -07:00
Sage Weil
39a1e76065
rgw: use normal logging mechanism
...
Keep capturing stdout/err, even though it should end up empty.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-04 16:09:51 -07:00
Josh Durgin
3d3ba1ebb1
daemon-helper: detect the signal actually sent
...
I thought I fixed this when I implemented coverage collection, but I
guess it got lost in a rebase or something.
2011-10-04 12:17:19 -07:00
Josh Durgin
d305d61b86
ceph_manager: remove unused raw_pg_status method
2011-10-03 17:49:53 -07:00
Josh Durgin
8e031730c1
ceph_manager: run ceph -s as a normal program
...
This allows failures from it to be detected better.
2011-10-03 17:49:13 -07:00
Josh Durgin
1cad309d65
Add failure_reason to summary for the first failure detected.
...
For now, this is the exception raised during a task, the error found
in the central log, or coredumps found. More specific errors
(i.e. s3-tests had 3 failures) can be added later as exceptions raised
by tasks.
2011-10-03 17:07:41 -07:00
Josh Durgin
817b950494
radosbench: get coverage and cores
2011-10-03 17:07:41 -07:00
Samuel Just
fe1a271d69
watch_notify_stress.py: add ceph flags option
...
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-03 14:26:08 -07:00
Samuel Just
28d60172f6
ceph.py: add btrfs option
...
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-03 14:26:04 -07:00
Sage Weil
2b601a32d0
radosgw-admin: test additional keys, log list/show/rm
...
Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-03 09:45:11 -07:00