Commit Graph

284 Commits

Author SHA1 Message Date
Sage Weil
392a6596aa move some old flaky tasks into marginal suite
These were pulled out of regression a while ago.  Put them into the
marginal suite where they will be regularly run and we can evaluate the
severity of the problems they cause.
2012-07-10 19:58:23 -07:00
Sage Weil
98a21cc8f0 move qemu_iozone test to marginal suite 2012-07-06 17:04:44 -07:00
Samuel Just
ed3bd211fe increase thrashosds timeout 2012-07-06 10:02:29 -07:00
Sage Weil
12a1f62364 move other ffsb workloads to marginal suite 2012-07-04 12:47:00 -07:00
Sage Weil
fb9d39d54c move locktest to marginal suite
This fails 1 in 10 times or something like that.
2012-07-03 17:39:59 -07:00
Sage Weil
9278e231e6 smoke: add msgr failures 2012-07-02 14:08:24 -07:00
Sage Weil
b9414b6cf7 fewer hosts for mon tests 2012-07-02 12:26:10 -07:00
Sage Weil
96ccb0605d add rbd_xfstests to kernel suite 2012-07-01 14:27:38 -07:00
Josh Durgin
3321700a9e qemu_iozone: use a larger image
The default is not large enough.
2012-06-29 11:02:38 -07:00
Sage Weil
74b1468fe6 kernel suite 2012-06-29 09:12:51 -07:00
Sage Weil
1db84ddd33 include ceph task in librbd collection 2012-06-25 21:21:33 -07:00
Sage Weil
aa89e6ab32 move kclient_workunit_suites_ffsb to marginal suite
until #1947 is fixed
2012-06-25 15:30:27 -07:00
Josh Durgin
94a6ab8ff3 Add some tests inside qemu for the librbd suite 2012-06-21 18:18:08 -07:00
Josh Durgin
a92306a41a Move librbd tests to rbd suite
This lets us generate jobs with different caching settings instead of
hardcoding them.
2012-06-21 18:16:32 -07:00
Sage Weil
845e6c282f move cfuse + dbench task that triggers #1737 to marginal suite 2012-06-20 11:23:20 -07:00
Sage Weil
a4589c6ab6 don't dup ceph task for new fsx jobs 2012-06-17 08:58:59 -07:00
Josh Durgin
0c40b24c15 Run fsx on rbd with thrashing 2012-06-15 11:59:43 -07:00
Josh Durgin
50e01c18c9 Increase number of ops done by fsx against rbd.
Especially in the no-cache case, this should detect more races. The
fiemap problem is detectable on plana after ~5000 fsx ops.
2012-06-15 11:55:35 -07:00
Sage Weil
9aeac5decd add radosgw-admin test to regression suite
We wrote this test ages ago, but forgot to add it!  Fixed up a few things
that have changed since then.
2012-06-14 14:06:34 -07:00
Josh Durgin
5012b73abb Add test for cls_rbd 2012-06-10 22:37:12 -07:00
Josh Durgin
68f14b400a Test old and new rbd formats 2012-06-10 21:45:59 -07:00
Josh Durgin
04ef5dcc12 Update for new workunit task syntax 2012-06-10 21:26:50 -07:00
Sage Weil
8c08482cc3 regression: fix new rados, rbd test yamls
Don't start cluster twice!
2012-06-08 14:35:56 -07:00
Sage Weil
6df344c7ec run rados, rbd api tests under thrashing 2012-06-08 11:55:30 -07:00
Sage Weil
95ecf40e44 add rados_stress_watch to regression 2012-05-31 16:44:30 -07:00
Sage Weil
43ac8e2c8c rbd_fsx in write-through mode 2012-05-08 16:07:10 -07:00
Sage Weil
c5429bf936 use fewer nodes for the simple singleton tasks 2012-04-30 20:11:44 -07:00
Sage Weil
ff0fe37294 add rbd_fsx_[no]cache jobs to regression suite 2012-04-19 13:33:32 -07:00
Sage Weil
7ae1aefab7 gather logs for cfuse dbench workload, hopefully catch #1737 2012-04-18 15:19:49 -07:00
Sage Weil
6bede298ef dump_stuck: whitelist 'wrongly marked me down'
The test marks the osds down.. they may generate this error if they get
that faster than they get the signal via the daemon-wrapper.
2012-04-15 20:39:56 -07:00
Sage Weil
4498825a48 add rbd_xfstests to regression suite 2012-04-13 22:27:24 -07:00
Sage Weil
55535d04bb move tasks:cfuse_workunit_suites_dbench.yaml to stress pending #1737 fix 2012-04-12 22:56:09 -07:00
Sage Weil
ef17c8c9eb add smoke suite
This could probably be collapsed into a bunch of singleton tasks to make
it simpler to track how many actual jobs result, but it was simpler to
make it a subset of regression.  And probably that'll be easier to maintain
moving forward.

Tried to avoid any jobs that took more than 10 minutes (tho there are a few
in here).  Kept both valgrind and lockdep jobs, and dropped many of those
from the basic collection (esp api tests).

We'll see how long this takes on plana and adjust up/down from there,
depending on how long we want to wait for it.
2012-03-24 21:47:15 -07:00
Sage Weil
24910c3b3b add osd-recovery test 2012-03-24 16:07:47 -07:00
Sage Weil
6bf9c957c9 renamed backfill -> osd_backfill 2012-03-24 16:07:38 -07:00
Sage Weil
01924a22d4 disable rbd thrash workload, #2174 2012-03-16 13:28:44 -07:00
Sage Weil
b4572351a9 Revert "disable rbd thrash workload, #2174"
This reverts commit 1bec416c7c.

Fixed with #2174
2012-03-15 10:32:39 -07:00
Sage Weil
1bec416c7c disable rbd thrash workload, #2174 2012-03-14 15:51:51 -07:00
Sage Weil
b90354dbab thrash: put client on separate machine from osds
This allows us to run kenrel clients (kclient, rbd) against the thrashing
cluster.
2012-03-13 10:49:33 -07:00
Sage Weil
096427d589 remove dup ceph tasks from new thrash workloads 2012-03-12 15:22:17 -07:00
Sage Weil
2b9e7bc50c clusters/fixed-3.yaml: 2 -> 6 osds
plana nodes have 3 scratch disks... use them!
2012-03-11 21:50:03 -07:00
Sage Weil
51d817fe57 Revert "disable s3tests on valgrind/lockdep until #2103 is fixed"
This reverts commit 9f757ca951.
2012-03-11 21:32:45 -07:00
Sage Weil
af445189c2 add rbd, kclient workloads to regression thrash collection
This will get us some kernel osd_client osd restart coverage.
2012-03-11 21:28:45 -07:00
Sage Weil
71e6e62ebb fix typo, ceph-fyuse -> ceph-fuse 2012-03-11 13:03:41 -07:00
Sage Weil
b84897e56f use dbench workunit, not the autotest one
The autotest one uses an old tarball that doesn't build.  Workunit assumes
the dbench package is installed.
2012-03-10 20:01:57 -08:00
Sage Weil
9f757ca951 disable s3tests on valgrind/lockdep until #2103 is fixed 2012-03-01 22:04:19 -08:00
Josh Durgin
b2bbede826 dump-stuck: set pg stuck threshold to match test 2012-02-29 15:45:25 -08:00
Sage Weil
722af1a4dd no peer as part of lost_unfound 2012-02-27 14:52:35 -08:00
Sage Weil
9afafdf164 move peer to separate test for now 2012-02-26 17:09:41 -08:00
Sage Weil
6295578f16 lost_unfound: do peer after, until wait_for_clean propagates last_epoch_started
The peer task does wait_for_clean, and then lost_unfound immediately marks
something down.  But the PGs become clean before the replica last_epoch_started
is moved forward in time, which means they block waiting for the now down
OSD.  Needlessly.

Until we fix this, just do the peer test after.
2012-02-25 21:35:31 -08:00