Josh Durgin
58126b01fd
workunit: pass branch/sha1 to test
...
Some tests download things from the ceph repo. Let them know which
version to use through the CEPH_REF environment variable.
2012-07-13 10:01:50 -07:00
tamil
1741cb6c90
Added functionality to get mkfs and mount options for file systems
...
from the config file,if present. Otherwise, default options are used.
The default value for inode size is changed to 2k when creating xfs.
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
2012-07-12 18:02:29 -07:00
tamil
353d9ccfe5
fixed typo
...
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
2012-07-12 16:36:40 -07:00
Sage Weil
51148b81e6
radosgw-admin: use --bucket instead of old --bucket-id
...
The --bucket-id support was removed.
2012-07-12 08:33:29 -07:00
Sage Weil
9b28948635
nuke: honor 'check-locks: ...' field in targets file
...
If you are nuking a yaml file with check-locks: false, don't check locks.
2012-07-11 14:23:51 -07:00
Sage Weil
3abc412812
internal: archive mon data dirs
...
These can be useful for debugging, and are usually pretty small.
Fixes : #2714
2012-07-11 14:14:46 -07:00
Sage Weil
cff2cfa217
internal: move pulling archive w/ tar to helper
2012-07-11 14:10:00 -07:00
Sage Weil
9ea22133b7
use sudo to kill teuthology proc
2012-07-06 20:15:55 -07:00
Sage Weil
e5fb49914c
run: make -a short for --archive
2012-07-05 13:43:19 -07:00
Sage Weil
132dc0066d
nuke: be more careful about kill; simplify
...
If the archive dir is specified, make sure we are killing the right
process.
Also drop the kill_process helper; it's simple enough to open-code.
2012-07-04 14:47:33 -07:00
Sage Weil
6dbf53e298
nuke: nuke based on archive path
...
Use path/config.yaml for targets, path/pid for pid to kill, and
path/owner for job owner.
2012-07-04 14:47:33 -07:00
Sage Weil
45fcca1fea
valgrind: add strptime suppressions
...
Precise's strptime triggers valgrind false positives.
Use ship_utilities to push the valgrind.supp file over, which is a bit
slippy.
2012-07-04 14:29:55 -07:00
tamil
e07b711325
Added a debug message
...
The debug message is to print the string that should be JSON.
This is to track a nightly run failure.
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
2012-07-03 16:04:12 -07:00
tamil
f3c2451797
nuke - optionally kill the process hung
...
Added a function kill_process to kill the process hung in the nightly runs.
It takes in pid as an optional argument.
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
2012-07-03 12:23:36 -07:00
Sage Weil
38aa344def
ceph: fix valgrind error check
...
grep all the logs, not the dir... doh!
2012-07-02 08:44:38 -07:00
Mark Nelson
2e5853f485
Now using daemon-helper
...
Signed-off-by: Mark Nelson <nhm@clusterfaq.org>
2012-06-29 14:36:30 -05:00
Mark Nelson
8c453cce55
cleaned up commented code
...
Signed-off-by: Mark Nelson <nhm@clusterfaq.org>
2012-06-28 11:47:16 -05:00
Mark Nelson
1a43c3443b
Added blktrace task
...
Signed-off-by: Mark Nelson <nhm@clusterfaq.org>
2012-06-27 19:38:12 -05:00
Sage Weil
cc380dee40
ignore DEADLOCK line inside lockdep splat
2012-06-25 15:20:19 -07:00
Josh Durgin
38f6a78c71
Add a task to run a test against rbd inside of qemu.
...
For now this task does not setup networking for the vm,
and simply runs an executable downloaded from a specified url.
It does support adding multiple rbd devices, but making use
of that with e.g. xfstests requires a bit more work.
2012-06-21 18:44:16 -07:00
Dan Mick
03597ca6b9
Check for machine args based on local, not ctx.machines
...
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2012-06-21 14:33:20 -07:00
Sage Weil
7773a93e3e
whitelist current lockdep warnings in syslog
...
These are causing too much noise in the qa runs to leave, and #2617 is
sufficiently non-trivial to do this in the interim. Putting a better
mechanism in place will include removing these coarse whitelist items and
replacing with something that specifically matches the failures we want
to ignore.
2012-06-21 13:20:18 -07:00
Sage Weil
c8e1ec6a91
record owner at start of run
...
So that we can clean up easily even when we don't finish and there is no
summary.yaml.
2012-06-20 11:35:43 -07:00
Josh Durgin
218b69246f
teuthology-ls: tolerate non-existent 'success' key in summary file
2012-06-20 10:13:48 -07:00
Sage Weil
286e639782
kernel: enable/disable kdb
...
This hard-codes ttyS1, which is what we use on sepia.
2012-06-19 17:24:01 -07:00
Yehuda Sadeh
ab42b8dd5b
add usage log tests to radosgw-admin tasks
...
tests 'usage show' and 'usage trim'
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-06-19 14:30:00 -07:00
Sage Weil
f7ee34b539
tolerate 250ms clock drift
2012-06-16 20:14:35 -07:00
Sage Weil
3bd387f9e8
radosgw-admin: fix for non-numeric bucket ids
2012-06-14 14:04:21 -07:00
Sage Weil
697c3b94c6
radosgw-admin: test max buckets limit
2012-06-14 14:04:21 -07:00
Sage Weil
474f8da41c
radosgw-admin: remove buckets before user
...
Otherwise user delete will fail.
2012-06-14 14:04:21 -07:00
Sage Weil
83f8f3d1e6
radosgw-admin: fix swift subuser/key tests
...
Need to do 'subuser (add|rm)', not 'key (add|rm)'.
2012-06-14 14:04:21 -07:00
Josh Durgin
5792f13725
workunit: grab 'all' config from the right variable
2012-06-11 12:31:28 -07:00
Josh Durgin
8af8d0e20c
workunit: allow setting environment variables
...
This is useful for e.g. running the same tests against rbd in new and
old formats.
2012-06-10 18:43:50 -07:00
Dan Mick
4fa665c19d
--summary: add total counts, also note free machines
...
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-06-07 13:50:51 -07:00
Dan Mick
44374bc4fb
new variable lock hid lock() function
2012-06-06 20:29:28 -07:00
Dan Mick
9313cdea97
teuthology-lock: add --summary and --brief options
...
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-06-06 18:54:33 -07:00
Sage Weil
9ec2843355
pull s3-tests.git using git, not http
2012-06-06 16:00:55 -07:00
Sage Weil
7523ff3e58
ceph: simplify 'cluster' mon log handling
...
It's not a special file in the mon_data directory anymore, but intead
something in archive that will get slurped up normally. Make sure we
grep for badness from the proper location.
2012-06-06 13:32:56 -07:00
Dan Mick
120ce3f8a7
Pass up unmodified exceptions from connection.connect()
...
This allows useful errors to be reported from things like
mismatched hostkeys, etc.
2012-06-05 18:41:45 -07:00
Dan Mick
fac88a4096
More shortnames fixes:
...
- Allow shortnames in teuthology-updatekeys as well
- Use list comprehensions instead of map()
2012-06-05 18:40:49 -07:00
Eleanor Cawthon
23c729305a
task/: Added object map benchmarking test
...
Signed-off-by: Eleanor Cawthon <eleanor.cawthon@inktank.com>
2012-06-05 15:30:51 -07:00
Dan Mick
044697d178
Allow short names to teuthology-lock (e.g. "plana14")
...
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sam Just <sam.just@inktank.com>
2012-06-04 17:50:07 -07:00
Sage Weil
d3f855ec81
fix up dist var
...
This lets you override the default (now precise) in the ceph config yaml,
e.g.
- ceph:
dist: oneiric
branch: master
2012-05-31 21:39:33 -07:00
Dan Mick
af4fe154d8
Change hardcoded oneiric to precise
...
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2012-05-31 17:09:20 -07:00
Sage Weil
62f8f006b3
rbd.xfstests: default to 250mb instead of 100mb
2012-05-20 20:50:19 -07:00
Sage Weil
3d1fff89c9
rbd_fsx: resize to byte boundaries (not object multiples)
2012-05-05 21:22:30 -07:00
Sage Weil
396d1feff9
ceph.newdream.net -> ceph.com
2012-05-05 09:30:41 -07:00
Sage Weil
715abdea56
ignore syslog cron noise
2012-05-01 22:26:03 -07:00
Sage Weil
dcbb8d4013
osd_recovery: test no* osdmap flags
2012-04-30 11:13:02 -07:00
Josh Durgin
25114bf9a4
nuke: refactor to run in parallel and add unlock option
...
nuke-on-error already did this, but now teuthology-nuke does it
too. Also outputs targets that couldn't be nuked at the end.
2012-04-24 17:52:01 -07:00
Josh Durgin
b32b693ab2
parallel: obey iterator protocol
...
Once it raises StopIteration, it must continue to do so on subsequent calls to next().
2012-04-24 17:48:05 -07:00
Sage Weil
a11b69fd4c
nuke: ignore ntpdate errors
...
We keep seeing a race between ntpd startup and our stop + ntpdate + start
sequence. Ignore errors here.
2012-04-23 09:21:02 -07:00
Sage Weil
6cf876733a
filestore_idempotent: url has changed
2012-04-21 13:36:27 -07:00
Sage Weil
e3af087712
rbd_fsx: show progress
...
The updated fsx takes this arg.
Signed-off-by: Sage Weil <sage@newdream.net>
2012-04-19 13:32:01 -07:00
Sage Weil
6a58314d46
fix misc checks that wait for N osds to be up
...
These all cut&pasted broken code, blah!
2012-04-19 12:44:10 -07:00
Sage Weil
407b2e0bc7
whitelist xfs_fsr syslog noise
...
Ignore lines like
2012-04-17T13:44:11-07:00 plana59 fsr[5454]: DEBUG: fsize=450560 blsz_dio=450560 d_min=512 d_max=2147483136 pgsz=4096
2012-04-18 11:21:10 -07:00
Josh Durgin
e875b89f93
Add task for running fsx on an rbd image.
2012-04-17 08:59:51 -07:00
Sage Weil
19e673ccf9
filestore_idempotent: use new sequence-based tester
...
random seed, inject at 50-300.
2012-04-14 14:06:12 -07:00
Sage Weil
6ba4efcd3a
rbd.py: add xfstests functionality
...
Add tasks for running xfstests over a pair of rbd volumes. The main
one is called xfstests, and it sets up rbd volumes of specified size
and runs a set of likely-to-be-successful tests. The other one is
used by the first, and is called run_xfstests. This provides a
generic (device rather than rbd device oriented) interface to
xfstests, and should probably be made standalone and distinct from
rbd at some point.
Using multiple rbd devices required the rbd udev rule manipulation
to ignore errors, since it appears that each device caused the a
teardown attempt, which leads to failures the second time around.
There's probably a more robust solution, but this works for now.
Signed-off-by: Alex Elder <elder@dreamhost.com>
2012-04-13 22:28:05 -07:00
Josh Durgin
ddb98f7773
ceph_manager: don't try to start greenlet twice
...
spawn already scheduled it. Trying to start it again hits an assert.
2012-04-10 16:23:58 -07:00
Sage Weil
1ac5554d75
kernel: kludge around mysterious 0-byte .git/HEAD files
...
No idea where these are coming from, but they break nodes with behavior
like
ubuntu@plana08:~$ sudo install -d -m0755 /lib/firmware/updates && cd /lib/firmware/updates && sudo git init
Reinitialized existing Git repository in /lib/firmware/updates/.git/
ubuntu@plana08:/lib/firmware/updates$ sudo git --git-dir=/lib/firmware/updates/.git config --get remote.origin.url >/dev/null || sudo git --git-dir=/lib/firmware/updates/.git remote add origin git://ceph.newdream.net/git/linux-firmware.git
ubuntu@plana08:/lib/firmware/updates$ cd /lib/firmware/updates && sudo git pull origin master
fatal: Not a git repository (or any of the parent directories): .git
where the .git directory looks like
total 32
drwxr-xr-x 7 root root 4096 2012-04-10 12:52 .
drwxr-xr-x 3 root root 4096 2012-04-06 13:54 ..
drwxr-xr-x 2 root root 4096 2012-04-06 13:54 branches
-rwxr--r-- 1 root root 236 2012-04-10 11:33 config
-rw-r--r-- 1 root root 0 2012-04-10 12:52 config.lock
-rw-r--r-- 1 root root 0 2012-04-06 13:54 description
-rw-r--r-- 1 root root 0 2012-04-06 13:54 FETCH_HEAD
-rw-r--r-- 1 root root 0 2012-04-06 13:54 HEAD
drwxr-xr-x 2 root root 4096 2012-04-06 13:54 hooks
drwxr-xr-x 2 root root 4096 2012-04-06 13:54 info
drwxr-xr-x 4 root root 4096 2012-04-06 13:54 objects
drwxr-xr-x 4 root root 4096 2012-04-06 13:54 refs
Hopefully someone can figure out what is causing this and revert this
later.
2012-04-10 13:41:16 -07:00
Sage Weil
0d5918f8e4
kernel: reset to remote firmware branch; don't pull
...
Pull might merge if upstream rebases. Just make our branch match the
remote one.
2012-04-10 09:17:24 -07:00
Sage Weil
9b755fd665
kernel: change git incantation for firmware pull
...
The 'git pull <uri>' seemed to consistently fail on some nodes. Can't be
sure this was really the problem with them all down now, but this is more
common, and works.
2012-04-10 09:12:01 -07:00
Sage Weil
22b1f17f78
ls: another newline
2012-04-10 08:59:47 -07:00
Sage Weil
7757fbb9bd
ls: remote stray newline
2012-04-10 08:57:19 -07:00
Dan Mick
9906d5ed08
Change to local mirror of linux-firmware repo to try to stop failures
2012-04-09 16:58:59 -07:00
Mark Nelson
3d7f1db731
Kernel: Pull linux-firmware from git
...
Signed-off-by: Mark Nelson <nhm@clusterfaq.org>
2012-04-05 08:49:19 -07:00
Mark Nelson
1836d4672f
Added assertion to check that targets > roles
...
Signed-off-by: Mark Nelson <mark.nelson@dreamhost.com>
2012-04-03 15:56:51 -07:00
Sage Weil
952940272b
nuke: don't run umount when no xargs args
...
Gets rid of this noise:
INFO:teuthology.nuke:Unmount any osd data directories...
INFO:teuthology.orchestra.run.err:Usage: umount -h | -V
INFO:teuthology.orchestra.run.err: umount -a [-d] [-f] [-r] [-n] [-v] [-t vfstypes] [-O opts]
INFO:teuthology.orchestra.run.err: umount [-d] [-f] [-r] [-n] [-v] special | node...
INFO:teuthology.orchestra.run.err:Usage: umount -h | -V
INFO:teuthology.orchestra.run.err: umount -a [-d] [-f] [-r] [-n] [-v] [-t vfstypes] [-O opts]
INFO:teuthology.orchestra.run.err: umount [-d] [-f] [-r] [-n] [-v] special | node...
...
2012-04-03 15:56:36 -07:00
Sage Weil
9a69c3f319
ceph.conf: enable 'osd recover clone overlap'
...
to test the recovery cloning in qa. this was redone, but forgot to enable
it in qa.
2012-03-30 16:15:34 -07:00
Samuel Just
b4aa098f47
make Thrasher not inherit from Greenlet
2012-03-29 18:08:19 -07:00
Samuel Just
394d8b1ebd
Add test for object source marked down
2012-03-29 18:08:19 -07:00
Samuel Just
749826c29b
allow use of a separate journal block device
2012-03-27 17:18:44 -07:00
Josh Durgin
e30b7710f5
rbd: fix typo in default config
...
pyflakes would have caught this if 'all' weren't a built-in function
2012-03-26 11:57:07 -07:00
Sage Weil
397e7f2f7b
add osd_recovery task to test divergent osd logs
2012-03-24 21:09:19 -07:00
Sage Weil
1c1192a9fb
backfill: use 'rbd' pool instead of 'data'
...
(data has a replay interval, which makes writes take longer to resume
after repeering)
2012-03-24 21:09:19 -07:00
Sage Weil
ca9a5a4ac4
rename backfill -> osd_backfill
2012-03-24 16:05:11 -07:00
Sage Weil
22e808746f
put filestore xattr option in [global]
...
...for test_filestore_idempotent's benefit
2012-03-24 15:36:08 -07:00
Josh Durgin
6f0f250b26
suite: add missing print statement
2012-03-21 12:00:55 -07:00
Josh Durgin
8a9a567067
suite: fix print statement when summary doesn't exist
2012-03-21 11:58:17 -07:00
Samuel Just
91c08f6eee
Add watch op to rados.py
...
Signed-off-by: Samuel Just <sam.just@dreamhost.com>
2012-03-20 19:00:12 -07:00
Josh Durgin
815fc3e2f6
suite: failed runs might not have durations
...
This was one cause of emails not being sent - stale /tmp/cephtest dirs
fail without recording a duration.
2012-03-20 07:50:08 -07:00
Josh Durgin
a65d4136e5
suite, coverage: use absolute dirs for isdir checks
...
This fixes the results to wait for all jobs to complete again.
2012-03-19 14:16:14 -07:00
Josh Durgin
bdb72c282f
filestore_idempotent: get coverage and coredumps
2012-03-19 11:57:02 -07:00
Josh Durgin
6c8db1a807
suite: more results logging
2012-03-19 11:31:33 -07:00
Sage Weil
7173a8afb6
ceph.conf: no comment
2012-03-18 11:56:18 -07:00
Sage Weil
7de798f6fa
ceph.conf: set 'filestore xattr use omap = true'
2012-03-18 11:06:05 -07:00
Sage Weil
7d2e1056fd
fix teuthology-ls isdir check
2012-03-18 10:50:17 -07:00
Sage Weil
94f0ba1efe
run valgrind with cwd set to /tmp/cephtest/archive/coredump
...
This lets us capture the vgcore.* files, which always go to valgrind's
cwd.
Fixes : #1953
2012-03-18 10:48:51 -07:00
Josh Durgin
07b97fe77f
suite: log results and coverage generation
...
Need to figure out where and when results emails are failing.
2012-03-16 11:44:13 -07:00
Josh Durgin
8fbd087d6b
results: make sure email is sent before anything else fails
2012-03-15 17:34:19 -07:00
Mark Nelson
e14d428c98
Merge branch 'master' of github.com:ceph/teuthology
2012-03-14 15:32:23 -05:00
Sage Weil
5c9acbd897
gitbuilder: put flavor last
...
in case we refine the field later
2012-03-13 10:09:18 -07:00
Sage Weil
1a01ccaafb
Pull from new gitbuilder.ceph.com locations.
...
Simplifies the flavor stuff into a tuple of
<package,type,flavor,dist,arch>
where package is ceph, kenrel, etc.
type is tarball, deb
flavor is basic, gcov, notcmalloc
arch is x86_64, i686 (uname -m)
dist is oneiric, etc. (lsb_release -s -c)
2012-03-13 10:02:26 -07:00
Mark Nelson
3833ada8b9
Made the example better with multiple roles.
2012-03-12 15:13:36 -05:00
Mark Nelson
0a61ffad4c
Added some example yaml files and an example parallel execution task.
2012-03-12 14:33:10 -05:00
Sage Weil
008cf7fd95
autotest: pull from github.com/ceph/autotest
2012-03-10 19:15:21 -08:00
Sage Weil
2124129e70
workunit: include python2.7 path too
2012-03-10 15:34:19 -08:00
Samuel Just
ddc1ab0c03
rados.py: include setattr and rmattr
...
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2012-03-08 16:14:44 -08:00
Mark Nelson
31762c0003
lock: Improved logging when there aren't enough nodes available to lock-many.
2012-03-07 12:55:54 -08:00