Tommi Virtanen
99ac6b0b3e
Disable asynchronous DNS lookups.
...
Especially on older hosts, we keep triggering errors::
ServerNotFoundError: Unable to find the server at
teuthology.front.sepia.ceph.com: [Errno 3] name does not exist
That comes from libevent's evdns via gevent.dns and httplib2. The rate
of these errors is low enough that they seem to be perhaps timeouts,
or more arbitrary. Busy looping on DNS resolution calls has never
triggered them, so far.
With ``monkey.patch_all(dns=False)``, the teuthology process will
block as a whole whenever doing DNS resolution. This will hopefully be
rare enough that it won't matter.
The only real "fix" seems to be upgrading libraries and hoping for the
best; this commit can be reverted after that is done.
2012-08-13 16:18:33 -07:00
Sage Weil
042edcbe1e
schedule/suite: schedule job, suite N times
2012-07-14 13:51:51 -07:00
Sage Weil
e5fb49914c
run: make -a short for --archive
2012-07-05 13:43:19 -07:00
Sage Weil
c8e1ec6a91
record owner at start of run
...
So that we can clean up easily even when we don't finish and there is no
summary.yaml.
2012-06-20 11:35:43 -07:00
Josh Durgin
25114bf9a4
nuke: refactor to run in parallel and add unlock option
...
nuke-on-error already did this, but now teuthology-nuke does it
too. Also outputs targets that couldn't be nuked at the end.
2012-04-24 17:52:01 -07:00
Mark Nelson
1836d4672f
Added assertion to check that targets > roles
...
Signed-off-by: Mark Nelson <mark.nelson@dreamhost.com>
2012-04-03 15:56:51 -07:00
Josh Durgin
1493674735
Use non-zero exit status if any tests failed
...
Fixes : #1989
2012-03-05 13:34:33 -08:00
Josh Durgin
2a1c74c5f5
Move duration calculation to an internal task
...
This excludes all generic start up costs, like waiting for locks,
rebooting into a new kernel, etc.
2012-02-21 15:12:26 -08:00
Sage Weil
8fb115fe2c
include run duration in summary.yaml
2012-01-16 12:39:20 -08:00
Sage Weil
b354ce4e91
run: put pid in archive dir
...
This will make it easy for teuthology-ls to show you the running process's
pid (if it's still running). Or for other utiltizes to kill + clean up
a hung teuthology run.
2012-01-08 14:39:30 -08:00
Josh Durgin
561f06cf94
suite: make email-on-success the default behavior
...
This way you can tell when a run is complete, instead of wondering if
it's stuck in the queue.
2012-01-05 17:27:31 -08:00
Josh Durgin
cdd5c456a0
nuke-on-error: only unlock if this run locked the machines
2012-01-03 13:02:31 -08:00
Josh Durgin
508f4f8359
Save summary after nuking machines.
...
This way you can tell when tests are entirely finished running.
2011-11-18 13:53:51 -08:00
Josh Durgin
a763297685
misc: move deep_merge out of the MergeConfig class - it's generic
2011-11-17 13:06:36 -08:00
Josh Durgin
c6988a07f4
Save config after locking nodes, so targets are included.
2011-11-17 11:57:07 -08:00
Josh Durgin
5d32bcae50
Add nuke-on-error option.
...
This lets automated jobs nuke and unlock machines after failed
tests. Each machine is nuke individually, so one down machine won't
keep others from being nuked and unlocked.
2011-11-08 16:09:21 -08:00
Josh Durgin
3d3eb0efea
Remove --keep-locked-on-error, and behave as if it were specified
...
This will help prevent machines with cephtest dirs still present from
being used. It's easy to unlock machines - the targets yaml fragment
is output during a run.
2011-10-07 14:49:53 -07:00
Josh Durgin
c3c262656d
schedule: put results timeout in the job
...
The default was always being used instead.
2011-09-21 11:05:33 -07:00
Tommi Virtanen
a2372fce12
Move orchestra to teuthology.orchestra so there's just one top-level package.
2011-09-13 14:53:02 -07:00
Sage Weil
d4a876f3e3
teuthology: do a deep merge of input yaml fragments
...
Concatenate lists, and recursively combine dicts.
If you specify inputs like
foo:
- a
- b
and
foo:
- c
you should get
foo:
- a
- b
- c
Dicts should also be merged (last one wins), and the merging is deep. E.g.
foo:
a:
b:
c: 1
and
foo:
a:
b:
c: 2
is
foo:
a:
b:
c: 2
Fixes : #1497
2011-09-03 15:07:21 -07:00
Josh Durgin
d340ebac4e
schedule: add a way to delete jobs from the queue
2011-08-31 17:43:14 -07:00
Josh Durgin
7be9eaa030
suite: add option to send an email if the entire suite passed
2011-08-29 12:42:45 -07:00
Josh Durgin
4f4227a44d
Generate coverage at the end of a suite run,
...
and optionally email failures and ongoing jobs.
2011-08-29 10:23:12 -07:00
Greg Farnum
af0d7c5e44
teuthology-nuke: move it into its own file.
...
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-10 15:38:57 -07:00
Greg Farnum
453a0f99d4
teuthology-nuke: identify and reboot machines with kernel mounts
...
This includes untested code for just force-unmounting them
when that works again, but for now it does a full reboot-and-
reconnect cycle.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-10 14:37:46 -07:00
Greg Farnum
9566008468
teuthology-nuke: use a more robust cfuse mount finder
...
This way it can remove cfuse mounts in any location on
the system.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-10 14:37:41 -07:00
Greg Farnum
257d63137f
teuthology-nuke: split out different pieces into different loops
...
This will let us behave more intelligently on things like
nuking kernel mounts.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-10 14:37:36 -07:00
Josh Durgin
5897d7b95d
teuthology-nuke: run in parallel, and print each node being nuked
2011-08-03 14:52:55 -07:00
Josh Durgin
30a8dac323
Set success at the beginning of a run.
...
This way internal tasks like locking can tell whether the run
succeeded, and unlock nodes if it did.
2011-08-03 14:03:13 -07:00
Josh Durgin
e8676ce0eb
teuthology-nuke: reset rsyslog config
2011-08-03 11:21:32 -07:00
Josh Durgin
02d0efad97
schedule: make default owner different from that of a normal run
...
This way the machines locked by scheduled jobs aren't confused
with those locked by manual runs, so they're harder to accidentally
unlock.
2011-07-19 17:25:57 -07:00
Josh Durgin
176b304c3d
fusermount runs on a single mount point.
2011-07-13 14:02:46 -07:00
Josh Durgin
5fadb1c11c
Whitespace and style cleanup.
2011-07-11 18:07:37 -07:00
Josh Durgin
28f19a4104
Add an option to keep machines locked if a test fails.
2011-07-11 16:23:05 -07:00
Sage Weil
6cf9633a6a
nuke: use default owner
2011-07-11 14:39:04 -07:00
Josh Durgin
85c24bda7f
Add teuthology-schedule and teuthology-worker.
...
schedule puts jobs in a beanstalk queue, worker takes them out and runs them.
2011-07-11 13:49:06 -07:00
Josh Durgin
fd30ed76bf
Add --block option to retry until machines are locked.
...
If there are not enough machines up, fail immediately.
2011-07-07 16:15:18 -07:00
Josh Durgin
a55d2eb53a
Read lock server from ~/teuthology.yaml.
2011-07-07 12:35:11 -07:00
Josh Durgin
9158c83167
Verify that machines are locked before nuking them.
2011-07-07 12:35:11 -07:00
Josh Durgin
9bfca87980
Check that all machines are locked, and add an option to lock machines instead of providing targets.
2011-07-07 12:35:11 -07:00
Josh Durgin
09bee43593
Move username to a utility method.
2011-07-07 12:32:58 -07:00
Sage Weil
f164dd7933
nuke: sudo for the final rm -rf
2011-07-05 16:47:00 -07:00
Sage Weil
2b168b033d
nuke: do not escape fusermount .../mnt.*
2011-07-05 09:01:01 -07:00
Josh Durgin
effee7ffc6
Make kernel a separate entity outside of tasks.
...
It is run before anything other than checking for conflicts.
This way it can't step on the connections used by other tasks,
or clobber test files in /tmp when rebooting.
2011-06-30 16:05:53 -07:00
Sage Weil
b95e61ae29
teuthology-nuke
...
Take in a full config (or just targets: portion) and do a destructive
cleanup.
Still need to clean up kernel mounts.
2011-06-29 12:23:44 -07:00
Sage Weil
2125e8dc1e
include @hostname in owner
2011-06-29 12:09:38 -07:00
Sage Weil
052f43c958
pass owner, optional description through to summary.yaml
...
Owner can be overridden explicitly, otherwise it's the running unix user.
The description is optional and passed straight through.
2011-06-29 12:09:38 -07:00
Tommi Virtanen
e481db1337
Archive syslog messages while the test was in progress.
2011-06-20 14:31:41 -07:00
Tommi Virtanen
57c542b9e8
Archive cores dumped during test, record test as failed if any seen.
2011-06-17 16:00:39 -07:00
Tommi Virtanen
78a3c23418
Move non-ceph logic out of the ceph task: base dir, archive transfer.
2011-06-16 14:36:22 -07:00