Commit Graph

74 Commits

Author SHA1 Message Date
Sandon Van Ness
5f9a1d8a0f Worker processes by machine type instead of teuthology branch.
teuthology-suite and schedulewill now take --worker instead of
--branch. The branch is set by setting teuthology_branch in the
yaml used to schedule the job.

The teuthology branches are assumed to be in ~/teuthology-$branch
of whatever user is running the workers.
2013-07-18 12:04:08 -07:00
Warren Usui
8129bffb17 Implement full reinstallation of a VM system.
Downburst create is used to reinstall a VM when it is locked.
Downburst destroy is used to remove a VM when it is unlocked.
Host keys are regenerated on each vm instantiation, so the keys
need to be checked prior to use.
If needed, qa-ceph-chef is run on newly installed systems to insure that
they are fully functional.

Signed-off-by: Warren Usui <warren.usui@inktank.com>
2013-04-03 12:29:47 -07:00
Sage Weil
9f46f47b6b run: clean up machine_type thing
Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-29 12:19:05 -07:00
Sage Weil
b815268b58 run: machine-type: foo, not machine_type: foo
Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-28 15:25:10 -07:00
Sage Weil
4e68c2033c verify /var/lib/ceph not present on start
Verify there is no /var/lib/ceph, just like we do with the cephtest
directory.  We will need to change this (or make it optional) when we
allow runs against an existing cluster, but then a whole bunch of other
things will need to change then as well.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-23 20:58:46 -07:00
Warren Usui
09979541ca Implement email task.
Email.py was added so that the emailto attribute could be passed,
and to prevent 'module object has no attribute: email' errors from
happening.  Run.py actual performs the email operation and calls
suite.email_results to do the actual send mail operation.  The
information passed right now is the summary and config information.

Signed-off-by: Warren Usui <warren.usui@inktank.com>
2013-02-27 12:28:59 -08:00
Warren Usui
c5b55f9b76 Fix pass/fail display on exit.
Signed-off-by: Warren Usui <warren.usui@inktank.com>
2013-02-27 12:28:59 -08:00
Warren Usui
3e8d11b409 Add timer.py and display summary info in run.py.
Signed-off-by: Warren Usui <warren.usui@inktank.com>
2013-02-25 16:02:08 -08:00
Sage Weil
9996bdbe6e run: print pass/FAIL as final line
Makes it easy to tell at a glance if your last test passed or not.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-20 15:27:23 -08:00
Sandon Van Ness
030bc7c23d Added support for multiple types of machines.
Added the ability to support multiple types of machines with
--machine-type added to teuthology-lock when used with --lock-many
or --machine-type with teuthology --lock (automated tests). It
defaults to 'plana' and the 'vps' type is currently unused but
should be in the future.

Signed-off-by: Sandon Van Ness <sandon@van-ness.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-07 13:26:37 -08:00
Sam Lang
933cc3c382 run.py: Fix argument parsing for --name
With the addition of the --name argument to the
teuthology program (run.py), jobs were failing
because --name was being treated as a non-arg
option, even though the name was being supplied
by the workers.  Fix that and give it a metavar.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 07:46:04 -06:00
Sam Lang
87b9849628 add --name option to teuthology
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 08:23:34 -06:00
Sage Weil
d07d7289a4 run: save original config, too 2012-11-25 08:37:06 -08:00
Josh Durgin
6c9d45e399 schedule: fix var name 2012-11-02 11:33:46 -07:00
Josh Durgin
5f4414e072 schedule: add option to display jobs in the queue
beanstalkd doesn't let you list jobs in the queue, but you can
inpsect specific job ids.
2012-11-02 11:08:59 -07:00
Josh Durgin
a09153b688 Allow scheduled jobs to use different teuthology branches
teuthology-[schedule|suite] get a parameter to specify the branch,
to put the job in a branch-specific queue. Workers running that
branch of teuthology can pull jobs from that queue.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-21 17:16:56 -07:00
Tommi Virtanen
99ac6b0b3e Disable asynchronous DNS lookups.
Especially on older hosts, we keep triggering errors::

  ServerNotFoundError: Unable to find the server at
  teuthology.front.sepia.ceph.com: [Errno 3] name does not exist

That comes from libevent's evdns via gevent.dns and httplib2. The rate
of these errors is low enough that they seem to be perhaps timeouts,
or more arbitrary. Busy looping on DNS resolution calls has never
triggered them, so far.

With ``monkey.patch_all(dns=False)``, the teuthology process will
block as a whole whenever doing DNS resolution. This will hopefully be
rare enough that it won't matter.

The only real "fix" seems to be upgrading libraries and hoping for the
best; this commit can be reverted after that is done.
2012-08-13 16:18:33 -07:00
Sage Weil
042edcbe1e schedule/suite: schedule job, suite N times 2012-07-14 13:51:51 -07:00
Sage Weil
e5fb49914c run: make -a short for --archive 2012-07-05 13:43:19 -07:00
Sage Weil
c8e1ec6a91 record owner at start of run
So that we can clean up easily even when we don't finish and there is no
summary.yaml.
2012-06-20 11:35:43 -07:00
Josh Durgin
25114bf9a4 nuke: refactor to run in parallel and add unlock option
nuke-on-error already did this, but now teuthology-nuke does it
too. Also outputs targets that couldn't be nuked at the end.
2012-04-24 17:52:01 -07:00
Mark Nelson
1836d4672f Added assertion to check that targets > roles
Signed-off-by: Mark Nelson <mark.nelson@dreamhost.com>
2012-04-03 15:56:51 -07:00
Josh Durgin
1493674735 Use non-zero exit status if any tests failed
Fixes: #1989
2012-03-05 13:34:33 -08:00
Josh Durgin
2a1c74c5f5 Move duration calculation to an internal task
This excludes all generic start up costs, like waiting for locks,
rebooting into a new kernel, etc.
2012-02-21 15:12:26 -08:00
Sage Weil
8fb115fe2c include run duration in summary.yaml 2012-01-16 12:39:20 -08:00
Sage Weil
b354ce4e91 run: put pid in archive dir
This will make it easy for teuthology-ls to show you the running process's
pid (if it's still running).  Or for other utiltizes to kill + clean up
a hung teuthology run.
2012-01-08 14:39:30 -08:00
Josh Durgin
561f06cf94 suite: make email-on-success the default behavior
This way you can tell when a run is complete, instead of wondering if
it's stuck in the queue.
2012-01-05 17:27:31 -08:00
Josh Durgin
cdd5c456a0 nuke-on-error: only unlock if this run locked the machines 2012-01-03 13:02:31 -08:00
Josh Durgin
508f4f8359 Save summary after nuking machines.
This way you can tell when tests are entirely finished running.
2011-11-18 13:53:51 -08:00
Josh Durgin
a763297685 misc: move deep_merge out of the MergeConfig class - it's generic 2011-11-17 13:06:36 -08:00
Josh Durgin
c6988a07f4 Save config after locking nodes, so targets are included. 2011-11-17 11:57:07 -08:00
Josh Durgin
5d32bcae50 Add nuke-on-error option.
This lets automated jobs nuke and unlock machines after failed
tests. Each machine is nuke individually, so one down machine won't
keep others from being nuked and unlocked.
2011-11-08 16:09:21 -08:00
Josh Durgin
3d3eb0efea Remove --keep-locked-on-error, and behave as if it were specified
This will help prevent machines with cephtest dirs still present from
being used. It's easy to unlock machines - the targets yaml fragment
is output during a run.
2011-10-07 14:49:53 -07:00
Josh Durgin
c3c262656d schedule: put results timeout in the job
The default was always being used instead.
2011-09-21 11:05:33 -07:00
Tommi Virtanen
a2372fce12 Move orchestra to teuthology.orchestra so there's just one top-level package. 2011-09-13 14:53:02 -07:00
Sage Weil
d4a876f3e3 teuthology: do a deep merge of input yaml fragments
Concatenate lists, and recursively combine dicts.

If you specify inputs like

 foo:
 - a
 - b

and

 foo:
 - c

you should get

 foo:
 - a
 - b
 - c

Dicts should also be merged (last one wins), and the merging is deep. E.g.

 foo:
   a:
     b:
       c: 1

and

 foo:
   a:
     b:
       c: 2

is

 foo:
   a:
     b:
       c: 2

Fixes: #1497
2011-09-03 15:07:21 -07:00
Josh Durgin
d340ebac4e schedule: add a way to delete jobs from the queue 2011-08-31 17:43:14 -07:00
Josh Durgin
7be9eaa030 suite: add option to send an email if the entire suite passed 2011-08-29 12:42:45 -07:00
Josh Durgin
4f4227a44d Generate coverage at the end of a suite run,
and optionally email failures and ongoing jobs.
2011-08-29 10:23:12 -07:00
Greg Farnum
af0d7c5e44 teuthology-nuke: move it into its own file.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-10 15:38:57 -07:00
Greg Farnum
453a0f99d4 teuthology-nuke: identify and reboot machines with kernel mounts
This includes untested code for just force-unmounting them
when that works again, but for now it does a full reboot-and-
reconnect cycle.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-10 14:37:46 -07:00
Greg Farnum
9566008468 teuthology-nuke: use a more robust cfuse mount finder
This way it can remove cfuse mounts in any location on
the system.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-10 14:37:41 -07:00
Greg Farnum
257d63137f teuthology-nuke: split out different pieces into different loops
This will let us behave more intelligently on things like
nuking kernel mounts.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-10 14:37:36 -07:00
Josh Durgin
5897d7b95d teuthology-nuke: run in parallel, and print each node being nuked 2011-08-03 14:52:55 -07:00
Josh Durgin
30a8dac323 Set success at the beginning of a run.
This way internal tasks like locking can tell whether the run
succeeded, and unlock nodes if it did.
2011-08-03 14:03:13 -07:00
Josh Durgin
e8676ce0eb teuthology-nuke: reset rsyslog config 2011-08-03 11:21:32 -07:00
Josh Durgin
02d0efad97 schedule: make default owner different from that of a normal run
This way the machines locked by scheduled jobs aren't confused
with those locked by manual runs, so they're harder to accidentally
unlock.
2011-07-19 17:25:57 -07:00
Josh Durgin
176b304c3d fusermount runs on a single mount point. 2011-07-13 14:02:46 -07:00
Josh Durgin
5fadb1c11c Whitespace and style cleanup. 2011-07-11 18:07:37 -07:00
Josh Durgin
28f19a4104 Add an option to keep machines locked if a test fails. 2011-07-11 16:23:05 -07:00