Commit Graph

1949 Commits

Author SHA1 Message Date
Zack Cerza
e312048ade Allow passing multiple job_ids
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-19 16:43:11 -06:00
Zack Cerza
220779c8c4 Implement single-job killing
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-19 16:32:41 -06:00
Zack Cerza
eeeb6267f9 For teuthology-kill, s/suite/run/
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-19 16:32:41 -06:00
SandonV
228358ff17 Merge pull request #165 from ceph/wip-7042-fix-wusui
Do not run local handling fix if local parameter is not found.
2013-12-19 14:27:16 -08:00
Warren Usui
37815b76d3 Do not run local handling fix if local parameter is not found.
Fixes: 7042
Signed-off-by: Warren Usui <warren.usui@inktank.com>
2013-12-19 14:20:12 -08:00
Zack Cerza
dbcef31ba0 Merge pull request #156 from ceph/teuthology-doc-hadoop-wusui
Added docstrings.  Cleaned up code (broke up long lines, removed unused
2013-12-19 09:27:14 -08:00
Zack Cerza
006c031182 Merge pull request #164 from ceph/wip-rados
rados: add in more (optional) op types
2013-12-19 09:24:21 -08:00
Zack Cerza
d70d1ad76f Merge pull request #160 from ceph/wip-fix-5149-wusui
Added handling of a 'local' option inside install.py which specifies
2013-12-19 09:23:36 -08:00
Zack Cerza
9a29c3ef71 Log calls to teuthology-report more verbosely
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-19 10:29:30 -06:00
Zack Cerza
b014c71829 Catch every exception here, for now.
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-19 10:29:30 -06:00
Sandon Van Ness
031be56813 Use saucy gitbuilder for arm package checking.
Some-how missed it checks both sha1 and package version file
and package version was still the quantal gitbuilder which wont
work as the hardware is down.

This was causing scheduling failures.

Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-12-18 12:38:50 -08:00
Sage Weil
5320db57ce rados: add in more (optional) op types
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-18 11:41:58 -08:00
Zack Cerza
a0eb1a8e8c Use shell=True to call teuthology-report
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-16 14:22:22 -06:00
Zack Cerza
c22ee528b7 Catch OSError if script isn't in $PATH
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-16 13:34:37 -06:00
Zack Cerza
420fff6207 Revert "Use path when calling teuthology-report. …"
This reverts commit e4b5ab811e.
2013-12-16 11:43:06 -06:00
Sandon Van Ness
e4b5ab811e Use path when calling teuthology-report. …
The 'teuthology-report' command is probably not going to exist
in $PATH so get the location of the running command and assume its
in the same path.

Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-12-14 07:14:51 -08:00
Zack Cerza
7cb815f792 Merge pull request #162 from jcsp/fsid-conf
Fix FSID not being set in ceph.conf
2013-12-13 09:25:30 -08:00
Zack Cerza
02e0a1e913 Merge pull request #161 from jcsp/ssh-config
Respect .ssh/config when opening SSH connections
2013-12-13 09:24:23 -08:00
Zack Cerza
2e2b8feba2 Skip the 'dead' report on old branches
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-13 10:48:52 -06:00
Sandon Van Ness
36c0344f98 Use saucy gitbuilder when grabbing sha1 for arm.
Old quantal gitbuilders are gone until hardware comes back. Use
the new saucy gitbuilders instead.
2013-12-12 16:04:38 -08:00
Zack Cerza
966dad544b Make sure to report all results.
If a just-finished job was using a teuthology branch not known to
contain the reporting feature, then report the job via the
teuthology-report script. Note that in some cases this will result in
double reporting but the extra load should be negligible.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-12 17:33:53 -06:00
Zack Cerza
7f135ec94a Enable reporting of single jobs
(also switch to docopt)

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-12 17:00:43 -06:00
Zack Cerza
3d23b9b205 Remove the child's stderr completely
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-12 15:45:58 -06:00
John Spray
9ff4d4a4e7 Fix FSID not being set in ceph.conf
Symptom was that 'ceph --admin-daemon... config get fsid'
returned zeros, while correct fsid was present in cluster maps.
Fix it by populating FSID in ceph.conf, after extracting it from
monmap.
2013-12-12 13:34:52 -08:00
Zack Cerza
625f479b68 When starting a job, tell paddles it's running
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-12 11:47:45 -06:00
Sandon Van Ness
a7f87f3a3a Longer timeout after sync/reboot.
With only a 5 second sleep via ssh and python it looks like a
race-condition was sometimes hitting where it would think
the machine is back up before the reboot command had completed.

Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-12-11 18:07:43 -08:00
John Spray
f3ce07c65f Respect .ssh/config when opening SSH connections
This handles that case where your private key is
in a non-default location that you're pointing
to in ~/.ssh/config.
2013-12-11 13:41:36 -08:00
Warren Usui
0eb784b654 Added handling of a 'local' option inside install.py which specifies
a local directory containing deb or rpm files to be installed.

Fixes: 5149
Signed-off-by: Warren Usui <warren.usui@inktank.com>
2013-12-10 23:45:38 -08:00
Zack Cerza
b3acff1d4f Use continue, not break
Fixes a bug where not all pids were being collected

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-10 16:48:12 -06:00
Zack Cerza
4a6e47cdce Tweak logic for pid lookup
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-10 16:48:07 -06:00
Zack Cerza
77145f1b7f Fix indentation
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-10 16:25:28 -06:00
Zack Cerza
57574fefc1 Don't show child's stderr, but show archive path
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-10 13:19:56 -06:00
Zack Cerza
339b7c474a Add debug statements
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-10 10:06:39 -06:00
Zack Cerza
971216b5c0 Merge pull request #159 from ceph/wip-cache
rados: allow existing pool(s) to be used
2013-12-10 08:02:51 -08:00
Sage Weil
6c856a2e94 rados: allow existing pool(s) to be used
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-09 16:02:13 -08:00
Sage Weil
2266eeb301 ceph.conf: put 2x command in [global]
so that osdmaptool sees it.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-09 15:37:58 -08:00
Zack Cerza
48b8ba4ad2 Create a DateTime object from the timestamp
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-09 16:57:11 -06:00
Zack Cerza
5ea5018dbe Make -a optional
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-09 16:42:15 -06:00
Zack Cerza
025ab3665f Add missing req: psutil 2013-12-09 16:42:14 -06:00
Zack Cerza
3d6feb4b60 Merge pull request #151 from ceph/wip-distro-kernel
Wip distro kernel
2013-12-09 13:16:33 -08:00
Zack Cerza
d7289f75e8 Auto-restart
If /tmp/teuthology-restart-workers is newer than the running process,
restart.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-09 15:01:33 -06:00
Zack Cerza
33a3600ff3 Merge pull request #158 from ceph/wip-nuke
make nuke behave
2013-12-09 13:01:03 -08:00
Sage Weil
1b80f4aa1c nuke: ignore exceptions while issuing reboot command
I'm seeing failed tasks (and nuke) leak machines.  It looks like we are
getting an exception on the '... reboot -f -n' command when we should be
ignoring it and waiting for the machine to restart.

For example:
   http://qa-proxy.ceph.com/teuthology/sage-2013-12-08_19:25:06-rados:thrash-wip-tier-foo-basic-plana/136321/teuthology.log

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-09 11:42:12 -08:00
Sandon Van Ness
478ecc304f Remove unused variable.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-12-09 11:42:06 -08:00
Sandon Van Ness
ce8ff0a3c8 Added additional comments.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-12-09 11:35:23 -08:00
Sage Weil
a276606312 ceph.conf: default to 2x
A bunch of our tests rely on this; they need to be fixed
before we can run at 3x.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-07 13:20:58 -08:00
Sage Weil
c0a4327513 nuke: fix sync before reboot timeout
If you do 'timeout 5 sync' and sync hangs, timeout will block trying to
kill it.

Instead, just background sync, wait a few seconds, and reboot.  This means
we wait a few seconds even if sync returns immediately, but who cares!

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-06 17:42:23 -08:00
Alfredo Deza
bec1ac191e Merge pull request #157 from ceph/wip-watchdog
Implement a watchdog for queued jobs
2013-12-06 06:18:14 -08:00
Zack Cerza
856f83449c Implement a watchdog for queued jobs
This continually posts the run's status to the results server, if
configured, at an interval defaulting to 600 seconds.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-05 17:48:10 -06:00
Warren Usui
421192617f A create_if_vm call was made more than once when a lock-many style lock
was performed.  This caused downburst to run twice, and the second
downburst fails as a result of the first downburst running.

Fixes: 6933
2013-12-04 17:49:21 -08:00