Zack Cerza
7f135ec94a
Enable reporting of single jobs
...
(also switch to docopt)
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-12 17:00:43 -06:00
Zack Cerza
3d23b9b205
Remove the child's stderr completely
...
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-12 15:45:58 -06:00
Zack Cerza
625f479b68
When starting a job, tell paddles it's running
...
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-12 11:47:45 -06:00
Sandon Van Ness
a7f87f3a3a
Longer timeout after sync/reboot.
...
With only a 5 second sleep via ssh and python it looks like a
race-condition was sometimes hitting where it would think
the machine is back up before the reboot command had completed.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-12-11 18:07:43 -08:00
Zack Cerza
b3acff1d4f
Use continue, not break
...
Fixes a bug where not all pids were being collected
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-10 16:48:12 -06:00
Zack Cerza
4a6e47cdce
Tweak logic for pid lookup
...
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-10 16:48:07 -06:00
Zack Cerza
77145f1b7f
Fix indentation
...
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-10 16:25:28 -06:00
Zack Cerza
57574fefc1
Don't show child's stderr, but show archive path
...
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-10 13:19:56 -06:00
Zack Cerza
339b7c474a
Add debug statements
...
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-10 10:06:39 -06:00
Sage Weil
6c856a2e94
rados: allow existing pool(s) to be used
...
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-09 16:02:13 -08:00
Sage Weil
2266eeb301
ceph.conf: put 2x command in [global]
...
so that osdmaptool sees it.
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-09 15:37:58 -08:00
Zack Cerza
48b8ba4ad2
Create a DateTime object from the timestamp
...
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-09 16:57:11 -06:00
Zack Cerza
5ea5018dbe
Make -a optional
...
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-09 16:42:15 -06:00
Zack Cerza
3d6feb4b60
Merge pull request #151 from ceph/wip-distro-kernel
...
Wip distro kernel
2013-12-09 13:16:33 -08:00
Zack Cerza
d7289f75e8
Auto-restart
...
If /tmp/teuthology-restart-workers is newer than the running process,
restart.
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-09 15:01:33 -06:00
Zack Cerza
33a3600ff3
Merge pull request #158 from ceph/wip-nuke
...
make nuke behave
2013-12-09 13:01:03 -08:00
Sage Weil
1b80f4aa1c
nuke: ignore exceptions while issuing reboot command
...
I'm seeing failed tasks (and nuke) leak machines. It looks like we are
getting an exception on the '... reboot -f -n' command when we should be
ignoring it and waiting for the machine to restart.
For example:
http://qa-proxy.ceph.com/teuthology/sage-2013-12-08_19:25:06-rados:thrash-wip-tier-foo-basic-plana/136321/teuthology.log
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-09 11:42:12 -08:00
Sandon Van Ness
478ecc304f
Remove unused variable.
...
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-12-09 11:42:06 -08:00
Sandon Van Ness
ce8ff0a3c8
Added additional comments.
...
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-12-09 11:35:23 -08:00
Sage Weil
a276606312
ceph.conf: default to 2x
...
A bunch of our tests rely on this; they need to be fixed
before we can run at 3x.
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-07 13:20:58 -08:00
Sage Weil
c0a4327513
nuke: fix sync before reboot timeout
...
If you do 'timeout 5 sync' and sync hangs, timeout will block trying to
kill it.
Instead, just background sync, wait a few seconds, and reboot. This means
we wait a few seconds even if sync returns immediately, but who cares!
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-06 17:42:23 -08:00
Zack Cerza
856f83449c
Implement a watchdog for queued jobs
...
This continually posts the run's status to the results server, if
configured, at an interval defaulting to 600 seconds.
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-05 17:48:10 -06:00
Warren Usui
421192617f
A create_if_vm call was made more than once when a lock-many style lock
...
was performed. This caused downburst to run twice, and the second
downburst fails as a result of the first downburst running.
Fixes: 6933
2013-12-04 17:49:21 -08:00
Warren Usui
207c910e85
Merge branch 'teuthology-fix-downburst-yaml-wusui'
2013-12-04 17:36:14 -08:00
Warren Usui
94f7dd1f3a
Implement --downburst-conf parameter for teuthology-lock.
...
Load the appropriate yaml information when found (this formerly
did not work). Make sure teuthology --lock works with a downburst
entry in the yaml files. Document how this works in README.rst.
Fixes : #6921
Reviewed-by: Dan Mick
2013-12-04 17:31:55 -08:00
Josh Durgin
5cc60996cf
rbd: make default size larger for xfstests
...
Test 167 runs out of space on newer kernels
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-12-03 17:31:45 -08:00
Warren Usui
49a48ae8cf
Merge branch 'wip-fix-teuth-tgt-wusui'
2013-11-25 20:56:24 -08:00
Warren Usui
4c7dd504ca
tgt and iscsi code need some minor fixes. Moved the settle call during
...
simple read testing. In iscsi.py, generic_mkfs and generic_mount need
to be called from the main body of the task. An extraneous iscsiadm
command was removed. The tgt size is now not hard-coded. It is extracted
from the property and defaults to 10240.
Fixes : #6782
2013-11-25 20:44:52 -08:00
Zack Cerza
e75b2d58a2
Merge pull request #154 from ceph/wip-multi-mtype
...
Wip multi mtype
2013-11-25 15:31:34 -08:00
Sandon Van Ness
c0297b436a
Changes suggested per review.
...
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-11-25 01:19:13 -08:00
Zack Cerza
deec86c703
Also catch httplib2.ServerNotFoundError
...
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-11-22 17:03:29 -06:00
Dan Mick
f6b5acc043
internal.py: nitty little spelling error
...
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-11-21 22:04:19 -08:00
Sandon Van Ness
f7af3e723e
Schedule-suite Use 'multi' tube for multiple types. Scheduling.
...
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-11-21 15:21:19 -08:00
Sandon Van Ness
c38eeec85f
Allow ability to use multi machine type deliminated by ,- \t.
...
I was originally attempting a more complicated locking mechanism
but I think its almost as good to just have it attempt the other
machine type if one.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-11-21 14:19:44 -08:00
Zack Cerza
d04f3a6ae0
Skip cluster() if use_existing_cluster is True
...
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-11-21 13:56:41 -06:00
SandonV
bf9434dbe7
Merge pull request #153 from ceph/wip-6790
...
Reviewed by Warren.
2013-11-20 18:03:04 -08:00
Sandon Van Ness
c5a26b38de
Use shortened version in order to avoid revision/arch mishaps.
...
Sometimes -X is added to package names which does not exist in the
/version file. Simply using the version string does not work on
RHEL (it does on centos). Until version and the packages match
identically we instead will just split the version at the - and
no longer specify the dist for better reliability but slightly
lower accuracy.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-11-20 16:37:31 -08:00
Zack Cerza
f8150d44d0
Add optional 'use_existing_cluster' flag
...
If this flag is present, skip a few unnecessary steps
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-11-20 16:23:07 -06:00
Sandon Van Ness
39830c613e
Fix ceph.repo so it uses URI value.
...
Basically some weird cases where ceph-releases would be pointing
to the wrong branch/build when two branches had the same sha1.
This fixes that.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-11-14 21:47:41 -08:00
Samuel Just
04322d9fbb
ceph_manager: provide unique pool names to avoid collision
...
Fixes : #6769
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-11-14 15:13:37 -08:00
Josh Durgin
07db94ef26
syslog: ignore perf nmi handler timeout
...
This seems to have started appearing in recent 3.12+ kernels
with perf enabled.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-11-13 15:27:30 -08:00
Zack Cerza
88792d62e1
Make report_job() always return an int
2013-11-12 17:07:15 -06:00
Sandon Van Ness
96cfb11b91
Add some debug logging.
...
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-11-12 13:04:00 -08:00
Sandon Van Ness
f0e01ad0e5
Distro kernel bug-fixes.
...
Fixed some things that were being done incorrectly.
Some distro kernels have no debug so added | true when disabling
kdb. Also changed what was skipping kernels if non-ubuntu to also
schedule kernel install if a distro kernel.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-11-08 14:35:51 -08:00
Zack Cerza
8d9b86f5d7
Merge pull request #146 from ceph/wip-os-type
...
Wip os type
2013-11-08 12:24:42 -08:00
Sandon Van Ness
03f31c6caf
Consolidate two excepts into one.
...
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-11-08 11:02:48 -08:00
Zack Cerza
b3e730e346
Also catch socket.error in try_push_job_info
2013-11-07 18:39:16 -06:00
Zack Cerza
d8f98201ac
Don't re-call logging.basicConfig()
...
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-11-06 16:04:39 -06:00
Zack Cerza
3fd3bd966d
Fix hilariously long sentry_event para
...
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-11-05 15:09:36 -06:00
Zack Cerza
ed81960242
Don't use create_run() unless necessary
...
Runs are created automatically now.
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-11-04 14:56:13 -06:00