Commit Graph

2230 Commits

Author SHA1 Message Date
Zack Cerza
eb667673e4 Pass timeout to _spawn_on_all_clients()
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-08 14:16:07 -06:00
Zack Cerza
4e01884e14 Log correct action in CephManager.remove_pool()
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-08 08:18:53 -06:00
Zack Cerza
72c63f13fa Log timeout value
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-08 08:01:00 -06:00
Zack Cerza
b4205caedf Iterate more sensibly over processes
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-08 07:58:13 -06:00
Zack Cerza
204b3ac710 Change default workunit timeout to 1h
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 15:14:50 -06:00
Zack Cerza
ec38bd3cb3 Use safe_while's action arg
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 14:04:49 -06:00
Zack Cerza
73f5af2f6a Add optional 'action' parameter to safe_while
This is to make it easier to see what actually timed out when scanning
error logs

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 14:02:33 -06:00
Zack Cerza
7604a1b670 Update safe_while users to reflect new defaults
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 13:36:35 -06:00
Zack Cerza
8258c8479b Change safe_while defaults to 6s 10x no increment
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 13:33:27 -06:00
Zack Cerza
081a5c4bf9 Look for ready() in the right place
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 13:03:58 -06:00
Sage Weil
ad04075b17 Merge pull request #218 from ceph/wip-radosbench-timeout
Introduce a timeout to radosbench's join phase
2014-03-07 09:39:04 -08:00
Zack Cerza
1778d35847 Use a timeout of config.get('time') * 2
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 11:36:45 -06:00
Zack Cerza
0be5f1f015 Introduce a timeout to radosbench's join phase
Set to 15min right now.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 11:21:31 -06:00
Zack Cerza
4b11d072bf Mark this 'while True' loop with 'finite' comment
If we're going to embark on a mission to rid ourselves of
infinitely-looping while loops, it seems smart to start marking the ones
we've fixed in order to make grepping for unfixed loops easier.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-06 23:09:44 -06:00
Zack Cerza
20bfc97844 Give up on wait_until_healthy() after 15min
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-06 22:32:29 -06:00
Zack Cerza
ac6ebf87b3 Merge pull request #217 from ceph/wip-fix-plus
suite: fix + handling
2014-03-06 20:00:09 -06:00
Sage Weil
94d73bd411 suite: fix build_matrix for + case
The + means we should concatenate everything in the directory.  Do that.

This was totally broken before (and unused until now).

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-06 17:56:01 -08:00
Josh Durgin
57259b54be rados: use backwards compatible args
For ops that default to 0, only add arguments for them if they are
specified in the task config. This lets us use the same task across
ceph versions, even if the older version does not support new op
types, like append on dumpling.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2014-03-06 17:32:10 -08:00
Sage Weil
95d445daea Merge remote-tracking branch 'gh/firefly' 2014-03-06 16:57:45 -08:00
Sage Weil
70de7d52d3 Revert "Do not spawn a parallel task if dictionary entry does not exist."
This reverts commit dadc9f7d0b.
2014-03-06 16:56:14 -08:00
Samuel Just
9a8bf66d70 ceph.conf.template: add in sensible erasure coding defaults
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-05 12:39:10 -08:00
Samuel Just
e6698af818 ceph_manager: fix erasure coding m, k values
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-05 12:38:51 -08:00
Zack Cerza
e69da0a5ea Log job PID
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-05 14:13:04 -06:00
Sage Weil
c5f0e7181c Merge pull request #216 from ceph/wip-workunit-timeout
Add a 6h timeout to workunits
2014-03-05 10:14:50 -08:00
Zack Cerza
c3c0b080f6 Add a 6h timeout to workunits
The timeout is configurable, but defaults to six hours. It's implemented
by using the 'timeout' command on the remote host.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-05 11:17:13 -06:00
Zack Cerza
963fb881cb Merge pull request #215 from dachary/wip-ec-pool
ceph-manager: fix ec_pool parameters
2014-03-03 14:47:52 -06:00
Zack Cerza
3006f23143 Merge pull request #207 from ceph/wip-7356
helper for bombing out of infinite loops
2014-03-03 10:24:27 -06:00
Loic Dachary
7889acbf65 ceph-manager: fix ec_pool parameters
* the crush ruleset and the pool create parameters must be identicals
  k=2 m= 1
* the --property argument is invalid
* the failure domain is ignored on pool create

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-02 00:41:21 +01:00
Alfredo Deza
a01d3ca87d Merge pull request #213 from ceph/wip-kdb-except
Allow setting kdb to fail.
2014-02-28 18:32:25 -05:00
Sandon Van Ness
fd507ed35a Allow setting kdb to fail.
Some kernels (primarily Debian distro kernels) do not support
setting kdb. Rather than having the entire test fail.

Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2014-02-28 14:16:32 -08:00
wusui
4f31eb01c0 Merge pull request #212 from ceph/wip-limit
Added --limit option to teuthology-suite.
2014-02-28 11:13:46 -08:00
Zack Cerza
4b5338faee Merge pull request #211 from ceph/wip-7554
mds_thrash #7554
2014-02-28 11:04:23 -06:00
Yuri Weinstein
bd9748d562 Added --limit option to teuthology-suite.
Use --limit to limit the number of jobs being scheduled during
teuthology-suite. Also can be used with schedule_suite.sh via the
10th argument.

Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Signed-off-by: Yuri Weinstein <yuri.weinstein@inktank.com>
2014-02-28 00:34:58 +00:00
John Spray
8dfcfa4a7e mds_thrash: Fix a potential getitem on None
get_mds_status returns None for things it can't see,
so have to check for Noneness on all its outputs.

Signed-off-by: John Spray <john.spray@inktank.com>
2014-02-27 18:39:45 +00:00
John Spray
22825c25a5 mds_thrash: Refactor gevent usage + get traceback
This simplifies the code to make MdsTrash be a greenlet
(as it logically is) rather than encapsulating one that
gets started in __init__ (spawning threads in constructors
is evil).

With this done, do_thrash is called from _run inside an
exception handler that will give us full tracebacks
if something bad happens.

Signed-off-by: John Spray <john.spray@inktank.com>
2014-02-27 18:39:45 +00:00
John Spray
f12426c3ec mds_thrash: PEP8-ize whitespace
...so that I can edit the code in a python IDE without
it lighting up like a christmas tree!

Signed-off-by: John Spray <john.spray@inktank.com>
2014-02-27 14:25:13 +00:00
Zack Cerza
3c87b8497a Worker logging tweaks
Change some statements' log levels; don't show bootstrap output if there
is no error.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-02-26 17:15:37 -06:00
Zack Cerza
0dcf3f4d71 --dead implies --refresh
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-02-26 16:41:43 -06:00
Zack Cerza
d42f31e5d3 Symlink worker log after child starts
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-02-26 16:22:32 -06:00
Zack Cerza
34478127d2 In find_job_info(), also look for orig.config.yaml
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-02-26 13:13:41 -06:00
Zack Cerza
f8a2a53c59 Push complete info when reporting jobs as dead
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-02-26 12:09:04 -06:00
Zack Cerza
0db35b9571 Merge pull request #210 from ceph/wip-queue
Add teuthology-queue command for beanstalk Managmeent.
2014-02-26 11:47:02 -06:00
Gregory Farnum
700bb94ba4 Merge pull request #208 from ceph/wip-7485
task: Add mds_creation_failure

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-02-26 09:46:43 -08:00
Alfredo Deza
6ba89851f1 fix docstring typo
Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
2014-02-26 08:17:48 -05:00
John Spray
7cc93751e7 task: Add mds_creation_failure
This is test code to accompany CephFS fix #7485.

Also fix DaemonState.wait_for_exit to clear up its 'proc'
attribute even if it fails, so that subsequent calls to 'restart'
happen properly.

Signed-off-by: John Spray <john.spray@inktank.com>
2014-02-26 13:03:15 +00:00
Sandon Van Ness
26f00fc541 Make help a bit more obvious. Misc tweeks.
Put each yaml in the job_description on its own line so not so
wide of a line. Make delete default None not False in function.

Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2014-02-25 19:57:41 -08:00
Sandon Van Ness
e04f8fd3cd Add teuthology-queue command for beanstalk Managmeent.
Supports listing entire queue of machine type and deleting test
suite runs from the queue without wiping the entire queue.

Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2014-02-25 11:13:31 -08:00
Alfredo Deza
2591935180 use itertools for seconds sum
Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
2014-02-24 16:22:59 -05:00
Alfredo Deza
60892ca995 tests for the new while helper
Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
2014-02-24 16:21:58 -05:00
Zack Cerza
38cead630b Flip logic of checking whether a branch can report
Previously we checked if the branch being used was in a whitelist of
branches known to contain the reporting feature. Now, switch to checking
against a blacklist of branches known to *not* have the feature:
argonaut, bobtail, cuttlefish and dumpling.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-02-24 15:01:58 -06:00