Commit Graph

2136 Commits

Author SHA1 Message Date
Christophe Courtaut
4d3c1a1997 Adds radosgw-agent small file sync test
Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
2014-03-10 00:29:50 -07:00
Sage Weil
a4dfbc88f3 workunit: change timeout 1h -> 3h
Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-09 10:29:37 -07:00
Zack Cerza
0b9d8936c2 Add missing space in error message
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-08 17:26:14 -06:00
Zack Cerza
e471f404e6 Make try_push_job_info() retry using safe_while
I've noticed sometimes try_push_job_info() fails because of server load
issues. It should try more than once (and now does).

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-08 15:36:22 -06:00
Zack Cerza
73849c1179 Update safe_while's suggested usage pattern
I didn't love the way safe_while was encouraged to be used and it didn't
fit right with the new no-raising behavior. Now it's encouraged to be
used like this:

with safe_while() as proceed:
    while proceed():
        do_things()

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-08 15:19:31 -06:00
Zack Cerza
c98098496e Add optional _raise parameter
Defaults to True but if set to False, when giving up log a warning
instead of raising an exception.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-08 14:58:39 -06:00
Zack Cerza
eb667673e4 Pass timeout to _spawn_on_all_clients()
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-08 14:16:07 -06:00
Zack Cerza
4e01884e14 Log correct action in CephManager.remove_pool()
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-08 08:18:53 -06:00
Zack Cerza
72c63f13fa Log timeout value
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-08 08:01:00 -06:00
Zack Cerza
b4205caedf Iterate more sensibly over processes
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-08 07:58:13 -06:00
Zack Cerza
204b3ac710 Change default workunit timeout to 1h
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 15:14:50 -06:00
Zack Cerza
ec38bd3cb3 Use safe_while's action arg
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 14:04:49 -06:00
Zack Cerza
73f5af2f6a Add optional 'action' parameter to safe_while
This is to make it easier to see what actually timed out when scanning
error logs

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 14:02:33 -06:00
Zack Cerza
7604a1b670 Update safe_while users to reflect new defaults
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 13:36:35 -06:00
Zack Cerza
8258c8479b Change safe_while defaults to 6s 10x no increment
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 13:33:27 -06:00
Zack Cerza
081a5c4bf9 Look for ready() in the right place
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 13:03:58 -06:00
Sage Weil
ad04075b17 Merge pull request #218 from ceph/wip-radosbench-timeout
Introduce a timeout to radosbench's join phase
2014-03-07 09:39:04 -08:00
Zack Cerza
1778d35847 Use a timeout of config.get('time') * 2
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 11:36:45 -06:00
Zack Cerza
0be5f1f015 Introduce a timeout to radosbench's join phase
Set to 15min right now.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-07 11:21:31 -06:00
Zack Cerza
4b11d072bf Mark this 'while True' loop with 'finite' comment
If we're going to embark on a mission to rid ourselves of
infinitely-looping while loops, it seems smart to start marking the ones
we've fixed in order to make grepping for unfixed loops easier.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-06 23:09:44 -06:00
Zack Cerza
20bfc97844 Give up on wait_until_healthy() after 15min
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-06 22:32:29 -06:00
Zack Cerza
ac6ebf87b3 Merge pull request #217 from ceph/wip-fix-plus
suite: fix + handling
2014-03-06 20:00:09 -06:00
Sage Weil
94d73bd411 suite: fix build_matrix for + case
The + means we should concatenate everything in the directory.  Do that.

This was totally broken before (and unused until now).

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-06 17:56:01 -08:00
Josh Durgin
57259b54be rados: use backwards compatible args
For ops that default to 0, only add arguments for them if they are
specified in the task config. This lets us use the same task across
ceph versions, even if the older version does not support new op
types, like append on dumpling.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2014-03-06 17:32:10 -08:00
Sage Weil
95d445daea Merge remote-tracking branch 'gh/firefly' 2014-03-06 16:57:45 -08:00
Sage Weil
70de7d52d3 Revert "Do not spawn a parallel task if dictionary entry does not exist."
This reverts commit dadc9f7d0b.
2014-03-06 16:56:14 -08:00
Samuel Just
9a8bf66d70 ceph.conf.template: add in sensible erasure coding defaults
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-05 12:39:10 -08:00
Samuel Just
e6698af818 ceph_manager: fix erasure coding m, k values
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-05 12:38:51 -08:00
Zack Cerza
e69da0a5ea Log job PID
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-05 14:13:04 -06:00
Sage Weil
c5f0e7181c Merge pull request #216 from ceph/wip-workunit-timeout
Add a 6h timeout to workunits
2014-03-05 10:14:50 -08:00
Zack Cerza
c3c0b080f6 Add a 6h timeout to workunits
The timeout is configurable, but defaults to six hours. It's implemented
by using the 'timeout' command on the remote host.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-03-05 11:17:13 -06:00
Zack Cerza
963fb881cb Merge pull request #215 from dachary/wip-ec-pool
ceph-manager: fix ec_pool parameters
2014-03-03 14:47:52 -06:00
Zack Cerza
3006f23143 Merge pull request #207 from ceph/wip-7356
helper for bombing out of infinite loops
2014-03-03 10:24:27 -06:00
Loic Dachary
7889acbf65 ceph-manager: fix ec_pool parameters
* the crush ruleset and the pool create parameters must be identicals
  k=2 m= 1
* the --property argument is invalid
* the failure domain is ignored on pool create

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-02 00:41:21 +01:00
Alfredo Deza
a01d3ca87d Merge pull request #213 from ceph/wip-kdb-except
Allow setting kdb to fail.
2014-02-28 18:32:25 -05:00
Sandon Van Ness
fd507ed35a Allow setting kdb to fail.
Some kernels (primarily Debian distro kernels) do not support
setting kdb. Rather than having the entire test fail.

Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2014-02-28 14:16:32 -08:00
wusui
4f31eb01c0 Merge pull request #212 from ceph/wip-limit
Added --limit option to teuthology-suite.
2014-02-28 11:13:46 -08:00
Zack Cerza
4b5338faee Merge pull request #211 from ceph/wip-7554
mds_thrash #7554
2014-02-28 11:04:23 -06:00
Yuri Weinstein
bd9748d562 Added --limit option to teuthology-suite.
Use --limit to limit the number of jobs being scheduled during
teuthology-suite. Also can be used with schedule_suite.sh via the
10th argument.

Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Signed-off-by: Yuri Weinstein <yuri.weinstein@inktank.com>
2014-02-28 00:34:58 +00:00
John Spray
8dfcfa4a7e mds_thrash: Fix a potential getitem on None
get_mds_status returns None for things it can't see,
so have to check for Noneness on all its outputs.

Signed-off-by: John Spray <john.spray@inktank.com>
2014-02-27 18:39:45 +00:00
John Spray
22825c25a5 mds_thrash: Refactor gevent usage + get traceback
This simplifies the code to make MdsTrash be a greenlet
(as it logically is) rather than encapsulating one that
gets started in __init__ (spawning threads in constructors
is evil).

With this done, do_thrash is called from _run inside an
exception handler that will give us full tracebacks
if something bad happens.

Signed-off-by: John Spray <john.spray@inktank.com>
2014-02-27 18:39:45 +00:00
John Spray
f12426c3ec mds_thrash: PEP8-ize whitespace
...so that I can edit the code in a python IDE without
it lighting up like a christmas tree!

Signed-off-by: John Spray <john.spray@inktank.com>
2014-02-27 14:25:13 +00:00
Zack Cerza
3c87b8497a Worker logging tweaks
Change some statements' log levels; don't show bootstrap output if there
is no error.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-02-26 17:15:37 -06:00
Zack Cerza
0dcf3f4d71 --dead implies --refresh
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-02-26 16:41:43 -06:00
Zack Cerza
d42f31e5d3 Symlink worker log after child starts
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-02-26 16:22:32 -06:00
Zack Cerza
34478127d2 In find_job_info(), also look for orig.config.yaml
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-02-26 13:13:41 -06:00
Zack Cerza
f8a2a53c59 Push complete info when reporting jobs as dead
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-02-26 12:09:04 -06:00
Zack Cerza
0db35b9571 Merge pull request #210 from ceph/wip-queue
Add teuthology-queue command for beanstalk Managmeent.
2014-02-26 11:47:02 -06:00
Gregory Farnum
700bb94ba4 Merge pull request #208 from ceph/wip-7485
task: Add mds_creation_failure

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-02-26 09:46:43 -08:00
Alfredo Deza
6ba89851f1 fix docstring typo
Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
2014-02-26 08:17:48 -05:00