Commit Graph

112 Commits

Author SHA1 Message Date
Nathan Cutler
bc76b39a30 qa/tasks/ceph.py: fail test if osd devices not found
Fixes: https://tracker.ceph.com/issues/42357
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2020-03-25 09:35:01 +01:00
Nathan Cutler
9abebf28a8 qa/tasks/ceph.py: use .format to log dicts
The ".format" builtin logs dicts nicely right out of the box.

Also, some of the log messages were too cryptic - fixed them in this commit as
well.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
2020-03-25 09:33:44 +01:00
Nathan Cutler
ad477be286 qa/tasks/ceph.py: drop roles_to_journals and remote_to_roles_to_journals
These do not seem to get any use anymore.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
2020-03-25 09:33:44 +01:00
Nathan Cutler
1393317129 qa/tasks/ceph.py: drop block_journal, tmpfs_journal
I looked, but did not find any tests that actually use these options.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
2020-03-25 09:33:44 +01:00
Nathan Cutler
51c714d9b2 qa/tasks/ceph.py: cleanup: stop calling get_wwn_id_map()
Nowadays, get_wwn_id_map is essentially a noop - it does:

    return dict((d, d) for d in devs)

This reverts another bit of 8f720454cb from 2013.

References: https://tracker.ceph.com/issues/42313
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2020-03-25 09:33:44 +01:00
Kefu Chai
da736c22c5 qa/tasks/ceph.py: quote "<kind>" in command line
otherwise bash will intepret "kind" as a file when handling command like
```
sudo zgrep <kind> /var/log/ceph/valgrind/* /dev/null | sort | uniq
```
and try to feed its content to zgrep, and write the output of zgrep
to /var/log/ceph/valgrind/*. this is not the intended behavior. what we
what to do is to pass "<kind>" as an argument to zgrep, along with
the globbed files names which matches "/var/log/ceph/valgrind/*".

in this change, "<kind>" is quoted as in the command line. it's also
what `pipes.quote()` does before the change of
35cf5131e7.

this addresses the regression introduced by
35cf5131e7.

Fixes: https://tracker.ceph.com/issues/44454
Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-03-06 12:17:42 +08:00
Kyr Shatskyy
fc5662957b qa/tasks/ceph: py3 compatibility
Addresses:
  TypeError: 'dict_values' object is not subscriptable

Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
2020-03-04 13:09:16 +08:00
Kyr Shatskyy
e46eb8348e qa/tasks: fix imports for py3 compatibility
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
2020-03-04 13:09:16 +08:00
Kyr Shatskyy
35cf5131e7 qa/tasks/ceph: get rid of cStringIO for py3 compat
Use io.BytesIO instead of cStringIO.StringIO
Use six.ensure_str whenever it needs to convert binary to str.

Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
2020-03-04 13:09:16 +08:00
Sage Weil
3a10b54a6a qa/tasks/ceph.py: add pre-mgr-commands option for ceph task
Signed-off-by: Sage Weil <sage@redhat.com>
2020-02-19 15:31:26 -06:00
Sage Weil
1dc2a8a09e qa/tasks/ceph: only re-request scrub on unscrubbed pgs
If we haven't scrubbed everything, we occasinoally re-request scrub in case
the request was missed by the OSD (this can happen).  But we were
re-requesting scrub on ALL pgs, and if they are done in a
semi-deterministic order and are slow, then we may never get to the final
ones.

Signed-off-by: Sage Weil <sage@redhat.com>
2020-01-30 10:22:49 -06:00
Sage Weil
be05f210e7 qa/tasks/ceph: simplify mon_health_to_clog suppression during restart
This only does one thing--do that.  More simply.

Signed-off-by: Sage Weil <sage@redhat.com>
2020-01-23 21:14:31 -06:00
Sage Weil
777f239df9 qa/tasks/ceph: set mon_health_to_clog=false via mon config
This actually works better on new versions.

Signed-off-by: Sage Weil <sage@redhat.com>
2020-01-23 17:12:56 -06:00
Thomas Bechtold
bdcc94a1d1 qa: Run flake8 on python2 and python3
To be able to catch problems with python2 *and* python3, run flake8
with both versions. From the flake8 homepage:

It is very important to install Flake8 on the correct version of
Python for your needs. If you want Flake8 to properly parse new
language features in Python 3.5 (for example), you need it to be
installed on 3.5 for Flake8 to understand those features. In many
ways, Flake8 is tied to the version of Python on which it runs.

Also fix the problems with python3 on the way.
Note: This requires now the six module for teuthology. But this is
already an install_require in teuthology itself.

Signed-off-by: Thomas Bechtold <tbechtold@suse.com>
2019-12-13 09:24:20 +01:00
Thomas Bechtold
0127cd1e88 qa: Enable flake8 tox and fix failures
There were a couple of problems found by flake8 in the qa/
directory (most of them fixed now). Enabling flake8 during the usual
check runs hopefully avoids adding new issues in the future.

Signed-off-by: Thomas Bechtold <tbechtold@suse.com>
2019-12-12 10:21:01 +01:00
Sage Weil
f2a1d0afe8 qa/tasks/ceph: replace wait_for_osds_up with manager.wait_for_all_osds_up
Signed-off-by: Sage Weil <sage@redhat.com>
2019-11-21 10:46:54 -06:00
Sage Weil
7c0eacb780 qa/tasks/ceph: healthy: use manager helpers (instead of teuthology/misc ones)
Signed-off-by: Sage Weil <sage@redhat.com>
2019-11-21 10:46:54 -06:00
Kefu Chai
02a2a04ad2 qa/tasks/ceph.py: remove unused variables
Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-10-21 12:37:56 +08:00
Kefu Chai
09863ef3d9 qa/tasks/ceph: tolerate 'T' or ' ' as date and time separator
str.replace() does not change the string in-place, so we need to assign
its return value to `t`.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-10-21 12:18:58 +08:00
Kefu Chai
548098668e qa/tasks/ceph_manager: do not panic of "pg_num_target" is missing
we don't have "pg_num_target" in "osd dump" back in mimic, so we don't
need to check it if it is missing when performing upgrade test.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-10-21 12:18:58 +08:00
Kefu Chai
0a8448e0c7
Merge pull request #30829 from kshtsk/wip-misc-drop-legacy-code-in-skeleton_config
tasks/ceph: drop testdir replacement in skeleton_config

Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-10-18 16:06:23 +08:00
Patrick Donnelly
463d8731d7
Merge PR #30873 into master
* refs/pull/30873/head:
	qa: get rid of iterkeys for py3 compatibility

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>
2019-10-14 13:01:36 -07:00
Kefu Chai
e660ca0880
Merge pull request #30792 from kshtsk/wip-python3-no-iteritems-ceph-task
tasks/ceph: get rid of iteritems for python3

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2019-10-14 22:26:36 +08:00
Kyr Shatskyy
5f95b532aa qa: get rid of iterkeys for py3 compatibility
Fixes: https://tracker.ceph.com/issues/42287

Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
2019-10-11 18:54:29 +02:00
Kyr Shatskyy
f1a96b55dc tasks/ceph: drop testdir replacement in skeleton_config
The str.format is not used anymore for ceph.conf.template

Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
2019-10-10 02:51:11 +02:00
Venky Shankar
e646524567 qa: tolerate ECONNRESET errcode during logrotate
Fixes: http://tracker.ceph.com/issues/41800
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2019-10-09 08:46:35 -04:00
Kyr Shatskyy
9b5bd6c715 tasks/ceph: get rid of iteritems for python3
For python3 compatibility use items() instead of iteritems().

Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
2019-10-08 16:41:32 +02:00
Sage Weil
b078dd002f qa/tasks/ceph: restart: stop osd, mark down, then start
If we stop, start, and then mark down, we may (likely) end up marking
the *new* instance down, which is noisy (generates a cluster warning
message) and inefficient.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-09-05 21:25:27 -05:00
Kefu Chai
b0aec38341
Merge pull request #29090 from liewegas/wip-40792
mon/MonClient: ENXIO when sending command to down mon

Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-08-27 17:34:13 +08:00
Patrick Donnelly
d30af45a54
Merge PR #29715 into master
* refs/pull/29715/head:
	qa: fix broken ceph.restart marking of OSDs down
	qa: add debugging failed osd-release setting

Reviewed-by: Sage Weil <sage@redhat.com>
2019-08-23 10:09:17 -07:00
Patrick Donnelly
231f79030b
qa: stop DaemonWatchdog for each cluster in daemon roles
Fixes: https://tracker.ceph.com/issues/41398
Introduced-by: 08b99eef27
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-08-22 09:00:49 -07:00
Patrick Donnelly
73c7d14eab
qa: fix broken ceph.restart marking of OSDs down
Sage noticed `osd down` was not being performed. Bug was that the role
format had changed so splitting no longer worked correctly.

Fixes: https://tracker.ceph.com/issues/40773
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-08-22 08:55:52 -07:00
Sage Weil
662dbf9c6c qa/tasks/ceph: retry several times to tell mons ot stop logging health
If we have any sort of failure injection, one attempt is not enough.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-21 15:12:48 -07:00
Jos Collin
5a296278f6
qa/tasks: Fix typo
Signed-off-by: Jos Collin <jcollin@redhat.com>
2019-08-06 06:36:45 +05:30
Jos Collin
08b99eef27
qa/tasks: start DaemonWatchdog when ceph starts
* Start DaemonWatchdog when ceph starts
* Drop the DaemonWatchdog starting in mds_thrash.py
* Bring the thrashers in mds_thrash.py into the context

Fixes: http://tracker.ceph.com/issues/10369
Signed-off-by: Jos Collin <jcollin@redhat.com>
2019-08-06 06:36:33 +05:30
Kefu Chai
90022b35ab
Merge pull request #17619 from liuchang0812/wip-ec-below-min-size
osd: allow EC PGs to do recovery below min_size

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-06-22 12:58:55 +08:00
Sage Weil
b747ae1711 qa/tasks/ceph: tolerate 'T' or ' ' as date and time separator
Signed-off-by: Sage Weil <sage@redhat.com>
2019-05-29 14:12:15 -05:00
Greg Farnum
0ee63a0450 qa: extend get_pool_property() to allow non-int values
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2019-05-10 10:45:25 +08:00
Patrick Donnelly
8cbdad9f9b
qa: update testing for standby-replay
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-02-27 21:39:12 -08:00
Casey Bodley
0aebb55af5 qa/mon: fix cluster support for monmap bootstrap
-filter out mons from other clusters
-fix parsing of mon name from role

Fixes: http://tracker.ceph.com/issues/38115

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2019-01-30 12:24:36 -05:00
Sage Weil
236c8a4528 qa/tasks/ceph.py: bracket addrvecs in mon_host etc
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-16 08:33:03 -06:00
Sage Weil
54c5202b74 qa/tasks/ceph: stop any split/merge activity before scrubbing
If there are leftover merges at the end of the run they can take a long
time to get through, blowing our timeout for (waiting for pgs to become
active and to stop splitting/merge) and scrubbing pgs.  Stop all of that
at the end of the run so that we don't have to wait so long.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-14 06:51:21 -06:00
Sage Weil
a19b8e5b14 qa/tasks/ceph: set initial monmap features with using addrvec addrs
The --add option will only infer a bare IP to include a v2 addr if the
NAUTILUS feature is there, and that isn't normally present on a freshly
generate monmap.  Add it if we are doing addrvecs!

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
241d402d7c qa/tasks/ceph: only use monmaptool --addv if addr has [,:v]
Otherwise, we want the --add path, which has the logic to infer ports,
v2+v1, etc.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
545df766be qa/tasks/ceph: keep mon addrs in ctx namespace
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
1ab352dd31 qa/tasks/ceph.py: move methods from teuthology.git into ceph.py directly; support mon bind * options
Having these live in teuthology.git is silly, since they are only consumed
by the ceph task, and it is hard to revise the behavior.

Revise the behavior by adding mon_bind_* options.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
de9b77cd38 qa/tasks/ceph: add 'mon_bind_addrvec' and 'mon_bind_msgr2' options
- Sometimes we don't want to use v2 addrs
- Sometimes we don't want addrvecs at all (e.g., upgrades)

Signed-off-by: Sage Weil <sage@redhat.com>
2018-12-21 15:31:32 -06:00
Sage Weil
6f06e394e4 qa/tasks/ceph: wait for splits/merges before final scrub
Signed-off-by: Sage Weil <sage@redhat.com>
2018-12-19 14:37:01 -06:00
Sage Weil
c9d275a0e2 Merge PR #21444 into master
* refs/pull/21444/head:
	qa: Replace 'ceph' with cluster name in restart()

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-11-25 21:20:44 -06:00
Sage Weil
2ec2c5ef9e Merge PR #24795 into master
* refs/pull/24795/head:
	qa/tasks/ceph: gather crash dumps
2018-11-20 17:34:54 -06:00