Commit Graph

153 Commits

Author SHA1 Message Date
Sridhar Seshasayee
b5570238b6 qa/tasks: Add wait_for_clean() check prior to initiating scrubbing.
Fixes: https://tracker.ceph.com/issues/49983
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2021-03-25 22:01:19 +05:30
Patrick Donnelly
ec1b82fd24
qa: skip exit-on-first-failure option for valgrind on ubuntu
The valgrind version is too old.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-03 09:30:21 -08:00
Patrick Donnelly
911b9a55bb
qa: wait for MDS to join fsmap
When running under valgrind, MDS may be slow to be added to the FSMap
(especially if mons are in valgrind too). The file system creation that
follows will throw unnecessary warnings about insufficient standbys if
no MDS is available.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-03 09:30:21 -08:00
Patrick Donnelly
3681e3a1a8
qa: move get_valgrind_args to qa
This method is unused in the teuthology repo. The helper method better
belongs here where it is more easily modified.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-03 09:30:08 -08:00
Sage Weil
4fe55af819 qa/tasks/ceph: set ctx.ceph[cluster_name].fsid
Signed-off-by: Sage Weil <sage@newdream.net>
2021-02-17 12:17:04 -05:00
Sage Weil
59576d17f7
Merge pull request #38817 from ideepika/fix-interactive-error
qa/tasks/ceph: do not update info.yaml if ctx.archive is not set
2021-01-20 15:21:15 -06:00
Patrick Donnelly
abe7c86337
qa: remove ceph file systems on completion
So that we can avoid MDS replacement warnings.

Fixes: https://tracker.ceph.com/issues/48757
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-01-08 10:44:53 -08:00
Deepika Upadhyay
c5b1d0ac46 qa/tasks/ceph: do not update info.yaml if ctx.archive is not set
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
2021-01-08 22:57:55 +05:30
Deepika Upadhyay
4da3b23b89 qa/tasks/ceph: rename s/update_archive_setting/update_info_yaml
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
2020-12-03 22:08:52 +05:30
Patrick Donnelly
9b8fc10031
Merge PR #37708 into master
* refs/pull/37708/head:
	qa/suites/fs: enable thrashing in multifs environment
	qa/workunits/fs/snaps: allow tests to be run
	qa/tasks/{kclient,ceph_fuse}: allow mounting
	qa/tasks: allow per file system config setting

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2020-12-02 12:07:17 -08:00
Ilya Dryomov
0e4bc27722 qa/tasks/ceph: resurrect log compression
Commit 9536625558 ("qa/tasks/ceph: use Cluster.sh() and Remote.sh()
when appropriate") dropped run.wait(), which waits for all given
processes to exit.  This resulted in errors like

  INFO:teuthology.orchestra.run.smithi107.stderr:tar: ./objectstore_tool..log: file changed as we read it
  INFO:teuthology.orchestra.run.smithi107.stderr:tar: ./ceph-client.admin.175125.log: File removed before we read it

as the task moved on to archiving semi-corrupted and uncompressed logs,
filling up the lab cluster.

Revert that hunk, as Cluster.sh() is useless here -- we don't need
stdout or stderr, but very much need parallel execution and wait for
the compression to finish.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-11-24 00:51:54 +01:00
Ramana Raja
7016a2001d qa/tasks: allow per file system config setting
Signed-off-by: Ramana Raja <rraja@redhat.com>
2020-11-20 13:23:21 +05:30
Kefu Chai
9536625558 qa/tasks/ceph: use Cluster.sh() and Remote.sh() when appropriate
for better readability

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-11-14 15:22:10 +08:00
Kefu Chai
de71f6b0a3 qa/tasks/ceph: update_archive_setting() only if ctx.archive is valid
When running teuthology interactively, ctx.archive might not be set.
If it's not set, there is no point trying to access files there.

Fixes: https://tracker.ceph.com/issues/48058

Signed-off-by: Marcus Watts <mwatts@redhat.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-11-14 15:22:00 +08:00
Kefu Chai
43f2738a0e qa/tasks/ceph: extract update_archive_setting()
for better readability

also update the comment in `ceph_crash()` to reflect the changed
settings

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-11-03 12:01:14 +08:00
Patrick Donnelly
c569036c5d
Merge PR #37629 into master
* refs/pull/37629/head:
	qa/cephfs: add session_timeout option support
	qa/cephfs: move the cephfs's opertions setting to create()
	qa/cephfs: add 'cephfs:' section support

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2020-10-25 16:26:36 -07:00
Xiubo Li
0422673b61 qa/cephfs: add session_timeout option support
When the mds revoking the Fwbl caps, the clients need to flush
the dirty data back to the OSDs, but the flush may make the OSDs
to be overloaded and slow, which may take more than 60 seconds to
finish. Then the MDS daemons will report the WRN messages.

For the teuthology test cases, let's just increase the timeout
value to make it work.

Fixes: https://tracker.ceph.com/issues/47565
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2020-10-23 14:27:37 +08:00
Xiubo Li
cb8081ce7f qa/cephfs: move the cephfs's opertions setting to create()
Fixes: https://tracker.ceph.com/issues/47565
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2020-10-23 14:27:37 +08:00
Xiubo Li
3b5303482f qa/cephfs: add 'cephfs:' section support
Fixes: https://tracker.ceph.com/issues/47565
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2020-10-23 14:27:30 +08:00
Deepika Upadhyay
7ef18559cb qa: drop hammer branch qa tests
fixes: https://tracker.ceph.com/issues/47731
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
2020-10-15 17:32:06 +05:30
Patrick Donnelly
eb95dabc3f
qa: fix proc exit status check
Fixes: f30a84b6a7
Fixes: https://tracker.ceph.com/issues/47677
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2020-09-28 11:03:21 -07:00
Patrick Donnelly
f30a84b6a7
qa: ignore logrotate state rename error
This is hopefully a transient issue that can be ignored.

Fixes: https://tracker.ceph.com/issues/42433
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2020-09-20 14:05:13 -07:00
Patrick Donnelly
269667b3a2
Merge PR #37218 into master
* refs/pull/37218/head:
	qa: spawn MDS daemons before creating fs

Reviewed-by: Kefu Chai <kchai@redhat.com>
2020-09-18 16:25:59 -07:00
Patrick Donnelly
61db7a9c2e
qa: spawn MDS daemons before creating fs
This avoids unnecessary MDS_ALL_DOWN messages because the MDS daemons
have not yet been spawned.

Fixes: https://tracker.ceph.com/issues/47518
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2020-09-17 09:26:38 -07:00
Greg Farnum
9506d09e3b Merge remote-tracking branch 'origin/master' into wip-stretch-mode
Conflicts:
	src/include/ceph_features.h

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2020-09-15 02:25:07 +00:00
Greg Farnum
d02625331c Merge remote-tracking branch 'origin/master' into wip-stretch-mode 2020-09-14 02:32:19 +00:00
Kyr Shatskyy
cea546f3b0 qa/tasks/ceph: use remote.read_file instead of misc.get_file
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
2020-09-04 00:02:15 +02:00
Kyr Shatskyy
b4a0ef0ed0 qa/tasks/ceph: use remote.write_file
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
2020-09-04 00:02:15 +02:00
Kefu Chai
786b87cdb2
Merge pull request #36718 from ShraddhaAg/add-dir-to-archive-in-logs
qa/tasks/ceph.py: add ceph logs directory in job's info.yaml

Reviewed-By: Josh Durgin <jdurgin@redhat.com>
2020-08-27 22:26:01 +08:00
Sage Weil
2ee9365d0b qa: log-whitelist -> log-ignorelist
Signed-off-by: Sage Weil <sage@newdream.net>
2020-08-24 19:53:08 +00:00
Kefu Chai
e6eabeeeb2 qa/tasks/ceph: create a log file before redirecting to it
as it is shell who interprets ">>" and redirect the stderr to given
file, but the shell process is launched by ubuntu:ububunt without using
sudo, so the command fails with "Permission denied" failure. to address
this issue, in this change, a file with proper priviledges is created
beforehand using `install`, so shell is able to write to it.

also, instead of creating this file in `maybe_redirect_stderr()`, it
returns the command to create the log file.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-08-22 11:33:12 +08:00
Shraddha Agrawal
e991f04b59 qa/tasks/ceph.py: add ceph logs directory in job's info.yaml
This commit adds the file path of ceph log directories to the job's
info.yaml log file. The motivation behind this is, in case of job
timeout, the logs would still be tranferred to teuthology host
before nuking test machines using these ceph log directory paths in
job's info.yaml log file.

Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>
2020-08-19 17:34:26 +05:30
Kefu Chai
e0620eefbd qa/tasks/ceph: redirect stderr for crimson flavor
we should redirect stderr for crimson instead for default flavor. this
change addresses a regression introduced by
da76f46461

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-08-07 01:11:57 +08:00
Kefu Chai
47d2822969
Merge pull request #36337 from tchaikov/wip-qa-ceph-waiting
qa/tasks/ceph: do not print out empty list of pg

Reviewed-by: Neha Ojha <nojha@redhat.com>
2020-07-31 20:32:13 +08:00
Kefu Chai
da76f46461 qa/tasks/ceph: redirect stderr to log file
crimson write log to stderr, let's redirect it to log file for a more
peaceful teuthology.log.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-07-31 00:51:30 +08:00
Kefu Chai
735aa04105 qa/tasks/ceph: do not print out empty list of pg
we could have following logging messages:

tasks.ceph:Waiting for all PGs to be active+clean and split+merged, waiting on ['2.6', '2.5', '1.0', '2.4'] to go clean and/or [] to split/merge

if the cluster has non-active+clean pgs when the "ceph" is about to
end. but this message is a little bit confusing in the sense it
lists "[]" in it.

in this change, only PGs being waited are listed. also, added some
cleanups:

* use "else" to check if the loop is terminated by a break
* remove "0" from the range() call

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-07-29 15:07:26 +08:00
Greg Farnum
7270bc4e47 qa: swap BytesIO for StringIO in ceph.py create_simple_monmap
We removed StringIO for py3 compatibility going forward and this collided.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2020-07-20 07:08:50 +00:00
Greg Farnum
cea78c7c67 qa: add new netsplit task
This is super basic right now and only works for monitor daemons
as it has to parse out their IPs from cluster information, then
turn that into the Host objects. We can extend it in future.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2020-07-08 04:26:03 +00:00
Greg Farnum
3bc1ae1eeb qa: use the config when building a monmap
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2020-07-08 04:26:03 +00:00
Kefu Chai
a7f18e46b9 qa/tasks/{ceph,ceph_manager}: drop py2 support
Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-07-05 10:58:28 +08:00
Kefu Chai
2fa726b88c qa/tasks: flake8 fixes
Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-06-23 23:00:56 +08:00
Nathan Cutler
bc76b39a30 qa/tasks/ceph.py: fail test if osd devices not found
Fixes: https://tracker.ceph.com/issues/42357
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2020-03-25 09:35:01 +01:00
Nathan Cutler
9abebf28a8 qa/tasks/ceph.py: use .format to log dicts
The ".format" builtin logs dicts nicely right out of the box.

Also, some of the log messages were too cryptic - fixed them in this commit as
well.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
2020-03-25 09:33:44 +01:00
Nathan Cutler
ad477be286 qa/tasks/ceph.py: drop roles_to_journals and remote_to_roles_to_journals
These do not seem to get any use anymore.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
2020-03-25 09:33:44 +01:00
Nathan Cutler
1393317129 qa/tasks/ceph.py: drop block_journal, tmpfs_journal
I looked, but did not find any tests that actually use these options.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
2020-03-25 09:33:44 +01:00
Nathan Cutler
51c714d9b2 qa/tasks/ceph.py: cleanup: stop calling get_wwn_id_map()
Nowadays, get_wwn_id_map is essentially a noop - it does:

    return dict((d, d) for d in devs)

This reverts another bit of 8f720454cb from 2013.

References: https://tracker.ceph.com/issues/42313
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2020-03-25 09:33:44 +01:00
Kefu Chai
da736c22c5 qa/tasks/ceph.py: quote "<kind>" in command line
otherwise bash will intepret "kind" as a file when handling command like
```
sudo zgrep <kind> /var/log/ceph/valgrind/* /dev/null | sort | uniq
```
and try to feed its content to zgrep, and write the output of zgrep
to /var/log/ceph/valgrind/*. this is not the intended behavior. what we
what to do is to pass "<kind>" as an argument to zgrep, along with
the globbed files names which matches "/var/log/ceph/valgrind/*".

in this change, "<kind>" is quoted as in the command line. it's also
what `pipes.quote()` does before the change of
35cf5131e7152ce20d916aa99c124751d6a97f5c.

this addresses the regression introduced by
35cf5131e7152ce20d916aa99c124751d6a97f5c.

Fixes: https://tracker.ceph.com/issues/44454
Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-03-06 12:17:42 +08:00
Kyr Shatskyy
fc5662957b qa/tasks/ceph: py3 compatibility
Addresses:
  TypeError: 'dict_values' object is not subscriptable

Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
2020-03-04 13:09:16 +08:00
Kyr Shatskyy
e46eb8348e qa/tasks: fix imports for py3 compatibility
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
2020-03-04 13:09:16 +08:00
Kyr Shatskyy
35cf5131e7 qa/tasks/ceph: get rid of cStringIO for py3 compat
Use io.BytesIO instead of cStringIO.StringIO
Use six.ensure_str whenever it needs to convert binary to str.

Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
2020-03-04 13:09:16 +08:00