Commit Graph

170 Commits

Author SHA1 Message Date
Ronen Friedman
dda89e77ae tests/scripts: use 'tell pg deep-scrub pgid' instead of 'tell pgid
deep-scrub'

as older OSD versions do not support the former.

Fixes: https://tracker.ceph.com/issues/64972

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2024-04-07 10:45:00 -05:00
Laura Flores
25c16d7883 qa/tasks: fix syntax for deep-scrub command
Fixes: https://tracker.ceph.com/issues/63967
Signed-off-by: Laura Flores <lflores@ibm.com>
2024-01-09 10:54:01 -06:00
Samuel Just
77fe84c095 qa/tasks/ceph: use tell <pgid> deep_scrub in osd_scrub_pgs
This is the more modern variant.  Crimson doesn't currently
support the pg <pgid> deep_scrub variant, so let's just use
this one generally.

Signed-off-by: Samuel Just <sjust@redhat.com>
2023-12-11 04:10:17 +00:00
Vallari Agrawal
397e85a5f8
qa: rewrite "valgrind_post" to use ValgrindScanner
previously, valgrind_post() func used grep to find error
from valgrind logs.
now, it uses ValgrindScanner to log better exceptions with
traceback and exception kind, along with creating a more detailed
summary in valgrind.yaml in archive.

Signed-off-by: Vallari Agrawal <val.agl002@gmail.com>
2023-10-26 14:08:46 +05:30
Yuri Weinstein
ecebe2f4b2
Merge pull request #50616 from batrick/i59120
qa: use parallel gzip for compressing logs

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2023-05-20 10:04:13 -04:00
Patrick Donnelly
06c90a6c48
qa: check each fs for health
Fixes: https://tracker.ceph.com/issues/59425
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-04-11 10:16:53 -04:00
Patrick Donnelly
6739e11563
qa: time log compression
For debugging and ad-hoc analytics.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-03-27 14:55:16 -04:00
Patrick Donnelly
0a03a47103
qa/tasks: give verbose gzip output
For future analysis.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-03-27 14:55:16 -04:00
Patrick Donnelly
3c76cc3c51
qa/tasks: use medium compression
To speed up compression.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-03-27 14:55:16 -04:00
Patrick Donnelly
23a29d4abe
qa/ceph: parallelize gzip
Our machines have lots of cores, use them!

Fixes: https://tracker.ceph.com/issues/59120
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-03-27 14:55:16 -04:00
Samuel Just
acedd169e4
Merge pull request #48516 from athanatos/sjust/wip-57801
crimson,mon: add guards to avoid accidental crimson deployment and to avoid usage of unsupported features with crimson

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2023-02-23 18:39:27 -08:00
Samuel Just
0449a4093e qa/task/ceph.py: set-allow-crimson for crimson clusters
Signed-off-by: Samuel Just <sjust@redhat.com>
2023-02-13 22:36:35 -08:00
Nitzan Mordechai
d8202bc67c rados: upgrade suite test filestore removal
When upgrading osd with filestore to reef, restart should not be possible
the osd won't boot and error message will be showed in the osd log

Signed-off-by: Nitzan Mordechai <nmordec@redhat.com>
2023-02-12 06:11:29 +00:00
Milind Changire
bf83eaa4e7 qa: enhancement for subvol creation and mounting
Fixes: https://tracker.ceph.com/issues/54317
Signed-off-by: Milind Changire <mchangir@redhat.com>
2022-04-07 14:15:56 +05:30
Patrick Donnelly
7812cfb674
qa: move CephManager cluster instantiation to subtask
This needs to be available for the cephfs_setup task so administration
mounts can run ceph commands, potentially through `cephadm shell`.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-10-05 13:32:15 -04:00
Sridhar Seshasayee
4b0dba28b6 qa/tasks: Set default caps for 'osd' type in generate_caps()
Assign the default caps for osds to be the same as what the AuthMonitor
sets for a new osd. See AuthMonitor::validate_osd_new() which sets the
following caps for a new osd:

 mon='allow profile osd'
 mgr='allow profile osd'
 osd=''allow *'

When an actual real world cluster is deployed, the above caps are applied.
Unless the user modifies the defaults, a cluster will operate with the
above caps. Therefore, it makes sense to use the defaults when testing
Ceph so that issues if any due to the default settings may be caught and
fixed.

Therefore, the caps for the 'osd' type is reset to the default in
generate_caps(). The caps for 'mgr' already reflects the system defaults.
The caps for 'mds' type is not changed in this commit and will be
investigated and changed if necessary later.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2021-09-01 13:46:01 +05:30
Xiubo Li
361ee535dd qa: multifs already enabled as default
Since pacific already mark multifs enabled as defaut.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-07-28 13:56:10 +08:00
Sridhar Seshasayee
b5570238b6 qa/tasks: Add wait_for_clean() check prior to initiating scrubbing.
Fixes: https://tracker.ceph.com/issues/49983
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2021-03-25 22:01:19 +05:30
Patrick Donnelly
ec1b82fd24
qa: skip exit-on-first-failure option for valgrind on ubuntu
The valgrind version is too old.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-03 09:30:21 -08:00
Patrick Donnelly
911b9a55bb
qa: wait for MDS to join fsmap
When running under valgrind, MDS may be slow to be added to the FSMap
(especially if mons are in valgrind too). The file system creation that
follows will throw unnecessary warnings about insufficient standbys if
no MDS is available.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-03 09:30:21 -08:00
Patrick Donnelly
3681e3a1a8
qa: move get_valgrind_args to qa
This method is unused in the teuthology repo. The helper method better
belongs here where it is more easily modified.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-03 09:30:08 -08:00
Sage Weil
4fe55af819 qa/tasks/ceph: set ctx.ceph[cluster_name].fsid
Signed-off-by: Sage Weil <sage@newdream.net>
2021-02-17 12:17:04 -05:00
Sage Weil
59576d17f7
Merge pull request #38817 from ideepika/fix-interactive-error
qa/tasks/ceph: do not update info.yaml if ctx.archive is not set
2021-01-20 15:21:15 -06:00
Patrick Donnelly
abe7c86337
qa: remove ceph file systems on completion
So that we can avoid MDS replacement warnings.

Fixes: https://tracker.ceph.com/issues/48757
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-01-08 10:44:53 -08:00
Deepika Upadhyay
c5b1d0ac46 qa/tasks/ceph: do not update info.yaml if ctx.archive is not set
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
2021-01-08 22:57:55 +05:30
Deepika Upadhyay
4da3b23b89 qa/tasks/ceph: rename s/update_archive_setting/update_info_yaml
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
2020-12-03 22:08:52 +05:30
Patrick Donnelly
9b8fc10031
Merge PR #37708 into master
* refs/pull/37708/head:
	qa/suites/fs: enable thrashing in multifs environment
	qa/workunits/fs/snaps: allow tests to be run
	qa/tasks/{kclient,ceph_fuse}: allow mounting
	qa/tasks: allow per file system config setting

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2020-12-02 12:07:17 -08:00
Ilya Dryomov
0e4bc27722 qa/tasks/ceph: resurrect log compression
Commit 9536625558 ("qa/tasks/ceph: use Cluster.sh() and Remote.sh()
when appropriate") dropped run.wait(), which waits for all given
processes to exit.  This resulted in errors like

  INFO:teuthology.orchestra.run.smithi107.stderr:tar: ./objectstore_tool..log: file changed as we read it
  INFO:teuthology.orchestra.run.smithi107.stderr:tar: ./ceph-client.admin.175125.log: File removed before we read it

as the task moved on to archiving semi-corrupted and uncompressed logs,
filling up the lab cluster.

Revert that hunk, as Cluster.sh() is useless here -- we don't need
stdout or stderr, but very much need parallel execution and wait for
the compression to finish.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-11-24 00:51:54 +01:00
Ramana Raja
7016a2001d qa/tasks: allow per file system config setting
Signed-off-by: Ramana Raja <rraja@redhat.com>
2020-11-20 13:23:21 +05:30
Kefu Chai
9536625558 qa/tasks/ceph: use Cluster.sh() and Remote.sh() when appropriate
for better readability

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-11-14 15:22:10 +08:00
Kefu Chai
de71f6b0a3 qa/tasks/ceph: update_archive_setting() only if ctx.archive is valid
When running teuthology interactively, ctx.archive might not be set.
If it's not set, there is no point trying to access files there.

Fixes: https://tracker.ceph.com/issues/48058

Signed-off-by: Marcus Watts <mwatts@redhat.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-11-14 15:22:00 +08:00
Kefu Chai
43f2738a0e qa/tasks/ceph: extract update_archive_setting()
for better readability

also update the comment in `ceph_crash()` to reflect the changed
settings

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-11-03 12:01:14 +08:00
Patrick Donnelly
c569036c5d
Merge PR #37629 into master
* refs/pull/37629/head:
	qa/cephfs: add session_timeout option support
	qa/cephfs: move the cephfs's opertions setting to create()
	qa/cephfs: add 'cephfs:' section support

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2020-10-25 16:26:36 -07:00
Xiubo Li
0422673b61 qa/cephfs: add session_timeout option support
When the mds revoking the Fwbl caps, the clients need to flush
the dirty data back to the OSDs, but the flush may make the OSDs
to be overloaded and slow, which may take more than 60 seconds to
finish. Then the MDS daemons will report the WRN messages.

For the teuthology test cases, let's just increase the timeout
value to make it work.

Fixes: https://tracker.ceph.com/issues/47565
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2020-10-23 14:27:37 +08:00
Xiubo Li
cb8081ce7f qa/cephfs: move the cephfs's opertions setting to create()
Fixes: https://tracker.ceph.com/issues/47565
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2020-10-23 14:27:37 +08:00
Xiubo Li
3b5303482f qa/cephfs: add 'cephfs:' section support
Fixes: https://tracker.ceph.com/issues/47565
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2020-10-23 14:27:30 +08:00
Deepika Upadhyay
7ef18559cb qa: drop hammer branch qa tests
fixes: https://tracker.ceph.com/issues/47731
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
2020-10-15 17:32:06 +05:30
Patrick Donnelly
eb95dabc3f
qa: fix proc exit status check
Fixes: f30a84b6a7
Fixes: https://tracker.ceph.com/issues/47677
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2020-09-28 11:03:21 -07:00
Patrick Donnelly
f30a84b6a7
qa: ignore logrotate state rename error
This is hopefully a transient issue that can be ignored.

Fixes: https://tracker.ceph.com/issues/42433
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2020-09-20 14:05:13 -07:00
Patrick Donnelly
269667b3a2
Merge PR #37218 into master
* refs/pull/37218/head:
	qa: spawn MDS daemons before creating fs

Reviewed-by: Kefu Chai <kchai@redhat.com>
2020-09-18 16:25:59 -07:00
Patrick Donnelly
61db7a9c2e
qa: spawn MDS daemons before creating fs
This avoids unnecessary MDS_ALL_DOWN messages because the MDS daemons
have not yet been spawned.

Fixes: https://tracker.ceph.com/issues/47518
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2020-09-17 09:26:38 -07:00
Greg Farnum
9506d09e3b Merge remote-tracking branch 'origin/master' into wip-stretch-mode
Conflicts:
	src/include/ceph_features.h

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2020-09-15 02:25:07 +00:00
Greg Farnum
d02625331c Merge remote-tracking branch 'origin/master' into wip-stretch-mode 2020-09-14 02:32:19 +00:00
Kyr Shatskyy
cea546f3b0 qa/tasks/ceph: use remote.read_file instead of misc.get_file
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
2020-09-04 00:02:15 +02:00
Kyr Shatskyy
b4a0ef0ed0 qa/tasks/ceph: use remote.write_file
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
2020-09-04 00:02:15 +02:00
Kefu Chai
786b87cdb2
Merge pull request #36718 from ShraddhaAg/add-dir-to-archive-in-logs
qa/tasks/ceph.py: add ceph logs directory in job's info.yaml

Reviewed-By: Josh Durgin <jdurgin@redhat.com>
2020-08-27 22:26:01 +08:00
Sage Weil
2ee9365d0b qa: log-whitelist -> log-ignorelist
Signed-off-by: Sage Weil <sage@newdream.net>
2020-08-24 19:53:08 +00:00
Kefu Chai
e6eabeeeb2 qa/tasks/ceph: create a log file before redirecting to it
as it is shell who interprets ">>" and redirect the stderr to given
file, but the shell process is launched by ubuntu:ububunt without using
sudo, so the command fails with "Permission denied" failure. to address
this issue, in this change, a file with proper priviledges is created
beforehand using `install`, so shell is able to write to it.

also, instead of creating this file in `maybe_redirect_stderr()`, it
returns the command to create the log file.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-08-22 11:33:12 +08:00
Shraddha Agrawal
e991f04b59 qa/tasks/ceph.py: add ceph logs directory in job's info.yaml
This commit adds the file path of ceph log directories to the job's
info.yaml log file. The motivation behind this is, in case of job
timeout, the logs would still be tranferred to teuthology host
before nuking test machines using these ceph log directory paths in
job's info.yaml log file.

Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>
2020-08-19 17:34:26 +05:30
Kefu Chai
e0620eefbd qa/tasks/ceph: redirect stderr for crimson flavor
we should redirect stderr for crimson instead for default flavor. this
change addresses a regression introduced by
da76f46461

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-08-07 01:11:57 +08:00