Commit Graph

3242 Commits

Author SHA1 Message Date
Patrick Donnelly
541cc173c6 Merge PR #43179 into master
* refs/pull/43179/head:
	qa: lengthen grace for fs map showing dead MDS

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
2021-09-20 20:48:00 -04:00
Patrick Donnelly
c8a900c6c6 Merge PR #42763 into master
* refs/pull/42763/head:
	mon/FSCommands: add 'recover' flag in `fs new` command

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-09-20 20:46:25 -04:00
Patrick Donnelly
0d9753fa3c Merge PR #43122 into master
* refs/pull/43122/head:
	qa: add test for standby-replay marking rank damaged
	MDSMonitor: handle damaged from standby-replay
	mds: add config to mark rank damaged in standby-replay
	include: unset std::hex after printing CompatSet
	mds: refactor iterator lookup
	mds: harden rank lookup

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2021-09-16 21:47:40 -04:00
Sage Weil
1a19d69679 Merge PR #43172 into master
* refs/pull/43172/head:
	qa/tasks/kubeadm: modify (do not clobber) daemon.json

Reviewed-by: Joseph Sawaya <jsawaya@redhat.com>
2021-09-15 22:48:36 -04:00
Patrick Donnelly
91c6f3364d Merge PR #42719 into master
* refs/pull/42719/head:
	mgr/volumes: Fix permission during subvol creation with mode

Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-09-15 22:34:23 -04:00
Patrick Donnelly
33331cf4aa Merge PR #42584 into master
* refs/pull/42584/head:
	doc: fix `daemon status` interface (exclude file system name)
	test: adjust mirroring tests for `daemon status` change
	mgr/mirroring: `daemon status` command does not require file system name

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-09-15 22:33:18 -04:00
Patrick Donnelly
ef5d7febeb
qa: lengthen grace for fs map showing dead MDS
Fixes: https://tracker.ceph.com/issues/52625
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-09-15 22:21:03 -04:00
Sage Weil
2a6ad93a76 qa/tasks/kubeadm: modify (do not clobber) daemon.json
Otherwise we blow away the mirror config.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-09-15 15:16:50 -05:00
Mykola Golub
76743e0058 qa/suites/rados: add backfill_toofull test
Signed-off-by: Mykola Golub <mgolub@suse.com>
2021-09-15 17:21:11 +03:00
Xiubo Li
0cb06740a9 qa: enable dynamic debug support to kclient
Add a 'kmount_count' counter in ctx to make sure the dynamic debug
log won't be disabled until the last kernel mounter is unmounted.

Fixes: https://tracker.ceph.com/issues/48736
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-09-15 09:31:04 +08:00
Sage Weil
13238ade13 Merge PR #43136 into master
* refs/pull/43136/head:
	qa/tasks/kubeadm: change calico encap to IPIPCrossSubnet
	qa/suites/orch/rook/smoke: add host networking to matrix
	qa/tasks/rook: fix shadowing of config arg in rook_cluster()

Reviewed-by: Joseph Sawaya <jsawaya@redhat.com>
2021-09-13 18:28:43 -04:00
Sage Weil
528880d3bb qa/tasks/kubeadm: change calico encap to IPIPCrossSubnet
Signed-off-by: Sage Weil <sage@newdream.net>
2021-09-13 15:26:54 -05:00
Ramana Raja
67bb13859a mon/FSCommands: add 'recover' flag in fs new command
Currently, to recover a file system after recovering monitor store, you
need to stop all the MDSs; create FSMap with defaults using `fs new`
command; execute `fs reset` command to get the file system's rank 0 into
existing but failed state; and then restart MDSs.

Add 'recover' flag to the `fs new` command that sets the file system's
rank 0 to existing but failed state, and sets the file system's
'joinable' setting to False. Using the `fs new` command with 'recover'
flag gets rid of the steps to stop all the MDSs and execute `fs reset`
command when recovering the file system after recoving monitor store.

Fixes: https://tracker.ceph.com/issues/51716
Signed-off-by: Ramana Raja <rraja@redhat.com>
2021-09-13 00:15:39 -04:00
Mykola Golub
e0a926a2c1 qa/tasks/ceph_manager: fix assertion
The osd may be 0.

Signed-off-by: Mykola Golub <mgolub@suse.com>
2021-09-10 15:47:41 +03:00
Patrick Donnelly
f4a11a3290
qa: add test for standby-replay marking rank damaged
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-09-09 20:16:03 -04:00
Yuri Weinstein
3b779e712f
Merge pull request #42853 from sseshasa/wip-fix-vstart-mon-permissions
mon/MonCap: Update osd profile to allow cmd to set iops capacity on mon db

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-09-09 10:10:06 -07:00
Sebastian Wagner
fe734adddd
Merge pull request #43045 from sebastian-philipp/qa-tox-import-yaml
qa: tox.ini: verify yaml syntax

Reviewed-by: Sage Weil <sage@newdream.net>
2021-09-08 17:10:13 +02:00
Kotresh HR
7440ef842a mgr/volumes: Fix permission during subvol creation with mode
The subvolume creation with specific mode leads to
creation of parent directories ('/volumes/_no_group') with
the same mode if it's not already created. Fixed the same.

Similarly, the subvolumegroup creation with specific mode
leads to creation of parent directory ('/volumes') with
same mode if it's not already created. Fixed the same.

Fixes: https://tracker.ceph.com/issues/51870
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2021-09-07 15:51:21 +05:30
Sebastian Wagner
7777603e8b
qa: tox.ini: verify yaml syntax
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-09-07 10:20:34 +02:00
Patrick Donnelly
ca906d0d7a Merge PR #42529 into master
* refs/pull/42529/head:
	qa: verify rank 0 does not fail during journal repair tests
	qa: avoid stopping/restarting mds in journal repair tests

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2021-09-06 14:00:41 -04:00
Sage Weil
42b4108073 qa/tasks/rook: fix shadowing of config arg in rook_cluster()
Signed-off-by: Sage Weil <sage@newdream.net>
2021-09-03 10:49:54 -05:00
Sage Weil
9cb2f444fd Merge PR #42873 into master
* refs/pull/42873/head:
	qa/tasks/rook: add OSD creation to Rook QA

Reviewed-by: Sage Weil <sage@redhat.com>
2021-09-02 17:11:51 -04:00
Joseph Sawaya
4b6de11169 qa/tasks/rook: add OSD creation to Rook QA
This commit adds OSD creation to the Rook QA tasks. The Rook task will
explicitly wait for the mgr to start and the CLI to work (instead of
implicitly doing so while waiting for 'ceph osd dump' to work).
Then it will do `ceph orch apply osd --all-available-devices` to create
OSDs on the rest of the PVs.

Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
2021-09-01 11:27:40 -04:00
Kalpesh Pandya
9c1e5d5c52 qa/tasks: Addition of new code for session tags in STS
Signed-off-by: Kalpesh Pandya <kapandya@redhat.com>
2021-09-01 17:09:54 +05:30
Kalpesh Pandya
74b5ec876c qa/tasks: Addition of two new parameters for sts-tests
Addition of SUB and AZP parameter for some new sts-tests

Signed-off-by: Kalpesh Pandya <kapandya@redhat.com>
2021-09-01 17:09:54 +05:30
Sridhar Seshasayee
4b0dba28b6 qa/tasks: Set default caps for 'osd' type in generate_caps()
Assign the default caps for osds to be the same as what the AuthMonitor
sets for a new osd. See AuthMonitor::validate_osd_new() which sets the
following caps for a new osd:

 mon='allow profile osd'
 mgr='allow profile osd'
 osd=''allow *'

When an actual real world cluster is deployed, the above caps are applied.
Unless the user modifies the defaults, a cluster will operate with the
above caps. Therefore, it makes sense to use the defaults when testing
Ceph so that issues if any due to the default settings may be caught and
fixed.

Therefore, the caps for the 'osd' type is reset to the default in
generate_caps(). The caps for 'mgr' already reflects the system defaults.
The caps for 'mds' type is not changed in this commit and will be
investigated and changed if necessary later.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2021-09-01 13:46:01 +05:30
Patrick Donnelly
ec69208deb Merge PR #38481 into master
* refs/pull/38481/head:
	qa/vstart_runner: inherit methods instead of duplicating them
	qa/ceph_manager: make it possible to reuse few methods
	qa/vstart_runner: don't use "shell=False" in run_ceph_w()
	qa/ceph_manager: minor refactor

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-08-27 21:26:41 -04:00
Patrick Donnelly
ea04087786 Merge PR #42371 into master
* refs/pull/42371/head:
	mgr/volumes: Fix a race during clone cancel
	mgr/volumes: Fail subvolume removal if it's in progress

Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2021-08-23 20:02:31 -04:00
Avan Thakkar
95543bb150 mgr/dashboard: stats=false not working when listing buckets
Fixes: https://tracker.ceph.com/issues/51154
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
2021-08-23 15:57:54 +05:30
Sage Weil
6f8bdfbb90 Merge PR #42252 into master
* refs/pull/42252/head:
	mgr/dashboard: set rgw credentials: fix api tests
	mgr/dashboard: run-frontend-e2e-tests.sh: remove unneeded rgw setting
	mgr/dashboard: rgw service creation form: add realm and zone to service spec.
	mgr/dashboard: connect-rgw: rename to set-rgw-credentials; refactoring
	mgr/dashboard: connect-rgw: adaptation and test coverage
	mgr/cephadm: re-check dashboard <-> rgw creds when rgw daemons created/destroyed
	mgr/dashboard: add 'dashboard connect-rgw' command
	doc/mgr/dashboard: simplify dashboard+rgw config docs

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
2021-08-11 11:28:28 -04:00
Alfonso Martínez
a682b9d7a4 mgr/dashboard: set rgw credentials: fix api tests
Fixes: https://tracker.ceph.com/issues/44605
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
2021-08-11 08:59:13 +02:00
Sage Weil
4b9a3b2171 Merge PR #42613 into master
* refs/pull/42613/head:
	qa/suites/roch/rook/smoke: test rook 1.7.0, not 1.6.2
	qa/tasks/rook: set storage_class to scratch

Reviewed-by: merge 42318
2021-08-10 16:47:22 -04:00
Sage Weil
3331a0a7ea Merge PR #42691 into master
* refs/pull/42691/head:
	mgr/nfs: add --port to 'nfs cluster create' and port to 'nfs cluster info'
	qa/suites/orch/cephadm/smoke-roleless: test taking ganeshas offline
	qa/tasks/vip: exec with bash -ex
	qa/suites/orch/cephadm: separate test_nfs from test_orch_cli

Reviewed-by: Varsha Rao <varao@redhat.com>
2021-08-10 16:37:38 -04:00
Alfonso Martínez
6e20ef1dd3 mgr/dashboard: connect-rgw: rename to set-rgw-credentials; refactoring
- Rename the dashboard command to better reflect its behavior.
- Rename '_radosgw_admin' method to 'send_rgwadmin_command' for consistency with
  'send_mon_command' and move it to the mgr_module.py .
- Cleanup: remove unneeded rgw settings.
- Better error handling and test coverage.

Fixes: https://tracker.ceph.com/issues/44605
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
2021-08-10 14:06:03 +02:00
Sage Weil
84479e03a7 Merge PR #42709 into master
* refs/pull/42709/head:
	qa/tasks/kubeadm: force docker cgroup engine to systemd

Reviewed-by: Travis Nielsen <tnielsen@redhat.com>
2021-08-09 15:23:11 -04:00
Casey Bodley
95f2161ee3
Merge pull request #42688 from cbodley/wip-52069
qa/rgw: update apache-maven mirror for rgw/hadoop-s3a

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2021-08-09 11:51:36 -04:00
Casey Bodley
e514b3a374
Merge pull request #42689 from cbodley/wip-52070
qa/rgw: barbican and pykmip tasks upgrade pip before installing pytz

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2021-08-09 11:51:21 -04:00
Sage Weil
517b7759b3 qa/tasks/kubeadm: force docker cgroup engine to systemd
Signed-off-by: Sage Weil <sage@newdream.net>
2021-08-06 14:21:08 -05:00
Kefu Chai
62944aefa0
Merge pull request #42277 from tchaikov/wip-vstart-runner-cleanups
qa/tasks/vstart_runner: do not send SIGTERM if no matched pid

Reviewed-by: Rishabh Dave <ridave@redhat.com>
2021-08-06 10:33:19 +08:00
Sage Weil
3c1e086be0 qa/tasks/vip: exec with bash -ex
Signed-off-by: Sage Weil <sage@newdream.net>
2021-08-05 17:45:56 -04:00
Casey Bodley
e5a5b4e379 qa/rgw: barbican and pykmip tasks upgrade pip before installing pytz
Downloading 461087a514/cryptography-3.4.7.tar.gz (546kB)
  Complete output from command python setup.py egg_info:

          =============================DEBUG ASSISTANCE==========================
          If you are seeing an error here please try the following to
          successfully install cryptography:

          Upgrade to the latest pip and try again. This will fix errors for most
          users. See: https://pip.pypa.io/en/stable/installing/#upgrading-pip
          =============================DEBUG ASSISTANCE==========================

  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "/tmp/pip-build-7fhnk5us/cryptography/setup.py", line 14, in <module>
      from setuptools_rust import RustExtension
  ModuleNotFoundError: No module named 'setuptools_rust'

Fixes: https://tracker.ceph.com/issues/52070

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2021-08-05 16:45:02 -04:00
Casey Bodley
9253733d08 qa/rgw: update apache-maven mirror for rgw/hadoop-s3a
Fixes: https://tracker.ceph.com/issues/52069

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2021-08-05 14:50:09 -04:00
Kefu Chai
a17ebc0406
Merge pull request #42575 from tchaikov/wip-venv
*: s/virtualenv/python -m venv/

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-08-04 18:37:45 +08:00
Sage Weil
460d7a215a qa/tasks/rook: set storage_class to scratch
Signed-off-by: Sage Weil <sage@newdream.net>
2021-08-03 16:13:13 -04:00
Venky Shankar
11b61b4fb9 test: adjust mirroring tests for daemon status change
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-08-02 06:39:16 -04:00
Rishabh Dave
d86bfbfe2d qa/vstart_runner: inherit methods instead of duplicating them
Inherit methods run_ceph_w(), run_cluster_cmd(), raw_cluster_cmd() and
raw_cluster_cmd_result() from ceph_manager.CephManager in
vstart_runner.LocalCephManager instead of duplicating them.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2021-08-02 11:37:49 +05:30
Rishabh Dave
93677576c1 qa/ceph_manager: make it possible to reuse few methods
Make minor adjustments to ceph_manager.CephManager so that methods
run_ceph_w(), run_cluster_cmd() raw_cluster_cmd() and
raw_cluster_cmd_result() can be reused, instead of duplicating, in
subclasses. The adjustments are -

* Having variables contain arguments that'll be prepended to every
  command received by the methods above.
* Grouping variables that needs to be overridden together so that it is
  easy to spot and override them for users.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2021-08-02 11:37:49 +05:30
Rishabh Dave
047c90f881 qa/vstart_runner: don't use "shell=False" in run_ceph_w()
Instead prepend "exec sudo" to the command arguments of
LocalCephManager.run_ceph_w(). This makes the default parameter
"shell=False" redundant in case of
ceph_manager.CephManager.run_ceph_w(), so get rid of it too and update
calls to run_ceph_w() accordingly.

The reason behind using any of these workarounds is that running "ceph
-w" with "shell" set to True leads to crash for Ceph API CI job. See
this ticket for more details: https://tracker.ceph.com/issues/49644.

The reason behind switching the workaround is that in the following
commits to reduce duplication LocalCephManager.run_ceph_w() will be
deleted and CephManager.run_ceph_w() will be used by LocalCephManager
via inheritance. However, due to the issue described above, Ceph API
test will fail since "shell" is set to "True" for the command issued by
CephManager.run_ceph_w(). Prepending "exec sudo" to the command when it
is used in LocalCephManager makes this duplication unnecessary and also
prevents Ceph API test from failing.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2021-08-02 11:37:44 +05:30
Rishabh Dave
4101f76ed6 qa/ceph_manager: minor refactor
Save the return value of method "teuthology.get_testdir()" instead of
calling it repeatedly in the same class.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2021-08-02 10:07:23 +05:30
Kefu Chai
f0ed7a188f qa/tasks: s/virtualenv/python3 -m venv/
so we don't need to use virtualenv python package for creating a
virtualenv, the "venv" module in Python3 would suffice.

see also https://docs.python.org/3/library/venv.html

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-07-31 22:34:05 +08:00
Patrick Donnelly
2cd3494771 qa: update mds_pre_upgrade to no longer stop standbys
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-30 16:28:54 -07:00
Patrick Donnelly
8e0b9bcad6 qa: update mds_pre_upgrade to disable standby-replay
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-30 16:28:54 -07:00
Patrick Donnelly
295971b9c6 qa: add tests for compat manipulation and upgrade
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-30 16:28:54 -07:00
Patrick Donnelly
5ae7b9202b Merge PR #42513 into master
* refs/pull/42513/head:
	qa: multifs already enabled as default

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-30 14:03:36 -07:00
Kotresh HR
103c7bdc70 mgr/volumes: Fail subvolume removal if it's in progress
Removing an in-progress subvolume clone with force doesn't
remove the clone index (tracker). This results in the cloner
thread to stuck in loop trying to clone the deleted one.

This patch addresses the issue by not allowing the subvolume clone
to be removed if it's not complete/cancelled/failed even with force option.
It throws the error EAGAIN, asking the user to cancel the pending clone
and retry.

Fixes: https://tracker.ceph.com/issues/51707
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2021-07-30 13:14:28 +05:30
Patrick Donnelly
0efa23572a qa: verify rank 0 does not fail during journal repair tests
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-29 13:53:20 -07:00
Patrick Donnelly
14324ab5c2 qa: avoid stopping/restarting mds in journal repair tests
It is enough to just fail ranks and manipulate the "joinable" flag of
the fs.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-29 13:53:20 -07:00
Brad Hubbard
434b325c40
Merge pull request #42442 from badone/wip-insights-reports-non-persistent-storage
Don't persist report data

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-07-29 09:19:32 +10:00
Patrick Donnelly
665b36de4e Merge PR #42349 into master
* refs/pull/42349/head:
	mon/MDSMonitor: propose if FSMap struct_v is too old
	mon/MDSMonitor: give a proper error message if FSMap struct_v is too old
	mds/FSMap: use DECODE_OLDEST to gate FSMap version
	qa: add tests for fs dump of epoch and trimming
	qa: add file system support for dumping epoch
	mon/MDSMonitor: return mon_mds_force_trim_to even if equal to current epoch
	mon: add debugging for trimming methods
	mon: fix debug spacing
	qa: add nofs upgrade suite

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
2021-07-28 10:45:08 -07:00
Patrick Donnelly
4f0f51e4cb Merge PR #41025 into master
* refs/pull/41025/head:
	qa: wait pgs to be clean before using the pools
	qa: ignore PG_RECOVERY_FULL and PG_DEGRADED for mds-full
	qa: wait more time since there have many more pgs than before
	qa: do not multiple the full ratio twice
	qa: do not raise for kclient for _fsync test
	qa: use the pg autoscale mode to calcuate the pg_num
	qa: set the object_size to 1M
	qa: move the is_full() to parent class

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-28 10:34:12 -07:00
Patrick Donnelly
5ddaa36d17 qa: add tests for fs dump of epoch and trimming
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-28 07:07:05 -07:00
Patrick Donnelly
ee899d9a44 qa: add file system support for dumping epoch
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-28 07:07:05 -07:00
Xiubo Li
361ee535dd qa: multifs already enabled as default
Since pacific already mark multifs enabled as defaut.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-07-28 13:56:10 +08:00
Xiubo Li
a448d1c3ee qa: wait pgs to be clean before using the pools
Or in some use cases, like the mds-full tests, we will hit the
"PG_AVAILABILITY" warning.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-07-27 09:54:03 +08:00
Xiubo Li
999c787ac6 qa: wait more time since there have many more pgs than before
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-07-27 09:48:56 +08:00
Xiubo Li
ba3833a622 qa: do not multiple the full ratio twice
The cluster has already multiple the full ratio before returning
the "max_avail".

Fixes: https://tracker.ceph.com/issues/50984
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-07-27 09:48:56 +08:00
Xiubo Li
a96ee41908 qa: do not raise for kclient for _fsync test
For kclient, the write() will return -ENOSPC instead of the fsync().

Fixes: https://tracker.ceph.com/issues/45434
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-07-27 09:48:56 +08:00
Xiubo Li
c1cea71299 qa: use the pg autoscale mode to calcuate the pg_num
Setting the pg_num to 8 is too small that some osds maybe not covered by the
pools, some osds maybe overloaded. Remove the hardcodeing pg_num here and let
the pg autoscale mode to calculate it as needed, and at the same time set the
pg_num_min to 64 to avoid the pg_num to small.

If ec pool is used, for the test cases most datas will go to the ec pool and
the primary replicated pool will store a small amount of metadata for all the
files only, so set the target size ratio to 0.05 should be enough.

Fixes: https://tracker.ceph.com/issues/45434
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-07-27 09:48:56 +08:00
Xiubo Li
c7837484d9 qa: set the object_size to 1M
Set the object_size to 1MB to make the objects destributed more even
among the OSDs.

Fixes: https://tracker.ceph.com/issues/45434
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-07-27 09:48:56 +08:00
Xiubo Li
f4288f2a9b qa: move the is_full() to parent class
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-07-27 09:48:56 +08:00
Sage Weil
cd089ee74e qa/suites/orch/cephadm: add rgw nfs export test
Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-26 16:23:17 -04:00
Sage Weil
45737fe95a qa/tasks/python: simple task to run python code
Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-26 16:23:06 -04:00
Neha Ojha
c9ad86e9c5
Merge pull request #42438 from tchaikov/wip-qa-test_module_selftest
qa/tasks/mgr: clean crash reports before waiting for clean

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2021-07-23 15:39:14 -07:00
Patrick Donnelly
0e71ea4a13
Merge PR #42106 into master
* refs/pull/42106/head:
	mds: create file system with specific ID

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-23 11:15:33 -07:00
Patrick Donnelly
4ee63174e6
Merge PR #42431 into master
* refs/pull/42431/head:
	cmake: add "mypy" back to tox envlist of "qa""
	qa/tasks/vstart_runner: add optional "sudo" param to _run_python()

Reviewed-by: Sebastian Wagner <swagner@suse.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-22 07:56:40 -07:00
Milind Changire
6edebf8a1f
Merge pull request #42329 from vshankar/wip-cephfs-mirror-dir-remove-registery
cephfs-mirror: record directory path cancel in DirRegistry

Reviewed-by: Milind Changire <mchangir@redhat.com>
2021-07-22 18:20:59 +05:30
Brad Hubbard
32d1cca2d9 qa/tasks/mgr/test_insights: Remove test for persistent checks
This test makes no sense if we are no longer persisting the store.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
2021-07-22 15:02:01 +10:00
Kefu Chai
0017df2006 qa/tasks/vstart_runner: add optional "sudo" param to _run_python()
to silence mypy warnings like:

tasks/vstart_runner.py:691: error: Definition of "_run_python" in base class "LocalCephFSMount" is incompatible with definition in base class "CephFSMount"
tasks/vstart_runner.py:705: error: Definition of "_run_python" in base class "LocalCephFSMount" is incompatible with definition in base class "CephFSMount"

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-07-22 10:08:27 +08:00
Neha Ojha
c9f8846b7f
Merge pull request #41907 from kamoltat/wip-ksirivad-progress-time-interval
pybind/mgr/progress: introduce 5 second sleep interval

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2021-07-21 16:53:38 -07:00
Kefu Chai
ec8a40b08f qa/tasks/mgr: clean crash reports before waiting for clean
otherwise we have following warning in health report

{"status":"HEALTH_WARN","checks":{"RECENT_MGR_MODULE_CRASH":{"severity":"HEALTH_WARN","summary":{"message":"1 mgr modules have recently crashed","count":1},"muted":false}},"mutes":[]}

and it does not disappear after the test waits for 30 seconds.
and the tasks.mgr.test_module_selftest.TestModuleSelftest test
fails like:

2021-07-21T09:59:52.560 INFO:tasks.cephfs_test_runner:======================================================================
2021-07-21T09:59:52.561 INFO:tasks.cephfs_test_runner:ERROR: test_module_commands (tasks.mgr.test_module_selftest.TestModuleSelftest)
2021-07-21T09:59:52.561 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2021-07-21T09:59:52.561 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
2021-07-21T09:59:52.562 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_6a5d5abc027f706687dec92f92ff6fc6f074d2ae/qa/tasks/mgr/test_module_selftest.py", line 201, in
test_mo
dule_commands
2021-07-21T09:59:52.562 INFO:tasks.cephfs_test_runner:    self.wait_for_health_clear(timeout=30)
2021-07-21T09:59:52.562 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_6a5d5abc027f706687dec92f92ff6fc6f074d2ae/qa/tasks/ceph_test_case.py", line 172, in
wait_for_health_c
lear
2021-07-21T09:59:52.563 INFO:tasks.cephfs_test_runner:    self.wait_until_true(is_clear, timeout)
2021-07-21T09:59:52.563 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_6a5d5abc027f706687dec92f92ff6fc6f074d2ae/qa/tasks/ceph_test_case.py", line 209, in
wait_until_true
2021-07-21T09:59:52.563 INFO:tasks.cephfs_test_runner:    raise TestTimeoutError("Timed out after {0}s and {1} retries".format(elapsed, retry_count))
2021-07-21T09:59:52.564 INFO:tasks.cephfs_test_runner:tasks.ceph_test_case.TestTimeoutError: Timed out after 30s and 0 retries

in this change, the crash reports are nuked right after
we see the warning, so that we can have a clean health
report.

Fixes: https://tracker.ceph.com/issues/51743
Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-07-21 22:46:18 +08:00
Kefu Chai
dc1a8a8b0e
Merge pull request #41929 from sebastian-philipp/fix-qa-tox
qa: Various make check fixes

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-07-21 00:36:59 +08:00
Ernesto Puerta
64dbe17fdb
Merge pull request #42188 from votdev/issue_51408_motd
mgr/dashboard: Add configurable MOTD or wall notification

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Paul Cuzner <pcuzner@redhat.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: sebastian-philipp <NOT@FOUND>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
2021-07-19 19:56:50 +02:00
Josh Durgin
adb0454599
Merge pull request #42074 from ljflores/wip-lflores-perf-channel
mgr/telemetry: add new 'perf' channel that shares aggregated perf counter metrics of a cluster

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>
2021-07-19 07:59:32 -07:00
Venky Shankar
19a45c8d54 test: add test for checking readd after remove for a directory path
Fixes: http://tracker.ceph.com/issues/51666
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-07-16 01:08:23 -04:00
Sage Weil
a1ee80fcf1 qa/tasks/mgr/test_orchestrator_cli: fix test
Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-14 16:20:11 -04:00
Sage Weil
d41b60404d qa/tasks/cephfs/test_nfs: define NFS_POOL_NAME
We can't import from mgr_module.py from here, sadly.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-14 16:20:11 -04:00
Sage Weil
1e6fd912f6 qa/tasks/cephfs/test_nfs: retry mount a few times
It may take a moment for a ganesha to (re)configure itself with a new
export.  If a mount fails, retry a couple times.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-14 16:20:11 -04:00
Sage Weil
82e939d89c mgr/nfs: change nfs pool to .nfs
This is a new pool that we can migrate all past NFS configuration to,
simplifying the migration process (and also allowing us to pick a
.-prefixed name).

Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-14 16:20:11 -04:00
Sage Weil
01c006c2de Merge PR #42041 into master
* refs/pull/42041/head:
	mgr/restful: ignore min/max_size
	test/crush: drop min/max_size refs
	qa/workunits/mon/pool_ops: remove test for min/max_size check
	qa: scrub a few remaining mentions of ruleset
	qa/standalone/mon/osd-*: fix tests
	PendingReleaseNotes: note min/max_size removal
	mgr/dashboard: remove max/min_size and ruleset
	mon/OSDMonitor: fix calls to CrushTester
	crush: eliminate min_size and max_size
	test/cli/crushtool: reunumber rulesets in test maps
	crushtool: require min/max or num-rep for --test
	crush: remove last traces of 'ruleset'
	test/cli/crushtool: use 'id' instead of 'ruleset' in crush inputs
	crushtool: take --min-rep and --max-rep explicitly
	crush/CrushTester: drop --ruleset
	doc: scrub 'ruleset' from docs
	src/erasure-code: rule, not ruleset
	mon/OSDMonitor: remove check_crush_rule() callers
	mon/OSDMonitor: rule, not ruleset
	crushtool: remove check for overlapped ruels
	crush/CrushWrapper: get_osd_pool_default_crush_replicated_ruleset -> rule
	crush: remove find_rule()
	mon/OSDMonitor: use pool's crush rule directly
	osd/OSDMap: drop checks for ruleset == ruleid
	osd/OSDMap: use pool's crush rule_id directly
	mon/PGMap: use pool's crush_rule directly
	mon/OSDMonitor: remove crush ruleset->rule rewrite

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
2021-07-14 14:38:59 -04:00
Volker Theile
f7f163e75c mgr/dashboard: Add configurable MOTD or wall notification
Fixes: https://tracker.ceph.com/issues/51408

Signed-off-by: Volker Theile <vtheile@suse.com>
2021-07-14 10:48:49 +02:00
Sage Weil
dee581c7e1 Merge PR #42319 into master
* refs/pull/42319/head:
	qa/tasks/rebuild_mondb: fix rebuild vs logmonitor external_log_to

Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-07-13 18:20:21 -04:00
Kamoltat
5f33f2f6e0 mgr/test_progress.py: Delay recover in test_progress
Changes some the tests in teuthology to make
the test more deterministic.
Using:

`ceph osd set norecover` and
`ceph osd set nobackfill` when marking osds in
or out. As this will delay the recovery and make
sure it the test cases get the chance to check
that there is actually events poping up in
the progress module.

took out test_osd_cannot_recover from
tasks/mgr/test_progress.py since it is no longer
a relevant test case since recovery will get
triggered regardless if pg is unmoved.

Ignoring `OSDMAP_FLAGS` in teuthology
because we are using norecover and nobackfill
to delay the recovery process, therefore, it
will create a health warning and fails the
teuthology test.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
2021-07-13 19:33:20 +00:00
Sage Weil
f586ec2fa9 qa/tasks/rebuild_mondb: fix rebuild vs logmonitor external_log_to
Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-13 14:44:57 -04:00
Patrick Donnelly
9f3e49389b
Merge PR #42029 into master
* refs/pull/42029/head:
	vstart_runner: use FileNotFoundError when os.stat() fails

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
2021-07-13 08:07:22 -07:00
Patrick Donnelly
6470e7cdcd
Merge PR #42030 into master
* refs/pull/42030/head:
	vstart_runner: maintain log level when --debug is passed

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-13 08:05:25 -07:00
Patrick Donnelly
a20300d4f0
Merge PR #42033 into master
* refs/pull/42033/head:
	vstart_runner: add log messages to vstart_runner.py

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-13 08:03:18 -07:00
Kefu Chai
58841419be qa/tasks/vstart_runner: do not send SIGTERM if no matched pid
otherwise following error is expected in some cases:

INFO:__main__:Traceback (most recent call last):
INFO:__main__:  File "/home/jenkins-build/build/workspace/ceph-api/qa/tasks/mgr/test_dashboard.py", line 18, in setUp
INFO:__main__:    self._assign_ports("dashboard", "ssl_server_port")
INFO:__main__:  File "/home/jenkins-build/build/workspace/ceph-api/qa/tasks/mgr/mgr_test_case.py", line 197, in _assign_ports
INFO:__main__:    cls.mgr_cluster.mgr_stop(mgr_id)
INFO:__main__:  File "/home/jenkins-build/build/workspace/ceph-api/qa/tasks/mgr/mgr_test_case.py", line 30, in mgr_stop
INFO:__main__:    self.mgr_daemons[mgr_id].stop()
INFO:__main__:  File "../qa/tasks/vstart_runner.py", line 558, in stop
INFO:__main__:    os.kill(pid, signal.SIGTERM)
INFO:__main__:TypeError: an integer is required (got type NoneType)

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-07-12 13:25:44 +08:00
Kefu Chai
e15ceb6bae qa/tasks/vstart_runner: consolidate logging message in LocalDaemon
it's more readable if we have a single logging message when no matching
pid is found.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-07-12 13:25:44 +08:00
Ramana Raja
a0a8ba5087 mds: create file system with specific ID
File system will need to be recreated when monitor databases are lost
and rebuilt. Some applications (e.g., CSI) expect that the recovered
file system have the same ID as before. Allow creating a file system
with a specific ID to help in such scenarios. This can now be done by
the `fs new` command using the argument 'fscid' and 'force' flag.
Newer file systems will no longer have increasing IDs as a corollary.

Fixes: https://tracker.ceph.com/issues/51340
Signed-off-by: Ramana Raja <rraja@redhat.com>
2021-07-09 21:14:01 -04:00
Sage Weil
14f85370a8 qa: scrub a few remaining mentions of ruleset
Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-07 10:32:11 -04:00