Commit Graph

3046 Commits

Author SHA1 Message Date
Kefu Chai
6f58a26281
Merge pull request #27465 from tchaikov/wip-38219
ceph-monstore-tool: use a large enough paxos/{first,last}_committed

Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-06-16 09:38:45 +08:00
Patrick Donnelly
174b8ad30b
Merge PR #41840 into master
* refs/pull/41840/head:
	qa: update cli syntax to conventional

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
2021-06-15 10:34:18 -07:00
Patrick Donnelly
8cb34b3849
Merge PR #41771 into master
* refs/pull/41771/head:
	qa: update scrub start code to use comma sep scrubopts

Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-06-15 10:32:06 -07:00
Patrick Donnelly
a402b23c84
qa: update cli syntax to conventional
This was using an obscure syntax that worked at one time and wasn't
documented (AFAIK).

Fixes: https://tracker.ceph.com/issues/51182
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-14 10:21:43 -07:00
Kefu Chai
75b91d49b8
Merge pull request #39624 from sebastian-philipp/mypy-812
src,qa: Upgrade to mypy 0.901

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-06-14 22:53:02 +08:00
Patrick Donnelly
7e320919f7
Merge PR #41482 into master
* refs/pull/41482/head:
	qa: remove obsolete deactivate routines

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2021-06-13 19:56:34 -07:00
Patrick Donnelly
6a095654f4
Merge PR #41422 into master
* refs/pull/41422/head:
	qa/tasks/cephfs/test_sessionmap: reap connections immediately
	msg/async: configurable threshold for reaping dead connections

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-13 19:52:48 -07:00
Patrick Donnelly
0441b3d60f
Merge PR #41403 into master
* refs/pull/41403/head:
	mgr/volumes: Add config to insert delay at the beginning of the clone

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-13 19:51:52 -07:00
Patrick Donnelly
b24608daa2
qa: choose victim pg from rbd pool
Right now scrub_test picks any pg in ceph. Unfortunately, it picked the
.mgr pool's only pg in [1]:

	2021-05-16T11:36:35.035 DEBUG:teuthology.orchestra.run.smithi049:> adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage rados --cluster ceph --pool rbd setomapval main.db-journal.0000000000000000 key val

Instead, only pick a pg in the rbd pool.

[1] /ceph/teuthology-archive/kchai-2021-05-16_11:19:39-rados-wip-kefu-testing-2021-05-16-1043-distro-basic-smithi/6117396/teuthology.log

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-11 20:07:22 -07:00
Patrick Donnelly
0d9032771c
qa: fix api test failures
"device_health_metrics" pool is gone -- .mgr pool is in.

I don't think the pool removal code in some test cases is necessary any
longer with recent changes to remove those warnings; so that code is
gone too.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-11 19:35:17 -07:00
Kefu Chai
7513b24aa5
Merge pull request #40480 from kamoltat/wip-ksirivad-fix-bug-49988
pybind/mgr/progress: Disregard unreported pgs

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2021-06-12 08:37:35 +08:00
Patrick Donnelly
95f0e9c959
Merge PR #39505 into master
* refs/pull/39505/head:
	qa: test nowsync option in kernel client workflows
	qa: deep merge top level overrides for fuse/kclient

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
2021-06-11 17:10:41 -07:00
Kefu Chai
7afd38f846 tasks/ceph_manager: ignore EACCES when waiting for quorum
mon_tick_interval is 5 seconds by default. monitors update their
rotating keys every mon_tick_interval. before monitors forms a
quorum, the auth requests from clients are put into the wait list.
these requests are re-enqueued once the monitors form a quorum. but
there is a small window of mon_tick_interval, before they are able
to serve the auth requests even after their claim to be able to
server requests. if these re-enqueued requests happen to be served
in this window, and if authx is enabled, they will be greeted with
errors like

handle_auth_bad_method server allowed_methods [2] but i only support [2]

in the case of ceph cli, the error would look like:

[errno 13] RADOS permission denied (error connecting to the cluster)

so, to address this issue, the EACCES error is ignored when waiting
for a quorum.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-06-10 20:29:50 +08:00
Kefu Chai
3908c1f4cd tasks/ceph_manager: use safe_while() to refactor the wait for quorum
for better readability

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-06-10 20:29:50 +08:00
Kamoltat
4b00f1c2bd pybind/mg/progress: Disregard unreported pgs
The global recovery event progress calculations only
takes into account pgs with `reported_epoch < start_epoch_of_event`
but sometimes the pgs doesn't get move before or after the creation
of the global recovery event, therefore this might result in a bug
where the global event gets stuck forever unless there is another
event that specifically makes the pgs that get stuck moves and updates
its `reported_epoch`.

Therefore, we decided to disregard pgs that are in active+clean state
but has `reported_epoch < start_epoch_of_event`.

Fixes: https://tracker.ceph.com/issues/49988

Signed-off-by: Kamoltat <ksirivad@redhat.com>
2021-06-09 15:11:32 +00:00
Patrick Donnelly
0f505dc299
qa: update scrub start code to use comma sep scrubopts
The documentation specifies this in [1] and yet we were using (I
believe) an older syntax:

    ceph tell mds.foo:0 scrub start / recursive force

instead of

    ceph tell mds.foo:0 scrub start / recursive,force

Oddly the former works at least as recently as in [2]:

    2021-06-03T07:11:42.071 DEBUG:teuthology.orchestra.run.smithi025:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph tell mds.1:0 scrub start / recursive force
    ...
    2021-06-03T07:11:42.268 INFO:teuthology.orchestra.run.smithi025.stdout:{
    2021-06-03T07:11:42.268 INFO:teuthology.orchestra.run.smithi025.stdout:    "return_code": 0,
    2021-06-03T07:11:42.268 INFO:teuthology.orchestra.run.smithi025.stdout:    "scrub_tag": "cf7a74b2-3eb2-4657-9274-ea504b1ebf8f",
    2021-06-03T07:11:42.269 INFO:teuthology.orchestra.run.smithi025.stdout:    "mode": "asynchronous"
    2021-06-03T07:11:42.269 INFO:teuthology.orchestra.run.smithi025.stdout:}

[1] https://docs.ceph.com/en/latest/cephfs/scrub/
[2] /ceph/teuthology-archive/pdonnell-2021-06-03_03:40:33-fs-wip-pdonnell-testing-20210603.020013-distro-basic-smithi/6148097/teuthology.log

Fixes: https://tracker.ceph.com/issues/51146
See-also: https://tracker.ceph.com/issues/51145
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-09 07:23:05 -07:00
Sebastian Wagner
1f6b4744b5 qa: Upgrade to mypy 0.901
mypy 0.9 now requires stub packages

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-06-09 12:53:21 +02:00
Ernesto Puerta
6465b9a254
Merge pull request #41123 from rhcs-dashboard/host-addr-and-labels
mgr/dashboard: Include Network address and labels on Host Creation form

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
2021-06-09 10:23:34 +02:00
Sage Weil
b18427da4b Merge PR #41509 into master
* refs/pull/41509/head:
	common/cmdparse: fix CephBool validation for tell commands
	mgr/nfs: fix 'nfs export create' argument order
	common/cmdparse: emit proper json
	mon/MonCommands: add -- seperator to example
	qa/tasks/cephfs/test_nfs: fix export create test
	mgr: make mgr commands compat with pre-quincy mon
	doc/_ext/ceph_commands: handle non-positional args in docs
	mgr: fix reweight-by-utilization cephbool flag
	mon/MonCommands: convert some CephChoices to CephBool
	mgr/k8sevents: fix help strings
	pybind/mgr/mgr_module: fix help desc formatting
	mgr/orchestrator: clean up 'orch {daemon add,apply} rgw' args
	mgr/orchestrator: add end_positional to a few methods
	mgr/orchestrator: reformat a few methods
	pybind/ceph_argparse: stop parsing when we run out of positional args
	pybind/ceph_argparse: remove dead code
	pybind/mgr/mgr_module: infer non-positional args
	pybind/mgr/mgr_module: add separator for non-positional args
	command/cmdparse: use -- to separate positional from non-positional args
	pybind/ceph_argparse: adjust help text for non-positional args
	pybind/ceph_argparse: track a 'positional' property on cli args

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-06-07 10:02:52 -04:00
Nizamudeen A
7c1df692f2 mgr/dashboard: Include Network address and labels on Host Creation form
The ability to create host by specifying network address and also create
labels.

https://tracker.ceph.com/issues/50318
Signed-off-by: Nizamudeen A <nia@redhat.com>
2021-06-07 14:47:09 +05:30
Patrick Donnelly
88f74dbfa6
qa: deep merge top level overrides for fuse/kclient
This allows for array/dict configs like mntopts to accumulate changes
from multiple yaml fragments.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-04 19:15:12 -07:00
Sage Weil
8683cccd06 qa/tasks/cephfs/test_nfs: fix export create test
Everything after --readonly is non-positional.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-04 16:56:17 -04:00
Sage Weil
c8c5071dcd qa/tasks/cephfs/test_sessionmap: reap connections immediately
We have to reap connections promptly for this test to work.

This test was broken indirectly by d51d80b323,
which moved the counter decrement to reap time instead of mark_down/stop
time.

The reaping is asynchronous, so allow for a delay in the count change.

Fixes: https://tracker.ceph.com/issues/50622
Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-04 11:02:29 -04:00
Kefu Chai
dba26fc7a8
Merge pull request #41652 from tchaikov/wip-qa-asock-or
qa/tasks/admin_socket: support "foo || bar" as command

Reviewed-by: Samuel Just <sjust@redhat.com>
2021-06-04 13:50:38 +08:00
Patrick Donnelly
a12db7941b
Merge PR #41499 into master
* refs/pull/41499/head:
	qa/tasks/mds_thrash: fix thrash iteration never skip

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-03 13:33:27 -07:00
Patrick Donnelly
4e1f812461
Merge PR #39910 into master
* refs/pull/39910/head:
	test: Add test for mgr hang when osd is full
	mgr: Set client_check_pool_perm to false
	mds: Add full caps to avoid osd full check

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-03 13:22:23 -07:00
Neha Ojha
11252f6117
Merge pull request #41308 from sseshasa/wip-osd-benchmark-for-mclock
osd: Run osd bench test to override default max osd capacity for mclock

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2021-06-03 08:39:22 -07:00
Kefu Chai
83e4edcd80 qa/tasks/admin_socket: support "foo || bar" as command
so we can cater the needs of different implementation of osd, i.e.,
classic osd and crimson osd. they offer different set of asock commands.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-06-03 14:23:46 +08:00
Patrick Donnelly
5871240363
Merge PR #41635 into master
* refs/pull/41635/head:
	qa: increase fragmentation to improve uniform distribution

Reviewed-by: Ramana Raja <rraja@redhat.com>
2021-06-02 08:18:22 -07:00
Sridhar Seshasayee
328271d587 qa/tasks: Enhance wait_until_true() to check & retry recovery progress
With mclock scheduler enabled, the recovery throughput is throttled based
on factors like the type of mclock profile enabled, the OSD capacity among
others. Due to this the recovery times may vary and therefore the existing
timeout of 120 secs may not be sufficient.

To address the above, a new method called _is_inprogress_or_complete() is
introduced in the TestProgress Class that checks if the event with the
specified 'id' is in progress by checking the 'progress' key of the
progress command response. This method also handles the corner case where
the event completes just before it's called.

The existing wait_until_true() method in the CephTestCase Class is
modified to accept another function argument called "check_fn". This is
set to the _is_inprogress_or_complete() function described earlier in the
"test_turn_off_module" test that has been observed to fail due to the
reasons already described above. A retry mechanism of a maximum of 5
attempts is introduced after the first timeout is hit. This means that
the wait can extend up to a maximum of 600 secs (120 secs * 5) as long as
there is recovery progress reported by the 'ceph progress' command result.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2021-06-02 14:19:48 +05:30
Yuval Lifshitz
679ddf5d11
Merge pull request #41026 from TRYTOBE8TME/wip-rgw-rabbitmq
qa/tasks: Adding RabbitMQ task for bucket notification tests
2021-06-02 07:47:39 +03:00
Patrick Donnelly
1c40ee32f9
qa: increase fragmentation to improve uniform distribution
Fixes: https://tracker.ceph.com/issues/51060
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-01 20:25:17 -07:00
Kalpesh
2e0b8a2a1f qa/tasks: Adding RabbitMQ task for bucket notification tests
This commit majorly consists of the RabbitMQ task which is a required and supported endpoint in bucket notification tests.
And some related changes in the AMQP tests. Major changes are:
1. Addition of RabbitMQ task
2. Documentation update for the steps to execute AMQP tests
3. Addition of attributes to the tests
4. Tox dependency removal from kafka.py

Signed-off-by: Kalpesh Pandya <kapandya@redhat.com>
2021-06-01 23:34:31 +05:30
Ilya Dryomov
dcd193c35e qa/tasks/qemu: precise repos have been archived
Fixes: https://tracker.ceph.com/issues/51033
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-06-01 12:54:16 +02:00
Milind Changire
26fbbefa82
Merge pull request #40831 from vshankar/wip-cephfs-mirror-incremental-sync
cephfs-mirror: incremental sync

Reviewed-by: Milind Changire <mchangir@redhat.com>
2021-05-27 13:39:23 +05:30
Sage Weil
9ab9cc26e2 Merge PR #41007 into master
* refs/pull/41007/head:
	qa/tasks/cephfs/test_nfs: fix info test
	doc/cephfs/fs-nfs-exports: document --ingress --virtual-ip
	mgr/nfs: move ingress vs virtual_ip check to cluster interface
	PendingReleaseNotes: clarify deprecated
	PendingReleaseNotes: note breaking CLI changes
	doc/cephadm/nfs: document nfs+ingress
	qa/suites/rados/cephadm/smoke-roleless: test nfs, nfs + ingress
	mgr/nfs: take --ingress argument to 'nfs cluster create'
	mgr/cephadm: adjust debug output for device refresh
	mgr/cephadm: ingress: fix log msg
	mgr/cephadm: fix logging of config/placement errors
	common/options: enable nfs module for new clusters
	cephadm: --stop-signal=SIGTERM
	mgr/orchestrator: default nfs pool, namespaces
	mgr/cephadm: nfs: create pool if it doesn't yet exist
	doc/cephadm/nfs: update
	mgr/nfs: change 'nfs cluster info'
	mgr/nfs: take optional virtual_ip for deploying ingress
	mgr/nfs: remove 'nfs cluster update'
	mgr/nfs: factor out ganesha pool creation
	mgr/nfs: delete -> rm for CLI
	mgr/nfs: add some type annotations
	python-common: fix IngressSpec yaml dump
	mgr/cephadm: ingress: remove eth0 default
	qa/tasks/cephadm: allow mounting volumes in shell
	cephadm: add -v arg to shell
	qa/tasks/vip: add 'vip.exec' task
	mgr/orchestrator: add --port arg to 'orch apply nfs'
	mgr/cephadm: nfs: add purge
	mgr/cephadm: ingress: support nfs
	mgr/cephadm: do not reconfigure daemons on deleted services
	mgr/cephadm: nfs: shell out to rados tool for conf creation
	mgr/cephadm: nfs: add rank to grace file from mgr module
	mgr/cephadm: nfs: bind ganesha to appropriate ip:port
	mgr/cephadm: enable ranked daemons for nfs
	mgr/cephadm: support creation of daemons with ranks
	mgr/cephadm: make _plan show removed daemon names
	mgr/cephadm/schedule: assign/map ranks
	mgr/cephadm: add rank[_generation] properties
	mgr/cephadm/inventory: store optional rank_map along with specs
	mgr/cephadm: include service_name is generated DaemonDescription
	mgr/orchestrator: include service_name in DaemonDescription dump
	mgr/cephadm/inventory: fix deleted check
	mgr/cephadm: simplify
	mgr/cephadm/schedule: make placement shuffle deterministic
	mgr/cephadm: document CephadmService flags

Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Varsha Rao <varao@redhat.com>
2021-05-25 16:17:44 -04:00
Sage Weil
218eec938d qa/tasks/cephfs/test_nfs: fix info test
Signed-off-by: Sage Weil <sage@newdream.net>
2021-05-25 10:15:45 -04:00
Venky Shankar
da9788a9b5 test: add test to verify incremental snapshot updates
Fixes: http://tracker.ceph.com/issues/49939
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-05-25 08:45:44 -04:00
Lianne
2b50cefa89 qa/tasks/mds_thrash: fix thrash iteration never skip
Signed-off-by: Lianne <liyan.wang@xtaotech.com>
2021-05-24 17:17:44 +08:00
sunilkumarn417
364fb5899b qa/tasks/cephadm: Include bootstrap registry options for downstream
- registry-url, registry-username and registry-password bootstrap options are
supported now. This is needed to access monitoring service container images.
- usage of RHEL distribution based cephadm in download_cephadm task.

Signed-off-by: sunilkumarn417 <sunnagar@redhat.com>
2021-05-24 12:38:54 +05:30
Kotresh HR
2bd6ba8026 test: Add test for mgr hang when osd is full
Add fs suite for tests requiring one node as well.

Fixes: https://tracker.ceph.com/issues/50532
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2021-05-22 19:09:07 +05:30
Patrick Donnelly
285e1547de
qa: remove obsolete deactivate routines
This is handled automatically since Mimic.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-05-21 12:52:10 -07:00
Sage Weil
8b95c4b7c5 qa/tasks/cephadm.conf: log_to_journald=false
For teuthology runs, we set log_to_stderr=false, so that we only see
derr-level events in the container log (and teuthology.log).  Now that we
log directly to journald, set log_to_journald=false too, so that we don't
see level-20 logs in teuthology.log.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-05-21 09:54:46 -04:00
Kotresh HR
7588f98505 mgr/volumes: Add config to insert delay at the beginning of the clone
Added the config 'delay_snapshot_clone' to insert delay at the beginning
of the clone to avoid races in tests. The default value is set to 0.

Fixes: https://tracker.ceph.com/issues/48231
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2021-05-21 12:35:07 +05:30
Sage Weil
ad0a62dc9d mgr/nfs: delete -> rm for CLI
The rest of the CLI uses 'rm' in place of 'remove' or 'delete', so let's
deprecate 'delete' and add 'rm'.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-05-20 18:38:18 -04:00
Patrick Donnelly
21b69fda3c
Merge PR #41084 into master
* refs/pull/41084/head:
	test: test to verify dir path removal when no mirror daemons are running
	pybind/mirroring: advance state machine from stalled state
	pybind/mirroring: start from correct state during policy init

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-05-20 12:42:19 -07:00
Sage Weil
b711a75277 qa/tasks/cephadm: allow mounting volumes in shell
Signed-off-by: Sage Weil <sage@newdream.net>
2021-05-19 08:43:14 -04:00
Sage Weil
54542fdaab qa/tasks/vip: add 'vip.exec' task
Signed-off-by: Sage Weil <sage@newdream.net>
2021-05-19 08:43:14 -04:00
Sage Weil
b9d8dc483a Merge PR #41286 into master
* refs/pull/41286/head:
	qa/suites/orch/rook: disable centos for now
	qa/suites/orch/rook/smoke: initial smoke suite
	qa/tasks/rook: ROOK_HOSTPATH_REQUIRES_PRIVILEGED=true on centos
	qa/tasks/rook: simplify shutdown
	qa/tasks/rook: archive logs
	qa/tasks/rook: more orderly cluster teardown
	qa/tasks/rook: deploy ceph via rook on top of kubernetes
	qa/tasks/kubeadm: install kubernetes with kubeadm
	qa/suites: move rados/cephadm -> orch/cephadm; symlink
	qa/tasks/cephadm: add whitespace between functions
	qa/tasks/cephadm: clean up ctx.manager setup

Reviewed-by: Sébastien Han <seb@redhat.com>
2021-05-19 07:55:30 -04:00
Patrick Donnelly
e9f7fafe52
Merge PR #41171 into master
* refs/pull/41171/head:
	test: disable mirroring module for certain tests

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-05-18 13:41:28 -07:00