Commit Graph

129190 Commits

Author SHA1 Message Date
Ilya Dryomov
303d3ede48 test/rbd_mirror: drop redundant MockJournaler instances
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-21 17:32:40 +01:00
Ilya Dryomov
d634a1df5b rbd-mirror: fix races in snapshot-based mirroring deletion propagation
When remote image is deleted, rbd-mirror can encounter three cases:

  1) no remote image id
  2) no remote mirror metadata
  3) MIRROR_IMAGE_STATE_DISABLING in remote mirror metadata

Commit d4c66ac5c6 ("rbd-mirror: fix issue with snapshot-based
mirroring deletion propagation") fixed case 1.  Cases 2 and 3 remained
broken because for both of them finalize_snapshot_state_builder() would
populate not only remote_mirror_peer_uuid but also remote_image_id,
thus disabling ENOLINK logic in handle_prepare_remote_image() and
handle_bootstrap().  Commit ff60aec2d9 ("rbd-mirror: fix bootstrap
sequence while the image is removed") touched on case 3, but it made
a difference only for journal-based mirroring.

Stop calling finalize_snapshot_state_builder() on errors.  Instead,
align with journal-based mirroring by filling remote_mirror_peer_uuid
together with remote_mirror_uuid.

Fixes: https://tracker.ceph.com/issues/53963
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-21 17:32:36 +01:00
Ilya Dryomov
ccfbf3e97e rbd-mirror: don't default replay_requires_remote_image() implementation
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-21 14:31:07 +01:00
Ilya Dryomov
f49fa483ec rbd-mirror: untangle StateBuilder::is_linked() overloads
Make it clear that the local image non-primariness is asserted
independent of the mode; avoid the default implementation being
overridden but still relied on by both modes.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-21 14:31:07 +01:00
Ilya Dryomov
baf57925ab rbd-mirror: drop redundant initialization of StateBuilder members
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-21 14:31:07 +01:00
Ilya Dryomov
4e5c45fb20
Merge pull request #44601 from trociny/wip-53888
cls/journal: skip disconnected clients when calculating min_commit_position

Reviewed-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-19 16:36:22 +01:00
Ilya Dryomov
3f3bdb9314
Merge pull request #44381 from ly798/add-ns-to-snap-dump
cls/rbd: add namespace to snapshot dump results

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-19 16:31:45 +01:00
Kamoltat Sirivadhna
ab097f88b4
Merge pull request #44553 from kamoltat/wip-ksirivad-progress-mgr-enforce-key-error-exception
pybind/mgr/progress: enforced try and except on event dictionary
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2022-01-19 10:24:07 -05:00
Ernesto Puerta
2791a2be2b
Merge pull request #44024 from rhcs-dashboard/fix-53334-master
mgr/dashboard: Improve notifications for osd nearfull, full

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
2022-01-19 14:03:42 +01:00
Ernesto Puerta
83dd339108
Merge pull request #44238 from rhcs-dashboard/fix-51575-fix
mgr/dashboard: Notification banners at the top of the UI have fixed height

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
2022-01-19 13:35:03 +01:00
Ernesto Puerta
3e6afc33ca
Merge pull request #44332 from rhcs-dashboard/grafana-warnings
monitoring/grafana: Replace missing legendFormat warning with error.

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
2022-01-19 13:18:31 +01:00
Sebastian Wagner
2c17af2eb5
Merge pull request #44600 from phlogistonjohn/jjm-mirror-service-module
mgr/cephadm: auto-enable mirroring module when deploying service

Reviewed-by: Michael Fritch <mfritch@suse.com>
2022-01-19 12:16:29 +01:00
Sebastian Wagner
1b0a195bef
Merge pull request #44647 from melissa-kun-li/remove-duplicate-deployment-doc
doc/cephadm: remove duplicate deployment scenario section

Reviewed-by: Adam King <adking@redhat.com>
2022-01-19 12:14:10 +01:00
Aashish Sharma
f771cd492c mgr/dashboard: Improve notifications for osd nearfull, full
This PR adds some visual hints for osds that are near full or full

Fixes: https://tracker.ceph.com/issues/53334
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
2022-01-19 16:35:27 +05:30
Yuri Weinstein
6fc395c98a
Merge pull request #44430 from 5cs/fix-check-pg-num-underflow
mon/OSDMonitor: fix integer underflow of check_pg_num

Reviewed-by: Kefu Chai <kchai@redhat.com>
2022-01-18 14:39:11 -08:00
Melissa Li
2222f26a37 doc/cephadm: remove duplicate deployment scenario section
Signed-off-by: Melissa Li <melissali@redhat.com>
2022-01-18 16:53:04 -05:00
Yuri Weinstein
8812674d13
Merge pull request #43732 from pdvian/wip-limit-slowop
osd/OSD: Log aggregated slow ops detail to cluster logs

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
2022-01-18 12:33:07 -08:00
Ernesto Puerta
86590ea131
Merge pull request #44578 from rhcs-dashboard/fix-53843-master
qa/dashboard: ensure node 16 is installed

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: David Galloway <dgallowa@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: sebastian-philipp <NOT@FOUND>
Reviewed-by: Yuri Weinstein <yweins@redhat.com>
2022-01-18 20:23:07 +01:00
Ernesto Puerta
104e17fb32
Merge pull request #44607 from rhcs-dashboard/fix-autopep8-qa-tasks
mgr/dashboard: include autopep8 for dashboard qa tasks

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
2022-01-18 20:11:07 +01:00
Ernesto Puerta
5a6965dc00
Merge pull request #44575 from rhcs-dashboard/dashboard-cephadm-more-stability
mgr/dashboard: Refactoring dashboard cephadm checks

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
2022-01-18 20:02:25 +01:00
Ernesto Puerta
197987a5a8
Merge pull request #42603 from cypherean/feedback_frontend
mgr/dashboard: report ceph tracker bug/feature through GUI

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
2022-01-18 19:47:13 +01:00
John Mulligan
bcb4fa70f9 mgr/cephadm: add a test for enabling cephfs mirroring module
Add a test that checks that when cephfs mirror service is enabled
the mirroring mgr module gets enabled.

Actually-written-by: Sebastian Wagner <sewagner@redhat.com>
Signed-off-by: John Mulligan <jmulligan@redhat.com>
2022-01-18 13:34:25 -05:00
John Mulligan
e030130fd1 mgr/cephadm: auto-enable mirroring module when deploying service
Automatically enable the mgr's mirroring module when creating
cephfs-mirror services. This will trigger a mgr respawn.

Fixes: https://tracker.ceph.com/issues/50593

Based roughly on 50dc1d0dec

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2022-01-18 13:34:25 -05:00
Casey Bodley
e11654d130
Merge pull request #33934 from cbodley/wip-40177
rgw: delete full sync index before switching to incremental sync

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
2022-01-18 13:22:18 -05:00
Casey Bodley
12301f63df
Merge pull request #40011 from cbodley/wip-49723
rgw: allow rgw_data_notify_interval_msec=0 to disable notifications

Reviewed-by: Shilpa Jagannath <smanjara@redhat.com>
Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
2022-01-18 13:21:06 -05:00
Casey Bodley
82db5e0bdb
Merge pull request #44581 from cbodley/wip-53177
rgw/dbstore: hide dbstore_log.h from rgw_main.cc

Reviewed-by: Soumya Koduri <skoduri@redhat.com>
2022-01-18 13:20:15 -05:00
Casey Bodley
7cc7eb7273
Merge pull request #44413 from cybozu/rgw-fix-error-code-of-remove-bucket-api
rgw: remove bucket API returns NoSuchKey than NoSuchBucket

Reviewed-by: Sébastien Han <seb@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
2022-01-18 13:19:53 -05:00
Casey Bodley
e96ae2f363
Merge pull request #44078 from cbodley/wip-rgw-multisite-metadata-retry-error
rgw/multisite: metadata sync only retries on errors

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
2022-01-18 13:18:46 -05:00
Neha Ojha
3e6b8ef30f
Merge pull request #44602 from neha-ojha/wip-qct-remove
doc/foundation.rst: qct is no longer a member

Reviewed-by: Dan van der Ster <daniel.vanderster@cern.ch>
2022-01-18 09:17:59 -08:00
Nizamudeen A
b6759b75c9 mgr/dashboard: Refactoring dashboard cephadm checks
I isolated all the tests suites into there respective files
so that in future it is easier to add more tests to it.

I also given priority to the host actions.

Create OSD checks are now written in a way that OSDs
are created only on the intended hosts. This will make
the host draining process easier and less time consuming.

Also tried to address the flaky force maintenance checks.

Removed some duplicated codes

Service creation part improved to reduce the time taken
for its completion

Fixes: https://tracker.ceph.com/issues/53905
Signed-off-by: Nizamudeen A <nia@redhat.com>
2022-01-18 21:45:26 +05:30
Sebastian Wagner
24aab16cd0
Merge pull request #44505 from guits/fix-53812
ceph-volume: fix regression introcuded via #43536

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Teoman Onay <tonay@redhat.com>
2022-01-18 16:31:49 +01:00
Sebastian Wagner
7842825a2e
Merge pull request #44489 from adk3798/agent-down-alerts
mgr/cephadm: still check agent deps if it is marked down
2022-01-18 16:23:40 +01:00
Pere Diaz Bou
57c26311de monitoring/grafana: replace filestore osd count
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
2022-01-18 14:14:41 +01:00
Pere Diaz Bou
a3cf5c5e9f monitoring/grafana: use Path class instead of split
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
2022-01-18 13:24:12 +01:00
Pere Diaz Bou
1e4d85d04f monitoring/grafana: remove explicit str casting
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
2022-01-18 13:24:12 +01:00
Pere Diaz Bou
2b4f3561d2 monitoring/grafana: add generated json files
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
2022-01-18 13:24:12 +01:00
Pere Diaz Bou
b381a83e9b monitoring/grafana: ValueError instead of RuntimeError
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
2022-01-18 13:24:12 +01:00
Pere Diaz Bou
4c302234ff monitoring/grafana: Replace missing legendFormat warning with error
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
2022-01-18 13:24:10 +01:00
Guillaume Abrioux
5c0f0698a5 qa/cephadm: install hwe kernel only for focal
Let's install hwe kernel only on Ubuntu focal, otherwise we only shift the
problem on Ubuntu bionic given that the hwe kernel for bionic is 5.4.

Fixes: https://tracker.ceph.com/issues/53863

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-01-18 10:36:24 +01:00
Samuel Just
51a347456d
Merge pull request #44591 from athanatos/sjust/wip-seastore-flush
crimson/os/seastore: avoid empty Transactions by adding explicit flush() call

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
2022-01-17 19:36:10 -08:00
Samuel Just
7370d7ded8
Merge pull request #44556 from cyx1231st/wip-crimson-improve-log-journal
crimson/os/seastore: consolidate seastore_journal logs with cleanup and validations

Reviewed-by: Samuel Just <sjust@redhat.com>
2022-01-17 13:19:02 -08:00
Casey Bodley
3b93654d6e rgw: clean up index after full metadata sync
Fixes: https://tracker.ceph.com/issues/40177

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-01-17 15:53:26 -05:00
Casey Bodley
dd6bf0b5a8 rgw: clean up index after full data sync
Fixes: https://tracker.ceph.com/issues/40177

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-01-17 15:53:26 -05:00
Samuel Just
7c4b3cc7fa crimson/os/seastore: implement FuturizedStore::flush
Signed-off-by: Samuel Just <sjust@redhat.com>
2022-01-17 20:50:58 +00:00
Waad AlKhoury
ea55a0b33d mgr/dashboard: Notification banners at the top of the UI have fixed height
Fixes: https://tracker.ceph.com/issues/51575
Signed-off-by: Waad AlKhoury <walkhour@redhat.com>
2022-01-17 20:45:58 +01:00
Casey Bodley
52bfa9a866 qa/rgw: run multisite tests with some async notifications disabled
disable the sending of async datalog notifications on one zone per
cluster. this helps to verify that tests don't rely on notifications to
succeed

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-01-17 13:54:39 -05:00
Casey Bodley
bf0a4ef1aa rgw: allow rgw_data_notify_interval_msec=0 to disable notifications
the data changes log for multisite will occasionally broadcast recent
changes to other zones, which they can use to prioritize sync of some
of the most recent changes. they'll eventually see all changes as they
replay the data changes log, though, so notifications aren't required
for successful sync. the ability to turn them off is useful for testing

Fixes: https://tracker.ceph.com/issues/49723

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-01-17 13:54:36 -05:00
Mykola Golub
078d72e5e6 cls/journal: skip disconnected clients when finding min_commit_position
When a new journal client is registered, all already registered
clients are checked, and a client with min position is selected
as a position for the new client. Thus we may expect that
starting from the registered position all journal entries will be
available (not trimmed) for the new client.

But when looking for a min commit position, the client_register
function did not take into account that a registered client might
be in disconnected state, and in that case the journal entries
might be trimmed for this client.

Fixes: https://tracker.ceph.com/issues/53888
Signed-off-by: Mykola Golub <mgolub@suse.com>
2022-01-17 18:41:34 +00:00
Guillaume Abrioux
f8e22fb3da qa/nvme_loop: fix an issue on ubuntu 18.04
The following command:

```
echo /dev/sda | tee /sys/kernel/config/nvmet/subsystems/sda/namespaces/1/device_path
```

makes nvme_loop fail because fascinatingly, it adds an unexpected newline.

See:
```
/dev/sda
/dev/sda

1
tee: /sys/kernel/config/nvmet/subsystems/sda/namespaces/1/enable: No such file or directory
/dev/sda
1
```

Other distros don't have the same behavior:

```
CentOS 8
/dev/sda
/dev/sda
1

Ubuntu 20.04
/dev/sda
/dev/sda
1
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-01-17 17:10:08 +01:00
Guillaume Abrioux
3c93ffdc92 ceph-volume: fix regression introcuded via #43536
The recent changes from PR #43536 introduced a regeression preventing from
running ceph-volume in a containerized context on Ubuntu 18.04.

Given that the path for the binary `lvs` differs between CentOS 8 and Ubuntu 18.04.
(`/usr/sbin/lvs` and `/sbin/lvs` respictively). It means that ceph-volume running
in the container on CentOS 8 sees the `lvs` binary at `/usr/sbin/lvs` and try to
run it with `nsenter` on the host which is running Ubuntu 18.04.

Fixes: https://tracker.ceph.com/issues/53812

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 95e88cda3df76b59b548ae808df0ef7f19db1f63)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-01-17 17:10:03 +01:00