Commit Graph

131702 Commits

Author SHA1 Message Date
Guillaume Abrioux
e720a658d6 cephadm: fix osd adoption with custom cluster name
When adopting Ceph OSD containers from a Ceph cluster with a custom name, it fails
because the name isn't propagated in unit.run.
The idea here is to change the lvm metadata and enforce 'ceph.cluster_name=ceph'
given that cephadm doesn't support custom names anyway.

Fixes: https://tracker.ceph.com/issues/55654

Signed-off-by: Adam King <adking@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-01 18:48:46 +02:00
Ilya Dryomov
04567650ca
Merge pull request #46474 from idryomov/wip-rbd-codeowners
CODEOWNERS: add RBD team

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
2022-06-01 18:29:38 +02:00
Ilya Dryomov
00a44f1c6b CODEOWNERS: add RBD team
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-06-01 13:37:21 +02:00
Xuehan Xu
bf2213f89d crimson/os/seastore/segment_cleaner: retrieve different live extents in parallel
Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
2022-06-01 16:28:58 +08:00
Redouane Kachach
0e7a4366c0
mgr/cephadm: capture exception when not able to list upgrade tags
Fixes: https://tracker.ceph.com/issues/55801

Signed-off-by: Redouane Kachach <rkachach@redhat.com>
2022-06-01 10:19:38 +02:00
Yaarit Hatuka
63f5dcdb52 mgr/telemetry: add Rook data
Add the first Rook data collection to telemetry's basic channel.

We choose to nag with this collection since we wish to know the volume
of Rook deployments in the wild.

The next Rook collections should have consecutive numbers (basic_rook_v02,
basic_rook_v03, ...).

See tracker below for more details.

Fixes: https://tracker.ceph.com/issues/55740
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
2022-06-01 04:46:17 +00:00
Samuel Just
9e3d8521cf
Merge pull request #46382 from rzarzynski/wip-crimson-op-tracking-3
crimson/osd: add support for historic & slow op tracking

Reviewed-by: Samuel Just <sjust@redhat.com>
2022-05-31 16:48:52 -07:00
Samuel Just
8e2719d7d7
Merge pull request #46437 from cyx1231st/wip-seastore-tune-and-fixes
crimson/os/seastore/segment_cleaner: tune and fixes around reclaiming

Reviewed-by: Samuel Just <sjust@redhat.com>
2022-05-31 16:37:11 -07:00
Laura Flores
19f6446e98
Merge pull request #46193 from ljflores/wip-zero-detection-off-by-default
os/bluestore: turn bluestore zero block detection off by default
2022-05-31 16:55:51 -05:00
Casey Bodley
bed7051383 rgw: restore check for empty olh name on reshard
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-31 17:29:59 -04:00
Casey Bodley
47913cb7f1 test/rgw: fix test case for empty-OLH-name cleanup
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-31 17:29:18 -04:00
Soumya Koduri
52f341b00a
Merge pull request #46367 from 0xavi0/dbstore-default-dbdir-rgw-data
rgw/dbstore: change default value of dbstore_db_dir to /var/lib/ceph/radosgw

Reviewed-by: Soumya Koduri <skoduri@redhat.com>
2022-05-31 21:27:28 +05:30
Casey Bodley
266da4a919
Merge pull request #46395 from cbodley/wip-backport-create-issue-assigned-to
backport-create-issue: copy 'Assignee' of original issue to backports

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
2022-05-31 11:17:12 -04:00
Neha Ojha
914c99da13
Merge pull request #46415 from neha-ojha/wip-cw-core
.github/CODEOWNERS: tag core devs on core PRs

Reviewed-by: Laura Flores <lflores@redhat.com>
2022-05-31 07:16:18 -07:00
Ilya Dryomov
ef83c0f347 librbd: unlink newest mirror snapshot when at capacity, bump capacity
CreatePrimaryRequest::unlink_peer() invoked via "rbd mirror image
snapshot" command or via rbd_support mgr module when creating a new
scheduled mirror snapshot at rbd_mirroring_max_mirroring_snapshots
capacity on the primary cluster can race with Replayer::unlink_peer()
invoked by rbd-mirror when finishing syncing an older snapshot on the
secondary cluster.  Consider the following:

   [ primary: primary-snap1, primary-snap2, primary-snap3
     secondary: non-primary-snap1 (complete), non-primary-snap2 (syncing) ]

0. rbd-mirror is syncing snap1..snap2 delta
1. rbd_support creates primary-snap4
2. due to rbd_mirroring_max_mirroring_snapshots == 3, rbd_support picks
   primary-snap3 for unlinking
3. rbd-mirror finishes syncing snap1..snap2 delta and marks
   non-primary-snap2 complete

   [ snap1 (the old base) is no longer needed on either cluster ]

4. rbd-mirror unlinks and removes primary-snap1
5. rbd-mirror removes non-primary-snap1
6. rbd-mirror picks snap2 as the new base
7. rbd-mirror creates non-primary-snap3 and starts syncing snap2..snap3
   delta

   [ primary: primary-snap2, primary-snap3, primary-snap4
     secondary: non-primary-snap2 (complete), non-primary-snap3 (syncing) ]

8. rbd_support unlinks and removes primary-snap3 which is in-use by
   rbd-mirror

If snap trimming on the primary cluster kicks in soon enough, the
secondary image becomes corrupted: rbd-mirror would eventually finish
"syncing" non-primary-snap3 and mark it complete in spite of bogus data
in the HEAD -- the primary cluster OSDs would start returning ENOENT
for snap trimmed objects.  Luckily, rbd-mirror's attempt to pick snap3
as the new base would wedge the replayer with "split-brain detected:
failed to find matching non-primary snapshot in remote image" error.

Before commit a888bff8d0 ("librbd/mirror: tweak which snapshot is
unlinked when at capacity") this could happen pretty much all the time
as it was the second oldest snapshot that was unlinked.  This commit
changed it to be the third oldest snapshot, turning this into a more
narrow but still very much possible to hit race.

Unfortunately this race condition appears to be inherent to the way
snapshot-based mirroring is currently implemented:

a. when mirror snapshots are created on the producer side of the
   snapshot queue, they are already linked
b. mirror snapshots can be concurrently unlinked/removed on both
   sides of the snapshot queue by non-cooperating clients (local
   rbd_mirror_image_create_snapshot() vs remote rbd-mirror)
c. with mirror peer links off the list due to (a), there is no
   existing way for rbd-mirror to persistently mark a snapshot as
   in-use

As a workaround, bump rbd_mirroring_max_mirroring_snapshots to 5 and
always unlink the newest snapshot (i.e. slot 4) instead of the third
oldest snapshot (i.e. slot 2).  Hopefully this gives enough leeway,
as rbd-mirror would need to sync two snapshots (i.e. transition from
syncing 0-1 to 1-2 and then to 2-3) before potentially colliding with
rbd_mirror_image_create_snapshot() on slot 4.

Fixes: https://tracker.ceph.com/issues/55803
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-05-31 15:14:03 +02:00
Ilya Dryomov
94703c1036 test/librbd: fix set_val() call in SuccessUnlink* test cases
rbd_mirroring_max_mirroring_snapshots isn't actually set to 3 there
due to the stray conf_ prefix.  It didn't matter until now because the
default was also 3.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-05-31 15:14:03 +02:00
Kefu Chai
42f5465755 debian: extract python3 packages to a single place
to better maintainability

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
2022-05-31 19:42:32 +08:00
Kefu Chai
ef19547e83 debian: add .requires for specifying python3 deps
we use dh_python3 to define subvar of ${python3:Depends} as a part
of the runtime dependencies of python3 packages, like,
ceph-mgr modules named "ceph-mgr-*", python3 bindings named "python3-*".

but unlike python3 bindings of Ceph APIs, the ceph-mgr modules are
not packaged in a typical python way. in other words, they do not
ship a "dist-info" or an "egg-info" directory. instead, we just
install the python scripts into a directory which can be found by
ceph-mgr, by default it is /usr/share/ceph/mgr/dashboard/plugins.

this does not follow the convention of python packaging or
debian packaging policies related to python package. but it
still makes to put these files in this non-convention place, as
they are not supposed to be python packages consumed by the
outer world -- they are but plugins. and should always work
with the same version of ceph-mgr.

the problem is, despite that we have ${python3:Depends} in
the "Depends" field of packages like ceph-mgr-dashboard, dh_python3
is not able to figure out the dependencies by looking at the
installed files. for instance, we have following "Depends" of
ceph-mgr-dashboard:

Depends: ceph-mgr (= 17.0.0-12481-g805d2320-1focal), python3-cherrypy3, python3-jwt, python3-bcrypt, python3-werkzeug, python3-routes

and in the debian/control file we have:

Depends: ceph-mgr (= ${binary:Version}),
         python3-cherrypy3,
         python3-jwt,
         python3-bcrypt,
         python3-werkzeug,
         python3-routes,
         ${misc:Depends},
         ${python:Depends},
         ${shlibs:Depends},

apparently, none of the subvar is materialized to
a non-empty string.

to improve the packaging, in this change:

* drop all subvars from ceph-mgr-*, as they
  are all implemented in pure python.
* add debian/ceph-mgr-*.requires, it's content
  is replicated with the corresponding requirements.txt
  files.
  * add python3-distutils for distutils, as debian
    and its derivatives package non-essetial part of
    distutils into a separate package, see
    https://packages.debian.org/stable/python3-distutils
* add ${python3:Depends} so dh_python3
  can extract the deps from debian/ceph-mgr-*.pydist
* update the rule for "override_dh_python3" target,
  so dh_python3 can pick up the dependencies specified
  in .requires file.
* remove the python3 dependencies not used by
  ceph-mgr from ceph-mgr's "Depends"

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
2022-05-31 19:42:32 +08:00
Redouane Kachach
6b76753c3c
mgr/cephadm: check if a service exists before trying to restart it
Fixes: https://tracker.ceph.com/issues/55800

Signed-off-by: Redouane Kachach <rkachach@redhat.com>
2022-05-31 12:11:03 +02:00
zdover23
b4376cfe57
Merge pull request #46430 from zdover23/wip-doc-2022-05-30-hw-recs-memory-section
doc/start: update "memory" in hardware-recs.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2022-05-31 17:07:40 +10:00
0xavi0
895ee7921c
rgw/dbstore: change default value of dbstore_db_dir to /var/lib/ceph/radosgw
Changes a few NULL to nullptr.

Adds std::filesystem for path building so they're platform independant.

Fixes a bug for DBStoreManager's second constructor not creating the DB.

Adds unit tests to test DB path and prefix.

Fixes: https://tracker.ceph.com/issues/55731

Signed-off-by: 0xavi0 <xavi.garcia@suse.com>
2022-05-31 09:01:56 +02:00
Zac Dover
429bbdea65 doc/start: update "memory" in hardware-recs.rst
This PR corrects some usage errors in the "Memory" section
of the hardware-recommendations.rst file. It also closes
some opened but never closed parentheses.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
2022-05-31 15:51:23 +10:00
Venky Shankar
2e4a777242
Merge pull request #46086 from nmshelke/feature-55401
mgr/volumes: set, get, list and remove metadata of snapshot

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2022-05-31 10:30:56 +05:30
Yingxin Cheng
377ef0e2b8 crimson/os/seastore/segment_cleaner: add info logs to reveal trim activities
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2022-05-31 12:23:01 +08:00
Yingxin Cheng
bb0e35bf54 crimson/os/seastore/transaction_manager: set to test mode under debug build
* force to test mode under debug build.
* make reclaim to happen and validated as early as possible.
* do not block user transaction when reclaim-ratio (unalive/unavailable)
  is high, especially in the beginning.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2022-05-31 12:23:01 +08:00
Yingxin Cheng
ba1227e547 crimson/os/seastore/segment_cleaner: cleanup reclaim logic
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2022-05-31 12:22:09 +08:00
Yingxin Cheng
a3afe706bc crimson/os/seastore/seastore_types: include backref as physical extents
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2022-05-31 10:51:03 +08:00
Yingxin Cheng
21306e492c crimson/os/seastore/cache: assert dirty
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2022-05-31 10:51:03 +08:00
Yingxin Cheng
bbfa540da8 crimson/os/seastore: cleanup rewrite_extent()
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2022-05-31 10:51:03 +08:00
Yingxin Cheng
b584e6a6d1 crimson/os/seastore/segment_cleaner: delay reclaim until near full
It should be generically better to delay reclaim as much as possible, so
that:
* unalive/unavailable can higher to reduce reclaim efforts;
* less conflicts between mutate and reclaim transactions;

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2022-05-31 10:51:03 +08:00
Samuel Just
03d805c7ec
Merge pull request #46296 from ceph/wip-nitzan-osd-log-to-correct-sufix
crimson/osd: logger into log_file

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Kefu Chai <tchaikov@gmail.com>
2022-05-30 16:37:41 -07:00
Samuel Just
7cfe74d266
Merge pull request #46388 from rzarzynski/wip-crimson-reindent-main-trace
crimson/osd: reindent the trace-related fragment of main()

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
2022-05-30 16:33:25 -07:00
Ilya Dryomov
3ba82f2aa7 rbd-mirror: don't prune non-primary snapshot when restarting delta sync
When restarting interrupted sync (signified by the "end" non-primary
snapshot with last_copied_object_number > 0), preserve the "start"
non-primary snapshot until the sync is completed, like it would have
been done had the sync not been interrupted.  This ensures that the
same m_local_snap_id_start is passed to scan_remote_mirror_snapshots()
and ultimately ImageCopyRequest state machine on restart as on initial
start.

This ends up being yet another fixup for 281af0de86 ("rbd-mirror:
prune unnecessary non-primary mirror snapshots"), following earlier
7ba9214ea5 ("rbd-mirror: don't prune older mirror snapshots when
pruning incomplete snapshot") and ecd3778a6f ("rbd-mirror: ensure
that the last non-primary snapshot cannot be pruned").

Fixes: https://tracker.ceph.com/issues/55796
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-05-30 21:57:14 +02:00
Ilya Dryomov
8ddce107d0 cls/rbd: fix operator<< for MirrorSnapshotNamespace
Commit 50702eece0 ("cls/rbd: added clean_since_snap_id to
MirrorSnapshotNamespace") updated dump() but missed operator<<
overload.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-05-30 21:57:14 +02:00
Ronen Friedman
2469ae8b31
Merge pull request #46320 from ronen-fr/wip-rf-snaps-onerr
osd/scrub: restart snap trimming after a failed scrub

Reviewed-by: Laura Flores <lflores@redhat.com>
2022-05-30 20:36:41 +03:00
Radosław Zarzyński
98064037f9 crimson/osd: add support for slowest historic op tracking
Signed-off-by: Radosław Zarzyński <rzarzyns@redhat.com>
2022-05-30 16:37:19 +02:00
Radosław Zarzyński
ff5757c7d1 crimson/osd: make OSDOperationRegistry responsible for historic ops
Signed-off-by: Radosław Zarzyński <rzarzyns@redhat.com>
2022-05-30 16:37:19 +02:00
Radosław Zarzyński
6f5737e4f5 crimson/osd: add support for historic op tracking.
Signed-off-by: Radosław Zarzyński <rzarzyns@redhat.com>
2022-05-30 16:37:19 +02:00
Ronen Friedman
290e744a9b osd/scrub: restart snap trimming after a failed scrub
A followup to PR#45640.
In PR#45640 snap trimming was restarted (if blocked) after all
successful scrubs, and after most scrub failures. Still, a few
failure scenarios did not handle snaptrim restart correctly.

The current PR cleans up and fixes the interaction between
scrub initiation/termination (for whatever cause) and snap
trimming.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2022-05-30 14:35:38 +00:00
Ilya Dryomov
4cb6436d67
Merge pull request #46426 from idryomov/wip-iscsi-mutual-chap-doc
doc/rbd: add mutual CHAP authentication example

Reviewed-by: Xiubo Li <xiubli@redhat.com>
2022-05-30 14:54:06 +02:00
Ilya Dryomov
dabcac2060 doc/rbd: add mutual CHAP authentication example
Based on https://github.com/ceph/ceph-iscsi/pull/260.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-05-30 13:51:49 +02:00
Kefu Chai
23501918de debian: s/${python:Depends}/${python3:Depends}/
${python:Depends} is added by dh_python2. but we've migrated to
python3 and Ceph is not compatible with python2 anymore. let's
replace all references of python2 with python3.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
2022-05-28 20:55:50 +08:00
Kefu Chai
b7b8838a56
Merge pull request #35598 from tchaikov/wip-cephfs-java
rpm,install-dep.sh: build cephfs java binding

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2022-05-28 13:29:25 +08:00
Neha Ojha
8303c6b911 .github/CODEOWNERS: tag core devs on core PRs
Start with everything that is present under core in .github/labeler.yml.

Signed-off-by: Neha Ojha <nojha@redhat.com>
2022-05-27 20:30:47 +00:00
Casey Bodley
ede6a85385 qa/rgw: fix flake8 errors in test_rgw_reshard.py
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:34 -04:00
Casey Bodley
7fc73716ce rgw/motr: fix build for MotrStore
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:34 -04:00
Adam C. Emerson
eb3606c370 rgw: RGWSyncBucketCR reads remote info on non-Incremental state
This ensures that the remote bucket index log info is available for
all cases where we're calling `InitBucketFullSyncStatusCR`

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2022-05-27 15:47:34 -04:00
Adam C. Emerson
5809082706 test/rgw: bucket sync run recovery case
1. Write several generations worth of objects. Ensure that everything
   has synced and that at least some generations have been trimmed.
2. Turn off the secondary `radosgw`.
3. Use `radosgw-admin object rm` to delete all objects in the bucket
   on the secondary.
4. Invoke `radosgw-admin bucket sync init` on the secondary.
5. Invoke `radosgw-admin bucket sync run` on the secondary.
6. Verify that all objects on the primary are also present on the
   secondary.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2022-05-27 15:47:34 -04:00
Adam C. Emerson
f59a3580ee test/rgw: Add incremental test of bucket sync run
This tests for iterating properly over the generations.

1. Create a bucket and write some objects to it. Wait for sync to
   complete. This ensures we are in Incremental.
2. Turn off the secondary `radosgw`.
3. Manually reshard. Then continue writing objects and resharding.
4. Choose objects so that each generation has objects in many but not
   all shards.
5. After building up several generations, run `bucket sync run` on the
   secondary.
6. Verify that all objects on the primary are on the secondary.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2022-05-27 15:47:34 -04:00
Adam C. Emerson
79932fcc19 rgw: add bucket object shard command to radosgw-admin
Given an object, return the bucket shard appropriate to it.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2022-05-27 15:47:34 -04:00