RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-02-26 04:22:48 +00:00

Author	SHA1	Message	Date
Guillaume Abrioux	e720a658d6	cephadm: fix osd adoption with custom cluster name When adopting Ceph OSD containers from a Ceph cluster with a custom name, it fails because the name isn't propagated in unit.run. The idea here is to change the lvm metadata and enforce 'ceph.cluster_name=ceph' given that cephadm doesn't support custom names anyway. Fixes: https://tracker.ceph.com/issues/55654 Signed-off-by: Adam King <adking@redhat.com> Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>	2022-06-01 18:48:46 +02:00
Ilya Dryomov	04567650ca	Merge pull request #46474 from idryomov/wip-rbd-codeowners CODEOWNERS: add RBD team Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>	2022-06-01 18:29:38 +02:00
Ilya Dryomov	00a44f1c6b	CODEOWNERS: add RBD team Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2022-06-01 13:37:21 +02:00
Xuehan Xu	bf2213f89d	crimson/os/seastore/segment_cleaner: retrieve different live extents in parallel Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>	2022-06-01 16:28:58 +08:00
Redouane Kachach	0e7a4366c0	mgr/cephadm: capture exception when not able to list upgrade tags Fixes: https://tracker.ceph.com/issues/55801 Signed-off-by: Redouane Kachach <rkachach@redhat.com>	2022-06-01 10:19:38 +02:00
Yaarit Hatuka	63f5dcdb52	mgr/telemetry: add Rook data Add the first Rook data collection to telemetry's basic channel. We choose to nag with this collection since we wish to know the volume of Rook deployments in the wild. The next Rook collections should have consecutive numbers (basic_rook_v02, basic_rook_v03, ...). See tracker below for more details. Fixes: https://tracker.ceph.com/issues/55740 Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>	2022-06-01 04:46:17 +00:00
Samuel Just	9e3d8521cf	Merge pull request #46382 from rzarzynski/wip-crimson-op-tracking-3 crimson/osd: add support for historic & slow op tracking Reviewed-by: Samuel Just <sjust@redhat.com>	2022-05-31 16:48:52 -07:00
Samuel Just	8e2719d7d7	Merge pull request #46437 from cyx1231st/wip-seastore-tune-and-fixes crimson/os/seastore/segment_cleaner: tune and fixes around reclaiming Reviewed-by: Samuel Just <sjust@redhat.com>	2022-05-31 16:37:11 -07:00
Laura Flores	19f6446e98	Merge pull request #46193 from ljflores/wip-zero-detection-off-by-default os/bluestore: turn bluestore zero block detection off by default	2022-05-31 16:55:51 -05:00
Casey Bodley	bed7051383	rgw: restore check for empty olh name on reshard Signed-off-by: Casey Bodley <cbodley@redhat.com>	2022-05-31 17:29:59 -04:00
Casey Bodley	47913cb7f1	test/rgw: fix test case for empty-OLH-name cleanup Signed-off-by: Casey Bodley <cbodley@redhat.com>	2022-05-31 17:29:18 -04:00
Soumya Koduri	52f341b00a	Merge pull request #46367 from 0xavi0/dbstore-default-dbdir-rgw-data rgw/dbstore: change default value of dbstore_db_dir to /var/lib/ceph/radosgw Reviewed-by: Soumya Koduri <skoduri@redhat.com>	2022-05-31 21:27:28 +05:30
Casey Bodley	266da4a919	Merge pull request #46395 from cbodley/wip-backport-create-issue-assigned-to backport-create-issue: copy 'Assignee' of original issue to backports Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2022-05-31 11:17:12 -04:00
Neha Ojha	914c99da13	Merge pull request #46415 from neha-ojha/wip-cw-core .github/CODEOWNERS: tag core devs on core PRs Reviewed-by: Laura Flores <lflores@redhat.com>	2022-05-31 07:16:18 -07:00
Ilya Dryomov	ef83c0f347	librbd: unlink newest mirror snapshot when at capacity, bump capacity CreatePrimaryRequest::unlink_peer() invoked via "rbd mirror image snapshot" command or via rbd_support mgr module when creating a new scheduled mirror snapshot at rbd_mirroring_max_mirroring_snapshots capacity on the primary cluster can race with Replayer::unlink_peer() invoked by rbd-mirror when finishing syncing an older snapshot on the secondary cluster. Consider the following: [ primary: primary-snap1, primary-snap2, primary-snap3 secondary: non-primary-snap1 (complete), non-primary-snap2 (syncing) ] 0. rbd-mirror is syncing snap1..snap2 delta 1. rbd_support creates primary-snap4 2. due to rbd_mirroring_max_mirroring_snapshots == 3, rbd_support picks primary-snap3 for unlinking 3. rbd-mirror finishes syncing snap1..snap2 delta and marks non-primary-snap2 complete [ snap1 (the old base) is no longer needed on either cluster ] 4. rbd-mirror unlinks and removes primary-snap1 5. rbd-mirror removes non-primary-snap1 6. rbd-mirror picks snap2 as the new base 7. rbd-mirror creates non-primary-snap3 and starts syncing snap2..snap3 delta [ primary: primary-snap2, primary-snap3, primary-snap4 secondary: non-primary-snap2 (complete), non-primary-snap3 (syncing) ] 8. rbd_support unlinks and removes primary-snap3 which is in-use by rbd-mirror If snap trimming on the primary cluster kicks in soon enough, the secondary image becomes corrupted: rbd-mirror would eventually finish "syncing" non-primary-snap3 and mark it complete in spite of bogus data in the HEAD -- the primary cluster OSDs would start returning ENOENT for snap trimmed objects. Luckily, rbd-mirror's attempt to pick snap3 as the new base would wedge the replayer with "split-brain detected: failed to find matching non-primary snapshot in remote image" error. Before commit `a888bff8d0` ("librbd/mirror: tweak which snapshot is unlinked when at capacity") this could happen pretty much all the time as it was the second oldest snapshot that was unlinked. This commit changed it to be the third oldest snapshot, turning this into a more narrow but still very much possible to hit race. Unfortunately this race condition appears to be inherent to the way snapshot-based mirroring is currently implemented: a. when mirror snapshots are created on the producer side of the snapshot queue, they are already linked b. mirror snapshots can be concurrently unlinked/removed on both sides of the snapshot queue by non-cooperating clients (local rbd_mirror_image_create_snapshot() vs remote rbd-mirror) c. with mirror peer links off the list due to (a), there is no existing way for rbd-mirror to persistently mark a snapshot as in-use As a workaround, bump rbd_mirroring_max_mirroring_snapshots to 5 and always unlink the newest snapshot (i.e. slot 4) instead of the third oldest snapshot (i.e. slot 2). Hopefully this gives enough leeway, as rbd-mirror would need to sync two snapshots (i.e. transition from syncing 0-1 to 1-2 and then to 2-3) before potentially colliding with rbd_mirror_image_create_snapshot() on slot 4. Fixes: https://tracker.ceph.com/issues/55803 Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2022-05-31 15:14:03 +02:00
Ilya Dryomov	94703c1036	test/librbd: fix set_val() call in SuccessUnlink* test cases rbd_mirroring_max_mirroring_snapshots isn't actually set to 3 there due to the stray conf_ prefix. It didn't matter until now because the default was also 3. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2022-05-31 15:14:03 +02:00
Kefu Chai	42f5465755	debian: extract python3 packages to a single place to better maintainability Signed-off-by: Kefu Chai <tchaikov@gmail.com>	2022-05-31 19:42:32 +08:00
Kefu Chai	ef19547e83	debian: add .requires for specifying python3 deps we use dh_python3 to define subvar of ${python3:Depends} as a part of the runtime dependencies of python3 packages, like, ceph-mgr modules named "ceph-mgr-", python3 bindings named "python3-". but unlike python3 bindings of Ceph APIs, the ceph-mgr modules are not packaged in a typical python way. in other words, they do not ship a "dist-info" or an "egg-info" directory. instead, we just install the python scripts into a directory which can be found by ceph-mgr, by default it is /usr/share/ceph/mgr/dashboard/plugins. this does not follow the convention of python packaging or debian packaging policies related to python package. but it still makes to put these files in this non-convention place, as they are not supposed to be python packages consumed by the outer world -- they are but plugins. and should always work with the same version of ceph-mgr. the problem is, despite that we have ${python3:Depends} in the "Depends" field of packages like ceph-mgr-dashboard, dh_python3 is not able to figure out the dependencies by looking at the installed files. for instance, we have following "Depends" of ceph-mgr-dashboard: Depends: ceph-mgr (= 17.0.0-12481-g805d2320-1focal), python3-cherrypy3, python3-jwt, python3-bcrypt, python3-werkzeug, python3-routes and in the debian/control file we have: Depends: ceph-mgr (= ${binary:Version}), python3-cherrypy3, python3-jwt, python3-bcrypt, python3-werkzeug, python3-routes, ${misc:Depends}, ${python:Depends}, ${shlibs:Depends}, apparently, none of the subvar is materialized to a non-empty string. to improve the packaging, in this change: * drop all subvars from ceph-mgr-, as they are all implemented in pure python. add debian/ceph-mgr-.requires, it's content is replicated with the corresponding requirements.txt files. add python3-distutils for distutils, as debian and its derivatives package non-essetial part of distutils into a separate package, see https://packages.debian.org/stable/python3-distutils * add ${python3:Depends} so dh_python3 can extract the deps from debian/ceph-mgr-.pydist update the rule for "override_dh_python3" target, so dh_python3 can pick up the dependencies specified in .requires file. * remove the python3 dependencies not used by ceph-mgr from ceph-mgr's "Depends" Signed-off-by: Kefu Chai <tchaikov@gmail.com>	2022-05-31 19:42:32 +08:00
Redouane Kachach	6b76753c3c	mgr/cephadm: check if a service exists before trying to restart it Fixes: https://tracker.ceph.com/issues/55800 Signed-off-by: Redouane Kachach <rkachach@redhat.com>	2022-05-31 12:11:03 +02:00
zdover23	b4376cfe57	Merge pull request #46430 from zdover23/wip-doc-2022-05-30-hw-recs-memory-section doc/start: update "memory" in hardware-recs.rst Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>	2022-05-31 17:07:40 +10:00
0xavi0	895ee7921c	rgw/dbstore: change default value of dbstore_db_dir to /var/lib/ceph/radosgw Changes a few NULL to nullptr. Adds std::filesystem for path building so they're platform independant. Fixes a bug for DBStoreManager's second constructor not creating the DB. Adds unit tests to test DB path and prefix. Fixes: https://tracker.ceph.com/issues/55731 Signed-off-by: 0xavi0 <xavi.garcia@suse.com>	2022-05-31 09:01:56 +02:00
Zac Dover	429bbdea65	doc/start: update "memory" in hardware-recs.rst This PR corrects some usage errors in the "Memory" section of the hardware-recommendations.rst file. It also closes some opened but never closed parentheses. Signed-off-by: Zac Dover <zac.dover@gmail.com>	2022-05-31 15:51:23 +10:00
Venky Shankar	2e4a777242	Merge pull request #46086 from nmshelke/feature-55401 mgr/volumes: set, get, list and remove metadata of snapshot Reviewed-by: Venky Shankar <vshankar@redhat.com>	2022-05-31 10:30:56 +05:30
Yingxin Cheng	377ef0e2b8	crimson/os/seastore/segment_cleaner: add info logs to reveal trim activities Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>	2022-05-31 12:23:01 +08:00
Yingxin Cheng	bb0e35bf54	crimson/os/seastore/transaction_manager: set to test mode under debug build * force to test mode under debug build. * make reclaim to happen and validated as early as possible. * do not block user transaction when reclaim-ratio (unalive/unavailable) is high, especially in the beginning. Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>	2022-05-31 12:23:01 +08:00
Yingxin Cheng	ba1227e547	crimson/os/seastore/segment_cleaner: cleanup reclaim logic Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>	2022-05-31 12:22:09 +08:00
Yingxin Cheng	a3afe706bc	crimson/os/seastore/seastore_types: include backref as physical extents Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>	2022-05-31 10:51:03 +08:00
Yingxin Cheng	21306e492c	crimson/os/seastore/cache: assert dirty Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>	2022-05-31 10:51:03 +08:00
Yingxin Cheng	bbfa540da8	crimson/os/seastore: cleanup rewrite_extent() Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>	2022-05-31 10:51:03 +08:00
Yingxin Cheng	b584e6a6d1	crimson/os/seastore/segment_cleaner: delay reclaim until near full It should be generically better to delay reclaim as much as possible, so that: * unalive/unavailable can higher to reduce reclaim efforts; * less conflicts between mutate and reclaim transactions; Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>	2022-05-31 10:51:03 +08:00
Samuel Just	03d805c7ec	Merge pull request #46296 from ceph/wip-nitzan-osd-log-to-correct-sufix crimson/osd: logger into log_file Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com> Reviewed-by: Kefu Chai <tchaikov@gmail.com>	2022-05-30 16:37:41 -07:00
Samuel Just	7cfe74d266	Merge pull request #46388 from rzarzynski/wip-crimson-reindent-main-trace crimson/osd: reindent the trace-related fragment of main() Reviewed-by: Kefu Chai <tchaikov@gmail.com>	2022-05-30 16:33:25 -07:00
Ilya Dryomov	3ba82f2aa7	rbd-mirror: don't prune non-primary snapshot when restarting delta sync When restarting interrupted sync (signified by the "end" non-primary snapshot with last_copied_object_number > 0), preserve the "start" non-primary snapshot until the sync is completed, like it would have been done had the sync not been interrupted. This ensures that the same m_local_snap_id_start is passed to scan_remote_mirror_snapshots() and ultimately ImageCopyRequest state machine on restart as on initial start. This ends up being yet another fixup for `281af0de86` ("rbd-mirror: prune unnecessary non-primary mirror snapshots"), following earlier `7ba9214ea5` ("rbd-mirror: don't prune older mirror snapshots when pruning incomplete snapshot") and `ecd3778a6f` ("rbd-mirror: ensure that the last non-primary snapshot cannot be pruned"). Fixes: https://tracker.ceph.com/issues/55796 Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2022-05-30 21:57:14 +02:00
Ilya Dryomov	8ddce107d0	cls/rbd: fix operator<< for MirrorSnapshotNamespace Commit `50702eece0` ("cls/rbd: added clean_since_snap_id to MirrorSnapshotNamespace") updated dump() but missed operator<< overload. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2022-05-30 21:57:14 +02:00
Ronen Friedman	2469ae8b31	Merge pull request #46320 from ronen-fr/wip-rf-snaps-onerr osd/scrub: restart snap trimming after a failed scrub Reviewed-by: Laura Flores <lflores@redhat.com>	2022-05-30 20:36:41 +03:00
Radosław Zarzyński	98064037f9	crimson/osd: add support for slowest historic op tracking Signed-off-by: Radosław Zarzyński <rzarzyns@redhat.com>	2022-05-30 16:37:19 +02:00
Radosław Zarzyński	ff5757c7d1	crimson/osd: make OSDOperationRegistry responsible for historic ops Signed-off-by: Radosław Zarzyński <rzarzyns@redhat.com>	2022-05-30 16:37:19 +02:00
Radosław Zarzyński	6f5737e4f5	crimson/osd: add support for historic op tracking. Signed-off-by: Radosław Zarzyński <rzarzyns@redhat.com>	2022-05-30 16:37:19 +02:00
Ronen Friedman	290e744a9b	osd/scrub: restart snap trimming after a failed scrub A followup to PR#45640. In PR#45640 snap trimming was restarted (if blocked) after all successful scrubs, and after most scrub failures. Still, a few failure scenarios did not handle snaptrim restart correctly. The current PR cleans up and fixes the interaction between scrub initiation/termination (for whatever cause) and snap trimming. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2022-05-30 14:35:38 +00:00
Ilya Dryomov	4cb6436d67	Merge pull request #46426 from idryomov/wip-iscsi-mutual-chap-doc doc/rbd: add mutual CHAP authentication example Reviewed-by: Xiubo Li <xiubli@redhat.com>	2022-05-30 14:54:06 +02:00
Ilya Dryomov	dabcac2060	doc/rbd: add mutual CHAP authentication example Based on https://github.com/ceph/ceph-iscsi/pull/260. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2022-05-30 13:51:49 +02:00
Kefu Chai	23501918de	debian: s/${python:Depends}/${python3:Depends}/ ${python:Depends} is added by dh_python2. but we've migrated to python3 and Ceph is not compatible with python2 anymore. let's replace all references of python2 with python3. Signed-off-by: Kefu Chai <tchaikov@gmail.com>	2022-05-28 20:55:50 +08:00
Kefu Chai	b7b8838a56	Merge pull request #35598 from tchaikov/wip-cephfs-java rpm,install-dep.sh: build cephfs java binding Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2022-05-28 13:29:25 +08:00
Neha Ojha	8303c6b911	.github/CODEOWNERS: tag core devs on core PRs Start with everything that is present under core in .github/labeler.yml. Signed-off-by: Neha Ojha <nojha@redhat.com>	2022-05-27 20:30:47 +00:00
Casey Bodley	ede6a85385	qa/rgw: fix flake8 errors in test_rgw_reshard.py Signed-off-by: Casey Bodley <cbodley@redhat.com>	2022-05-27 15:47:34 -04:00
Casey Bodley	7fc73716ce	rgw/motr: fix build for MotrStore Signed-off-by: Casey Bodley <cbodley@redhat.com>	2022-05-27 15:47:34 -04:00
Adam C. Emerson	eb3606c370	rgw: `RGWSyncBucketCR` reads remote info on non-`Incremental` state This ensures that the remote bucket index log info is available for all cases where we're calling `InitBucketFullSyncStatusCR` Signed-off-by: Adam C. Emerson <aemerson@redhat.com>	2022-05-27 15:47:34 -04:00
Adam C. Emerson	5809082706	test/rgw: bucket sync run recovery case 1. Write several generations worth of objects. Ensure that everything has synced and that at least some generations have been trimmed. 2. Turn off the secondary `radosgw`. 3. Use `radosgw-admin object rm` to delete all objects in the bucket on the secondary. 4. Invoke `radosgw-admin bucket sync init` on the secondary. 5. Invoke `radosgw-admin bucket sync run` on the secondary. 6. Verify that all objects on the primary are also present on the secondary. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>	2022-05-27 15:47:34 -04:00
Adam C. Emerson	f59a3580ee	test/rgw: Add incremental test of bucket sync run This tests for iterating properly over the generations. 1. Create a bucket and write some objects to it. Wait for sync to complete. This ensures we are in Incremental. 2. Turn off the secondary `radosgw`. 3. Manually reshard. Then continue writing objects and resharding. 4. Choose objects so that each generation has objects in many but not all shards. 5. After building up several generations, run `bucket sync run` on the secondary. 6. Verify that all objects on the primary are on the secondary. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>	2022-05-27 15:47:34 -04:00
Adam C. Emerson	79932fcc19	rgw: add `bucket object shard` command to radosgw-admin Given an object, return the bucket shard appropriate to it. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>	2022-05-27 15:47:34 -04:00

1 2 3 4 5 ...

131702 Commits