RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-04 10:12:30 +00:00

Author	SHA1	Message	Date
zdover23	78f5e7ece0	Merge pull request #44213 from zdover23/wip-doc-2021-12-05-hardware-recommendations-removing-journal doc/start: remove journal info from hardware recs Reviewed-by: Dan van der Ster <daniel.vanderster@cern.ch>	2021-12-09 05:14:37 +10:00
Soumya Koduri	a7100972ad	rgw/dbstore: Multipart upload APIs For multipart upload processing, below is the method applied - MultipartUpload::Init - create head object entry for meta obj (src_obj_name + "." + upload_id) [ Meta object stores all the parts upload info] MultipartWriter::process - create all data/tail objects with obj_name same as meta obj (so that they can all be identified & deleted during abort) MultipartUpload::Abort - Just delete meta obj .. that will indirectly delete all the uploads associated with that upload id / meta obj so far. MultipartUpload::Complete - Create head object of the original object (if not exists). Rename all data/tail object entries' obj name to orig object name and update metadata of the orig object. Signed-off-by: Soumya Koduri <skoduri@redhat.com>	2021-12-08 23:03:12 +05:30
Daniel Gryniewicz	a09886f8b1	Merge pull request #44232 from Huber-ming/admin_fixlogs radosgw-admin: fix some error logs	2021-12-08 11:34:41 -05:00
Daniel Gryniewicz	4cdf6d4f1f	Merge pull request #43915 from qiuxinyidian/dev-rgw rgw: when radosgw-admin stating user, add user exists judging Reviewed-by: Casey Bodley <cbodley@redhat.com> Reviewed-by: Daniel Gryniewicz <dang@redhat.com>	2021-12-08 11:22:28 -05:00
Daniel Gryniewicz	0e8ec9654f	Merge pull request #43834 from Huber-ming/admin_bilog radosgw-admin: supplement help documents with 'bilog autotrim'	2021-12-08 11:19:12 -05:00
Zac Dover	276bbd8f2b	doc/start: remove journal info from hardware recs This PR removes mentions of journaling from the hardware recommendations. Journaling was a FileStore-related practice. BlueStore is the default backend for Ceph OSDs and has been since Luminous. The documentation should reflect that. Signed-off-by: Zac Dover <zac.dover@gmail.com>	2021-12-09 01:20:57 +10:00
Sebastian Wagner	16e60463de	Merge pull request #44093 from melissa-kun-li/ssh-non-root-user mgr/cephadm: support bootstrap with non-root ssh-user Reviewed-by: Michael Fritch <mfritch@suse.com> Reviewed-by: Sebastian Wagner <sewagner@redhat.com>	2021-12-08 15:51:36 +01:00
Sage Weil	f5973ccef4	mgr/progress: avoid inefficient dump of all pg stats We only use a handful of fields, and the pg dump includes a gazillion fields that we waste CPU copying to python-land. This tends to lead to long ClusterState::lock hold times, leading to long ms_dispatch delays and generally gumming up the works. Instead, create a new "pg_progress" item that dumps only the fields that mgr/progress needs. Fixes: https://tracker.ceph.com/issues/53475 Signed-off-by: Sage Weil <sage@newdream.net>	2021-12-08 07:13:27 -05:00
Sage Weil	1c741b4147	Merge PR #44162 into master * refs/pull/44162/head: mgr: only queue notify events that modules ask for pybind/mgr: annotate which events modules consume pybind/mgr: introduce NotifyType enum mgr: stop issuing events that no modules consume Reviewed-by: Sebastian Wagner <sewagner@redhat.com>	2021-12-08 07:11:53 -05:00
Sage Weil	907f38c151	Merge PR #44196 into master * refs/pull/44196/head: mon/MgrStatMonitor: do not spam subscribers (mgr) with service_map Reviewed-by: Neha Ojha <nojha@redhat.com>	2021-12-08 07:10:21 -05:00
Sage Weil	60e0ea02d7	Merge PR #44207 into master * refs/pull/44207/head: mgr/ActivePyModule: avoid with_gil where possible mgr/ActivePyModules: push without_gil_t down into blocks Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-12-08 07:10:08 -05:00
Radoslaw Zarzynski	e714d6effb	Merge pull request #43542 from rzarzynski/wip-crimson-net-ms_learn_from_peer crimson/net: add support for ms_learn_addr_from_peer. Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com> Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>	2021-12-08 11:24:24 +01:00
Sebastian Wagner	9ead2e4523	Merge pull request #44104 from guits/guits-cephadm-skip-cv-restorecon cephadm: pass `CEPH_VOLUME_SKIP_RESTORECON=yes` Reviewed-by: Ken Dreyer <kdreyer@redhat.com> Reviewed-by: Sebastian Wagner <sewagner@redhat.com>	2021-12-08 10:25:49 +01:00
Sébastien Han	d89f30a1b1	Merge pull request #44239 from guits/guits-add-skip-needs-root ceph-volume: make it possible to skip needs_root()	2021-12-08 09:56:49 +01:00
Kefu Chai	d8c54e8b68	Merge pull request #44245 from tchaikov/wip-crimson-seastar seastar: pick up change to fix FTBFS with old cryptopp Reviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com> Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-12-08 16:50:20 +08:00
Radoslaw Zarzynski	cdef16ecf7	crimson/net: add support for ms_learn_addr_from_peer. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-12-08 08:46:50 +00:00
Samuel Just	4a84dac892	Merge pull request #44179 from athanatos/sjust/wip-pin-race crimson/os/seastore: initialize logical pins before exposing to cache Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com> Reviewed-by: Xuehan Xu <xxhdx1985126@gmail.com>	2021-12-07 23:43:17 -08:00
Kefu Chai	58cb9bace4	crimson/osd: s/seastar::fprint()/fmt::print()/ otherwise, we'd have warnings like: ./src/crimson/osd/main.cc:106:16: error: 'fprint<const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > &>' is deprecated: use std::format_to() or fmt::print() [-Werror,-Wdeprecated-declarations] seastar::fprint(std::cerr, "already have key in keyring: %s\n", path); ^ Signed-off-by: Kefu Chai <tchaikov@gmail.com>	2021-12-08 15:39:49 +08:00
Kefu Chai	918eef4f61	seastar: pick up change to fix FTBFS with old cryptopp Signed-off-by: Kefu Chai <kefu@xsky.com>	2021-12-08 14:53:08 +08:00
Yuval Lifshitz	2e6e91838d	Merge pull request #44186 from qiuxinyidian/rgw-de rgw: add object null point judging when listing pubsub topics	2021-12-08 08:05:32 +02:00
Yingxin Cheng	bf9f669e06	crimson/os/seastore: measure the number of conflicting transactions by srcs Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>	2021-12-08 13:46:07 +08:00
Yingxin Cheng	4a6dd67f62	crimson/os/seastore: differentiate cleaner trim/reclaim transactions Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>	2021-12-08 13:46:07 +08:00
chunmei-liu	ce1ca97f84	crimson/osd: fix heartbeat front and back blank ip when ceph.conf not set public ip & cluster ip, heartbeat will get blank ip address. when osd::_send_boot , classic osd will check if heartbeat front and back addrs are blank ip, if they are blank ip, will use public ip which is learned from mon to set into them. So implement them in crimson osd. Signed-off-by: chunmei-liu <chunmei.liu@intel.com>	2021-12-07 16:57:54 -08:00
Kefu Chai	b03d3d9165	Merge pull request #44147 from rzarzynski/wip-crimson-new-seastar crimson: bump up Seastar to recent master and fix FTBFS Reviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>	2021-12-08 08:56:11 +08:00
Neha Ojha	3304a82bfd	Merge pull request #44095 from Matan-B/wip-matanb-local-workunits doc/dev: Running workunits locally Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2021-12-07 13:34:15 -08:00
Yuval Lifshitz	365bfd1437	Merge pull request #43940 from TRYTOBE8TME/wip-rgw-empty-config src/rgw: Empty configuration support	2021-12-07 21:20:33 +02:00
Yuval Lifshitz	a2d9f222bb	Merge pull request #43665 from zenomri/wip-omri-multipart-trace rgw/tracer: Multipart upload trace	2021-12-07 21:19:27 +02:00
Samuel Just	0ded1b2b6a	Merge pull request #44156 from rzarzynski/wip-crimson-fix-process_op-sequencing crimson/osd: fix sequencing issues in ClientRequest::process_op. Reviewed-by: Samuel Just <sjust@redhat.com>	2021-12-07 07:53:21 -08:00
Samuel Just	d029e2e989	Merge pull request #44223 from rzarzynski/wip-crimson-fix-pullinfo-on-push crimson/osd: don't assume a pull must happen if there is no push. Reviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>	2021-12-07 07:52:41 -08:00
Samuel Just	d4ad98c15f	Merge pull request #44224 from rzarzynski/wip-crimson-clean-msghs crimson/osd: clean the recovery message-related header inclusion. Reviewed-by: Chunmei Liu <chunmei.liu@intel.com> Reviewed-by: Samuel Just <sjust@redhat.com>	2021-12-07 07:52:02 -08:00
Guillaume Abrioux	691660c42e	ceph-volume: fix error 'KeyError' with inventory The tag ceph.cluster_name is always set at the end. The only way it could be absent was if the osd prepare has been interrupted between [1] and [2]. [1] https://github.com/ceph/ceph/blob/v14.2.11/src/ceph-volume/ceph_volume/devices/lvm/strategies/bluestore.py#L355-L387 [2] https://github.com/ceph/ceph/blob/v14.2.11/src/ceph-volume/ceph_volume/devices/lvm/prepare.py Although the code received tremendous changes meantime and this error shouldn't show up again, we need to handle the case where this tag wouldn't have been set. Fixes: https://tracker.ceph.com/issues/44356 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-12-07 16:50:28 +01:00
Samuel Just	b7dfff6cf1	Merge pull request #44184 from rzarzynski/wip-crimson-internal_client_request-fix-hobj crimson/osd: fix assertion failure in InternalClientRequest. Reviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>	2021-12-07 07:49:08 -08:00
Guillaume Abrioux	c24d3666c0	ceph-volume: fix tags dict output in `lvm list` Default value for `--crush-device-class` is `None`. When not passing this parameter, ceph-volume sets the value "None" in the lv tags. Therefore, ceph-volume will output that value with calling `ceph-volume lvm list --format json` For instance: ``` "1": [ { "devices": [ "/dev/sdc" ], "lv_name": "osd-data-5a4a34f5-5733-4c69-b439-edb48e31a45f", "lv_path": "/dev/ceph-aeb16fc3-9ac2-4126-ab66-bf920d101ea4/osd-data-5a4a34f5-5733-4c69-b439-edb48e31a45f", "lv_size": "49.00g", "lv_tags": "ceph.block_device=/dev/ceph-aeb16fc3-9ac2-4126-ab66-bf920d101ea4/osd-data-5a4a34f5-5733-4c69-b439-edb48e31a45f,ceph.block_uuid=E9hZNU-80Zz-PiER-iWN3-jSIU-krEN-khwU3x,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=40fe4af5-0408-444b-843c-0926d550d1f1,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=39680838-19df-4e50-9bb6-46b093d5b52b,ceph.osd_id=1,ceph.type=block,ceph.vdo=0", "lv_uuid": "E9hZNU-80Zz-PiER-iWN3-jSIU-krEN-khwU3x", "name": "osd-data-5a4a34f5-5733-4c69-b439-edb48e31a45f", "path": "/dev/ceph-aeb16fc3-9ac2-4126-ab66-bf920d101ea4/osd-data-5a4a34f5-5733-4c69-b439-edb48e31a45f", "tags": { "ceph.block_device": "/dev/ceph-aeb16fc3-9ac2-4126-ab66-bf920d101ea4/osd-data-5a4a34f5-5733-4c69-b439-edb48e31a45f", "ceph.block_uuid": "E9hZNU-80Zz-PiER-iWN3-jSIU-krEN-khwU3x", "ceph.cephx_lockbox_secret": "", "ceph.cluster_fsid": "40fe4af5-0408-444b-843c-0926d550d1f1", "ceph.cluster_name": "ceph", "ceph.crush_device_class": "None", ``` ceph-volume should print `"ceph.crush_device_class": "",` instead of `"ceph.crush_device_class": "None",` Fixes: https://tracker.ceph.com/issues/53425 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-12-07 16:48:29 +01:00
Pritha Srivastava	123f508a80	rgw: deleting objects inline in case bypass_gc is specified for bucket remove command. fixes: https://tracker.ceph.com/issues/53512 Signed-off-by: Pritha Srivastava <prsrivas@redhat.com>	2021-12-07 20:57:21 +05:30
Guillaume Abrioux	068a1d2a30	ceph-volume: make it possible to skip needs_root() Add the possibility to skip the `needs_root()` decorator. See linked tracker for details. Fixes: https://tracker.ceph.com/issues/53511 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-12-07 15:18:10 +01:00
Alfonso Martínez	6628f444b3	Merge pull request #44145 from rhcs-dashboard/fix-frontend-vulnerabilities mgr/dashboard: fix frontend deps' vulnerabilities Reviewed-by: Waad Alkhoury <walkhour@redhat.com> Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>	2021-12-07 15:02:42 +01:00
Radoslaw Zarzynski	be0ba67623	crimson/osd: fix sequencing issues in ClientRequest::process_op. The following crash has been observed in one of the runs at Sepia: ``` ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-8898-ge57ad63c/rpm/el8/BUILD/ceph-17.0.0-8898-ge57ad63c/src/crimson/osd/osd_operation_sequencer.h:123: void crimson::osd::OpSequencer::finish_op_in_order(crimson::osd::ClientRequest&): Assertion `op.get_id() > last_completed_id' failed. Aborting on shard 0. Backtrace: Reactor stalled for 1807 ms on shard 0. Backtrace: 0xb14ab 0x46e57428 0x46bc450d 0x46be03bd 0x46be0782 0x46be0946 0x46be0bf6 0x12b1f 0x137341 0x3fdd6a92 0x3fddccdb 0x3fdde1ee 0x3fdde8b3 0x3fdd3f2b 0x3fdd4442 0x3fdd4c3a 0x12b1f 0x3737e 0x21db4 0x21c88 0x2fa75 0x3b769527 0x3b8418af 0x3b8423cb 0x3b842ce0 0x3b84383d 0x3a116220 0x3a143f31 0x3a144bcd 0x46b96271 0x46bde51a 0x46d6891b 0x46d6a8f0 0x4681a7d2 0x4681f03b 0x39fd50f2 0x23492 0x39b7a7dd 0# gsignal in /lib64/libc.so.6 1# abort in /lib64/libc.so.6 2# 0x00007FB9FB946C89 in /lib64/libc.so.6 3# 0x00007FB9FB954A76 in /lib64/libc.so.6 4# 0x00005595E98E6528 in ceph-osd 5# 0x00005595E99BE8B0 in ceph-osd 6# 0x00005595E99BF3CC in ceph-osd 7# 0x00005595E99BFCE1 in ceph-osd 8# 0x00005595E99C083E in ceph-osd 9# 0x00005595E8293221 in ceph-osd 10# 0x00005595E82C0F32 in ceph-osd 11# 0x00005595E82C1BCE in ceph-osd 12# 0x00005595F4D13272 in ceph-osd 13# 0x00005595F4D5B51B in ceph-osd 14# 0x00005595F4EE591C in ceph-osd 15# 0x00005595F4EE78F1 in ceph-osd 16# 0x00005595F49977D3 in ceph-osd 17# 0x00005595F499C03C in ceph-osd 18# main in ceph-osd 19# __libc_start_main in /lib64/libc.so.6 20# _start in ceph-osd ``` The sequence of events provides at least two clues: - the op no. 32 finished before the op no. 29 which was waiting for `ObjectContext`, - the op no. 29 was a short-living one -- it wasn't waiting even on `obc`. ``` rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-11-22_22:01:32-rados-master-distro-basic-smithi$ less ./6520106/remote/smithi115/log/ceph-osd.3.log.gz ... DEBUG 2021-11-22 22:32:24,531 [shard 0] osd - client_request(id=29, detail=m=[osd_op(client.4371.0:36 4.d 4.f0fb5e1d (undecoded) ondisk+retry+read+rwordered+known_if_redirected+supports_pool_eio e23) v8]): start DEBUG 2021-11-22 22:32:24,531 [shard 0] osd - client_request(id=29, detail=m=[osd_op(client.4371.0:36 4.d 4.f0fb5e1d (undecoded) ondisk+retry+read+rwordered+known_if_redirected+supports_pool_eio e23) v8]): in repeat ... DEBUG 2021-11-22 22:32:24,546 [shard 0] osd - client_request(id=29, detail=m=[osd_op(client.4371.0:36 4.d 4.f0fb5e1d (undecoded) ondisk+retry+read+rwordered+known_if_redirected+supports_pool_eio e23) v8]) same_interval_since: 21 DEBUG 2021-11-22 22:32:24,546 [shard 0] osd - OpSequencer::start_op: op=29, last_started=27, last_unblocked=27, last_completed=27 ... DEBUG 2021-11-22 22:32:24,621 [shard 0] osd - client_request(id=32, detail=m=[osd_op(client.4371.0:49 4.d 4.81addbad (undecoded) ondisk+retry+write+known_if_redirected+supports_pool_eio e23) v8]): start DEBUG 2021-11-22 22:32:24,621 [shard 0] osd - client_request(id=32, detail=m=[osd_op(client.4371.0:49 4.d 4.81addbad (undecoded) ondisk+retry+write+known_if_redirected+supports_pool_eio e23) v8]): in repeat ... DEBUG 2021-11-22 22:32:24,626 [shard 0] osd - client_request(id=32, detail=m=[osd_op(client.4371.0:49 4.d 4.81addbad (undecoded) ondisk+retry+write+known_if_redirected+supports_pool_eio e23) v8]) same_interval_s ince: 21 DEBUG 2021-11-22 22:32:24,626 [shard 0] osd - OpSequencer::start_op: op=32, last_started=29, last_unblocked=29, last_completed=27 <note that op 32 is very short living> DEBUG 2021-11-22 22:32:24,669 [shard 0] osd - OpSequencer::finish_op_in_order: op=32, last_started=32, last_unblocked=32, last_completed=27 ... DEBUG 2021-11-22 22:32:24,671 [shard 0] osd - client_request(id=32, detail=m=[osd_op(client.4371.0:49 4.d 4:b5dbb581:::smithi11538976-13:head {write 601684~619341 in=619341b, stat} snapc 0={} RETRY=1 ondisk+retry+write+known_if_redirected+supports_pool_eio e23) v8]): destroying ... DEBUG 2021-11-22 22:32:24,722 [shard 0] osd - client_request(id=29, detail=m=[osd_op(client.4371.0:36 4.d 4:b87adf0f:::smithi11538976-9:head {read 0~1} snapc 0={} RETRY=1 ondisk+retry+read+rwordered+known_if_redirected+supports_pool_eio e23) v8]): got obc lock ... INFO 2021-11-22 22:32:24,723 [shard 0] osd - client_request(id=29, detail=m=[osd_op(client.4371.0:36 4.d 4:b87adf0f:::smithi11538976-9:head {read 0~1} snapc 0={} RETRY=1 ondisk+retry+read+rwordered+known_if_redirected+supports_pool_eio e23) v8]) obc.get()=0x6190000d5780 ... DEBUG 2021-11-22 22:32:24,753 [shard 0] osd - OpSequencer::finish_op_in_order: op=29, last_started=32, last_unblocked=32, last_completed=32 ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-8898-ge57ad63c/rpm/el8/BUILD/ceph-17.0.0-8898-ge57ad63c/src/crimson/osd/osd_operation_sequencer.h:123: void crimson::osd::OpSequencer::finish_op_in_order(crimson::osd::ClientRequest&): Assertion `op.get_id() > last_completed_id' failed. Aborting on shard 0. ``` This could be explained in a scenario where: - op no. 29 skipped stages of the execution pipeline while - it wrongly informed `OpSequencer` the execution was in-order. Static analysis shows there are multiple problems of this genre in the `ClientRequest::process_op()` and its callees with the most recently merged one being the path for `PG::already_complete()`. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-12-07 09:36:24 +00:00
Samuel Just	4fefd80043	crimson/os/seastore/lba_manager: initialize lba node pins using get_extent Signed-off-by: Samuel Just <sjust@redhat.com>	2021-12-07 08:30:07 +00:00
Samuel Just	c32300258d	crimson/os/seastore: initialize logical pins before exposing to cache Otherwise, another task may get a reference to the extent before we've set the pin. Fixes: https://tracker.ceph.com/issues/53267 Signed-off-by: Samuel Just <sjust@redhat.com>	2021-12-07 07:13:59 +00:00
Samuel Just	347d7d0f26	Merge pull request #44231 from xxhdx1985126/wip-cpu-profile crimson/os/seastore: fix compiler error for gcc > 9 and clang13 Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com> Reviewed-by: Samuel Just <sjust@redhat.com>	2021-12-06 22:21:55 -08:00
Huber-ming	51cde60319	radosgw-admin: fix some error logs Signed-off-by: Huber-ming <zhangsm01@inspur.com>	2021-12-07 13:59:47 +08:00
Xuehan Xu	5829e03a3d	crimson/os/seastore: fix compiler error for gcc > 9 and clang13 Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>	2021-12-07 12:27:14 +08:00
Sage Weil	30ac5e7935	osd/OSDMapMapping: fix spurious threadpool timeout errors We were passing a grace of zero seconds to our temporary work queue, which led to the HeartbeatMap issuing cpu_tp timeout errors to the log. By using a non-zero grace period we can avoid these. Use the same default grace we use for the workqueue itself when it goes to sleep. Fixes: https://tracker.ceph.com/issues/53506 Signed-off-by: Sage Weil <sage@newdream.net>	2021-12-06 13:13:06 -05:00
David Galloway	92404026b1	Merge pull request #44222 from ceph/wip-m2r doc: Use older mistune	2021-12-06 12:58:22 -05:00
Radoslaw Zarzynski	72e1ab8c2e	crimson/osd: clean the recovery message-related header inclusion. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-12-06 17:46:02 +00:00
Radoslaw Zarzynski	3f745b9eed	crimson/osd: don't assume a pull must happen if there is no push. In the classical OSD the `ReplicatedRecoveryBackend::recover_object()` divides into two main flows: pull and push: ```cpp int ReplicatedBackend::recover_object( const hobject_t &hoid, // ... ) { dout(10) << __func__ << ": " << hoid << dendl; RPGHandle h = static_cast<RPGHandle >(_h); if (get_parent()->get_local_missing().is_missing(hoid)) { ceph_assert(!obc); // pull prepare_pull( v, hoid, head, h); } else { ceph_assert(obc); int started = start_pushes( hoid, obc, h); // ... } return 0; } ``` Pulls may also enter the push path (`C_ReplicatedBackend_OnPullComplete`) but push handling doesn't draw any assumption on that. What's important, `recover_object()` may result in no pulls and pushes. This isn't the case of crimson as its implementation of the push path asserts that, if no push is scheduled, `PullInfo` must be allocated. This patch reworks this logic to reflects the classical one and to avoid crashes like the following one: ``` DEBUG 2021-12-01 18:43:00,220 [shard 0] osd - recover_object: loaded obc: 3:4e058a2e:::smithi13839607-45:head WARN 2021-12-01 18:43:00,220 [shard 0] none - intrusive_ptr_add_ref(p=0x6190000d7f80, use_count=3) WARN 2021-12-01 18:43:00,220 [shard 0] none - intrusive_ptr_release(p=0x6190000d7f80, use_count=4) TRACE 2021-12-01 18:43:00,220 [shard 0] osd - call_with_interruption_impl clearing interrupt_cond: 0x60300012b210,N7crimson3osd20IOInterruptConditionE TRACE 2021-12-01 18:43:00,220 [shard 0] osd - call_with_interruption_impl: may_interrupt: false, local interrupt_condintion: 0x60300012b210, global interrupt_cond: 0x0,N7crimson3osd20IOInterruptConditionE TRACE 2021-12-01 18:43:00,220 [shard 0] osd - set: interrupt_cond: 0x60300012b210, ref_count: 1 ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-8902-g52fd47fe/rpm/el8/BUILD/ceph-17.0. 0-8902-g52fd47fe/src/crimson/osd/replicated_recovery_backend.cc:84: ReplicatedRecoveryBackend::maybe_push_shards(const hobject_t&, eversion_t)::<lambda()>: Assertion `recovery.pi' failed. Aborting on shard 0. ``` Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-12-06 17:34:20 +00:00
David Galloway	ed2ad24a4b	doc: Use older mistune https://github.com/miyakogi/m2r/issues/66 Signed-off-by: David Galloway <dgallowa@redhat.com>	2021-12-06 10:32:56 -05:00
locallocal	1428544ec6	os/bluestore: don't need separate variable to mark hits when lookup oid. Signed-off-by: locallocal <locallocal@163.com>	2021-12-06 10:26:53 +08:00
benhanokh	ac826d1665	Merge pull request #43870 from benhanokh/restore_alloc_file NCB::refresh allocation-file after FSCK remove	2021-12-05 09:47:49 +02:00
Gabriel BenHanokh	cc87bef99e	BlueStore: Fix a bug when FSCK is invoked in mount()/umount()/mkfs() with DEEP option Fixes: https://tracker.ceph.com/issues/53185 NCB mishandles fsck DEEP in mount()/umount()/mkfs() case causing it to remove the allocation-file without destaging a new copy (which will cost us a full rebuild on startup) There are also few confiliting calls to open_db()/close_db() passing inconsistent read-only flag We fix both issues by storing open-db type (read-only/read-write) and using it for close-db (which won't pass read-only flag anymore) We also move allocation-file destage to close-db so it will be refreshed after being removed by fsck and such Signed-off-by: Gabriel Benhanokh <gbenhano@redhat.com>	2021-12-04 23:59:39 +02:00

... 3 4 5 6 7 ...

128634 Commits