Commit Graph

128415 Commits

Author SHA1 Message Date
Yingxin Cheng
c5093c8048 crimson/os/seastore: mark out empty transactions
TODO: avoid write if the transaction is empty.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-12-09 09:37:05 +08:00
Yingxin Cheng
310ed9ee81 crimson/os/seastore: refactor, introduce record_t and record_group_t with sizes
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-12-09 09:37:05 +08:00
Yingxin Cheng
28fec46261 crimson/os/seastore: scan records based on record_locator_t
Record may not have its own base if headers are merged.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-12-09 09:37:05 +08:00
Yingxin Cheng
3f90994e58 crimson/os/seastore: add more checks when read record_header_t
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-12-09 09:37:05 +08:00
Yingxin Cheng
2241111087 crimson/os/seastore: misc cleanup and reformat
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-12-09 09:37:05 +08:00
Yingxin Cheng
2cd753aa7d crimson/os/seastore: add logs in ExtentReader
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-12-09 09:37:05 +08:00
Yuri Weinstein
58faf5712e
Merge pull request #43919 from ronen-fr/wip-rf-test-nodeep
osd/scrub (& qa/standalone): test for scrub behavior when no-scrub is set but no-deep-scrub is not


Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Matan Breizman <Matan.Brz@gmail.com>
2021-12-08 13:04:57 -08:00
Yuri Weinstein
87bb5c601f
Merge pull request #43305 from heylinn/ceph_rundir_sysvinit
init-ceph: create /var/run/ceph for sysvinit

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-12-08 13:04:14 -08:00
Yuri Weinstein
096c11d1ff
Merge pull request #44090 from sseshasa/wip-fix-require-osd-release
osd/OSDMap: Add health warning if 'require-osd-release' != current release

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
2021-12-08 13:02:50 -08:00
Samuel Just
cc9322f17f
Merge pull request #44244 from cyx1231st/wip-seastore-refine-metrics
crimson/os/seastore: refine transaction metrics

Reviewed-by: Samuel Just <sjust@redhat.com>
2021-12-08 12:02:45 -08:00
Samuel Just
c10a0a93af
Merge pull request #44242 from liu-chunmei/crimson-fix-heartbeat-addrs
crimson/osd: fix heartbeat front and back blank ip

Reviewed-by: Samuel Just <sjust@redhat.com>
2021-12-08 12:02:04 -08:00
Igor Fedotov
a0873066e5
Merge pull request #44098 from ifed01/wip-ifed-dump-alloc-unit
os/bluestore: dump bluestore/bluefs alloc unit sizes with perf dump

Reviewed-by: Laura Flores <lflores@redhat.com>
2021-12-08 22:21:40 +03:00
Igor Fedotov
f7e76a975c
Merge pull request #43840 from ifed01/wip-ifed-verbose-open-col
osd,bluestore: gracefully handle a failure during meta collection load

Reviewed-by: jdurgin@redhat.com
Reviewed-by: nojha@redhat.com
2021-12-08 22:19:22 +03:00
Igor Fedotov
3c267248a7
Merge pull request #39691 from aclamk/wip-bs-compression-blob-64k
os/bluestore: Set new compression blob size to 64K

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
2021-12-08 22:17:38 +03:00
zdover23
78f5e7ece0
Merge pull request #44213 from zdover23/wip-doc-2021-12-05-hardware-recommendations-removing-journal
doc/start: remove journal info from hardware recs

Reviewed-by: Dan van der Ster <daniel.vanderster@cern.ch>
2021-12-09 05:14:37 +10:00
Daniel Gryniewicz
a09886f8b1
Merge pull request #44232 from Huber-ming/admin_fixlogs
radosgw-admin: fix some error logs
2021-12-08 11:34:41 -05:00
Daniel Gryniewicz
4cdf6d4f1f
Merge pull request #43915 from qiuxinyidian/dev-rgw
rgw: when radosgw-admin stating user, add user exists judging

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2021-12-08 11:22:28 -05:00
Daniel Gryniewicz
0e8ec9654f
Merge pull request #43834 from Huber-ming/admin_bilog
radosgw-admin: supplement help documents with 'bilog autotrim'
2021-12-08 11:19:12 -05:00
Zac Dover
276bbd8f2b doc/start: remove journal info from hardware recs
This PR removes mentions of journaling from the hardware
recommendations.

Journaling was a FileStore-related practice. BlueStore is
the default backend for Ceph OSDs and has been since
Luminous. The documentation should reflect that.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
2021-12-09 01:20:57 +10:00
Sebastian Wagner
16e60463de
Merge pull request #44093 from melissa-kun-li/ssh-non-root-user
mgr/cephadm: support bootstrap with non-root ssh-user

Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-12-08 15:51:36 +01:00
Sage Weil
1c741b4147 Merge PR #44162 into master
* refs/pull/44162/head:
	mgr: only queue notify events that modules ask for
	pybind/mgr: annotate which events modules consume
	pybind/mgr: introduce NotifyType enum
	mgr: stop issuing events that no modules consume

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-12-08 07:11:53 -05:00
Sage Weil
907f38c151 Merge PR #44196 into master
* refs/pull/44196/head:
	mon/MgrStatMonitor: do not spam subscribers (mgr) with service_map

Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-12-08 07:10:21 -05:00
Sage Weil
60e0ea02d7 Merge PR #44207 into master
* refs/pull/44207/head:
	mgr/ActivePyModule: avoid with_gil where possible
	mgr/ActivePyModules: push without_gil_t down into blocks

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-12-08 07:10:08 -05:00
Radoslaw Zarzynski
e714d6effb
Merge pull request #43542 from rzarzynski/wip-crimson-net-ms_learn_from_peer
crimson/net: add support for ms_learn_addr_from_peer.

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
2021-12-08 11:24:24 +01:00
Sebastian Wagner
9ead2e4523
Merge pull request #44104 from guits/guits-cephadm-skip-cv-restorecon
cephadm: pass `CEPH_VOLUME_SKIP_RESTORECON=yes`

Reviewed-by: Ken Dreyer <kdreyer@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-12-08 10:25:49 +01:00
Sébastien Han
d89f30a1b1
Merge pull request #44239 from guits/guits-add-skip-needs-root
ceph-volume: make it possible to skip needs_root()
2021-12-08 09:56:49 +01:00
Kefu Chai
d8c54e8b68
Merge pull request #44245 from tchaikov/wip-crimson-seastar
seastar: pick up change to fix FTBFS with old cryptopp

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-12-08 16:50:20 +08:00
Radoslaw Zarzynski
cdef16ecf7 crimson/net: add support for ms_learn_addr_from_peer.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-12-08 08:46:50 +00:00
Samuel Just
4a84dac892
Merge pull request #44179 from athanatos/sjust/wip-pin-race
crimson/os/seastore: initialize logical pins before exposing to cache

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Xuehan Xu <xxhdx1985126@gmail.com>
2021-12-07 23:43:17 -08:00
Kefu Chai
58cb9bace4 crimson/osd: s/seastar::fprint()/fmt::print()/
otherwise, we'd have warnings like:

./src/crimson/osd/main.cc:106:16: error: 'fprint<const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > &>' is deprecated: use std::format_to() or fmt::print() [-Werror,-Wdeprecated-declarations]
      seastar::fprint(std::cerr, "already have key in keyring: %s\n", path);
               ^

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
2021-12-08 15:39:49 +08:00
Kefu Chai
918eef4f61 seastar: pick up change to fix FTBFS with old cryptopp
Signed-off-by: Kefu Chai <kefu@xsky.com>
2021-12-08 14:53:08 +08:00
Yuval Lifshitz
2e6e91838d
Merge pull request #44186 from qiuxinyidian/rgw-de
rgw: add object null point judging when listing pubsub  topics
2021-12-08 08:05:32 +02:00
Yingxin Cheng
bf9f669e06 crimson/os/seastore: measure the number of conflicting transactions by srcs
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-12-08 13:46:07 +08:00
Yingxin Cheng
4a6dd67f62 crimson/os/seastore: differentiate cleaner trim/reclaim transactions
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-12-08 13:46:07 +08:00
chunmei-liu
ce1ca97f84 crimson/osd: fix heartbeat front and back blank ip
when ceph.conf not set public ip & cluster ip, heartbeat will get blank ip address. when osd::_send_boot , classic osd will check if heartbeat front and back addrs are blank ip, if they are blank ip, will use public ip which is learned from mon to set into them. So implement them in crimson osd.

Signed-off-by: chunmei-liu <chunmei.liu@intel.com>
2021-12-07 16:57:54 -08:00
Kefu Chai
b03d3d9165
Merge pull request #44147 from rzarzynski/wip-crimson-new-seastar
crimson: bump up Seastar to recent master and fix FTBFS

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
2021-12-08 08:56:11 +08:00
Neha Ojha
3304a82bfd
Merge pull request #44095 from Matan-B/wip-matanb-local-workunits
doc/dev: Running workunits locally

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2021-12-07 13:34:15 -08:00
Yuval Lifshitz
365bfd1437
Merge pull request #43940 from TRYTOBE8TME/wip-rgw-empty-config
src/rgw: Empty configuration support
2021-12-07 21:20:33 +02:00
Yuval Lifshitz
a2d9f222bb
Merge pull request #43665 from zenomri/wip-omri-multipart-trace
rgw/tracer: Multipart upload trace
2021-12-07 21:19:27 +02:00
Samuel Just
0ded1b2b6a
Merge pull request #44156 from rzarzynski/wip-crimson-fix-process_op-sequencing
crimson/osd: fix sequencing issues in ClientRequest::process_op.

Reviewed-by: Samuel Just <sjust@redhat.com>
2021-12-07 07:53:21 -08:00
Samuel Just
d029e2e989
Merge pull request #44223 from rzarzynski/wip-crimson-fix-pullinfo-on-push
crimson/osd: don't assume a pull must happen if there is no push.

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
2021-12-07 07:52:41 -08:00
Samuel Just
d4ad98c15f
Merge pull request #44224 from rzarzynski/wip-crimson-clean-msghs
crimson/osd: clean the recovery message-related header inclusion.

Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
2021-12-07 07:52:02 -08:00
Samuel Just
b7dfff6cf1
Merge pull request #44184 from rzarzynski/wip-crimson-internal_client_request-fix-hobj
crimson/osd: fix assertion failure in InternalClientRequest.

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
2021-12-07 07:49:08 -08:00
Guillaume Abrioux
068a1d2a30 ceph-volume: make it possible to skip needs_root()
Add the possibility to skip the `needs_root()` decorator.
See linked tracker for details.

Fixes: https://tracker.ceph.com/issues/53511

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-12-07 15:18:10 +01:00
Alfonso Martínez
6628f444b3
Merge pull request #44145 from rhcs-dashboard/fix-frontend-vulnerabilities
mgr/dashboard: fix frontend deps' vulnerabilities

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
2021-12-07 15:02:42 +01:00
Radoslaw Zarzynski
be0ba67623 crimson/osd: fix sequencing issues in ClientRequest::process_op.
The following crash has been observed in one of the runs at Sepia:

```
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-8898-ge57ad63c/rpm/el8/BUILD/ceph-17.0.0-8898-ge57ad63c/src/crimson/osd/osd_operation_sequencer.h:123: void crimson::osd::OpSequencer::finish_op_in_order(crimson::osd::ClientRequest&): Assertion `op.get_id() > last_completed_id' failed.
Aborting on shard 0.
Backtrace:
Reactor stalled for 1807 ms on shard 0. Backtrace: 0xb14ab 0x46e57428 0x46bc450d 0x46be03bd 0x46be0782 0x46be0946 0x46be0bf6 0x12b1f 0x137341 0x3fdd6a92 0x3fddccdb 0x3fdde1ee 0x3fdde8b3 0x3fdd3f2b 0x3fdd4442 0x3fdd4c3a 0x12b1f 0x3737e 0x21db4 0x21c88 0x2fa75 0x3b769527 0x3b8418af 0x3b8423cb 0x3b842ce0 0x3b84383d 0x3a116220 0x3a143f31 0x3a144bcd 0x46b96271 0x46bde51a 0x46d6891b 0x46d6a8f0 0x4681a7d2 0x4681f03b 0x39fd50f2 0x23492 0x39b7a7dd
 0# gsignal in /lib64/libc.so.6
 1# abort in /lib64/libc.so.6
 2# 0x00007FB9FB946C89 in /lib64/libc.so.6
 3# 0x00007FB9FB954A76 in /lib64/libc.so.6
 4# 0x00005595E98E6528 in ceph-osd
 5# 0x00005595E99BE8B0 in ceph-osd
 6# 0x00005595E99BF3CC in ceph-osd
 7# 0x00005595E99BFCE1 in ceph-osd
 8# 0x00005595E99C083E in ceph-osd
 9# 0x00005595E8293221 in ceph-osd
10# 0x00005595E82C0F32 in ceph-osd
11# 0x00005595E82C1BCE in ceph-osd
12# 0x00005595F4D13272 in ceph-osd
13# 0x00005595F4D5B51B in ceph-osd
14# 0x00005595F4EE591C in ceph-osd
15# 0x00005595F4EE78F1 in ceph-osd
16# 0x00005595F49977D3 in ceph-osd
17# 0x00005595F499C03C in ceph-osd
18# main in ceph-osd
19# __libc_start_main in /lib64/libc.so.6
20# _start in ceph-osd
```

The sequence of events provides at least two clues:
  - the op no. 32 finished before the op no. 29 which was waiting
    for `ObjectContext`,
  - the op no. 29 was a short-living one -- it wasn't waiting even
    on `obc`.

```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-11-22_22:01:32-rados-master-distro-basic-smithi$ less ./6520106/remote/smithi115/log/ceph-osd.3.log.gz
...
DEBUG 2021-11-22 22:32:24,531 [shard 0] osd - client_request(id=29, detail=m=[osd_op(client.4371.0:36 4.d 4.f0fb5e1d (undecoded) ondisk+retry+read+rwordered+known_if_redirected+supports_pool_eio e23) v8]): start
DEBUG 2021-11-22 22:32:24,531 [shard 0] osd - client_request(id=29, detail=m=[osd_op(client.4371.0:36 4.d 4.f0fb5e1d (undecoded) ondisk+retry+read+rwordered+known_if_redirected+supports_pool_eio e23) v8]): in repeat
...
DEBUG 2021-11-22 22:32:24,546 [shard 0] osd - client_request(id=29, detail=m=[osd_op(client.4371.0:36 4.d 4.f0fb5e1d (undecoded) ondisk+retry+read+rwordered+known_if_redirected+supports_pool_eio e23) v8]) same_interval_since: 21
DEBUG 2021-11-22 22:32:24,546 [shard 0] osd - OpSequencer::start_op: op=29, last_started=27, last_unblocked=27, last_completed=27
...
DEBUG 2021-11-22 22:32:24,621 [shard 0] osd - client_request(id=32, detail=m=[osd_op(client.4371.0:49 4.d 4.81addbad (undecoded) ondisk+retry+write+known_if_redirected+supports_pool_eio e23) v8]): start
DEBUG 2021-11-22 22:32:24,621 [shard 0] osd - client_request(id=32, detail=m=[osd_op(client.4371.0:49 4.d 4.81addbad (undecoded) ondisk+retry+write+known_if_redirected+supports_pool_eio e23) v8]): in repeat
...
DEBUG 2021-11-22 22:32:24,626 [shard 0] osd - client_request(id=32, detail=m=[osd_op(client.4371.0:49 4.d 4.81addbad (undecoded) ondisk+retry+write+known_if_redirected+supports_pool_eio e23) v8]) same_interval_s
ince: 21
DEBUG 2021-11-22 22:32:24,626 [shard 0] osd - OpSequencer::start_op: op=32, last_started=29, last_unblocked=29, last_completed=27
<note that op 32 is very short living>
DEBUG 2021-11-22 22:32:24,669 [shard 0] osd - OpSequencer::finish_op_in_order: op=32, last_started=32, last_unblocked=32, last_completed=27
...
DEBUG 2021-11-22 22:32:24,671 [shard 0] osd - client_request(id=32, detail=m=[osd_op(client.4371.0:49 4.d 4:b5dbb581:::smithi11538976-13:head {write 601684~619341 in=619341b, stat} snapc 0={} RETRY=1 ondisk+retry+write+known_if_redirected+supports_pool_eio e23) v8]): destroying
...
DEBUG 2021-11-22 22:32:24,722 [shard 0] osd - client_request(id=29, detail=m=[osd_op(client.4371.0:36 4.d 4:b87adf0f:::smithi11538976-9:head {read 0~1} snapc 0={} RETRY=1 ondisk+retry+read+rwordered+known_if_redirected+supports_pool_eio e23) v8]): got obc lock
...
INFO  2021-11-22 22:32:24,723 [shard 0] osd - client_request(id=29, detail=m=[osd_op(client.4371.0:36 4.d 4:b87adf0f:::smithi11538976-9:head {read 0~1} snapc 0={} RETRY=1 ondisk+retry+read+rwordered+known_if_redirected+supports_pool_eio e23) v8]) obc.get()=0x6190000d5780
...
DEBUG 2021-11-22 22:32:24,753 [shard 0] osd - OpSequencer::finish_op_in_order: op=29, last_started=32, last_unblocked=32, last_completed=32
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-8898-ge57ad63c/rpm/el8/BUILD/ceph-17.0.0-8898-ge57ad63c/src/crimson/osd/osd_operation_sequencer.h:123: void crimson::osd::OpSequencer::finish_op_in_order(crimson::osd::ClientRequest&): Assertion `op.get_id() > last_completed_id' failed.
Aborting on shard 0.
```

This could be explained in a scenario where:
  - op no. 29 skipped stages of the execution pipeline while
  - it wrongly informed `OpSequencer` the execution was in-order.

Static analysis shows there are multiple problems of this genre
in the `ClientRequest::process_op()` and its callees with the most
recently merged one being the path for `PG::already_complete()`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-12-07 09:36:24 +00:00
Samuel Just
4fefd80043 crimson/os/seastore/lba_manager: initialize lba node pins using get_extent
Signed-off-by: Samuel Just <sjust@redhat.com>
2021-12-07 08:30:07 +00:00
Samuel Just
c32300258d crimson/os/seastore: initialize logical pins before exposing to cache
Otherwise, another task may get a reference to the extent before
we've set the pin.

Fixes: https://tracker.ceph.com/issues/53267
Signed-off-by: Samuel Just <sjust@redhat.com>
2021-12-07 07:13:59 +00:00
Samuel Just
347d7d0f26
Merge pull request #44231 from xxhdx1985126/wip-cpu-profile
crimson/os/seastore: fix compiler error for gcc > 9 and clang13

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
2021-12-06 22:21:55 -08:00
Huber-ming
51cde60319 radosgw-admin: fix some error logs
Signed-off-by: Huber-ming <zhangsm01@inspur.com>
2021-12-07 13:59:47 +08:00