Commit Graph

128634 Commits

Author SHA1 Message Date
Igor Fedotov
96f0efe6d5 os/bluestore: avoid premature onode release.
This was observed when onode's removal is followed by reading
and the latter causes object release before the removal is finalized.
The root cause is an improper 'pinned' state assessment in Onode::get

More detailed overview is:
At some point Onode::get() might face the case when nref == 2 and pinned = true
which means parallel incomplete put is running on the onode - ref count is
decremented but pinned state is still unmodified (and even lock hasn't been
acquired yet).
This might finally result in two puts racing over the same onode with nref == 2
which finally results in a premature onode release:
  // nref =3, pinned = 1
  // Thread 1                   Thread 2
  //   o->put()                   o->get()
  //   --nref(n = 2, pinned=1)
  //                              nref++ (n=3, pinned = 1)
  //                              return
  //                              ...
  //                              o->put()
  //                              --nref(n = 2)
  //                              pinned = 0,
  //                              --nref(n = 1)
  //                              ocs->_unpin_and_rm(o) -> o->put()
  //                                ...
  //                                --nref(n = 0)
  //                                release o
  //  o->c->get_onode_cache()
  //  FAULT!
  //
The suggested fix is to introduce additional atomic counter tracking
running put() functions. And permit onode release when both regular
nref and put_nref are both equal to zero.

Fixes: https://tracker.ceph.com/issues/53002
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
2021-12-14 17:54:23 +03:00
Casey Bodley
8ef7837979
Merge pull request #44029 from cbodley/wip-rgw-beast-header-limit
rgw/beast: add max_header_size option with 16k default, up from 4k

Reviewed-by: Mark Kogan <mkogan@redhat.com>
2021-12-14 08:21:05 -05:00
Sebastian Wagner
d732a51df3
cephadm: make extract_uid_gid errors more readable
Avoid dumping a traceback

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-12-14 14:13:15 +01:00
Foad Lind
5077eef378 doc/cephadm/upgrade: correct example command
Update the ceph version used in the example upgrade command to match the one mentioned in the text above it.

Signed-off-by: Foad Lind <foad.lind@citynetwork.eu>
2021-12-14 14:01:58 +01:00
myoungwon oh
60b7690262 tools/ceph_dedup_tool: add explanations for added two commands
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
2021-12-14 18:52:24 +09:00
Pere Diaz Bou
2286ddc1c2 monitoring/grafana: rename tox promql test
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
2021-12-14 09:36:23 +01:00
Pere Diaz Bou
5ebdb746e8 monitoring/grafana: improve grafana unit tests variable substitution
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
2021-12-14 09:36:23 +01:00
Samuel Just
686398f742
Merge pull request #44235 from xxhdx1985126/wip-onode-omap-hint-optimization
crimson/os/seastore: avoid onode/omap laddr hint conflicts as much as possible

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
2021-12-14 00:10:31 -08:00
Xuehan Xu
d2235ba3b9 crimson/os/seastore: make onode data/metadata laddr space reservation configurable
Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
2021-12-14 14:55:26 +08:00
Samuel Just
53d8f0855c crimson/os/seastore: randomize metadata laddr hints
This should prevent omap and xattr extent allocations from clumping near
the onode's hint.  Additionally, only generate them past the default
16MB object_data_handler reservation.

Signed-off-by: Samuel Just <sjust@redhat.com>
2021-12-14 14:55:26 +08:00
Xuehan Xu
4b27d0a6e6 crimson/common: DONOT call crimson::get_logger() if NDEBUG is defined
Avoid debug related perf degredation

Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
2021-12-14 14:55:21 +08:00
Samuel Just
ec7aac4e55
Merge pull request #44141 from xxhdx1985126/wip-53409
crimson/os/seastore/segment_cleaner: correct available space calculation

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
2021-12-13 22:15:36 -08:00
Samuel Just
b275acde3e
Merge pull request #44290 from liu-chunmei/crimson-fix-no-pg
crimson/osd: fix interruptor assert when no pg in peering_event

Reviewed-by: Samuel Just <sjust@redhat.com>
2021-12-13 22:08:10 -08:00
myoungwon oh
16e7d5578c qa: add object-dedup test
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
2021-12-14 13:49:45 +09:00
myoungwon oh
daa27d6526 tool/ceph-dedup-tool: add object-dedup command
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
2021-12-14 12:27:43 +09:00
myoungwon oh
e7e875c547 qa: add chunk-dedup test
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
2021-12-14 12:27:43 +09:00
myoungwon oh
ee45f46b3f tool/ceph-dedup-tool: add chunk-dedup command
From perspective user who want to use deduplication,
it is hard to know how to use dedup feature.
So, providing chunk-dedup might be helpful to use
deduplication.

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
2021-12-14 12:27:43 +09:00
Xuehan Xu
6d142533ae crimson/os/seastore/segment_cleaner: correct available space calculation
Current available space calculation is wrong, it just counts the space occupied
by extents, deltas and other stuff are not taken into account.

Fixes: https://tracker.ceph.com/issues/53409
Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
2021-12-14 10:30:59 +08:00
Xuehan Xu
7ff2ecf84e crimson/common: redirect interruptible future debug output to default subsys
Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
2021-12-14 10:30:59 +08:00
Xuehan Xu
d7f1394f61 crimson/os/seastore/segment_cleaner: add perf metrics for better monitoring
Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
2021-12-14 10:30:59 +08:00
Yuri Weinstein
c3ee11ec12
Merge pull request #44015 from liewegas/fix-44012
osd/PeeringState: separate history's pruub from pg's

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
2021-12-13 15:20:51 -08:00
Yuri Weinstein
6adb612eff
Merge pull request #43864 from yaarith/fix-config-notify
mgr/telemetry: fix waiting for mgr to warm up

Reviewed-by: Sage Weil <sage@redhat.com>
2021-12-13 15:20:11 -08:00
Yuri Weinstein
6d5c4e0292
Merge pull request #43857 from aclamk/wip-aclamk-omap-clone-assert
os/bluestore: Protect _clone against sudden omap format changes

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
2021-12-13 15:19:14 -08:00
Neha Ojha
3ff9478953 doc/releases/pacific.rst: add core updates for 16.2.7
16.2.7 fixes https://tracker.ceph.com/issues/53062, so remove the
"big scary warning" from the top of the pacific release page. We continue
to warn about this bug under the 16.2.6 section and in
https://docs.ceph.com/en/latest/releases/pacific/#upgrading-from-octopus-or-nautilus.

Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-12-13 17:45:58 -05:00
Neha Ojha
ad52d9392e doc/releases/index.rst: change ref to 16.2.7
Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-12-13 17:45:58 -05:00
Yuri Weinstein
5831d3c874 doc: 16.2.7 change log => 3 PRs added
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
2021-12-13 17:45:58 -05:00
Ernesto Puerta
4c33cb0d6a doc: 16.2.7 Release Notes (dashboard)
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
2021-12-13 17:45:58 -05:00
Yuri Weinstein
9644d8ebea doc: 16.2.7 Release Notes
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
2021-12-13 17:45:58 -05:00
Christopher Hoffman
6c51b1fc9b doc/rados/operations: Updated rados docs to include
changes to health-checks in MANY_OBJECTS_PER_PG
warning when autoscale is on.

Signed-off-by: Christopher Hoffman <choffman@redhat.com>
2021-12-13 22:02:10 +00:00
Neha Ojha
460c29a736
Merge pull request #44298 from adamemerson/wip-leveldb-release-note
doc: Add PendingReleaseNote for LevelDB removal

Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-12-13 13:15:59 -08:00
Christopher Hoffman
8d846c3003 mon: Omit MANY_OBJECTS_PER_PG warning when autoscaler is on
Add a conditional statement when autoscaler is
set to ON to omit message when about pool having
many more objects per pg than cluster average.

Fixes: https://tracker.ceph.com/issues/53516

Signed-off-by: Christopher Hoffman <choffman@redhat.com>
2021-12-13 21:07:18 +00:00
Adam King
79d596a07e mgr/cephadm: agent: log response from mgr
Signed-off-by: Adam King <adking@redhat.com>
2021-12-13 15:38:49 -05:00
Kaleb S. KEITHLEY
2414c7584e rgw:cleanup/refactor json and xml encoders and decoders
move the encoder and decoder methods into their associated class
files to eliminate undefined references to the class vtable

https://tracker.ceph.com/issues/53596

Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
2021-12-13 14:33:52 -05:00
Adam C. Emerson
e2406c1777 doc: Add PendingReleaseNote for LevelDB removal
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2021-12-13 14:30:41 -05:00
Ernesto Puerta
d10b0b7e72
mgr/dashboard: disable Promql test in ARM
Temporarily disable this test while debugging the issue (since https://github.com/ceph/ceph/pull/43669
originally passed the ARM check).

Fixes: https://tracker.ceph.com/issues/53451
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
2021-12-13 20:20:44 +01:00
Casey Bodley
44d706c353
Merge pull request #44009 from cbodley/wip-qa-cls-rgw-gc
qa/rgw: run ceph_test_cls_rgw_gc in rgw/verify suite

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2021-12-13 11:18:49 -05:00
Christopher Hoffman
d025df0149 mailmap: Add Christopher Hoffman
Signed-off-by: Christopher Hoffman <choffman@redhat.com>
2021-12-13 15:00:27 +00:00
Sebastian Wagner
0c021eb69b
Merge pull request #42905 from sebastian-philipp/service_spec_no_redundant_placement
python-common: improve OSD spec error messages

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Michael Fritch <mfritch@suse.com>
2021-12-13 12:11:09 +01:00
Guillaume Abrioux
b6f0e1969f
Merge pull request #44218 from guits/guits-issue-44356
ceph-volume: fix error 'KeyError' with inventory
2021-12-13 09:47:01 +01:00
Guillaume Abrioux
0a67a3347f
Merge pull request #44219 from guits/guits-issue-53425
ceph-volume: fix tags dict output in `lvm list`
2021-12-13 09:46:44 +01:00
Samuel Just
0200b02155
Merge pull request #44281 from athanatos/sjust/wip-53555
crimson/os/seastore: index lba pins atomically with addition to cache

Reviewed-by: Xuehan Xu <xuxuehan@360.cn>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-12-13 00:43:34 -08:00
chunmei-liu
a8279ed7c8 crimson/osd: fix interruptor assert when no pg in peering_event
when no pg created, can't use interruptor

Signed-off-by: chunmei-liu <chunmei.liu@intel.com>
2021-12-12 21:09:09 -08:00
Tim Serong
eca838dfab ceph.spec.in: fix mgr-cephadm CherryPy requirement for SUSE builds
Commit 78983ad0d0 added cherrypy to ceph-mgr-cephadm's Requires,
but this needs to be split out into distro-specific sections due
to subtle/irritating naming differences.

Fixes: 78983ad0d0
Signed-off-by: Tim Serong <tserong@suse.com>
2021-12-13 15:50:45 +11:00
Yuri Weinstein
8c4e9d665b
Merge pull request #44225 from liewegas/fix-53506
osd/OSDMapMapping: fix spurious threadpool timeout errors

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2021-12-10 15:35:42 -08:00
Yuri Weinstein
b2e20eb068
Merge pull request #44025 from ljflores/wip-remove-aggregated-perf-data
mgr/telemetry: remove aggregated perf metrics from the perf channel

Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Yaarit Hatuka <yaarit@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
2021-12-10 15:35:09 -08:00
Yuri Weinstein
ebb64e4a3d
Merge pull request #43612 from adamemerson/wip-unleveling
build: Remove LevelDB support

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-12-10 15:33:05 -08:00
Samuel Just
62f3cf1a3d crimson/os/seastore/cache: init extents prior to read
Thus should ensure that any captured members of extent_init_func are
still valid at the cost of not being able to access the contents of the
extent at invocation time.  With this, we should be able to rely on any
logical extents/lba extents in the cache having validly initialized lba
pins.

Fixes: https://tracker.ceph.com/issues/53555
Signed-off-by: Samuel Just <sjust@redhat.com>
2021-12-10 14:57:32 -08:00
Samuel Just
96390d5f9e crimson/os/seatore/.../lba_btree: update get_*_node to add_pin without reading node contents
This will allow us to do add_pin before we perform the actual extent read.

Signed-off-by: Samuel Just <sjust@redhat.com>
2021-12-10 14:56:15 -08:00
Samuel Just
43f347ecec crimson/os/seastore: pass depth/begin/end to get_*_node
We'll need this to populate the pin fields prior to read.

Signed-off-by: Samuel Just <sjust@redhat.com>
2021-12-10 14:56:15 -08:00
Samuel Just
c2d2dd7e70 crimson/os/seastore/transaction_manager: clarify that init lambda only runs on new extents
Signed-off-by: Samuel Just <sjust@redhat.com>
2021-12-10 14:56:15 -08:00