For multipart upload processing, below is the method applied -
MultipartUpload::Init - create head object entry for meta obj (src_obj_name + "." + upload_id)
[ Meta object stores all the parts upload info]
MultipartWriter::process - create all data/tail objects with obj_name same as
meta obj (so that they can all be identified & deleted during abort)
MultipartUpload::Abort - Just delete meta obj .. that will indirectly delete all the uploads
associated with that upload id / meta obj so far.
MultipartUpload::Complete - Create head object of the original object (if not exists).
Rename all data/tail object entries' obj name to orig object name and update metadata of the orig object.
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
This PR removes mentions of journaling from the hardware
recommendations.
Journaling was a FileStore-related practice. BlueStore is
the default backend for Ceph OSDs and has been since
Luminous. The documentation should reflect that.
Signed-off-by: Zac Dover <zac.dover@gmail.com>
mgr/cephadm: support bootstrap with non-root ssh-user
Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
We only use a handful of fields, and the pg dump includes a gazillion
fields that we waste CPU copying to python-land. This tends to lead to
long ClusterState::lock hold times, leading to long ms_dispatch delays
and generally gumming up the works.
Instead, create a new "pg_progress" item that dumps only the fields that
mgr/progress needs.
Fixes: https://tracker.ceph.com/issues/53475
Signed-off-by: Sage Weil <sage@newdream.net>
* refs/pull/44162/head:
mgr: only queue notify events that modules ask for
pybind/mgr: annotate which events modules consume
pybind/mgr: introduce NotifyType enum
mgr: stop issuing events that no modules consume
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
* refs/pull/44207/head:
mgr/ActivePyModule: avoid with_gil where possible
mgr/ActivePyModules: push without_gil_t down into blocks
Reviewed-by: Kefu Chai <kchai@redhat.com>
crimson/net: add support for ms_learn_addr_from_peer.
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
seastar: pick up change to fix FTBFS with old cryptopp
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
when ceph.conf not set public ip & cluster ip, heartbeat will get blank ip address. when osd::_send_boot , classic osd will check if heartbeat front and back addrs are blank ip, if they are blank ip, will use public ip which is learned from mon to set into them. So implement them in crimson osd.
Signed-off-by: chunmei-liu <chunmei.liu@intel.com>
crimson/osd: don't assume a pull must happen if there is no push.
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
crimson/osd: clean the recovery message-related header inclusion.
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
crimson/osd: fix assertion failure in InternalClientRequest.
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Default value for `--crush-device-class` is `None`.
When not passing this parameter, ceph-volume sets the
value "None" in the lv tags.
Therefore, ceph-volume will output that value with calling
`ceph-volume lvm list --format json`
For instance:
```
"1": [
{
"devices": [
"/dev/sdc"
],
"lv_name": "osd-data-5a4a34f5-5733-4c69-b439-edb48e31a45f",
"lv_path": "/dev/ceph-aeb16fc3-9ac2-4126-ab66-bf920d101ea4/osd-data-5a4a34f5-5733-4c69-b439-edb48e31a45f",
"lv_size": "49.00g",
"lv_tags": "ceph.block_device=/dev/ceph-aeb16fc3-9ac2-4126-ab66-bf920d101ea4/osd-data-5a4a34f5-5733-4c69-b439-edb48e31a45f,ceph.block_uuid=E9hZNU-80Zz-PiER-iWN3-jSIU-krEN-khwU3x,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=40fe4af5-0408-444b-843c-0926d550d1f1,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=39680838-19df-4e50-9bb6-46b093d5b52b,ceph.osd_id=1,ceph.type=block,ceph.vdo=0",
"lv_uuid": "E9hZNU-80Zz-PiER-iWN3-jSIU-krEN-khwU3x",
"name": "osd-data-5a4a34f5-5733-4c69-b439-edb48e31a45f",
"path": "/dev/ceph-aeb16fc3-9ac2-4126-ab66-bf920d101ea4/osd-data-5a4a34f5-5733-4c69-b439-edb48e31a45f",
"tags": {
"ceph.block_device": "/dev/ceph-aeb16fc3-9ac2-4126-ab66-bf920d101ea4/osd-data-5a4a34f5-5733-4c69-b439-edb48e31a45f",
"ceph.block_uuid": "E9hZNU-80Zz-PiER-iWN3-jSIU-krEN-khwU3x",
"ceph.cephx_lockbox_secret": "",
"ceph.cluster_fsid": "40fe4af5-0408-444b-843c-0926d550d1f1",
"ceph.cluster_name": "ceph",
"ceph.crush_device_class": "None",
```
ceph-volume should print `"ceph.crush_device_class": "",` instead of `"ceph.crush_device_class": "None",`
Fixes: https://tracker.ceph.com/issues/53425
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Add the possibility to skip the `needs_root()` decorator.
See linked tracker for details.
Fixes: https://tracker.ceph.com/issues/53511
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Otherwise, another task may get a reference to the extent before
we've set the pin.
Fixes: https://tracker.ceph.com/issues/53267
Signed-off-by: Samuel Just <sjust@redhat.com>
crimson/os/seastore: fix compiler error for gcc > 9 and clang13
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
We were passing a grace of zero seconds to our temporary work queue, which
led to the HeartbeatMap issuing cpu_tp timeout errors to the log. By using
a non-zero grace period we can avoid these. Use the same default grace
we use for the workqueue itself when it goes to sleep.
Fixes: https://tracker.ceph.com/issues/53506
Signed-off-by: Sage Weil <sage@newdream.net>
In the classical OSD the `ReplicatedRecoveryBackend::recover_object()`
divides into two main flows: pull and push:
```cpp
int ReplicatedBackend::recover_object(
const hobject_t &hoid,
// ...
)
{
dout(10) << __func__ << ": " << hoid << dendl;
RPGHandle *h = static_cast<RPGHandle *>(_h);
if (get_parent()->get_local_missing().is_missing(hoid)) {
ceph_assert(!obc);
// pull
prepare_pull(
v,
hoid,
head,
h);
} else {
ceph_assert(obc);
int started = start_pushes(
hoid,
obc,
h);
// ...
}
return 0;
}
```
Pulls may also enter the push path (`C_ReplicatedBackend_OnPullComplete`)
but push handling doesn't draw any assumption on that. What's important,
`recover_object()` may result in no pulls and pushes.
This isn't the case of crimson as its implementation of the push path
asserts that, if no push is scheduled, `PullInfo` must be allocated.
This patch reworks this logic to reflects the classical one and to avoid
crashes like the following one:
```
DEBUG 2021-12-01 18:43:00,220 [shard 0] osd - recover_object: loaded obc: 3:4e058a2e:::smithi13839607-45:head
WARN 2021-12-01 18:43:00,220 [shard 0] none - intrusive_ptr_add_ref(p=0x6190000d7f80, use_count=3)
WARN 2021-12-01 18:43:00,220 [shard 0] none - intrusive_ptr_release(p=0x6190000d7f80, use_count=4)
TRACE 2021-12-01 18:43:00,220 [shard 0] osd - call_with_interruption_impl clearing interrupt_cond: 0x60300012b210,N7crimson3osd20IOInterruptConditionE
TRACE 2021-12-01 18:43:00,220 [shard 0] osd - call_with_interruption_impl: may_interrupt: false, local interrupt_condintion: 0x60300012b210, global interrupt_cond: 0x0,N7crimson3osd20IOInterruptConditionE
TRACE 2021-12-01 18:43:00,220 [shard 0] osd - set: interrupt_cond: 0x60300012b210, ref_count: 1
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-8902-g52fd47fe/rpm/el8/BUILD/ceph-17.0.
0-8902-g52fd47fe/src/crimson/osd/replicated_recovery_backend.cc:84: ReplicatedRecoveryBackend::maybe_push_shards(const hobject_t&, eversion_t)::<lambda()>: Assertion `recovery.pi' failed.
Aborting on shard 0.
```
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Fixes: https://tracker.ceph.com/issues/53185
NCB mishandles fsck DEEP in mount()/umount()/mkfs() case causing it to remove the allocation-file without destaging a new copy (which will cost us a full rebuild on startup)
There are also few confiliting calls to open_db()/close_db() passing inconsistent read-only flag
We fix both issues by storing open-db type (read-only/read-write) and using it for close-db (which won't pass read-only flag anymore)
We also move allocation-file destage to close-db so it will be refreshed after being removed by fsck and such
Signed-off-by: Gabriel Benhanokh <gbenhano@redhat.com>