Commit Graph

143547 Commits

Author SHA1 Message Date
Leonid Usov
09e08ac6a4 doc/cephfs/fs-volumes: Add info about the quiesce command
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
2024-03-04 13:48:03 +02:00
Leonid Usov
88fb668938 doc: fixes for local dev builds
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
2024-03-04 13:48:03 +02:00
Leonid Usov
d151876d5b mgr/volumes: support for fs subvolume quiesce
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
2024-03-04 13:48:03 +02:00
Leonid Usov
78afc61361 mgr/volumes: use volume_exception_to_retval as a decorator
When used as a decorator, it saves one indented try-catch block inside the decorated method.
This can be applied to most of the methods in the file, subject to a separate refactoring commit

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
2024-03-04 13:48:03 +02:00
Leonid Usov
9907efd013 pybind/mgr: add a one-shot parameter to send_command
with the parameter set, the message won't be held on to when the remote end resets
or fails to reconnect.

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
2024-03-04 13:48:03 +02:00
Leonid Usov
3de0882ad3 mds/quiesce: QuiesceAgent implementation and unit tests
QuiesceAgent is the layer that converts updates from the QuiesceDb
into calls to the QuiesceProtocol APIs, and then sends async acks
back to the db manager following the quiesce protocol events.

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
2024-03-04 13:48:03 +02:00
Leonid Usov
0e61c44238 mds/quiesce: QuiesceDb.h and QuiesceDbManager with tests
Quiesce DB is one of the components of the "Consistent Snapshots" epic.
The solution is discussed in a slide deck available for viewing to @redhat users:
https://docs.google.com/presentation/d/1wE3-e9AAme7Q3qmeshUSthJoQGw7-fKTrtS9PsdAIVo/edit?usp=sharing

This commit is focusing on the replicated quiesce database maintained by the MDS rank cluster.
One of the major goals was to design the component in a way that can be easily tested
outside of the MDS infrastructure, which is why the communication layer
has been asbtracted out by introducing just two communication callbacks
that will need to be implemented by the infrastructure.

The most of the component code is delivered in a single coherent commit, along with the uint tests.
Other commits will be dedicated to integration with the MDS infrastructure and other changes
that can't be attributed to the core quiesce db code or its tests.

The quiesce db component is composed of the following major parts/actors:

* QuiesceDbManager is the main actor, implementing both the leader and the replica roles.
  Normally, there will be an instance of the manager per MDS rank, although given the
  decoupling of the infrastructure and the manager, one can run any number of instances
  on a single node, which is how test are working.
* The manager interfaces to the infrastructure via two main APIs with the infrastructure
  that provides communication and cluster configuration (actor 2) and the quiesce db
  client that is responsible for the quiescing of the roots (actor 3)
** ClusterMembership is how manager is configured to be part of a (virtual) cluster.
   This structure will deliver information about other peers, the leader and provide
   two communication APIs: send_listing_to for db replication from the leader to replicas
   and send_ack for reporting quiesce success from the agents.
** Client Interface consists of a QuisceMap notify callback and a dedicated manager
   method to submit asynchronous acks following the agent (rank) quiesce progress.

The API of the quiesce db is described in the slide deck mentioned above. The full scope
of capabilities are encapsulated in a single QuiesceDbRequest structure. This should
simplify the implementation of other components that will have to propagate the functionality
to the administrator user of the volumes plugin.

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
2024-03-04 13:48:03 +02:00
Leonid Usov
c1c884212f common/Timer.cc: improve debug messages from the timer_thread
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
2024-03-04 13:48:03 +02:00
Leonid Usov
60cd6d1171 mds: MDSRank.cc: return status from send_message_mds
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
2024-03-04 13:48:03 +02:00
Leonid Usov
641279c4d0 encoding: add emplace variants for map dencoders
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
2024-03-04 13:48:03 +02:00
Leonid Usov
71f1280505 common/Cond: make C_SaferCond private members protected to facilitate inheritance
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
2024-03-04 13:48:03 +02:00
Leonid Usov
eff0a1d2ae qa/tasks/cephfs: give the tests more time to run heavy fs workloads
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
2024-03-04 13:48:03 +02:00
Kefu Chai
1a1fd808ac
Merge pull request #55897 from Matan-B/wip-matanb-crimson-seastar-sub-march24
src/seastar: update seastar submodule to fix FTBFS

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Kefu Chai <tchaikov@gmail.com>
2024-03-04 14:17:16 +08:00
Venky Shankar
6d57bb50e2
Merge pull request #55659 from batrick/i64503
client: log debug message when requesting unmount

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2024-03-04 11:30:22 +05:30
Matan Breizman
945b181954 src/seastar: update seastar submodule to fix FTBFS
See: d382f24762

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2024-03-03 13:36:16 +00:00
zdover23
7280a2412d
Merge pull request #55899 from zdover23/wip-doc-2024-03-02-rados-radosgw-pgcalc
doc/rados: remove PGcalc from docs

Reviewed-by: Ronen Friedman <rfriedman@redhat.com>
2024-03-03 20:41:23 +10:00
Zac Dover
ccb851d2a4 doc/rados: remove PGcalc from docs
Remove mention of the "PG calc" tool from the documentation. I have
removed all mention of this in one fell swoop to help posterity restore
mention of this tool if we decide we need to do so.

Signed-off-by: Zac Dover <zac.dover@proton.me>
2024-03-03 20:28:00 +10:00
Ilya Dryomov
3e302abb81
Merge pull request #52540 from petrutlucian94/single_process
rbd-wnbd: use a single daemon process per host

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2024-03-02 19:53:06 +01:00
Kefu Chai
91e8cea0d3
Merge pull request #55787 from tchaikov/wip-cmake-liburing-2.5
cmake: bump liburing from 0.7 to 2.5

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2024-03-02 17:58:08 +08:00
Kefu Chai
95e03f8809 cmake: bump liburing from 0.7 to 2.5
this allows us to use newer liburing features. Seastar is using
some of them which are not provided by liburing 0.7.

in this change, `--use-libc` is passed to configure. otherwise
it does not link against libc, and the symbles like memset()
won't be available when compiling liburing.so with -fPIC using
clang, which does not pull libc in that case.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
2024-03-02 12:16:14 +08:00
Samuel Just
3c0710024e
Merge pull request #55878 from athanatos/sjust/wip-seastar-module
crimson: update seastar submodule to fix prometheus build error

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2024-03-01 18:43:02 -08:00
Samuel Just
2806cdb151 crimson/.../interruptible_future: remove SEASTAR_CONCEPT guard
Seastar commit 8dc3398a removed this macro, no longer necessary.

Signed-off-by: Samuel Just <sjust@redhat.com>
2024-03-02 00:26:26 +00:00
Samuel Just
7f42b1d0b7 src/seastar: update seastar submodule to fix prometheus build failure
Fixes: https://tracker.ceph.com/issues/64589
Signed-off-by: Samuel Just <sjust@redhat.com>
2024-03-02 00:26:04 +00:00
Dan Mick
8c92912b7d
Merge pull request #55856 from dmick/wip-workflow-update
.github/workflows/create-backport-trackers.yml: update actions
2024-03-01 15:10:48 -08:00
zdover23
81bdde91f1
Merge pull request #55869 from zdover23/wip-doc-2024-03-01-install-manual-radosgw
doc/install: add manual RADOSGW install procedure

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2024-03-02 09:02:35 +10:00
Dan Mick
4a1d6122eb .github/workflows/create-backport-trackers.yml: update versions of actions
Getting warning about node16 being deprecated.  The workflow doesn't use node
directly, but through the external actions.  Moving to node20 requires
changing setup-python version; Bhacaz/checkout-files is deprecated and
recommends actions/checkout.

Signed-off-by: Dan Mick <dmick@redhat.com>
2024-03-01 13:26:36 -08:00
Zac Dover
565bc95038 doc/install: add manual RADOSGW install procedure
Add a manual RADOSGW installation procedure to
doc/install/manual-deployment.rst. This procedure was developed by Janne
Johansson and reported to the ceph-users mailing list on 29 Jan 2024
here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/LB3YRIKAPOHXYCW7MKLVUJPYWYRQVARU/

Co-authored-by: Janne Johansson <icepic.dz@gmail.com>
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
2024-03-02 07:25:08 +10:00
Ilya Dryomov
28fe52ab8c
Merge pull request #55797 from ajarr/wip-64574
qa: add diff-continuous and compare-mirror-image tests to rbd and krbd suites respectively

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2024-03-01 18:56:26 +01:00
Lucian Petrut
a14003c492 rbd-wnbd: use the right AdminSocket instance
The rbd-wnbd daemon currently caches one rados context per cluster.
However, it's registering hooks against the global context
admin socket, which won't be available. For this reason,
the "rbd-wnbd stats" command no longer works.

To address this issue, we'll ensure that rbd-wnbd sets command hooks
against the right admin socket instance, leveraging the image
context.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
2024-03-01 17:39:00 +00:00
Lucian Petrut
83d58ab307 rbd-wnbd: adjust admin socket hook to accept image path
For each rbd-wnbd mapping we set an admin socket hook that can
be used to retrieve IO stats.

Now that the same daemon is reused for multiple mappings, we need
to distinguish the images when receiving a "stats" request.

For this reason, we'll add the image identifier to "wnbd stats"
admin socket commands.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
2024-03-01 17:39:00 +00:00
Lucian Petrut
0d73d31b6f qa: update rbd-wnbd test, retrying image rm operations
The "rbd-wnbd unmap" command is currently telling the WNBD driver
to remove the mapping without contacting the rbd-wnbd daemon
and waiting for it to perform its cleanup.

For this reason, attempting to delete the image immediately after
unmapping it can fail due to existing watchers.

As a temporary solution, we'll retry the image remove operation.
At a later time, we'll update the "rbd-wnbd unmap" command to go
through the rbd-wnbd daemon, ensuring that all the necessary
cleanup is performed before returning.

While at it, we're dropping a redundant LOG.error call so that we
won't print expected exceptions.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
2024-03-01 17:39:00 +00:00
Lucian Petrut
5f9c69d53c rbd-wnbd: update registry settings handling
This commit will store the mapping config in the Windows registry
only after initializing the mapping. This ensures that we aren't
replacing the registry settings for already mapped images.

We'll also check if the registry setting was added by us before
cleaning it up.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
2024-03-01 17:39:00 +00:00
Lucian Petrut
661c55002d rbd-wnbd: use one daemon process per host
We're currently using one rbd-wnbd process per image mapping.
Since OSD connections aren't shared across those processes,
we end up with an excessive amount of TCP sessions, potentially
exceeding Windows limits:
https://ask.cloudbase.it/question/3598/ceph-for-windows-tcp-session-count/

In order to improve rbd-wnbd's scalability, we're going to use
a single process per host (unless "-f" is passed when mapping the
image, in which case the daemon will run as part of the same
process). This allows OSD sessions to be shared across image
mappings.

Another advantage is that the "ceph-rbd" service starts faster,
especially when having a large number of image mappings.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
2024-03-01 17:38:53 +00:00
Lucian Petrut
96e8850ff3 rbd-wnbd: introduce RbdMapping class
We're moving most of the WNBD mapping handling to a separate
class called RbdMapping. This simplifies cleanup and makes it
easier to reuse.

The WnbdHandler class covers WNBD specific operations and IO
callbacks while the RbdMapping wrapper will take care of RBD
operations.

A subsequent change will make use of it while switching from
one process per mapping to a single process per host.

While at it, we're also moving the rbd-wnbd config helpers
to separate files.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
2024-03-01 17:35:28 +00:00
Casey Bodley
9b38ed6721
Merge pull request #54767 from climb-mountain123/worm_multipart
src/rgw: fix for the multipart interface in the WORM function

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2024-03-01 17:19:58 +00:00
Adam King
a5b9d64c56
Merge pull request #55719 from phlogistonjohn/jjm-teuth-tasks-cephadm-jt
qa/tasks/cephadm: add generic templating where subst_vip was used

Reviewed-by: Adam King <adking@redhat.com>
2024-03-01 11:54:14 -05:00
Casey Bodley
1576c1f5b9 rgw: remove unused object lock stuff from CompleteMultipart
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2024-03-01 10:15:27 -05:00
Ronen Friedman
0389bfe3f4
Merge pull request #55817 from rkhudov/src-test-common-test_hobject-remove-constexpr
src/test/common/test_hobject: remove constexpr

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
2024-03-01 16:54:37 +02:00
Casey Bodley
4fa6af2b92
Merge pull request #55727 from cbodley/wip-64549
rgw/auth: do_aws4_auth_completion() catches exceptions

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
2024-03-01 13:28:41 +00:00
Yingxin
93ace0d66e
Merge pull request #55855 from xxhdx1985126/wip-seastore-interface
crimson/os/seastore: adjust SeaStore::_omap_set_kvs() params

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
2024-03-01 13:48:10 +08:00
Xuehan Xu
9852f4d97e crimson/os/seastore: adjust SeaStore::_omap_set_kvs() params
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
2024-03-01 10:32:32 +08:00
zdover23
f488c01fa8
Merge pull request #55834 from zdover23/wip-doc-2024-02-29-dev-internals
doc/dev: edit internals.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2024-03-01 10:13:46 +10:00
zdover23
fa66a6672b
Merge pull request #55835 from zdover23/wip-doc-2024-02-29-glossary-mds
doc/glossary: improve "MDS" entry

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2024-03-01 09:32:07 +10:00
Zac Dover
2c6983d8b4 doc/glossary: improve "MDS" entry
Improve the entry for "MDS" in doc/glossary.rst by linking to the
"ceph-mds" man page and mentioning the relationship between clients and
MDS (or MDSes).

Signed-off-by: Zac Dover <zac.dover@proton.me>
2024-03-01 08:08:25 +10:00
Ramana Raja
92b254138d qa/suites: add diff-continuous and compare-mirror-image tests
... to rbd and krbd suites respectively.

This allows the compare-mirror-image tests introduced in ea3a567
to be run against various kernel branches, e.g., testing branch.
And allows diff_continuous test in rbd_suite to run against distro
kernel.

Fixes: https://tracker.ceph.com/issues/64574
Signed-off-by: Ramana Raja <rraja@redhat.com>
2024-02-29 12:12:19 -05:00
Ramana Raja
af43f61624 qa/suites/rbd: rename nbd folder to device folder
Signed-off-by: Ramana Raja <rraja@redhat.com>
2024-02-29 11:55:08 -05:00
Matt Benjamin
97445bb6ea rgw: don't overwrite target attrs checking mpu info
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2024-02-29 11:39:38 -05:00
daijufang
3a65a0ad14 src/rgw: fix for the multipart interface in the WORM function
1. Save the WORM configuration information in the initialization chunk information for use when merging chunks.
2. Support x-amz-bypass-governance-retention when merging chunks.

Fixes: https://tracker.ceph.com/issues/63724

Signed-off-by: daijufang <daijufang_yewu@cmss.chinamobile.com>
2024-02-29 11:39:35 -05:00
Igor Fedotov
2996d1320f
Merge pull request #55594 from ifed01/wip-ifed-fix-64443
test/store_test: fix DeferredWrite test when prefer_deferred_size=0

Reviewed-by: Adam Kupczyk <akupczyk@ibm.com>
2024-02-29 19:07:10 +03:00
John Mulligan
4f1f09531a qa/tasks: replace uses of subst_vip with new templating function
Signed-off-by: John Mulligan <jmulligan@redhat.com>
2024-02-29 10:00:29 -05:00