Commit Graph

149089 Commits

Author SHA1 Message Date
Samuel Just
7ac64b0b24 crimson: OpsExecuter no longer needs to be a lw shared ptr
ClientRequest and InternalClientRequest can declare them
as auto variables.

Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
8f3ac965c3 crimson: remove now unused PG::do_osd_ops* and log_reply
Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
a0efff116c crimson: clarify ops_executer.h comment
Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
c091f3b2ab crimson: convert InternalClientRequest::do_request to use *_executer rather than do_osd_ops*
Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
fc41fcb9d2 crimson: factor out InternalClientRequest::do_process
Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
304e20e9bc crimson: switch ClientRequest::do_request to use *_executer rather than do_osd_ops
Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
00057b45f0 crimson: introduce PG::run_executer,submit_executer
These are intended to replace do_osd_ops*.  The implementation
is simpler and does not involve passing success and failure
callbacks.  It also moves responsibility for dealing with
the MOSDOpReply and client related error handling over to
ClientRequest.

do_osd_op* will be removed once users are switched over.

Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
7a826eb86c crimson: PG::submit_error_log returns eversion_t rather than optional
It seems like the motivation here was to allow do_osd_ops_execute to
communicate that it didn't submit an error log by making
maybe_submit_error_log a std::optional<eversion_t>.  However,
submit_error_log itself always returns a version.  Fix submit_error_log
and compensate in do_osd_ops_execute.

Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
5e28a3bd3b crimson: introduce rollback_obc_if_modified without an error argument
Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
24b7b4f4b5 crimson: futures from flush_changes_n_do_ops_effects must not fail
The return signature previously suggested that the second future
returned could be an error.  This seemed necessary due to how
effects are handled:

template <typename MutFunc>
OpsExecuter::rep_op_fut_t
OpsExecuter::flush_changes_n_do_ops_effects(
  const std::vector<OSDOp>& ops,
  SnapMapper& snap_mapper,
  OSDriver& osdriver,
  MutFunc mut_func) &&
{
...
    all_completed =
      std::move(all_completed).then_interruptible([this, pg=this->pg] {
      // let's do the cleaning of `op_effects` in destructor
      return interruptor::do_for_each(op_effects,
        [pg=std::move(pg)](auto& op_effect) {
        return op_effect->execute(pg);
      });

However, all of the actual execute implementations (created via
OpsExecuter::with_effect_on_obc) return a bare seastar::future and
cannot fail.

In a larger sense, it's actually critical that neither future returned
from flush_changes_n_do_ops_effects may fail -- they represent applying
the transaction locally and remotely.  If either portion fails, there
would need to be an interval change to recover.

Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
a43452f47e crimson: OpsExecutor::flush_clone_metadata no longer needs to return a future
Snapmapper updates happen during log commit now.

Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
0a83d956e5 crimson: remove the eagain error from PG::do_osd_ops
The idea here is that PG::do_osd_ops propogates an eagain after starting
a repair upon encountering an eio to indicate that the op should restart
from the top of ClientRequest::process_op.

However, InternalClientRequest's handler for this error simply ignores
it.  ClientRequest's handling, while superficially reasonable, doesn't
actually work.  Re-calling process_op would mean reentering previous
stages.  This is problematic for at least a few reasons:
1. Reentering a prior stage with the same handler doesn't actually work
   since the corresponding event entries will already be populated.
2. There might be other ops on the same object waiting on the process
   stage.  They'd need to be sent back as well in order to preserve
   ordering.

Because this mechanism doesn't really seem to be fully baked, let's
remove it for now and try to reintroduce it later after
do_osd_ops[_execute] are a bit simpler.

Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
7da7c3d736 crimson/osd: move pipelines to osd_operation.h
Each of the two existing pipelines are shared across multiple
ops.  Rather than defining them in a specific op or in
osd_operations/common/pg_pipeline.h, just declare them in
osd_operation.h.

Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
96c771383a crimson: eliminate get_obc stage
f90af12d introduced check_already_complete_get_obc to replace get_obc,
but left get_obc and didn't update the other users.

Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
238f3e573d crimson/.../internal_client_request: convert with_interruption to coroutine
Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
a091414c67 crimson/.../internal_client_request: factor out with_interruption
Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
a7812e095c crimson/.../internal_client_request: remove unnecessary system_shutdown guard
Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
4bea366e5d crimson: fix typo OpsExecutor->OpsExecuter
Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
1f99108d19 crimson: add missing field to SUBLOGDPPI and LOGDPPI
SUBLOGDPPI and LOGDPPI need an extra {} for the interrupt_cond.

Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Samuel Just
7b78387696 crimson: remove watchers upon object deletion
Fixes: https://tracker.ceph.com/issues/68538
Signed-off-by: Samuel Just <sjust@redhat.com>
2024-10-14 20:37:26 -07:00
Patrick Donnelly
7e7aac11cd
Merge PR #60301 into main
* refs/pull/60301/head:
	doc/governance: add new CSC members

Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
2024-10-14 20:21:22 -04:00
Casey Bodley
c4c647480a osdc: remove unused overloads for async::Completion
ea67f3dee2 switched to
asio::any_completion_handler<> for completions, but left some converting
overloads behind for compatibility. none of those overloads appear to be
used, so remove them

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2024-10-14 16:13:19 -04:00
Patrick Donnelly
2f61b2847d
doc/governance: update my CSC email
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2024-10-14 14:57:31 -04:00
Patrick Donnelly
e4177406f9
mailmap: add my ibm email
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2024-10-14 14:56:46 -04:00
Patrick Donnelly
022b90a753
doc/governance: add new CSC members
Congratulations!

Election: https://vote.heliosvoting.org/helios/elections/f276a15a-84c5-11ef-a0e4-b69e035002b0/view
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2024-10-14 14:40:41 -04:00
Avan Thakkar
88e4484acf mgr/cephadm: add ok_to_stop func for smb service
Fixes: https://tracker.ceph.com/issues/68527
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
2024-10-15 00:00:09 +05:30
Laura Flores
24378a0726
Merge pull request #60158 from aclamk/wip-aclamk-bluefs-truncate-allocations
os/bluestore: Make truncate() drop unused allocations
2024-10-14 12:19:53 -05:00
Matan Breizman
096ff99558
Merge pull request #59914 from xxhdx1985126/wip-68174
crimson/osd/pg: remove snapmapper objects when eventually removing collections at the last moment of pg deleting, just as pg meta objects

Reviewed-by: Samuel Just <sjust@redhat.com>
2024-10-14 18:51:51 +03:00
Lee Sanders
7855ea8a7c
Merge pull request #60169 from lee-j-sanders/wip-ljs-rmcosbench
qa/suites/tasks/cbt.py: Deprecating cosbench from Teuthology in preparation for deletion of cosbench from CBT
2024-10-14 16:14:49 +01:00
Adam King
73e2b06c5c
Merge pull request #59888 from phlogistonjohn/jjm-mypy-more
prepare mypy checking for newer python (3.12)

Reviewed-by: Adam King <adking@redhat.com>
2024-10-14 11:11:36 -04:00
Ronen Friedman
4a5715fdcb
Merge pull request #59942 from ronen-fr/wip-rf-store2-steps
osd/scrub: separate shallow vs deep errors storage

Reviewed-by: Samuel Just <sjust@redhat.com>
2024-10-14 15:44:55 +03:00
Zac Dover
4298b7e41e
Merge pull request #60242 from zdover23/wip-doc-2024-10-10-SubmittingPatches-backports
doc: SubmittingPatches-backports - remove backports team

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
2024-10-14 21:24:22 +10:00
Venky Shankar
739670290e Merge PR #60219 into main
* refs/pull/60219/head:
	qa/cephfs: update earmark values to valid ones in test_volumes.py

Reviewed-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2024-10-14 15:21:15 +05:30
Naman Munet
517ab013e2 mgr/dashboard: sync policy's in Object >> Multi-site >> Sync-policy, does not show the zonegroup to which policy belongs to
Fixes: https://tracker.ceph.com/issues/68355

Fixes Includes: Added default zonegroup name with the sync policy details

Signed-off-by: Naman Munet <namanmunet@li-ff83bccc-26af-11b2-a85c-a4b04bfb1003.ibm.com>
2024-10-14 14:47:17 +05:30
leonidc
0f71333fde
Merge pull request #60247 from leonidc/101024-fix-no-listeners
mon/nvmeofgw*: fix HA usecase when gateway has no listeners: behaves …
2024-10-14 11:45:27 +03:00
afreen23
637025b959
Merge pull request #59905 from rhcs-dashboard/osd-perf-impr
mgr/dashboard: introduce server side pagination for osds

Reviewed-by: Afreen Misbah <afreen23.git@gmail.com>
2024-10-14 13:52:55 +05:30
afreen23
203f55c03d
Merge pull request #60259 from afreen23/wip-listener-del
mgr/dashboard: Fix listener deletion

Reviewed-by: Afreen Misbah <afreen23.git@gmail.com>
2024-10-14 13:51:49 +05:30
Matan Breizman
a764b915e0
Merge pull request #59572 from xxhdx1985126/wip-67874
crimson/osd/backfill_state: add the object to be pushed in the peer missing set of PeeringState

Reviewed-by: Samuel Just <sjust@redhat.com>
2024-10-14 10:03:30 +03:00
Matan Breizman
e252561f54
Merge pull request #59878 from xxhdx1985126/wip-68147
crimson/osd/backfill_state: push peer pg infos' last_backfills only when all objects before them are backfilled

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
2024-10-14 10:02:21 +03:00
Ronen Friedman
7fb28191d2
Merge pull request #60198 from ronen-fr/wip-rf-rm-recovery-2
qa/standalone/scrub: remove TEST_recovery_scrub_2

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
2024-10-13 18:52:33 +03:00
Matan Breizman
b473143ac0
Merge pull request #59916 from xxhdx1985126/wip-68175
crimson/osd/backfill_state: do at least one time of replica scanning if necessary in the Enqueuing state

Reviewed-by: Samuel Just <sjust@redhat.com>
2024-10-13 16:04:27 +03:00
Matan Breizman
1316950ce1
Merge pull request #59776 from xxhdx1985126/wip-68061
crimson/osd/backfill_state: always go to Enqueuing when object is pushed during Waiting

Reviewed-by: Samuel Just <sjust@redhat.com>
2024-10-13 16:00:13 +03:00
Matan Breizman
e4365c1f7d
Merge pull request #59853 from xxhdx1985126/wip-crimson-pg-purge-strays
crimson/osd: purge strays when PGs go clean

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
2024-10-13 15:54:56 +03:00
Dan Mick
900fb50837
Merge pull request #60255 from dmick/wip-fix-container-arch
container/build.sh: fix arm architecture tagging
2024-10-11 12:59:17 -07:00
Adam King
d4b04b4042
Merge pull request #60028 from rkachach/fix_issue_add_internal_mtls_check
mgr/cephadm: adding config to enforce clients cert check for internal nginx (mTLS)

Reviewed-by: Adam King <adking@redhat.com>
2024-10-11 08:56:57 -04:00
Max Kellermann
5b0d849730 common/ceph_context: use std::atomic<std::shared_ptr<T>>
Fixes the compiler warning:

 src/common/ceph_context.h: In member function ‘std::shared_ptr<std::vector<entity_addrvec_t> > ceph::common::CephContext::get_mon_addrs() const’:
 src/common/ceph_context.h:288:36: warning: ‘std::shared_ptr<_Tp> std::atomic_load_explicit(const shared_ptr<_Tp>*, memory_order) [with _Tp = vector<entity_addrvec_t>]’ is deprecated: use 'std::atomic<std::shared_ptr<T>>' instead [-Wdeprecated-declarations]
   288 |     auto ptr = atomic_load_explicit(&_mon_addrs, std::memory_order_relaxed);
       |                ~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 /usr/include/c++/14/bits/shared_ptr_atomic.h:133:5: note: declared here
   133 |     atomic_load_explicit(const shared_ptr<_Tp>* __p, memory_order)
       |     ^~~~~~~~~~~~~~~~~~~~

The modernized version does not build with GCC 11, so this patch
contains both versions for now, switched by a `__GNUC__` preprocessor
check.

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
2024-10-11 12:26:37 +02:00
Afreen Misbah
3dc091dd12 mgr/dashboard: Fix listener deletion
Listener deletion is broken due to passing wrong gateway address.
Including `traddr` in DELETE API of listener to choose correct gateway address for deletion.

The same fix we did for POST API here: 287ff3b360

Fixes: https://tracker.ceph.com/issues/68506

Signed-off-by: Afreen Misbah <afreen23.git@gmail.com>
2024-10-11 14:58:54 +05:30
Adam Kupczyk
8ebcb2dd46 os/bluestore/ceph-bluestore-tool: Modify show-label for many devs
It was possible to give multiple devices to cbt:
> ceph-bluestore-tool show-label --dev /dev/sda --dev /dev/sdb

But is any of devices cannot provide valid label, nothing was printed.

Now, always print results. Non readable labels are output as empty dictionaries.
Exit code:
- 0 if any label properly read
- 1 if all labels failed

Fixes: https://tracker.ceph.com/issues/68505

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
2024-10-11 08:21:35 +00:00
Nizamudeen A
f9b50b2e88 mgr/dashboard: fix group name bugs in the nvmeof API
there are 2 issues

1. in cephadm, i was always using the first daemon to populate the group
   in all the services for the dashboard config.

2. in the API, if there are more than 1 gateways listed in the config,
   rather than chosing a random gateway from the group, raise an
   exception and warn user to specify the gw_group parameter in the api
   request

Fixes: https://tracker.ceph.com/issues/68463
Signed-off-by: Nizamudeen A <nia@redhat.com>
2024-10-11 13:31:57 +05:30
Nizamudeen A
86378344ab mgr/dashboard: introduce server side pagination for osds
Fixes: https://tracker.ceph.com/issues/56511
Signed-off-by: Nizamudeen A <nia@redhat.com>
2024-10-11 13:16:35 +05:30