Previously, MOSDMap messages sent to other OSDs were populated with the
superblocks's oldest_map. We should, instead, use the superblock's
cluster_osdmap_trim_lower_bound because oldest map is merely a marker
for each osd's trimming progress.
As specified in the docs:
***
We use cluster_osdmap_trim_lower_bound rather than a specific osd's oldest_map
because we don't necessarily trim all MOSDMap::oldest_map. In order to avoid
doing too much work at once we limit the amount of osdmaps trimmed using
``osd_target_transaction_size`` in OSD::trim_maps().
For this reason, a specific OSD's oldest_map can lag behind
OSDSuperblock::cluster_osdmap_trim_lower_bound
for a while.
***
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Improve the readability and clarity of
erasure_coding.rst.
Co-author: Cole Mitchell <cole.mitchell@gmail.com>
Signed-off-by: Zac Dover <zac.dover@gmail.com>
Create the initial mClock QoS params at CONF_DEFAULT level using
set_val_default(). This allows switching to a custom profile on a
running OSD and to make necessary changes to the desired QoS params.
Note that Switching to ‘custom’ profile and then subsequently changing
the QoS params using “config set osd.n …” will be at a higher level i.e.
at CONF_MON.
But When switching back to a built-in profile, the new values won’t take
effect since CONF_DEFAULT < CONF_MON. For the values to take effect, the
config keys created as part of the ‘custom’ profile must be removed from
the ConfigMonitor store after switching back to a built-in profile.
- Added a couple of standalone tests to exercise the scenario.
- Updated the mClock configuration document and the mClock internal
documentation with a couple of typos relating to the best effort weights.
- Added new sections to the mClock configuration document outlining the
steps to switch between the built-in and custom profile and vice-versa.
Fixes: https://tracker.ceph.com/issues/55153
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
The document details the refinements to mclock scheduler and presents the
results of a comparison study performed between the mclock scheduler and
the WPQ scheduler.
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
doc/dev/osd_internals/manifest.rst: add information about clone snap refcounting
Reviewed-by: Myoungwon Oh <myoungwon.oh@samsung.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
I've added a few significant changes:
* TIER_PROMOTE purely ensures that data is resident in the base pool
* Add EVICT_CHUNK to permit a tiering agent to selectively cold space
from HEAD/snapshots independently of SET_CHUNK.
* Avoid using DIRTY entirely for dedup targets.
* Instead of modifying clone_range when updating clone manifests, simply
update ReplicatedBackend::calc_*_subsets to also consider the clone
obc manifest.
I've also added sections on how tiering agents are rbd/rgw are meant
to interact as well as on testing.
Signed-off-by: Samuel Just <sjust@redhat.com>
Make dumping of reservation info congruent between scrub and recovery
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
This is similar to how recovery reservations are split between
local and remote.
It was the case that scrubs_pending was used for reservations at
the replicas as well as at the primary while requesting reservations
from the replicas. There was no need for scrubs_pending to turn
into scrubs_active at the primary as nothing treated that value
as special. scrubber.active = true when scrubbing is
actually going.
Now scurbber.local_reserved indicates scrubs_local incremented
Now scrubber.remote_reserved indicates scrubs_remote incremented
Fixes: https://tracker.ceph.com/issues/41669
Signed-off-by: David Zafman <dzafman@redhat.com>
I'm going to extract this logic and reuse it in crimson. Recovery* has
always been a confusing name as it implements neither log-based recovery
nor backfill. Rather, it's mainly the buisiness logic for agreeing on
an authoritative log and some ancillary things such as scrub/backfill
reservation.
$ for i in $(git grep -l 'RecoveryMachine'); do sed -i 's/RecoveryMachine/PeeringMachine/g' $i; done
$ for i in $(git grep -l 'RecoveryState'); do sed -i 's/RecoveryState/PeeringState/g' $i; done
$ for i in $(git grep -l 'RecoveryCtx'); do sed -i 's/RecoveryCtx/PeeringCtx/g' $i; done
Signed-off-by: Samuel Just <sjust@redhat.com>