ceph/qa/tasks/ceph.conf.template
Gabriel BenHanokh abba1a8b2c osd/SnapMapper: maintain the prefix_itr between calls to SnapMapper::get_next_objects_to_trim()
Maintain the prefix_itr between calls to SnapMapper::get_next_objects_to_trim() to prevent searching depleted prefixes.
We got 8 distinct hash prefixes used for searching objects owned by a given PG.
On each call to SnapMapper::get_next_objects_to_trim() we start from the first prefix even after all objects mapped to it were depleted.
This means that we will be searching for 1 non-existing prefix after the first prefix was depleted, 2 after the first two prefixes were depleted... and so on until we will search 7 non-existing prefixes after the first 7 prefixes were depleted.

This is a performance improvement PR only!
It maintains the existing behavior and does not try to fix/change any of the TRIM logic.
I added an extra step after the last object is trimmed doing a full scan of the DB and only if no object was found it will return ENOENT.
This should make the new code no-worse than existing code which returns ENOENT after a full scan found no object.
It should not impact performance in real life snaps as it should only happen once per-snap.

added snap-mapper tests to rados-test-suite
disabled osd_debug_trim_objects when running (SnapMapperTest, prefix_itr) to prevent asserts(as this code does illegal inserts into DELETED snaps)
Code beautifing

Disabled the assert as there is a corner case when we retrieve the last valid object/s in a snap
The prefix_itr is advanced past the last valid value (as we completed a full scan)
If the OSD will call get_next_objects_to_trim() before the retrieved object/s was processed and removed from the SnapMapper DB it won't be found by the next call (as the prefix_itr is invalid).
The object will be found in the second-pass which will seems as if it was added after the trim was started (which is illegal) and will trigger an ASSERT

Signed-off-by: Gabriel BenHanokh <gbenhano@redhat.com>
2023-11-02 19:25:16 +00:00

115 lines
2.9 KiB
Plaintext

[global]
chdir = ""
pid file = /var/run/ceph/$cluster-$name.pid
auth supported = cephx
filestore xattr use omap = true
mon clock drift allowed = 1.000
osd crush chooseleaf type = 0
auth debug = true
ms die on old message = true
ms die on bug = true
mon max pg per osd = 10000 # >= luminous
mon pg warn max object skew = 0
# disable pg_autoscaler by default for new pools
osd_pool_default_pg_autoscale_mode = off
osd pool default size = 2
mon osd allow primary affinity = true
mon osd allow pg remap = true
mon warn on legacy crush tunables = false
mon warn on crush straw calc version zero = false
mon warn on no sortbitwise = false
mon warn on osd down out interval zero = false
mon warn on too few osds = false
mon_warn_on_pool_pg_num_not_power_of_two = false
mon_warn_on_pool_no_redundancy = false
mon_allow_pool_size_one = true
osd pool default erasure code profile = "plugin=jerasure technique=reed_sol_van k=2 m=1 crush-failure-domain=osd"
osd default data pool replay window = 5
mon allow pool delete = true
mon cluster log file level = debug
debug asserts on shutdown = true
mon health detail to clog = false
[osd]
osd journal size = 100
osd scrub load threshold = 5.0
osd scrub max interval = 600
osd mclock profile = high_recovery_ops
osd recover clone overlap = true
osd recovery max chunk = 1048576
osd debug shutdown = true
osd debug op order = true
osd debug verify stray on activate = true
osd debug trim objects = true
osd open classes on start = true
osd debug pg log writeout = true
osd deep scrub update digest min age = 30
osd map max advance = 10
journal zero on create = true
filestore ondisk finisher threads = 3
filestore apply finisher threads = 3
bdev debug aio = true
osd debug misdirected ops = true
[mgr]
debug ms = 1
debug mgr = 20
debug mon = 20
debug auth = 20
mon reweight min pgs per osd = 4
mon reweight min bytes per osd = 10
mgr/telemetry/nag = false
[mon]
debug ms = 1
debug mon = 20
debug paxos = 20
debug auth = 20
mon data avail warn = 5
mon mgr mkfs grace = 240
mon reweight min pgs per osd = 4
mon osd reporter subtree level = osd
mon osd prime pg temp = true
mon reweight min bytes per osd = 10
# rotate auth tickets quickly to exercise renewal paths
auth mon ticket ttl = 660 # 11m
auth service ticket ttl = 240 # 4m
# don't complain about insecure global_id in the test suite
mon_warn_on_insecure_global_id_reclaim = false
mon_warn_on_insecure_global_id_reclaim_allowed = false
# 1m isn't quite enough
mon_down_mkfs_grace = 2m
mon_warn_on_filestore_osds = false
[client]
rgw cache enabled = true
rgw enable ops log = true
rgw enable usage log = true
log file = /var/log/ceph/$cluster-$name.$pid.log
admin socket = /var/run/ceph/$cluster-$name.$pid.asok