mirror of
https://github.com/ceph/ceph
synced 2025-04-01 23:02:17 +00:00
Based on tests performed at scale on a HDD based cluster, it was found
that scheduling with mClock was not optimal with multiple OSD shards. For
e.g., in the scaled cluster with multiple OSD node failures, the client
throughput was found to be inconsistent across test runs coupled with
multiple reported slow requests.
However, the same test with a single OSD shard and with multiple worker
threads yielded significantly better results in terms of consistency of
client and recovery throughput across multiple test runs.
For more details see https://tracker.ceph.com/issues/66289.
Therefore, as an interim measure until the issue with multiple OSD shards
(or multiple mClock queues per OSD) is investigated and fixed, the
following change to the default HDD OSD shard configuration is made:
- osd_op_num_shards_hdd = 1 (was 5)
- osd_op_num_threads_per_shard_hdd = 5 (was 1)
The other changes in this commit include:
- Doc change to the OSD and mClock config reference describing
this change.
- OSD troubleshooting entry on the procedure to change the shard
configuration for clusters affected by this issue running on older
releases.
- Add release note for this change.
Fixes: https://tracker.ceph.com/issues/66289
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit
|
||
---|---|---|
.. | ||
auth-config-ref.rst | ||
bluestore-config-ref.rst | ||
ceph-conf.rst | ||
common.rst | ||
demo-ceph.conf | ||
filestore-config-ref.rst | ||
general-config-ref.rst | ||
index.rst | ||
journal-ref.rst | ||
mclock-config-ref.rst | ||
mon-config-ref.rst | ||
mon-lookup-dns.rst | ||
mon-osd-interaction.rst | ||
msgr2.rst | ||
network-config-ref.rst | ||
osd-config-ref.rst | ||
pool-pg-config-ref.rst | ||
pool-pg.conf | ||
storage-devices.rst |