mirror of
https://github.com/ceph/ceph
synced 2025-01-12 14:10:27 +00:00
doc/cephfs/dirfrags: clarify the unit of threshold limits
Rationale: There are many threshold limits for split and merge in this doc that just says like: "A directory fragment is eligible for splitting when its size exceeds `mds_bal_split_size` (default 10000)". Need to clarify what 10000 actually means. This applies to all other such entries in this doc. Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
This commit is contained in:
parent
988b17c73f
commit
0d094f8f52
@ -46,8 +46,8 @@ Size thresholds
|
|||||||
===============
|
===============
|
||||||
|
|
||||||
A directory fragment is eligible for splitting when its size exceeds
|
A directory fragment is eligible for splitting when its size exceeds
|
||||||
``mds_bal_split_size`` (default 10000). Ordinarily this split is
|
``mds_bal_split_size`` (default 10000 directory entries). Ordinarily this
|
||||||
delayed by ``mds_bal_fragment_interval``, but if the fragment size
|
split is delayed by ``mds_bal_fragment_interval``, but if the fragment size
|
||||||
exceeds a factor of ``mds_bal_fragment_fast_factor`` the split size,
|
exceeds a factor of ``mds_bal_fragment_fast_factor`` the split size,
|
||||||
the split will happen immediately (holding up any client metadata
|
the split will happen immediately (holding up any client metadata
|
||||||
IO on the directory).
|
IO on the directory).
|
||||||
@ -58,14 +58,15 @@ ENOSPC errors if they try to create files in the fragment. On
|
|||||||
a properly configured system, this limit should never be reached on
|
a properly configured system, this limit should never be reached on
|
||||||
ordinary directories, as they will have split long before. By default,
|
ordinary directories, as they will have split long before. By default,
|
||||||
this is set to 10 times the split size, giving a dirfrag size limit of
|
this is set to 10 times the split size, giving a dirfrag size limit of
|
||||||
100000. Increasing this limit may lead to oversized directory fragment
|
100000 directory entries. Increasing this limit may lead to oversized
|
||||||
objects in the metadata pool, which the OSDs may not be able to handle.
|
directory fragment objects in the metadata pool, which the OSDs may not
|
||||||
|
be able to handle.
|
||||||
|
|
||||||
A directory fragment is eligible for merging when its size is less
|
A directory fragment is eligible for merging when its size is less
|
||||||
than ``mds_bal_merge_size``. There is no merge equivalent of the
|
than ``mds_bal_merge_size``. There is no merge equivalent of the
|
||||||
"fast splitting" explained above: fast splitting exists to avoid
|
"fast splitting" explained above: fast splitting exists to avoid
|
||||||
creating oversized directory fragments, there is no equivalent issue
|
creating oversized directory fragments, there is no equivalent issue
|
||||||
to avoid when merging. The default merge size is 50.
|
to avoid when merging. The default merge size is 50 directory entries.
|
||||||
|
|
||||||
Activity thresholds
|
Activity thresholds
|
||||||
===================
|
===================
|
||||||
@ -79,16 +80,19 @@ operations on directory fragments. The decaying load counters have an
|
|||||||
exponential decay based on the ``mds_decay_halflife`` setting.
|
exponential decay based on the ``mds_decay_halflife`` setting.
|
||||||
|
|
||||||
On writes, the write counter is
|
On writes, the write counter is
|
||||||
incremented, and compared with ``mds_bal_split_wr``, triggering a
|
incremented, and compared with ``mds_bal_split_wr``, triggering a
|
||||||
split if the threshold is exceeded. Write operations include metadata IO
|
split if the threshold is exceeded. Write operations include metadata IO
|
||||||
such as renames, unlinks and creations.
|
such as renames, unlinks and creations.
|
||||||
|
|
||||||
The ``mds_bal_split_rd`` threshold is applied based on the read operation
|
The ``mds_bal_split_rd`` threshold is applied based on the read operation
|
||||||
load counter, which tracks readdir operations.
|
load counter, which tracks readdir operations.
|
||||||
|
|
||||||
By the default, the read threshold is 25000 and the write threshold is
|
The ``mds_bal_split_rd`` and ``mds_bal_split_wr`` configs represent the
|
||||||
10000, i.e. 2.5x as many reads as writes would be required to trigger
|
popularity threshold. In the MDS these are measured as "read/write temperatures"
|
||||||
a split.
|
which is closely related to the number of respective read/write operations.
|
||||||
|
By default, the read threshold is 25000 operations and the write
|
||||||
|
threshold is 10000 operations, i.e. 2.5x as many reads as writes would be
|
||||||
|
required to trigger a split.
|
||||||
|
|
||||||
After fragments are split due to the activity thresholds, they are only
|
After fragments are split due to the activity thresholds, they are only
|
||||||
merged based on the size threshold (``mds_bal_merge_size``), so
|
merged based on the size threshold (``mds_bal_merge_size``), so
|
||||||
|
Loading…
Reference in New Issue
Block a user