pybind/mgr/pg_autoscaler: change overlapping roots to warning

Change the log level of overlapping roots
from ``Error`` to ``Warning``.

Point the user to documentation that
explains the overlapping roots.

Added more information regarding overlapping roots
in the autoscaler documentation such as
the step to get rid of the warning.

Fixes: https://tracker.ceph.com/issues/55611

Signed-off-by: Kamoltat <ksirivad@redhat.com>
This commit is contained in:
Kamoltat 2022-05-12 12:22:13 +00:00
parent 103fe44f3c
commit e8490dae9f
2 changed files with 14 additions and 6 deletions
doc/rados/operations
src/pybind/mgr/pg_autoscaler

View File

@ -143,16 +143,21 @@ example, a pool that maps to OSDs of class `ssd` and a pool that maps
to OSDs of class `hdd` will each have optimal PG counts that depend on
the number of those respective device types.
In the case where a pool uses OSDs under two or more CRUSH roots, e.g., (shadow
trees with both `ssd` and `hdd` devices), the autoscaler will
issue a warning to the user in the manager log stating the name of the pool
and the set of roots that overlap each other. The autoscaler will not
scale any pools with overlapping roots because this can cause problems
with the scaling process. We recommend making each pool belong to only
one root (one OSD class) to get rid of the warning and ensure a successful
scaling process.
The autoscaler uses the `bulk` flag to determine which pool
should start out with a full complement of PGs and only
scales down when the usage ratio across the pool is not even.
However, if the pool doesn't have the `bulk` flag, the pool will
start out with minimal PGs and only when there is more usage in the pool.
The autoscaler identifies any overlapping roots and prevents the pools
with such roots from scaling because overlapping roots can cause problems
with the scaling process.
To create pool with `bulk` flag::
ceph osd pool create <pool-name> --bulk

View File

@ -364,8 +364,11 @@ class PgAutoscaler(MgrModule):
if prev_root_id != root_id:
overlapped_roots.add(prev_root_id)
overlapped_roots.add(root_id)
self.log.error('pool %d has overlapping roots: %s',
pool_id, overlapped_roots)
self.log.warning("pool %s won't scale due to overlapping roots: %s",
pool['pool_name'], overlapped_roots)
self.log.warning("Please See: https://docs.ceph.com/en/"
"latest/rados/operations/placement-groups"
"/#automated-scaling")
break
if not s:
s = CrushSubtreeResourceStatus()