2020-04-14 21:17:15 +00:00
|
|
|
.. _upmap:
|
|
|
|
|
2023-03-16 17:41:08 +00:00
|
|
|
Using pg-upmap
|
|
|
|
==============
|
2017-07-27 19:11:44 +00:00
|
|
|
|
2023-03-16 17:41:08 +00:00
|
|
|
In Luminous v12.2.z and later releases, there is a *pg-upmap* exception table
|
2017-07-27 19:11:44 +00:00
|
|
|
in the OSDMap that allows the cluster to explicitly map specific PGs to
|
2023-03-16 17:41:08 +00:00
|
|
|
specific OSDs. This allows the cluster to fine-tune the data distribution to,
|
|
|
|
in most cases, uniformly distribute PGs across OSDs.
|
2017-07-27 19:11:44 +00:00
|
|
|
|
2023-03-16 17:41:08 +00:00
|
|
|
However, there is an important caveat when it comes to this new feature: it
|
|
|
|
requires all clients to understand the new *pg-upmap* structure in the OSDMap.
|
2017-07-27 19:11:44 +00:00
|
|
|
|
|
|
|
Enabling
|
|
|
|
--------
|
|
|
|
|
2023-03-16 17:41:08 +00:00
|
|
|
In order to use ``pg-upmap``, the cluster cannot have any pre-Luminous clients.
|
|
|
|
By default, new clusters enable the *balancer module*, which makes use of
|
|
|
|
``pg-upmap``. If you want to use a different balancer or you want to make your
|
|
|
|
own custom ``pg-upmap`` entries, you might want to turn off the balancer in
|
|
|
|
order to avoid conflict:
|
2020-04-14 21:17:15 +00:00
|
|
|
|
2022-12-10 04:34:46 +00:00
|
|
|
.. prompt:: bash $
|
|
|
|
|
|
|
|
ceph balancer off
|
2020-04-14 21:17:15 +00:00
|
|
|
|
2023-03-16 17:41:08 +00:00
|
|
|
To allow use of the new feature on an existing cluster, you must restrict the
|
|
|
|
cluster to supporting only Luminous (and newer) clients. To do so, run the
|
|
|
|
following command:
|
2022-12-10 04:34:46 +00:00
|
|
|
|
|
|
|
.. prompt:: bash $
|
2017-07-27 19:11:44 +00:00
|
|
|
|
2022-12-10 04:34:46 +00:00
|
|
|
ceph osd set-require-min-compat-client luminous
|
2017-07-27 19:11:44 +00:00
|
|
|
|
2023-03-16 17:41:08 +00:00
|
|
|
This command will fail if any pre-Luminous clients or daemons are connected to
|
|
|
|
the monitors. To see which client versions are in use, run the following
|
|
|
|
command:
|
2022-12-10 04:34:46 +00:00
|
|
|
|
|
|
|
.. prompt:: bash $
|
2017-07-27 19:11:44 +00:00
|
|
|
|
2022-12-10 04:34:46 +00:00
|
|
|
ceph features
|
2017-07-27 19:11:44 +00:00
|
|
|
|
2019-12-18 03:38:51 +00:00
|
|
|
Balancer module
|
2023-03-16 17:41:08 +00:00
|
|
|
---------------
|
2017-07-27 19:11:44 +00:00
|
|
|
|
2023-03-16 17:41:08 +00:00
|
|
|
The `balancer` module for ``ceph-mgr`` will automatically balance the number of
|
|
|
|
PGs per OSD. See :ref:`balancer`
|
2017-07-27 19:11:44 +00:00
|
|
|
|
|
|
|
Offline optimization
|
|
|
|
--------------------
|
|
|
|
|
2023-03-16 17:41:08 +00:00
|
|
|
Upmap entries are updated with an offline optimizer that is built into
|
|
|
|
``osdmaptool``.
|
2017-07-27 19:11:44 +00:00
|
|
|
|
2022-12-10 04:34:46 +00:00
|
|
|
#. Grab the latest copy of your osdmap:
|
2017-07-27 19:11:44 +00:00
|
|
|
|
2022-12-10 04:34:46 +00:00
|
|
|
.. prompt:: bash $
|
2017-07-27 19:11:44 +00:00
|
|
|
|
2022-12-10 04:34:46 +00:00
|
|
|
ceph osd getmap -o om
|
2017-07-27 19:11:44 +00:00
|
|
|
|
2022-12-10 04:34:46 +00:00
|
|
|
#. Run the optimizer:
|
|
|
|
|
|
|
|
.. prompt:: bash $
|
|
|
|
|
|
|
|
osdmaptool om --upmap out.txt [--upmap-pool <pool>] \
|
|
|
|
[--upmap-max <max-optimizations>] \
|
|
|
|
[--upmap-deviation <max-deviation>] \
|
|
|
|
[--upmap-active]
|
2017-07-27 19:11:44 +00:00
|
|
|
|
|
|
|
It is highly recommended that optimization be done for each pool
|
2023-03-16 17:41:08 +00:00
|
|
|
individually, or for sets of similarly utilized pools. You can specify the
|
|
|
|
``--upmap-pool`` option multiple times. "Similarly utilized pools" means
|
|
|
|
pools that are mapped to the same devices and that store the same kind of
|
|
|
|
data (for example, RBD image pools are considered to be similarly utilized;
|
|
|
|
an RGW index pool and an RGW data pool are not considered to be similarly
|
|
|
|
utilized).
|
|
|
|
|
|
|
|
The ``max-optimizations`` value determines the maximum number of upmap
|
|
|
|
entries to identify. The default is `10` (as is the case with the
|
|
|
|
``ceph-mgr`` balancer module), but you should use a larger number if you are
|
|
|
|
doing offline optimization. If it cannot find any additional changes to
|
|
|
|
make (that is, if the pool distribution is perfect), it will stop early.
|
|
|
|
|
|
|
|
The ``max-deviation`` value defaults to `5`. If an OSD's PG count varies
|
|
|
|
from the computed target number by no more than this amount it will be
|
|
|
|
considered perfect.
|
|
|
|
|
|
|
|
The ``--upmap-active`` option simulates the behavior of the active balancer
|
|
|
|
in upmap mode. It keeps cycling until the OSDs are balanced and reports how
|
|
|
|
many rounds have occurred and how long each round takes. The elapsed time
|
|
|
|
for rounds indicates the CPU load that ``ceph-mgr`` consumes when it computes
|
|
|
|
the next optimization plan.
|
2019-12-18 19:27:02 +00:00
|
|
|
|
2022-12-10 04:34:46 +00:00
|
|
|
#. Apply the changes:
|
|
|
|
|
|
|
|
.. prompt:: bash $
|
2017-07-27 19:11:44 +00:00
|
|
|
|
2022-12-10 04:34:46 +00:00
|
|
|
source out.txt
|
2017-07-27 19:11:44 +00:00
|
|
|
|
2023-03-16 17:41:08 +00:00
|
|
|
In the above example, the proposed changes are written to the output file
|
|
|
|
``out.txt``. The commands in this procedure are normal Ceph CLI commands
|
|
|
|
that can be run in order to apply the changes to the cluster.
|
2019-12-18 03:38:51 +00:00
|
|
|
|
2023-03-16 17:41:08 +00:00
|
|
|
The above steps can be repeated as many times as necessary to achieve a perfect
|
|
|
|
distribution of PGs for each set of pools.
|
2017-07-27 19:11:44 +00:00
|
|
|
|
2023-03-16 17:41:08 +00:00
|
|
|
To see some (gory) details about what the tool is doing, you can pass
|
|
|
|
``--debug-osd 10`` to ``osdmaptool``. To see even more details, pass
|
|
|
|
``--debug-crush 10`` to ``osdmaptool``.
|