mirror of
https://github.com/ceph/ceph
synced 2025-01-04 02:02:36 +00:00
Merge pull request #27250 from ivancich/wip-update-resharding-docs
rgw: updates to resharding documentation Reviewed-by: Adam Emerson <aemerson@redhat.com>
This commit is contained in:
commit
0262ed3173
@ -9,46 +9,49 @@ RGW Dynamic Bucket Index Resharding
|
||||
A large bucket index can lead to performance problems. In order
|
||||
to address this problem we introduced bucket index sharding.
|
||||
Until Luminous, changing the number of bucket shards (resharding)
|
||||
needed to be done offline, from Luminous we support
|
||||
needed to be done offline. Starting with Luminous we support
|
||||
online bucket resharding.
|
||||
|
||||
Each bucket index shard can handle its entries efficiently up until
|
||||
reaching a certain threshold of entries. If this threshold is exceeded the system
|
||||
could encounter performance issues.
|
||||
The dynamic resharding feature detects this situation and increases
|
||||
automatically the number of shards used by the bucket index,
|
||||
reaching a certain threshold number of entries. If this threshold is exceeded the system
|
||||
can encounter performance issues.
|
||||
The dynamic resharding feature detects this situation and automatically
|
||||
increases the number of shards used by the bucket index,
|
||||
resulting in the reduction of the number of entries in each bucket index shard.
|
||||
This process is transparent to the user.
|
||||
|
||||
The detection process runs:
|
||||
1. When new objects are added to the bucket
|
||||
2. In a background process that periodically scans all the buckets
|
||||
This is needed in order to deal with existing buckets in the system that are not being updated.
|
||||
A bucket that requires resharding is added to the ``reshard_log`` queue and will be
|
||||
|
||||
1. when new objects are added to the bucket and
|
||||
2. in a background process that periodically scans all the buckets.
|
||||
|
||||
The background process is needed in order to deal with existing buckets in the system that are not being updated.
|
||||
A bucket that requires resharding is added to the resharding queue and will be
|
||||
scheduled to be resharded later.
|
||||
The reshard threads run in the background and execute the scheduled resharding, one at a time.
|
||||
The reshard threads run in the background and execute the scheduled resharding tasks, one at a time.
|
||||
|
||||
Multisite
|
||||
=========
|
||||
Dynamic resharding is not supported in multisite environment.
|
||||
|
||||
Dynamic resharding is not supported in a multisite environment.
|
||||
|
||||
|
||||
Configuration
|
||||
=============
|
||||
|
||||
Enable/Disable Dynamic bucket index resharding:
|
||||
Enable/Disable dynamic bucket index resharding:
|
||||
|
||||
-``rgw_dynamic_resharding``: true/false, default: true.
|
||||
- ``rgw_dynamic_resharding``: true/false, default: true
|
||||
|
||||
Parameters to control the resharding process in Ceph configuration fie:
|
||||
Configuration options that control the resharding process:
|
||||
|
||||
-``rgw_reshard_num_logs``: number of shards for the resharding log, default: 16
|
||||
- ``rgw_reshard_num_logs``: number of shards for the resharding queue, default: 16
|
||||
|
||||
-``rgw_reshard_bucket_lock_duration``: duration of lock on bucket obj during resharding, default: 120 seconds.
|
||||
- ``rgw_reshard_bucket_lock_duration``: duration, in seconds, of lock on bucket obj during resharding, default: 120 seconds
|
||||
|
||||
-``rgw_max_objs_per_shard``: maximum number of objects per bucket index shard, default: 100000 objects.
|
||||
- ``rgw_max_objs_per_shard``: maximum number of objects per bucket index shard before resharding is triggered, default: 100000 objects
|
||||
|
||||
-``rgw_reshard_thread_interval``: maximum time between rounds of reshard thread processing, default: 600 seconds
|
||||
- ``rgw_reshard_thread_interval``: maximum time, in seconds, between rounds of resharding queue processing, default: 600 seconds
|
||||
|
||||
|
||||
Admin commands
|
||||
@ -68,8 +71,8 @@ List resharding queue
|
||||
|
||||
# radosgw-admin reshard list
|
||||
|
||||
Process/Schedule a bucket resharding
|
||||
------------------------------------
|
||||
Process tasks on the resharding queue
|
||||
-------------------------------------
|
||||
|
||||
::
|
||||
|
||||
@ -85,12 +88,12 @@ Bucket resharding status
|
||||
Cancel pending bucket resharding
|
||||
--------------------------------
|
||||
|
||||
Ongoing bucket resharding operations cannot be cancelled. ::
|
||||
Note: Ongoing bucket resharding operations cannot be cancelled. ::
|
||||
|
||||
# radosgw-admin reshard cancel --bucket <bucket_name>
|
||||
|
||||
Manual bucket resharding
|
||||
------------------------
|
||||
Manual immediate bucket resharding
|
||||
----------------------------------
|
||||
|
||||
::
|
||||
|
||||
@ -101,20 +104,21 @@ Troubleshooting
|
||||
===============
|
||||
|
||||
Clusters prior to Luminous 12.2.11 and Mimic 13.2.5 left behind stale bucket
|
||||
instance entries that weren't automatically cleaned up. The issue also affected
|
||||
LifeCycle policies which weren't applied to resharded buckets anymore. Both of
|
||||
instance entries, which were not automatically cleaned up. The issue also affected
|
||||
LifeCycle policies, which were not applied to resharded buckets anymore. Both of
|
||||
these issues can be worked around using a couple of radosgw-admin commands.
|
||||
|
||||
Stale Instance Management
|
||||
Stale instance management
|
||||
-------------------------
|
||||
|
||||
List the stale instances in a cluster that are ready to be cleaned up.
|
||||
|
||||
::
|
||||
|
||||
# radosgw-admin reshard stale-instances list
|
||||
|
||||
This lists the stale instances in a cluster that are ready to be cleaned up.
|
||||
Please note that the cleanup of these instances should be done only on a single
|
||||
site cluster. The cleanup can be done by the following command:
|
||||
Clean up the stale instances in a cluster. Note: cleanup of these
|
||||
instances should only be done on a single site cluster.
|
||||
|
||||
::
|
||||
|
||||
@ -124,11 +128,13 @@ site cluster. The cleanup can be done by the following command:
|
||||
Lifecycle fixes
|
||||
---------------
|
||||
|
||||
For clusters which had resharded instances, it is highly likely that the old
|
||||
lifecycle processes would've flagged and deleted lifecycle processing as the
|
||||
For clusters that had resharded instances, it is highly likely that the old
|
||||
lifecycle processes would have flagged and deleted lifecycle processing as the
|
||||
bucket instance changed during a reshard. While this is fixed for newer clusters
|
||||
(from 13.2.6 and 12.2.12), older buckets which had lifecycle policies and
|
||||
would've undergone reshard will have to be manually fixed by issuing the following command
|
||||
(from Mimic 13.2.6 and Luminous 12.2.12), older buckets that had lifecycle policies and
|
||||
that have undergone resharding will have to be manually fixed.
|
||||
|
||||
The command to do so is:
|
||||
|
||||
::
|
||||
|
||||
@ -136,4 +142,4 @@ would've undergone reshard will have to be manually fixed by issuing the followi
|
||||
|
||||
|
||||
As a convenience wrapper, if the ``--bucket`` argument is dropped then this
|
||||
command will try and fix LC policies for all the buckets in the cluster.
|
||||
command will try and fix lifecycle policies for all the buckets in the cluster.
|
||||
|
Loading…
Reference in New Issue
Block a user