mirror of https://github.com/ceph/ceph
116 lines
4.4 KiB
ReStructuredText
116 lines
4.4 KiB
ReStructuredText
==================================
|
|
Orphan List and Associated Tooling
|
|
==================================
|
|
|
|
.. version added:: Luminous
|
|
|
|
.. contents::
|
|
|
|
Orphans are RADOS objects that are left behind after their associated
|
|
RGW objects are removed. Normally these RADOS objects are removed
|
|
automatically, either immediately or through a process known as
|
|
"garbage collection". Over the history of RGW, however, there may have
|
|
been bugs that prevented these RADOS objects from being deleted, and
|
|
these RADOS objects may be consuming space on the Ceph cluster without
|
|
being of any use. From the perspective of RGW, we call such RADOS
|
|
objects "orphans".
|
|
|
|
Orphans Find -- DEPRECATED
|
|
--------------------------
|
|
|
|
The `radosgw-admin` tool has/had three subcommands to help manage
|
|
orphans, however these subcommands are (or will soon be)
|
|
deprecated. These subcommands are:
|
|
|
|
::
|
|
# radosgw-admin orphans find ...
|
|
# radosgw-admin orphans finish ...
|
|
# radosgw-admin orphans list-jobs ...
|
|
|
|
There are two key problems with these subcommands, however. First,
|
|
these subcommands have not been actively maintained and therefore have
|
|
not tracked RGW as it has evolved in terms of features and updates. As
|
|
a result the confidence that these subcommands can accurately identify
|
|
true orphans is presently low.
|
|
|
|
Second, these subcommands store intermediate results on the cluster
|
|
itself. This can be problematic when cluster administrators are
|
|
confronting insufficient storage space and want to remove orphans as a
|
|
means of addressing the issue. The intermediate results could strain
|
|
the existing cluster storage capacity even further.
|
|
|
|
For these reasons "orphans find" has been deprecated.
|
|
|
|
Orphan List
|
|
-----------
|
|
|
|
Because "orphans find" has been deprecated, RGW now includes an
|
|
additional tool -- 'rgw-orphan-list'. When run it will list the
|
|
available pools and prompt the user to enter the name of the data
|
|
pool. At that point the tool will, perhaps after an extended period of
|
|
time, produce a local file containing the RADOS objects from the
|
|
designated pool that appear to be orphans. The administrator is free
|
|
to examine this file and the decide on a course of action, perhaps
|
|
removing those RADOS objects from the designated pool.
|
|
|
|
All intermediate results are stored on the local file system rather
|
|
than the Ceph cluster. So running the 'rgw-orphan-list' tool should
|
|
have no appreciable impact on the amount of cluster storage consumed.
|
|
|
|
WARNING: Experimental Status
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
The 'rgw-orphan-list' tool is new and therefore currently considered
|
|
experimental. The list of orphans produced should be "sanity checked"
|
|
before being used for a large delete operation.
|
|
|
|
WARNING: Specifying a Data Pool
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
If a pool other than an RGW data pool is specified, the results of the
|
|
tool will be erroneous. All RADOS objects found on such a pool will
|
|
falsely be designated as orphans.
|
|
|
|
WARNING: Unindexed Buckets
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
RGW allows for unindexed buckets, that is buckets that do not maintain
|
|
an index of their contents. This is not a typical configuration, but
|
|
it is supported. Because the 'rgw-orphan-list' tool uses the bucket
|
|
indices to determine what RADOS objects should exist, objects in the
|
|
unindexed buckets will falsely be listed as orphans.
|
|
|
|
|
|
RADOS List
|
|
----------
|
|
|
|
One of the sub-steps in computing a list of orphans is to map each RGW
|
|
object into its corresponding set of RADOS objects. This is done using
|
|
a subcommand of 'radosgw-admin'.
|
|
|
|
::
|
|
# radosgw-admin bucket radoslist [--bucket={bucket-name}]
|
|
|
|
The subcommand will produce a list of RADOS objects that support all
|
|
of the RGW objects. If a bucket is specified then the subcommand will
|
|
only produce a list of RADOS objects that correspond back the RGW
|
|
objects in the specified bucket.
|
|
|
|
Note: Shared Bucket Markers
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Some administrators will be aware of the coding schemes used to name
|
|
the RADOS objects that correspond to RGW objects, which include a
|
|
"marker" unique to a given bucket.
|
|
|
|
RADOS objects that correspond with the contents of one RGW bucket,
|
|
however, may contain a marker that specifies a different bucket. This
|
|
behavior is a consequence of the "shallow copy" optimization used by
|
|
RGW. When larger objects are copied from bucket to bucket, only the
|
|
"head" objects are actually copied, and the tail objects are
|
|
shared. Those shared objects will contain the marker of the original
|
|
bucket.
|
|
|
|
.. _Data Layout in RADOS : ../layout
|
|
.. _Pool Placement and Storage Classes : ../placement
|