dynamic reshard was gated behind the zonegroup resharding flag with
RGWSI_Zone::can_reshard(), but manual reshard was only calling
RGWBucketReshard::can_reshard()
Signed-off-by: Casey Bodley <cbodley@redhat.com>
RGWSI_BucketIndex_RADOS::handle_overwrite() is already writing the
datalog/bilog entries related to BUCKET_DATASYNC_DISABLED
RGWBucket::sync() calls handle_overwrite() indirectly from
bucket->put_info() when it writes the bucket instance with this new
BUCKET_DATASYNC_DISABLED flag, so RGWBucket::sync() shouldn't
duplicate those writes here
Signed-off-by: Casey Bodley <cbodley@redhat.com>
several places were getting the current index layout indirectly
with layout.logs.back() and rgw::log_to_index_layout(). use
get_current_index() instead so we don't rely on layout.logs, which may
be empty for indexless buckets
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Writing bucket instance info is surprising, as if you pass a null
pointer for the attributes, it just erases all the attributes.
To avoid disturbing users and other 'system objects', make a special
case that we can pass in explicitly.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
use an API that does not check for cache inconsistency
hence, "WARNING: The bucket info cache is inconsistent" warnings is removed from reshard
Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
There appears to be a long-standing bug in RGW such that when
resharding is cancelled and the bucket instance is updated to reflect
the new resharding status, the xattrs were lost. The xattrs are used
to store metadata such as ACLs and LifeCycle policies.
This commit makes sure that all call paths that lead to a cancelled
reshard provide the xattrs, so they can be included when the bucket
instance info is updated.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
With the new resharding code, some bucket metadata that is stored as
xattrs (e.g., ACLs, life-cycle policies) were not sent with the
updated bucket instance data when resharding completed. As a result,
resharding has a regression where that metadata is lost after a
successful reshard.
This commit restores the variable in the RGWBucketReshard class that
maintains the bucket attributes, so they can be saved when the bucket
instance object is updated.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
The "bucket radoslist" sub-command of radosgw-admin is supposed to
list all rados objects tied to one or all directories and thereby
provide a way to determine orphaned rados objects.
But indexless buckets don't provide an index to employ for this
purpose. So warnings or errors should be provided depending on the
circumstances.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
The code for bucket stats was recently updated to check for an
indexless bucket before proceeding. The interface on RGWBucketInfo was
recently expanded to support these types of checks, so it is now used.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Determining whether a bucket is indexless starting with an
RGWBucketInfo object requires traversing multiple data structures and
"inside knowledge" blurring the line between interface and
implementation. The same applies for retrieving the current index for
non-indexless buckets.
This commit adds to the RGWBucketInfo interface to make this
information readily accessible.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
without that the following errors are happening during sync:
ERROR: AWS4 completion for operation: 0, NOT IMPLEMENTED
op->ERRORHANDLER: err_no=-2201 new_err_no=-2201
Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
The new bucket layout code didn't check whether the bucket is
indexless prior to asking for the last entry in the layout log. The
layout log appears to be empty for an indexless bucket, thereby
putting the runtime in an undefined state that later may cause a
failed assertion.
This commit adds two safety checks and returns -EINVAL along with
putting useful information on stderr when either stats are requested
on an indexless bucket or when the layout log is empty.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
RGWDataSyncSingleEntryCR is the only caller of RGWRunBucketSourcesSyncCR
it always provides a source_bs, and never provides a target_bs. so remove
all the complexity related to target_bs, and the idea that we'd need to
sync several source bucket shards related to the target bucket
we now just have the single loop over the target buckets that use the
given bucket as a source
Signed-off-by: Casey Bodley <cbodley@redhat.com>
when data sync queries RGWOp_BILog_Info from an un-upgraded gateway, it
doesn't include the oldest_gen/latest_gen fields. so initialize these
variables to 0 by default
Signed-off-by: Casey Bodley <cbodley@redhat.com>
enable the background dynamic resharding thread based on
RGWSI_Zone::can_reshard(), which takes the zonegroup features into
account
Fixes: https://tracker.ceph.com/issues/52877
Signed-off-by: Casey Bodley <cbodley@redhat.com>
if the remote gives us more shards than we expect, just count those
shards as 'behind' and avoid out-of-bounds access of shard_status
Signed-off-by: Casey Bodley <cbodley@redhat.com>
if the full sync status object is missing, it's possible that we just
haven't started syncing it again after upgrading from just the per-shard
status objects
in this case, as long as we have a log generation 0, assume that we just
haven't initialized the full status object and try to read the gen=0
per-shard incremental status for comparison
Signed-off-by: Casey Bodley <cbodley@redhat.com>
all we need to construct the per-shard bucket sync status object names
are the bucket names themselves, which we already have from
rgw_sync_bucket_pipe
Signed-off-by: Casey Bodley <cbodley@redhat.com>
rgw_read_bucket_inc_sync_status() uses the size of this vector as the
'num_shards', so we need to resize it appropriately beforehand
Signed-off-by: Casey Bodley <cbodley@redhat.com>
the calls to rgw_read_bucket_inc_sync_status() depend on
sync_status.incremental_gen, which we need to read via
rgw_read_bucket_full_sync_status() regardless of whether
we're returning it to the client (version > 1)
Signed-off-by: Casey Bodley <cbodley@redhat.com>
As specified in rgw_bucket_index_marker_info, unless we're doing the
compatibility check, in which case we look at generation 0.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>