Commit Graph

131541 Commits

Author SHA1 Message Date
Casey Bodley
87efd66db0 test/rgw: test_rgw_reshard.py injects ECANCELED on set_target_layout/commit_target_layout
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
c70a1fc882 radosgw-admin: add --inject-error-code to customize injected error
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
1af609198d rgw/reshard: revert_target_layout handles ECANCELED races/retries
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
ea04393676 rgw/reshard: init_target_layout handles ECANCELED races/retries
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
53aa8a11db rgw/reshard: commit_reshard handles ECANCELED races/retries
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
1be9c9f2c1 rgw: pass non-const ReshardFaultInjector
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
f683177b11 rgw: add comparison operators for index layout types
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
fdbccf3967 rgw/reshard: set_resharding_status() doesn't need retry
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Adam C. Emerson
935ac48919 rgw: Retry -ECANCELED in reshard commit and cancel
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
85c091588f rgw: prevent 'radosgw-admin bucket reshard' if zonegroup reshard is disabled
dynamic reshard was gated behind the zonegroup resharding flag with
RGWSI_Zone::can_reshard(), but manual reshard was only calling
RGWBucketReshard::can_reshard()

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
24431e07b2 rgw: add back json for zone/zonegroup features
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
45d378e414 rgw: RGWBucket::sync() no longer duplicates datalog/bilog entries
RGWSI_BucketIndex_RADOS::handle_overwrite() is already writing the
datalog/bilog entries related to BUCKET_DATASYNC_DISABLED

RGWBucket::sync() calls handle_overwrite() indirectly from
bucket->put_info() when it writes the bucket instance with this new
BUCKET_DATASYNC_DISABLED flag, so RGWBucket::sync() shouldn't
duplicate those writes here

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
37bfba1f0e rgw: add checks for non-empty layout.logs
always verify that logs is not empty before calling logs.back() or
logs.front()

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
165eae9bb6 rgw: use get_current_index() instead of log_to_index_layout()
several places were getting the current index layout indirectly
with layout.logs.back() and rgw::log_to_index_layout(). use
get_current_index() instead so we don't rely on layout.logs, which may
be empty for indexless buckets

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
b993001d38 radosgw-admin: add command to dump 'bucket layout'
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Adam C. Emerson
da0116dc40 rgw: Add generation to ChangeStatus
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2022-05-27 15:47:33 -04:00
Adam C. Emerson
c2b2138e41 rgw: Compare log.gen to log.gen
And refuse to remove the only log.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2022-05-27 15:47:33 -04:00
Adam C. Emerson
010b3ad04a rgw: Don't erase bucket attributes on trim
Writing bucket instance info is surprising, as if you pass a null
pointer for the attributes, it just erases all the attributes.

To avoid disturbing users and other 'system objects', make a special
case that we can pass in explicitly.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2022-05-27 15:47:33 -04:00
Yuval Lifshitz
9eab2dbc97 rgw/reshard: resolve inconsistent cache warnings
use an API that does not check for cache inconsistency
hence, "WARNING: The bucket info cache is inconsistent" warnings is removed from reshard

Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
c3ccb1adb4 test/rgw: test_bucket_reshard verifies that ACLs are preserved
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
J. Eric Ivancich
f0080c24b3 rgw: save bucket instance xattrs when resharding cancelled
There appears to be a long-standing bug in RGW such that when
resharding is cancelled and the bucket instance is updated to reflect
the new resharding status, the xattrs were lost. The xattrs are used
to store metadata such as ACLs and LifeCycle policies.

This commit makes sure that all call paths that lead to a cancelled
reshard provide the xattrs, so they can be included when the bucket
instance info is updated.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
2022-05-27 15:47:33 -04:00
J. Eric Ivancich
267bd85e7d rgw: resharding causes bucket attributes to be lost
With the new resharding code, some bucket metadata that is stored as
xattrs (e.g., ACLs, life-cycle policies) were not sent with the
updated bucket instance data when resharding completed. As a result,
resharding has a regression where that metadata is lost after a
successful reshard.

This commit restores the variable in the RGWBucketReshard class that
maintains the bucket attributes, so they can be saved when the bucket
instance object is updated.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
2022-05-27 15:47:33 -04:00
J. Eric Ivancich
10b785816f rgw: add indexless bucket logic to "bucket radoslist"
The "bucket radoslist" sub-command of radosgw-admin is supposed to
list all rados objects tied to one or all directories and thereby
provide a way to determine orphaned rados objects.

But indexless buckets don't provide an index to employ for this
purpose. So warnings or errors should be provided depending on the
circumstances.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
2022-05-27 15:47:33 -04:00
J. Eric Ivancich
194b0de182 rgw: update indexless bucket check for bucket stats
The code for bucket stats was recently updated to check for an
indexless bucket before proceeding. The interface on RGWBucketInfo was
recently expanded to support these types of checks, so it is now used.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
2022-05-27 15:47:33 -04:00
J. Eric Ivancich
177bb80cd5 rgw: add streamlined ways to handle indexless buckets correctly
Determining whether a bucket is indexless starting with an
RGWBucketInfo object requires traversing multiple data structures and
"inside knowledge" blurring the line between interface and
implementation. The same applies for retrieving the current index for
non-indexless buckets.

This commit adds to the RGWBucketInfo interface to make this
information readily accessible.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
2022-05-27 15:47:33 -04:00
Yuval Lifshitz
98e72261ea rgw/multisite: add type to RGW_OP_SYNC_DATALOG_NOTIFY2
without that the following errors are happening during sync:

ERROR: AWS4 completion for operation: 0, NOT IMPLEMENTED
op->ERRORHANDLER: err_no=-2201 new_err_no=-2201

Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
2022-05-27 15:47:33 -04:00
J. Eric Ivancich
e0eb3d8c22 rgw: radosgw-admin bucket stats on indexless bucket crashes
The new bucket layout code didn't check whether the bucket is
indexless prior to asking for the last entry in the layout log. The
layout log appears to be empty for an indexless bucket, thereby
putting the runtime in an undefined state that later may cause a
failed assertion.

This commit adds two safety checks and returns -EINVAL along with
putting useful information on stderr when either stats are requested
on an indexless bucket or when the layout log is empty.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
2022-05-27 15:47:33 -04:00
Yuval Lifshitz
030ec8e44d rgw: fix reshard cancelling race condition
this is happening when resharding while objects are uploaded
tests steps are here:
https://gist.github.com/yuvalif/060f66f03511bff881e952287df3087b

Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
bd72bc9903 rgw: preserve 'bucket sync disable' over reshard
if bucket sync is disabled, apply that flag to new index objects on
bucket reshard

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
aaf4bc134f rgw/multisite: handle shard_progress correctly in RunBucketSources
we run bucket sync on each of the sync pipes, so size the vector
accordingly

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
ae27b46d45 Revert "rgw: cr: add prealloc_stack()"
This reverts commit 7970f35549.

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
dee6278722 Revert "rgw: bucket sync: track progress by stack id"
This reverts commit c0baf3eb34.

Signed-off-by: Casey Bodley <cbodley@redhat.com>

Conflicts:
	src/rgw/rgw_data_sync.cc no longer loops over num_shards
2022-05-27 15:47:33 -04:00
Casey Bodley
91fe8d464a rgw/multisite: RunBucketSourcesSync no longer takes optional target
RGWDataSyncSingleEntryCR is the only caller of RGWRunBucketSourcesSyncCR

it always provides a source_bs, and never provides a target_bs. so remove
all the complexity related to target_bs, and the idea that we'd need to
sync several source bucket shards related to the target bucket

we now just have the single loop over the target buckets that use the
given bucket as a source

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
a0e9a59246 radosgw-admin: allow reshard commands in multisite on secondary
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
a50d6ad12d rgw: fix for uninitialized oldest_gen/latest_gen
when data sync queries RGWOp_BILog_Info from an un-upgraded gateway, it
doesn't include the oldest_gen/latest_gen fields. so initialize these
variables to 0 by default

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
60ecef0c8a rgw: enable RGWReshard thread on any zone that supports it
enable the background dynamic resharding thread based on
RGWSI_Zone::can_reshard(), which takes the zonegroup features into
account

Fixes: https://tracker.ceph.com/issues/52877

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
9b6a1a8e0b rgw: prevent reshard from creating too many log generations
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Shilpa Jagannath
6457a66b7a rgw: remove per-shard sync status object after incremental sync finishes
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
58b026965c radosgw-admin: bucket sync status guards against shard count mismatch
if the remote gives us more shards than we expect, just count those
shards as 'behind' and avoid out-of-bounds access of shard_status

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
99fe9eb7b7 radosgw-admin: bucket sync status handles missing full status
if the full sync status object is missing, it's possible that we just
haven't started syncing it again after upgrading from just the per-shard
status objects

in this case, as long as we have a log generation 0, assume that we just
haven't initialized the full status object and try to read the gen=0
per-shard incremental status for comparison

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
0ac4d7bb28 rgw: rgw_read_bucket_inc_sync_status doesn't need bucket info
all we need to construct the per-shard bucket sync status object names
are the bucket names themselves, which we already have from
rgw_sync_bucket_pipe

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
c2f707786f rgw: resize status vector before reading inc_sync_status
rgw_read_bucket_inc_sync_status() uses the size of this vector as the
'num_shards', so we need to resize it appropriately beforehand

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Casey Bodley
6dbfe10306 rgw: RGWOp_BILog_Status reads full status unconditionally
the calls to rgw_read_bucket_inc_sync_status() depend on
sync_status.incremental_gen, which we need to read via
rgw_read_bucket_full_sync_status() regardless of whether
we're returning it to the client (version > 1)

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-05-27 15:47:33 -04:00
Adam C. Emerson
d50acd9296 rgw: RGWCollectBucketSyncStatusCR doesn't need the shard count
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2022-05-27 15:47:33 -04:00
Adam C. Emerson
b461c0d5b3 rgw: RunBucketSourceSync uses num_shards from remote bilog info
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2022-05-27 15:47:33 -04:00
Adam C. Emerson
718234f994 rgw: RGWListBucketIndexesCR only needs zero shard
We only need to check one shard, and everything has shard zero.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2022-05-27 15:47:33 -04:00
Adam C. Emerson
3877c35068 rgw: sync checkpoint gets num_shards from remote bilog info
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2022-05-27 15:47:33 -04:00
Adam C. Emerson
c2fd1bda79 rgw: RGWRemoteBucketManager constructor takes num_shards
The logic for getting it was moved to its caller.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2022-05-27 15:47:33 -04:00
Adam C. Emerson
c4dc861263 rgw: InitBucketFullSyncStatusCR gets num shards from remote
As specified in rgw_bucket_index_marker_info, unless we're doing the
compatibility check, in which case we look at generation 0.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2022-05-27 15:47:33 -04:00
Shilpa Jagannath
df730b5d34 rgw: read shard count using remote bilog info during bucket sync
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
2022-05-27 15:47:33 -04:00