Apparently we weren't setting header_changed to true in the
case where we handled the CEPH_RGW_UPDATE case and cur_disk.exists
was false. In practice what this created is that in case where
object was created but the index complete call failed (or timed
out), calling rgw_dir_suggest_changes() fixed the entry, however,
we didn't account the new entry. This would lead to negative
stats on the bucket index.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reorder the snapdir logic and ctx->at_version adjustments prior to filling
in the object_info_t and user_versions and all that stuff. Adjust
at_version after appending the log entry (so that it points to the next
position/version we will write at.. culminating in the actual user
event).
The user log entry contains the request id, which will be used
by replay ops to put themselves in the correct place in the
waiting_for_commit/ack maps. Thus, the repop needs to be tagged
with the same version as the log entry with the request id.
Thus, the request id bearing log entry should be the last in
the log entry vector.
This should fix#3072, wherein a replay which should wait on
the repop tagged as version '36 will instead wait on '35.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
Instead of just keeping a flat usage info per bucket, we
now maintain a list of categories for which requests
usage is aggregated in. Ops are put in categories based
on their names.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Otherwise, a pg_temp from a previous peering sequence
(but not a different peering_interval) might leak through
into Active and incorrectly trip the
Active::react(AdvMap&) asserts regarding want_acting.
Those asserts assume that want_acting is either empty or is
a results of recovery completion. In the latter case, the
want_acting set much consist only of elements of up and
acting.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Deep scrub reads the contents of every file from the store and computes
a crc32 digest. The primary compares the digest of all replicas and will
mark the PG inconsistent if any don't match.
OSDs that do not support deep scrub simply perform an ordinary chunky
scrub. Any subset of OSDs that do support deep scrub will have their
digests compared.
Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
Chunky scrub is a more efficient scrub. It blocks writes on a subset of
objects and scrubs those, allowing writes through to the rest of the PG.
The scrub takes longer to complete than a classic scrub, but improves
overall write throughput.
This feature is backward-compatible with classic scrub. If the primary
detects that any replica does not have the chunky scrub feature, it
falls back to the less efficient classic scrub.
Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
The 'pool=default' in the default crush maps is confusing wrt rados pools.
'root' makes more sense given that we are talking about hierarchies/trees.
Signed-off-by: Sage Weil <sage@inktank.com>