Check list-inconsistent-obj output
Check how many _scan_snap groupings
Use more general check for crashed osd(s)
Signed-off-by: David Zafman <dzafman@redhat.com>
When multiple objects are in flight for the same ReadOp, swap() on the
map<hobject_t, read_request_t> would remove requests for all objects.
We just want to replace the requests for the single object we're
dealing with in send_all_remaining_reads().
This prevents crashing trying to look up rop.to_read[hoid] when another
object in the same ReadOp gets an EIO and tries to send more requests.
Test this by using osd-recovery-max-single-start to bundle multiple
reads into one ReadOp. Save and restore CEPH_ARGS so custom settings
are reset for each test.
Fixes: http://tracker.ceph.com/issues/23195 (the 2nd crash there)
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Discount shards that already returned EIO, and use minimum_to_decode()
to request just what is necessary to recover or read the originally
requested extents of the object.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
1235810c2ad08ccb7ef5946686eb2b85798f5bca allowed recovery to use
multiple passes of reads to handle EIO, but the end condition for
checking whether we finished reading requires the full data to be
decodable (this is what get_want_to_read_shards returns).
This is just a loss of efficiency normally, since when there is only
one object the subsequent read works, and grabs all the data
necessary. The crash comes from having multiple objects in the same
ReadOp - in this case the sequence of events is:
- start recovery of two objects (osd_recovery_max_single_start > 1)
- read object a shard 3
- read object b shard 3
- fail minimum_to_decode because shard 3 can't reconstruct all of object a
- re-read all of object a, marking more reads in progress
- fail minimum_to_decode because shard 3 can't reconstruct all of object b
- skip re-reading object because there are now reads in progress
- finish reading k shards of object a
- still fail minimum_to_decode for object b, so no extra data was read
- send_all_remaining_reads tries to lookup object b in ReadOp object
- crash dereferencing to_read[object b], since this was cleared after handling the original object b read reply
This patch fixes the immediate inefficiency and crash by only checking
for the missing shards that were requested, rather than the entire
object, for recovery reads.
Fixes: http://tracker.ceph.com/issues/23195 (first crash)
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
System attributes shown as "object_info", "snapset" and "hashinfo"
Only output user attributes as "attrs"
Drop leading undescore "_" for user attribute keys
Improve logic as to when to show user attributes or specific system attributes
Signed-off-by: David Zafman <dzafman@redhat.com>
Keep a standalone wrapper for the workunit, so we can test it locally,
leveraging the ceph-helpers to do the setup. Keep a workunit to be
exercised by teuthology.
Signed-off-by: Joao Eduardo Luis <joao@suse.de>
Consolidate check() code and common script code
TEST_recovery_multi() wasn't reliable due to delayed peer_missing
Signed-off-by: David Zafman <dzafman@redhat.com>
This prevents the fix for http://tracker.ceph.com/issues/22050 or
potential future bugs from causing too much latency by trimming too
many log entries at once.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Regular updates piggyback some osd state for this purpose with
MOSDRepOp[Reply]. Do the same thing for pure log entry updates (write
errors and lost/revert additions) via MOSDPGUpdateLogMissing[Reply].
Fixes: http://tracker.ceph.com/issues/22050
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Add test script that verifies the command in qa/standalone/osd
Fixes: http://tracker.ceph.com/issues/23242
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Signed-off-by: David Zafman <dzafman@redhat.com>