Also:
- Do not print **offset** until specified
- Count missing objects correctly (used to be primary's local missing)
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
In EC recovery read, if the object's attrs read failed or with errors, we erase the attrs we have read and
try to read it again from left shards. This will make the primary osd get the object's attrs correct and
avoid assert.
Signed-off-by: xiaofei cui <cuixiaofei@sangfor.com>
When multiple objects are in flight for the same ReadOp, swap() on the
map<hobject_t, read_request_t> would remove requests for all objects.
We just want to replace the requests for the single object we're
dealing with in send_all_remaining_reads().
This prevents crashing trying to look up rop.to_read[hoid] when another
object in the same ReadOp gets an EIO and tries to send more requests.
Test this by using osd-recovery-max-single-start to bundle multiple
reads into one ReadOp. Save and restore CEPH_ARGS so custom settings
are reset for each test.
Fixes: http://tracker.ceph.com/issues/23195 (the 2nd crash there)
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Discount shards that already returned EIO, and use minimum_to_decode()
to request just what is necessary to recover or read the originally
requested extents of the object.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
1235810c2a allowed recovery to use
multiple passes of reads to handle EIO, but the end condition for
checking whether we finished reading requires the full data to be
decodable (this is what get_want_to_read_shards returns).
This is just a loss of efficiency normally, since when there is only
one object the subsequent read works, and grabs all the data
necessary. The crash comes from having multiple objects in the same
ReadOp - in this case the sequence of events is:
- start recovery of two objects (osd_recovery_max_single_start > 1)
- read object a shard 3
- read object b shard 3
- fail minimum_to_decode because shard 3 can't reconstruct all of object a
- re-read all of object a, marking more reads in progress
- fail minimum_to_decode because shard 3 can't reconstruct all of object b
- skip re-reading object because there are now reads in progress
- finish reading k shards of object a
- still fail minimum_to_decode for object b, so no extra data was read
- send_all_remaining_reads tries to lookup object b in ReadOp object
- crash dereferencing to_read[object b], since this was cleared after handling the original object b read reply
This patch fixes the immediate inefficiency and crash by only checking
for the missing shards that were requested, rather than the entire
object, for recovery reads.
Fixes: http://tracker.ceph.com/issues/23195 (first crash)
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
some tests, like osd-backfill-stats.sh are using delete_pool(), but
they don't have this function defined. and this function is defined
in standalone tests separately, so would be simpler if we can
consolidate them in ceph-helper.sh.
Signed-off-by: Kefu Chai <kchai@redhat.com>
wait_for_clean() can miss the new pool if it races with pool create.
Fixes: http://tracker.ceph.com/issues/20465
Signed-off-by: David Zafman <dzafman@redhat.com>
/bin/bash is a Linuxism. Other operating systems install bash to
different paths. Use /usr/bin/env in shebangs to find bash.
Signed-off-by: Alan Somers <asomers@gmail.com>
- stop running via make check
- add teuthology yamls to run them
- disable ceph_objecstore_tool.py for now (too slow for make check, and
we can't use vstart in teuthology via a package install)
- drop cephtool tests since those are already covered by other teuthology
tests
- leave a handful of (fast!) ceph-helpers tests for make check for minimal
integration tests.
Signed-off-by: Sage Weil <sage@redhat.com>