Commit Graph

21 Commits

Author SHA1 Message Date
huanwen ren
f1219d716d qa/osd: fixup osd-rep-recov-eio.sh fails to parse pg dump
Fixes: http://tracker.ceph.com/issues/36418
Signed-off-by: huanwen ren <ren.huanwen@zte.com.cn>
2018-10-16 02:18:22 +08:00
xie xingguo
85ba2f0a82 osd/PrimaryLogPG: s/list_missing/list_unfound/
Also:
- Do not print **offset** until specified
- Count missing objects correctly (used to be primary's local missing)

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-09-06 09:52:20 +08:00
cuixf
3eb1679b1f osd: retry to read object attrs at EC recovery
In EC recovery read, if the object's attrs read failed or with errors, we erase the attrs we have read and
try to read it again from left shards. This will make the primary osd get the object's attrs correct and
avoid assert.

Signed-off-by: xiaofei cui <cuixiaofei@sangfor.com>
2018-06-01 06:26:56 -04:00
Josh Durgin
d4808256d2 osd/ECBackend: preserve requests for other objects when sending extra reads
When multiple objects are in flight for the same ReadOp, swap() on the
map<hobject_t, read_request_t> would remove requests for all objects.

We just want to replace the requests for the single object we're
dealing with in send_all_remaining_reads().

This prevents crashing trying to look up rop.to_read[hoid] when another
object in the same ReadOp gets an EIO and tries to send more requests.

Test this by using osd-recovery-max-single-start to bundle multiple
reads into one ReadOp. Save and restore CEPH_ARGS so custom settings
are reset for each test.

Fixes: http://tracker.ceph.com/issues/23195 (the 2nd crash there)
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2018-04-20 19:42:15 -04:00
Josh Durgin
b162a5478d osd/ECBackend: recover from EIO based on the minimum data necessary
Discount shards that already returned EIO, and use minimum_to_decode()
to request just what is necessary to recover or read the originally
requested extents of the object.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2018-04-20 19:42:14 -04:00
Josh Durgin
468ad4b410 osd/ECBackend: only check required shards when finishing recovery reads
1235810c2a allowed recovery to use
multiple passes of reads to handle EIO, but the end condition for
checking whether we finished reading requires the full data to be
decodable (this is what get_want_to_read_shards returns).

This is just a loss of efficiency normally, since when there is only
one object the subsequent read works, and grabs all the data
necessary. The crash comes from having multiple objects in the same
ReadOp - in this case the sequence of events is:

- start recovery of two objects (osd_recovery_max_single_start > 1)
- read object a shard 3
- read object b shard 3
- fail minimum_to_decode because shard 3 can't reconstruct all of object a
- re-read all of object a, marking more reads in progress
- fail minimum_to_decode because shard 3 can't reconstruct all of object b
- skip re-reading object because there are now reads in progress
- finish reading k shards of object a
- still fail minimum_to_decode for object b, so no extra data was read
- send_all_remaining_reads tries to lookup object b in ReadOp object
- crash dereferencing to_read[object b], since this was cleared after handling the original object b read reply

This patch fixes the immediate inefficiency and crash by only checking
for the missing shards that were requested, rather than the entire
object, for recovery reads.

Fixes: http://tracker.ceph.com/issues/23195 (first crash)
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2018-04-20 19:42:14 -04:00
Kefu Chai
fc43ae1724 qa/standalone: s/delete_erasure_pool/delete_erasure_coded_pool/
it's a regression introduced by ac56a202

Signed-off-by: Kefu Chai <kchai@redhat.com>
2018-03-01 19:09:31 +08:00
Kefu Chai
ac56a202fd qa/standalone: extract delete_pool()
some tests, like osd-backfill-stats.sh are using delete_pool(), but
they don't have this function defined. and this function is defined
in standalone tests separately, so would be simpler if we can
consolidate them in ceph-helper.sh.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2018-02-28 15:40:28 +08:00
David Zafman
69b5fc54fe test: Cleanup test-erasure-eio.sh code
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-10-18 11:12:14 -07:00
David Zafman
bb2bcb95f5 osd: Add new UnfoundBackfill and UnfoundRecovery pg transitions
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-10-18 11:01:39 -07:00
David Zafman
b9de5eec26 test: Test case that reproduces tracker 18162
recover_replicas: object added to missing set for backfill, but is not in recovering, error!

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-10-18 10:58:23 -07:00
David Zafman
1235810c2a osd: Allow recovery to send additional reads
For now it doesn't include non-acting OSDs
Added test for this case

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-28 23:31:18 -07:00
David Zafman
f92aa6c824 test: Allow modified options to existing setup functions
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-28 23:31:18 -07:00
David Zafman
43e3206de2 test: Use feature to get last array element
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-28 23:31:18 -07:00
David Zafman
50e08b0a5d test: Add a removal test for erasure code read
Test feature: http://tracker.ceph.com/issues/14513

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-13 13:15:52 -07:00
Kefu Chai
30b5b4627c Merge pull request #16494 from asomers/bin_bash
misc: Fix bash path in shebangs

Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-27 10:14:14 +08:00
David Zafman
574b3cd3d4 qa: Add common generalized inject_eio() to ceph-helpers.sh
Retry for a while to allow pool to appear

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 08:30:47 -07:00
David Zafman
99ad4bbd91 qa: Add create_pool() which sleeps 1 second like python variant
wait_for_clean() can miss the new pool if it races with pool create.

Fixes: http://tracker.ceph.com/issues/20465

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-04 06:38:09 -07:00
Alan Somers
3aae5ca6fd scripts: fix bash path in shebangs
/bin/bash is a Linuxism.  Other operating systems install bash to
different paths.  Use /usr/bin/env in shebangs to find bash.

Signed-off-by: Alan Somers <asomers@gmail.com>
2017-07-27 13:24:26 -06:00
Sage Weil
cabad62242 qa/standalone/ceph-helpers: factor rbd pool create out of run_mon
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-24 22:11:50 -04:00
Sage Weil
71ea171604 qa: move ceph-helpers and misc src/test/*.sh tests to qa/standalone
- stop running via make check
- add teuthology yamls to run them
- disable ceph_objecstore_tool.py for now (too slow for make check, and
we can't use vstart in teuthology via a package install)
- drop cephtool tests since those are already covered by other teuthology
tests
- leave a handful of (fast!) ceph-helpers tests for make check for minimal
integration tests.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-24 22:11:49 -04:00