Kefu Chai
1578875194
Merge pull request #24013 from dzafman/wip-35845
...
test: Use a grep pattern that works across releases
Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-09-12 23:00:39 +08:00
Kefu Chai
510d9e1345
Merge pull request #23723 from xiexingguo/wip-list-missing
...
osd/PrimaryLogPG: rename list_missing -> list_unfound command
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2018-09-11 20:25:21 +08:00
David Zafman
dc80f8585a
test: Use a grep pattern that works across releases
...
Fixes: http://tracker.ceph.com/issues/35845
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-09-10 08:21:36 -07:00
Sage Weil
4fc02a7f48
osd/OSDMap: include age in up and in counts for ceph status
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 09:07:50 -05:00
xie xingguo
85ba2f0a82
osd/PrimaryLogPG: s/list_missing/list_unfound/
...
Also:
- Do not print **offset** until specified
- Count missing objects correctly (used to be primary's local missing)
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-09-06 09:52:20 +08:00
Sage Weil
2c26fb0fe1
rados: drop mkpool, rmpool commands
...
- mkpool and rmpool users should use the normal cli/mon commands
Signed-off-by: Sage Weil <sage@redhat.com>
2018-08-31 09:27:36 -05:00
David Zafman
687f63e599
test: Update tests for error message changes
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-08-23 11:09:22 -07:00
David Zafman
58c4d32203
test: Verify cluster logging of scrub error messages
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-08-23 11:09:22 -07:00
David Zafman
67d9e44de6
test: Add test for repair of bad object info data_digest on all copies
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-07-26 07:50:23 -07:00
David Zafman
ebb05b2542
test: When possible show side-by-side diff in addition to regular diff
...
Fixes: https://tracker.ceph.com/issues/21664
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-26 18:23:07 -07:00
David Zafman
fe09fc5e9d
test: Fail immediately if some operations fail
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-18 14:09:14 -07:00
David Zafman
39fc43556f
test: Put files in private test directory
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-18 14:08:23 -07:00
David Zafman
c1e96ae7cb
test: Use a file that should be on all OSes
...
Also, create temporary files in test specific dir and remove
Caused by: 154330fd68
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-05 11:27:12 -07:00
Sage Weil
154330fd68
osd/PrimaryLogPG: fix on_local_recover crash on stray clone
...
If there is a stray clone (one that does not appear in the SnapSet) and
we do any sort of recovery on it the OSD will crash. Log an error instead
but continue.
This addresses a problem where a cluster has both (1) an unexpected clone
and (2) the clone is not present on all replicas. Doing repair on that
PG will both not fix the unexpected clone and also cause the remaining
OSDs to crash trying to recover it.
Include a test.
Fixes: https://tracker.ceph.com/issues/24396
Signed-off-by: Sage Weil <sage@redhat.com>
2018-06-05 11:09:01 -05:00
David Zafman
843598b69b
Revert "qa/standalone/scrub/osd-scrub-repair.sh: drop omap_digest flag"
...
This reverts commit 886606bfd7
.
Signed-off-by: David Zafman <dzafman@redhat.com>
Conflicts:
qa/standalone/scrub/osd-scrub-repair.sh (manually made equivalent changes)
2018-05-31 12:01:53 -07:00
David Zafman
1a7fa9a62a
test: Add test cases for multiple copy pool and snapshot errors
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-28 16:42:19 -07:00
David Zafman
2fa596dc0c
test: Prepare for second test and minor improvements
...
Check list-inconsistent-obj output
Check how many _scan_snap groupings
Use more general check for crashed osd(s)
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-28 16:42:19 -07:00
David Zafman
bae4940574
test: Fix comment at end of scrub test scripts
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-28 16:42:19 -07:00
David Zafman
458babe7ee
test: Use jq in a compatible way and for easier diff analysis
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-16 08:11:24 -07:00
David Zafman
22ddc6da5f
osd: Change shard digests to hex like object info digests
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-12 07:59:21 -07:00
David Zafman
9c5ef19f93
test: Be smarter about when jsonschema can be used
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:52:10 -07:00
David Zafman
60ae2b8eb3
osd rados command: Show snapset in list-inconsistent-snapset
...
Add SnapSet bufferlist to inconsistent_snapset_t
Partial fix for http://tracker.ceph.com/issues/23428
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:51:48 -07:00
David Zafman
1b1d45bf51
test: Add getjson variable to save output
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00
David Zafman
007cb45fe5
osd rados command: Change error name snapset_mismatch to snapset_error
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00
David Zafman
0c7ac9db3b
test: Clean-up test and use local values for number of objects and osds
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00
David Zafman
982509514c
osd rados command: list-inconsistent-obj attribute improvements
...
System attributes shown as "object_info", "snapset" and "hashinfo"
Only output user attributes as "attrs"
Drop leading undescore "_" for user attribute keys
Improve logic as to when to show user attributes or specific system attributes
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00
David Zafman
01687b052f
osd rados command: Change "oi" to "info" in scrub handling errors
...
data_digest_mismatch_oi -> data_digest_mismatch_info
omap_digest_mismatch_oi -> omap_digest_mismatch_info
size_mismatch_oi -> size_mismatch_info
obj_size_oi_mismatch -> obj_size_info_mismatch
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00
David Zafman
273f6213ea
osd rados command: Change "oi_attr" to "info" in scrub handling errors
...
oi_attr_missing -> info_missing
oi_attr_corrupted -> info_corrupted
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00
David Zafman
bec67e3d40
osd rados command: Rename ss_attr_missing/ss_attr_corrupted to snapset_missing/snapset_corrupted
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00
David Zafman
d713c7dad0
osd rados command: Improve scrub handling of HashInfo (hinfo_key xattr)
...
Fixes: http://tracker.ceph.com/issues/23364
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00
David Zafman
be815f9b2b
test: Remove check that masks differences (let diff fail)
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00
David Zafman
5cfb8241f4
osd: Fix stale scrub stats when a primary takes over
...
Fixes: http://tracker.ceph.com/issues/23267
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-03 12:51:06 -07:00
David Zafman
293ac9895f
test: Replace bc command with printf command
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-03-22 17:19:56 -07:00
David Zafman
fa5e75d046
test: Make code clearer by moving code out of loop
...
Caused by 33e747724a
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-03-06 11:30:08 -08:00
David Zafman
33e747724a
osd: Add new snapset_inconsistency error check
...
Includes new test case
Caused by: 5f58301a1364e948834dabe503200dda07fc2790
This changed attr consistency checking to exclude system keys,
which required snapset to be handled just like object info.
Fixes: http://tracker.ceph.com/issues/22996
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-02-15 09:03:49 -08:00
David Zafman
aeba36a660
ceph-helpers.sh: Add flush_pg_stats() to wait_for_clean() to make it reliable
...
osd-scrub-repair.sh: Fixes for omap keys landing on different OSDs due to flush
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-01-14 18:17:23 -08:00
Igor Fedotov
1653bcca3e
qa/standalone/scrub/osd-scrub-repair.sh: remove extents flag from object_info_t
...
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
2018-01-08 20:10:16 +03:00
xie xingguo
f82228c4af
osd/osd_type.cc: dump extents map object_info_t
...
which is good for bug hunting and diagnosing.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-10-24 11:46:23 +08:00
Sage Weil
886606bfd7
qa/standalone/scrub/osd-scrub-repair.sh: drop omap_digest flag
...
This is no longer set if we are backed by bluestore, which we are by
default. See be078c8b7b131764caa28bc44452b8c5c2339623
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-10-06 19:25:40 +08:00
xie xingguo
2470ab4aba
qa/standalone/scrub/osd-scrub-repair.sh: add extents flag into object_info_t
...
Introduced-by: https://github.com/ceph/ceph/pull/15199
Fixes: http://tracker.ceph.com/issues/21618
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-10-03 21:14:53 +08:00
Alan Somers
d1cbb90daa
scripts: fix bash path in shebangs (part 2)
...
/bin/bash is a Linuxism. Other operating systems install bash to
different paths. Use /usr/bin/env in shebangs to find bash.
Signed-off-by: Alan Somers <asomers@gmail.com>
2017-09-25 17:20:40 -06:00
Sage Weil
ec2bdbc44c
qa/standalone/scrub/osd-scrub-snaps: adjust test for lack of snapdir objects
...
The head_exists stuff is totally gone; those test failures go away.
Signed-off-by: Sage Weil <sage@redhat.com>
2017-09-22 17:49:19 -04:00
xie xingguo
afcb617dc9
osd/PrimaryLogPG: do not generate data digest for BlueStore by default
...
BlueStore enables CRC by default, so this is a dup and gains
no more benefits.
Turn this off by default, which is good for performance.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-09-13 12:17:16 +08:00
Kefu Chai
30b5b4627c
Merge pull request #16494 from asomers/bin_bash
...
misc: Fix bash path in shebangs
Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-27 10:14:14 +08:00
Sage Weil
84465bf5a5
qa/standalone/scrub/osd-scrub-repair: fix grep pattern
...
PGMap shows
ss << pg_sum.stats.sum.num_objects_unfound
<< "/" << pg_sum.stats.sum.num_objects << " objects unfound (" << b << "%)";
but we were grepping for "1/1 unfound" instead of "1/1 objects
unfound".
Introduced by fe81b7e3a5034ce855303f93f3e413f3f2dc74a8.
Fixes: http://tracker.ceph.com/issues/21127
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-25 11:03:44 -04:00
Kefu Chai
85b63670d9
Merge pull request #17039 from dzafman/wip-18206
...
osd: Fixes for osd_scrub_during_recovery handling
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-22 22:50:24 +08:00
David Zafman
367c32c69a
osd: Fixes for osd_scrub_during_recovery handling
...
Fixes: http://tracker.ceph.com/issues/18206
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-21 17:08:14 -07:00
David Zafman
9f3d970a0d
tests: osd-scrub-snaps.sh minor cleanup
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-21 17:08:14 -07:00
David Zafman
4c949b6258
osd, rados: Adding ss_attr_missing and ss_attr_corrupt errors to list-inconsistent-obj
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-11 11:37:32 -07:00
David Zafman
5f58301a13
osd, rados: Improve size scrub error handling
...
Fixes: http://tracker.ceph.com/issues/20243
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-11 11:37:32 -07:00