David Zafman
8a7e6c2349
Merge pull request #20220 from dzafman/wip-calc-stats3
...
osd: Improve recovery stat handling by using peer_missing and missing_loc info
Reviewed-by: Sage Weil <sage@redhat.com>
2018-03-14 11:07:44 -07:00
David Zafman
af85f3cc48
test: osd-backfill-stats.sh parallel osd-recovery-stats.sh check() changes
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-03-14 10:07:11 -07:00
David Zafman
acc1f80684
test: Use "(est)" in log message when an osd doesn't have peer_missing
...
Consolidate check() code and common script code
TEST_recovery_multi() wasn't reliable due to delayed peer_missing
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-03-14 10:07:11 -07:00
David Zafman
12e331b742
test: osd-recovery-stats.sh: New test with different missing objs on multiple OSDs
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-03-14 10:07:11 -07:00
David Zafman
09b5697ba2
test: Correction for better degraded/misplaced handling
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-03-14 10:07:11 -07:00
David Zafman
d7fd9174b9
osd: Fix for handling more than 1 missing target
...
Fix test case to test more than 1 target
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-03-14 10:07:03 -07:00
David Zafman
51b740ad41
test: Fail upon flush_pg_stats timeout
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-03-11 16:26:11 -07:00
Josh Durgin
1c15458a00
PrimaryLogPG: only trim up to osd_pg_log_trim_max entries at once
...
This prevents the fix for http://tracker.ceph.com/issues/22050 or
potential future bugs from causing too much latency by trimming too
many log entries at once.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2018-03-09 19:14:28 -05:00
Josh Durgin
b50186bfe6
PG, PrimaryLogPG: trim log and rollback info for error log entries
...
Regular updates piggyback some osd state for this purpose with
MOSDRepOp[Reply]. Do the same thing for pure log entry updates (write
errors and lost/revert additions) via MOSDPGUpdateLogMissing[Reply].
Fixes: http://tracker.ceph.com/issues/22050
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2018-03-09 17:54:08 -05:00
Josh Durgin
2067f7c679
Merge pull request #20786 from dzafman/wip-zafman-log-trim
...
tools/ceph-objectstore-tool: command to trim the pg log
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-03-08 16:42:31 -08:00
Josh Durgin
b01e4ea5e2
tools: Add pg log trim command to ceph-objectstore-tool
...
Add test script that verifies the command in qa/standalone/osd
Fixes: http://tracker.ceph.com/issues/23242
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-03-08 15:58:55 -08:00
David Zafman
317b3d3b36
Merge pull request #20759 from dzafman/wip-cleanup
...
test: Make clearer by moving code out of loop
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2018-03-08 10:45:38 -08:00
Sage Weil
c9e974800f
qa: --no-mon-config for ceph-objectstore-tool --op mkfs ..
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-03-06 14:44:50 -06:00
Sage Weil
5ee5bbace1
qa/standalone: drop CEPH_LIB hacks
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-03-06 14:44:49 -06:00
David Zafman
fa5e75d046
test: Make code clearer by moving code out of loop
...
Caused by 33e747724a
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-03-06 11:30:08 -08:00
Kefu Chai
fc43ae1724
qa/standalone: s/delete_erasure_pool/delete_erasure_coded_pool/
...
it's a regression introduced by ac56a202
Signed-off-by: Kefu Chai <kchai@redhat.com>
2018-03-01 19:09:31 +08:00
Kefu Chai
ac56a202fd
qa/standalone: extract delete_pool()
...
some tests, like osd-backfill-stats.sh are using delete_pool(), but
they don't have this function defined. and this function is defined
in standalone tests separately, so would be simpler if we can
consolidate them in ceph-helper.sh.
Signed-off-by: Kefu Chai <kchai@redhat.com>
2018-02-28 15:40:28 +08:00
Josh Durgin
d1ca620698
mon/OSDMonitor: fix min_size default for replicated pools
...
This was accidentally changed to 0 by using the config value
directly in 582e567c93
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2018-02-23 00:39:13 -05:00
David Zafman
33e747724a
osd: Add new snapset_inconsistency error check
...
Includes new test case
Caused by: 5f58301a13
This changed attr consistency checking to exclude system keys,
which required snapset to be handled just like object info.
Fixes: http://tracker.ceph.com/issues/22996
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-02-15 09:03:49 -08:00
Patrick Donnelly
46c25abd1c
test/encoding: refactor to avoid escaping shell magic
...
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-02-07 18:03:05 -08:00
Kefu Chai
4233cc02d4
Merge pull request #19651 from yanghonggang/master
...
mon/OSDMonitor.cc: fix expected_num_objects interpret error
Reviewed-by: Joao Eduardo Luis <joao@suse.de>
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-01-26 14:34:11 +08:00
Yang Honggang
c24f2baec9
mon/OSDMonitor.cc: fix expected_num_objects interpret error
...
Fixes: http://tracker.ceph.com/issues/22530
Signed-off-by: Yang Honggang <joseph.yang@xtaotech.com>
2018-01-21 21:00:17 -05:00
David Zafman
7ccb7b7023
Merge pull request #19850 from dzafman/wip-calc-stats
...
osd/PG: re-write of _update_calc_stats and improve pg degraded state
Fixes: http://tracker.ceph.com/issues/20059
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-01-16 11:58:49 -08:00
Kefu Chai
7aba57b9b4
Merge pull request #18191 from hjwsm1989/osd-mark-down
...
qa/standalone/osd/osd-mark-down: create pool to get updated osdmap faster
Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-01-15 11:09:02 +08:00
David Zafman
88ce0c1a91
test: Verify stat calculations during backfill
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-01-14 18:17:23 -08:00
David Zafman
f5af1af6d3
test: Verify stat calculations during recovery
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-01-14 18:17:23 -08:00
David Zafman
aeba36a660
ceph-helpers.sh: Add flush_pg_stats() to wait_for_clean() to make it reliable
...
osd-scrub-repair.sh: Fixes for omap keys landing on different OSDs due to flush
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-01-14 18:17:23 -08:00
Igor Fedotov
1653bcca3e
qa/standalone/scrub/osd-scrub-repair.sh: remove extents flag from object_info_t
...
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
2018-01-08 20:10:16 +03:00
Kefu Chai
e7097593a7
qa/standalone: remove osd-map-max-advance related tests
...
this setting was removed in 8967b73
Fixes: http://tracker.ceph.com/issues/22596
Signed-off-by: Kefu Chai <kchai@redhat.com>
2018-01-06 19:40:15 +08:00
Sage Weil
f33ab7e03a
Merge remote-tracking branch 'gh/mimic-dev1'
2017-12-20 15:08:30 -06:00
Sage Weil
06b7707cee
Merge pull request #19456 from liewegas/wip-22373
...
qa/standalone/ceph-helpers: pass --verbose to ceph-disk
2017-12-19 11:55:07 -06:00
Kefu Chai
2ceff9eb4e
qa/stanalone: pass options using --<option-name>=<value>
...
not "--<option-name> <value>', otherwise `ceph-authtool` would error
out:
$ CEPH_ARGS='--osd-map-max-advance 1000' bin/ceph-authtool --gen-print-key
bin/ceph-authtool: unexpected '1000'
usage: ceph-authtool keyringfile [OPTIONS]...
....
but using the syntax of `--<option-name>=<value>', it works:
$ CEPH_ARGS='--osd-map-max-advance=1000' bin/ceph-authtool --gen-print-key
AQBAhTNamf5+ABAASkAp/6IGq7LkUTEOMp/fgw==
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-12-15 16:19:15 +08:00
Kefu Chai
4e621762ed
qa/standalone/ceph-helpers.sh: silence ceph-disk DEPRECATION_WARNING
...
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-12-13 19:42:50 +08:00
Sage Weil
86dc162686
qa/standalone/ceph-helpers: pass --verbose to ceph-disk
...
Signed-off-by: Sage Weil <sage@redhat.com>
2017-12-12 12:56:45 -06:00
Sage Weil
4389b55435
Merge remote-tracking branch 'gh/mimic-dev1'
2017-12-11 22:27:35 -06:00
David Zafman
c4602c9ac8
test: ceph_objectstore_tool.py: Perform dump-import
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-12-08 18:50:04 -08:00
David Zafman
a8b8d541dd
ceph-objectstore-tool: Add option "dump-import" to examine an export
...
Fixes: http://tracker.ceph.com/issues/22086
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-12-06 17:30:47 -08:00
Sage Weil
c6529ad93e
qa/standalone/ceph-helpers.sh: fix full ratio ordering
...
Signed-off-by: Sage Weil <sage@redhat.com>
2017-11-29 16:07:12 -06:00
David Zafman
f94322066f
Merge pull request #18449 from dzafman/wip-zafman-misc
...
mark_unfound_lost fix and some other minor changes
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-10-27 10:21:25 -07:00
xie xingguo
f82228c4af
osd/osd_type.cc: dump extents map object_info_t
...
which is good for bug hunting and diagnosing.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-10-24 11:46:23 +08:00
David Zafman
f918b1fac1
test: Remove bogus check in ceph_objectstore_tool.py
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-10-18 18:07:23 -07:00
David Zafman
69b5fc54fe
test: Cleanup test-erasure-eio.sh code
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-10-18 11:12:14 -07:00
David Zafman
c2572bee3c
test: Add replicated recovery/backfill test
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-10-18 11:12:14 -07:00
David Zafman
bb2bcb95f5
osd: Add new UnfoundBackfill and UnfoundRecovery pg transitions
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-10-18 11:01:39 -07:00
David Zafman
b9de5eec26
test: Test case that reproduces tracker 18162
...
recover_replicas: object added to missing set for backfill, but is not in recovering, error!
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-10-18 10:58:23 -07:00
huangjun
ee618a38a9
qa/standalone/osd/osd-mark-down: create pool to get updated osdmap faster
...
Mon send osdmap to random osds after we mark osd down, the down osd
may use more than $sleep time to get updated osdmap if there is no
osd ping between osds. So create pool after setup cluster.
Signed-off-by: huangjun <huangjun@xsky.com>
2017-10-09 22:19:29 +08:00
Sage Weil
96ddf5c3a0
Merge pull request #17708 from liewegas/wip-pg
...
osd: initial minimal efforts to clean up PG interface
2017-10-08 21:47:49 -05:00
Sage Weil
b6a5c09dba
ceph-objectstore-tool: remove rm-past-intervals op
...
The OSD doesn't rebuild this on demand anymore.
Signed-off-by: Sage Weil <sage@redhat.com>
2017-10-06 13:08:18 -05:00
Sage Weil
886606bfd7
qa/standalone/scrub/osd-scrub-repair.sh: drop omap_digest flag
...
This is no longer set if we are backed by bluestore, which we are by
default. See be078c8b7b
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-10-06 19:25:40 +08:00
Sage Weil
eaa350be95
Merge pull request #18094 from xiexingguo/wip-tracker-21618
...
qa/standalone/scrub/osd-scrub-repair.sh: add extents flag into object_info_t
Reviewed-by: Sage Weil <sage@redhat.com>
2017-10-05 11:14:01 -05:00
Sage Weil
15b63d6795
qa/standalone/scrub/osd-scrub-repair: no -y to diff
...
With -y you can't see the entire line when it is long, which is
needed to identify the diff failure in
http://tracker.ceph.com/issues/21618
Instead, let the interactive user specify the option if they want it.
Signed-off-by: Sage Weil <sage@redhat.com>
2017-10-03 14:35:35 -05:00
xie xingguo
2470ab4aba
qa/standalone/scrub/osd-scrub-repair.sh: add extents flag into object_info_t
...
Introduced-by: https://github.com/ceph/ceph/pull/15199
Fixes: http://tracker.ceph.com/issues/21618
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-10-03 21:14:53 +08:00
Kefu Chai
3dfe209499
Merge pull request #17955 from asomers/bin_bash2
...
test: fix bash path in shebangs (part 2)
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-09-30 12:13:35 +08:00
David Zafman
2f466f8b26
Merge pull request #17920 from dzafman/wip-21382
...
Erasure code recovery should send additional reads if necessary
Fixes: http://tracker.ceph.com/issues/21382
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-09-29 09:04:43 -07:00
David Zafman
1235810c2a
osd: Allow recovery to send additional reads
...
For now it doesn't include non-acting OSDs
Added test for this case
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-28 23:31:18 -07:00
David Zafman
f92aa6c824
test: Allow modified options to existing setup functions
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-28 23:31:18 -07:00
David Zafman
43e3206de2
test: Use feature to get last array element
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-28 23:31:18 -07:00
Alan Somers
d1cbb90daa
scripts: fix bash path in shebangs (part 2)
...
/bin/bash is a Linuxism. Other operating systems install bash to
different paths. Use /usr/bin/env in shebangs to find bash.
Signed-off-by: Alan Somers <asomers@gmail.com>
2017-09-25 17:20:40 -06:00
Sage Weil
ec2bdbc44c
qa/standalone/scrub/osd-scrub-snaps: adjust test for lack of snapdir objects
...
The head_exists stuff is totally gone; those test failures go away.
Signed-off-by: Sage Weil <sage@redhat.com>
2017-09-22 17:49:19 -04:00
Kefu Chai
73d4afbf8c
Merge pull request #17747 from tchaikov/wip-qa
...
qa/standalone: respect $TEMPDIR
Reviewed-by: David Zafman <dzafman@redhat.com>
2017-09-20 23:08:47 +08:00
Kefu Chai
f27251432a
Merge pull request #17785 from dzafman/wip-add-repair
...
test: Fix ceph-objectstore-tool usage check
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-09-20 12:35:16 +08:00
Sage Weil
6767f841e5
Merge pull request #17427 from liewegas/wip-pg-num-limits
...
mon/OSDMonitor: implement cluster pg limit
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-09-19 12:57:10 -05:00
David Zafman
0364ae104a
test: Fix ceph-objectstore-tool usage check
...
Caused by: c7b7a1f04f
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-18 15:29:22 -07:00
Kefu Chai
085778b80a
Merge pull request #17703 from dzafman/wip-misc
...
Erasure code read test and code cleanup
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-09-15 19:54:58 +08:00
Kefu Chai
279d2980fa
qa/standalone/ceph-helpers.sh: pass btrfs subvolume options the right way
...
with the latest btrfs-progs, it complains with
$ sudo btrfs subvolume list . -t
btrfs subvolume list: too many arguments
so, we need to pass `-t` right after `list` subcommand.
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-09-15 12:19:50 +08:00
Kefu Chai
0c47aa8217
qa: respect $TEMPDIR
...
ceph-disk and ceph-detect-init are build in $TEMPDIR if it's defined.
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-09-15 12:19:50 +08:00
Sage Weil
c9ffeeebeb
qa/standalong/mon/osd-pool-create: fewer pgs in test
...
This runs afoul of the new max pg per osd limit.
Signed-off-by: Sage Weil <sage@redhat.com>
2017-09-14 12:10:13 -04:00
David Zafman
50e08b0a5d
test: Add a removal test for erasure code read
...
Test feature: http://tracker.ceph.com/issues/14513
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-13 13:15:52 -07:00
Xie Xingguo
0e604b112e
Merge pull request #17515 from xiexingguo/wip-data-digest
...
osd/PrimaryLogPG: do not set data digest for bluestore
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2017-09-13 18:31:10 +08:00
xie xingguo
afcb617dc9
osd/PrimaryLogPG: do not generate data digest for BlueStore by default
...
BlueStore enables CRC by default, so this is a dup and gains
no more benefits.
Turn this off by default, which is good for performance.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-09-13 12:17:16 +08:00
David Zafman
44f51024cc
Merge pull request #17538 from dzafman/wip-21272
...
Add export and remove ceph-objectstore-tool command option
Reviewed-by: Sage Weil <sage@redhat.com>
2017-09-11 20:12:27 -07:00
David Zafman
3bb20f6d75
ceph-objectstore-tool: Make pg removal require --force
...
Add new export-remove to combine the 2 operations
Fixes: http://tracker.ceph.com/issues/21272
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-08 17:56:05 -07:00
David Zafman
49ca1fff7f
ceph-objectstore-tool: Better messages for bad --journal-path
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-08 17:50:46 -07:00
David Zafman
3ac219df2d
test: Fix ceph-objectstore-tool test for standalone and latest code
...
vstart.sh now defaults to bluestore, so specify filestore
Set environment for run-standalone.sh and cmake build
Create td/cot_dir as test directory
Crush output format change
Change dir into test directory
Give a little time after pool creation
Check for core files as ceph-helpers.sh does
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-08 16:53:53 -07:00
David Zafman
495c32fd31
test: Move ceph-objectstore-tool test to standalone
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-08 16:53:30 -07:00
Yuri Weinstein
0c2a139ee6
Merge pull request #17513 from Liuchang0812/wip-max-avail-in-df
...
mon: incorrect MAX AVAIL in "ceph df"
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2017-09-08 13:41:07 -07:00
Sage Weil
e2bc8883ba
qa/standalone/mon/misc.sh: fix mon feature test
...
Signed-off-by: Sage Weil <sage@redhat.com>
2017-09-06 10:18:07 -04:00
liuchang0812
365558571c
mon: incorrect MAX AVAIL in "ceph df"
...
Fixes: http://tracker.ceph.com/issues/21243
Signed-off-by: liuchang0812 <liuchang0812@gmail.com>
2017-09-06 21:09:29 +08:00
xie xingguo
2ee80aead8
mon/OSDMonitor: make 'osd crush class rename' idempotent
...
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-29 10:43:35 +08:00
Kefu Chai
30b5b4627c
Merge pull request #16494 from asomers/bin_bash
...
misc: Fix bash path in shebangs
Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-27 10:14:14 +08:00
Sage Weil
5db94f4786
Merge pull request #17126 from xiexingguo/wip-nicenum
...
common/types: make numbers a bit nicer when displaying space usage
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-25 10:11:06 -05:00
Sage Weil
84465bf5a5
qa/standalone/scrub/osd-scrub-repair: fix grep pattern
...
PGMap shows
ss << pg_sum.stats.sum.num_objects_unfound
<< "/" << pg_sum.stats.sum.num_objects << " objects unfound (" << b << "%)";
but we were grepping for "1/1 unfound" instead of "1/1 objects
unfound".
Introduced by fe81b7e3a5
.
Fixes: http://tracker.ceph.com/issues/21127
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-25 11:03:44 -04:00
Kefu Chai
85b63670d9
Merge pull request #17039 from dzafman/wip-18206
...
osd: Fixes for osd_scrub_during_recovery handling
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-22 22:50:24 +08:00
xie xingguo
1ea448ac75
common/types: make numbers a bit nicer when displaying space usage
...
Was:
----------------------------------------------------------------------------
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
30911M 27050M 3861M 12.49
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
rbd 0 101216k 1.10 8913M 1178
cephfs_data_a 1 0 0 8913M 0
cephfs_metadata_a 2 892 0 8913M 21
----------------------------------------------------------------------------
Now:
----------------------------------------------------------------------------
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
30.2G 26.4G 3.77G 12.50
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
rbd 0 99.2M 1.10 8.70G 1180
cephfs_data_a 1 0 0 8.70G 0
cephfs_metadata_a 2 892 0 8.70G 21
----------------------------------------------------------------------------
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-22 12:33:10 +08:00
David Zafman
367c32c69a
osd: Fixes for osd_scrub_during_recovery handling
...
Fixes: http://tracker.ceph.com/issues/18206
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-21 17:08:14 -07:00
David Zafman
9f3d970a0d
tests: osd-scrub-snaps.sh minor cleanup
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-21 17:08:14 -07:00
David Zafman
4c949b6258
osd, rados: Adding ss_attr_missing and ss_attr_corrupt errors to list-inconsistent-obj
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-11 11:37:32 -07:00
David Zafman
5f58301a13
osd, rados: Improve size scrub error handling
...
Fixes: http://tracker.ceph.com/issues/20243
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-11 11:37:32 -07:00
David Zafman
8ad4b29113
osd: Add whether shard is primary in list-inconsistent-obj
...
Add new field in the client interface
Update test case
Fixes: http://tracker.ceph.com/issues/18836
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-11 11:37:03 -07:00
Yuri Weinstein
11c57701c6
Merge pull request #16961 from xiexingguo/wip-class-rename
...
crush: "osd crush class rename" support
Reviewed-by: Sage Weil <sage@redhat.com>
2017-08-11 06:18:57 -07:00
Sage Weil
d2d9b41275
Merge pull request #16709 from dzafman/wip-standalone
...
qa/standalone: misc fixes
2017-08-10 21:33:43 -05:00
xie xingguo
d792e8d528
crush: "osd crush class rename" support
...
In 076a6abd80
I killed the 'class rename' command
and thought it was totally useless but I was wrong.
Consider the following user case:
(1) randomly choose some OSDs(e.g., from different hosts) and try to make them for private use only,
say, by grouping them into 'pool1'
(2) ceph osd crush set-device-class pool1 'OSDs from (1)'
(3) ceph osd crush rule create-replicated rule_for_pool1 default host pool1
(4) ceph osd pool rename pool1 pool2
(5) ceph osd crush class rename pool1 pool2
From the above user case, we need to safely change a pool name without worrying
any risk of data migration. That is why the 'osd crush class rename' command
is still needed here.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-11 08:32:39 +08:00
David Zafman
e24ac51a82
qa: Fix broken test_activate_osd() due to missing space
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 12:37:05 -07:00
David Zafman
ae2c5331fb
qa: Fix races with waiting for scrubs
...
The trigger_scrub sets the last_scrub_stamp backwards to
force a scheduled scrub. In a small window this stamp could get propagated
to the mgr. A test failure occurred because wait_for_scrub() was confused
by seeing a backward moving date.
The most critical change is having wait_for_scrub() make sure that the
date advances past the previous in value.
A test failed because the random backoff kept delayed triggered scrub, so
set osd_scrub_backoff throughout.
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 12:37:05 -07:00
David Zafman
dddda523d1
qa: Testing of ceph-helpers.sh, teardown on fail to dump logs, save cores
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 12:37:05 -07:00
David Zafman
1fe6cb0f02
osd: Avoid confusion over legacy snaps when head_exists corrupt
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 12:37:05 -07:00
David Zafman
229de6b71d
qa: Add support for core dumps
...
Save core dumps when running tests locally
Dump logs to output whenever cores seen
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 12:37:04 -07:00
David Zafman
4db5124e1a
qa: For FreeBSD skip osd-dup.sh because there is no bluestore
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 08:30:47 -07:00
David Zafman
61bfd236ad
qa: Raise mon-data-avail-warn to pass tests with less space
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 08:30:47 -07:00
David Zafman
574b3cd3d4
qa: Add common generalized inject_eio() to ceph-helpers.sh
...
Retry for a while to allow pool to appear
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 08:30:47 -07:00
David Zafman
3988ebab43
qa: osd-scrub-repair.sh handle older versions of jq
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 08:30:47 -07:00
David Zafman
2a679a36de
qa: Add support for specifying sub-tests with run-standalone.sh
...
Fix test-ceph-helpers.sh to pass additional arguments on
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 08:30:47 -07:00
David Zafman
69413618a0
qa: ceph-helpers.sh fixes
...
Add missing teardown to cleanup test directory
Fix pgid due to elimination of initial default pool
Testing could never fail because run_tests return ignored
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 08:30:47 -07:00
xie xingguo
87952fc68d
crush: automatically kill dead classes
...
If a class is no more referenced by any devices or crush rules,
it shall be considered as dead.
This patch makes Ceph automatically recycles those dead classes,
so user does not to explicitly call 'class rm', which is unsafe
and annoying.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-05 18:53:39 +08:00
xie xingguo
b863883ca7
crush: remove 'class rm' command
...
The current version is broken. E.g., it should only remove a class
which is never referenced by any device.
Since we now create new classes automatically, we shall automatically
recycle dead classes too. So this command is definitely unuseful.
(Actually it is weird that we keep 'class rm' without keeping the
corresponding 'class create' command).
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-05 18:52:30 +08:00
xie xingguo
f1d80ff750
crush: do not automatically recycle class for 'rm-device-class'
...
This will prevent the current crush rule from referencing a non-existent
shadow tree and hence avoid a coredump such as below:
0> 2017-08-05 09:54:19.943349 7f73887d6700 -1 /clove/vm/xxg/rpm/ceph/rpmbuild/BUILD/ceph-12.1.2.1/src/crush/CrushWrapper.cc: In function 'int CrushWrapper::get_rule_weight_osd_map(unsigned
int, std::map<int, float>*)' thread 7f73887d6700 time 2017-08-05 09:54:19.941291
/clove/vm/xxg/rpm/ceph/rpmbuild/BUILD/ceph-12.1.2.1/src/crush/CrushWrapper.cc: 1631: FAILED assert(b)
ceph version 12.1.2.1-11-gd0f812a (d0f812a3a757b319c26794f558b57770663ab324) luminous (rc)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x7f7398b66ea0]
2: (CrushWrapper::get_rule_weight_osd_map(unsigned int, std::map<int, float, std::less<int>, std::allocator<std::pair<int const, float> > >*)+0x54e) [0x7f7398daac4e]
3: (PGMap::get_rule_avail(OSDMap const&, int) const+0x68) [0x7f73989a6428]
4: (PGMap::get_rules_avail(OSDMap const&, std::map<int, long, std::less<int>, std::allocator<std::pair<int const, long> > >*) const+0x35c) [0x7f73989b748c]
5: (PGMap::encode_digest(OSDMap const&, ceph::buffer::list&, unsigned long) const+0x16) [0x7f73989b7506]
6: (DaemonServer::send_report()+0x2a4) [0x7f73989f5474]
7: (DaemonServer::maybe_ready(int)+0x2f9) [0x7f73989f6129]
8: (DaemonServer::ms_dispatch(Message*)+0xce) [0x7f73989ff68e]
9: (DispatchQueue::entry()+0x792) [0x7f7398dd2a22]
10: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f7398c1429d]
11: (()+0x7df3) [0x7f739640cdf3]
12: (clone()+0x6d) [0x7f73954f23ed]
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-05 18:44:59 +08:00
David Zafman
99ad4bbd91
qa: Add create_pool() which sleeps 1 second like python variant
...
wait_for_clean() can miss the new pool if it races with pool create.
Fixes: http://tracker.ceph.com/issues/20465
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-04 06:38:09 -07:00
David Zafman
b20dfc2864
qa: Add special test_failure.sh script (not run by default)
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-04 06:38:09 -07:00
David Zafman
8c768050a5
qa: run-standalone.sh improvements
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-04 06:38:09 -07:00
David Zafman
4314cdd666
qa: Dump logs after daemons are killed to make sure everything is flushed
...
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-04 06:38:09 -07:00
xie xingguo
734b5f2c60
test/osd-fast-mark-down: enable 'osd-class-update-on-start' by default
...
116cf759c8
will now hide all shadow trees(roots), so this is not applicable anymore
(actually it is misleading).
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-03 17:26:26 -04:00
Sage Weil
41bcf2fee5
Merge pull request #16281 from badone/wip-PG-cluster-log-audit
...
osd: Log audit
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-27 16:25:30 -05:00
Alan Somers
3aae5ca6fd
scripts: fix bash path in shebangs
...
/bin/bash is a Linuxism. Other operating systems install bash to
different paths. Use /usr/bin/env in shebangs to find bash.
Signed-off-by: Alan Somers <asomers@gmail.com>
2017-07-27 13:24:26 -06:00
Sage Weil
e469a8044c
qa/standalone/crush/crush-classes: fix test
...
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:25:25 -04:00
Sage Weil
380de3395f
qa/standalone/README
...
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:24:52 -04:00
xie xingguo
076a6abd80
crush: kill 'class rename'
...
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:40:50 +08:00
xie xingguo
a27fd9d25c
crush: kill "class create" command
...
The device class is now self and automatically managed.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:40:17 +08:00
xie xingguo
edd8930346
crush: allow "crush class rm" to automatically recycle shadow tree(s)
...
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:39:41 +08:00
xie xingguo
9d908c14f6
crush: rm-device-class support
...
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:39:08 +08:00
xie xingguo
32fb548797
crush: guard set-device-class
...
If a device has already been bounded to a class,
do not allow to change its class silently.
Require user call rm-device-class first.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:34:08 +08:00
xie xingguo
e4e83a0dd7
crush: fix class_is_in_use()
...
A class can be considered as in-use only if it is referenced by
any of the existing crush rules.
The patch also makes the output more human readable. For example:
./bin/ceph osd crush rule create-replicated myrule default host ssd
./bin/ceph osd crush class rm ssd
Error EBUSY: class 'ssd' still referenced by crush_rule 'myrule'
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:31:39 +08:00
xie xingguo
f3a3180cca
crush: rebuild shadow tree on "crush create-or-move/move"
...
This patch solves the problem below:
./bin/ceph osd crush move osd.0 root=foo rack=foo-rack host=foo-host
moved item id 0 name 'osd.0' to location {host=foo-host,rack=foo-rack,root=foo} in crush map
./bin/ceph osd crush rule create-replicated foo-rule foo host ssd
Error EINVAL: root foo has no devices with class ssd
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:30:59 +08:00
xie xingguo
10bf2a633f
crush: fix "crush create-or-move/move" would drop osd's class
...
Was:
./bin/ceph osd tree
ID CLASS WEIGHT TYPE NAME UP/DOWN REWEIGHT PRI-AFF
-1 3.00000 root default
-2 3.00000 host gitbuilder-ceph-rpm-centos7-amd64-basic
0 ssd 1.00000 osd.0 up 1.00000 1.00000
1 ssd 1.00000 osd.1 up 1.00000 1.00000
2 ssd 1.00000 osd.2 up 1.00000 1.00000
./bin/ceph osd crush move osd.0 root=foo rack=foo-rack host=foo-host
moved item id 0 name 'osd.0' to location {host=foo-host,rack=foo-rack,root=foo} in crush map
./bin/ceph osd tree
ID CLASS WEIGHT TYPE NAME UP/DOWN REWEIGHT PRI-AFF
-7 1.00000 root foo
-6 1.00000 rack foo-rack
-5 1.00000 host foo-host
0 1.00000 osd.0 up 1.00000 1.00000
-1 2.00000 root default
-2 2.00000 host gitbuilder-ceph-rpm-centos7-amd64-basic
1 ssd 1.00000 osd.1 up 1.00000 1.00000
2 ssd 1.00000 osd.2 up 1.00000 1.00000
Now:
./bin/ceph osd tree
ID CLASS WEIGHT TYPE NAME UP/DOWN REWEIGHT PRI-AFF
-1 3.00000 root default
-2 3.00000 host gitbuilder-ceph-rpm-centos7-amd64-basic
0 ssd 1.00000 osd.0 up 1.00000 1.00000
1 ssd 1.00000 osd.1 up 1.00000 1.00000
2 ssd 1.00000 osd.2 up 1.00000 1.00000
./bin/ceph osd crush move osd.0 root=foo rack=foo-rack host=foo-host
moved item id 0 name 'osd.0' to location {host=foo-host,rack=foo-rack,root=foo} in crush map
./bin/ceph osd tree
ID CLASS WEIGHT TYPE NAME UP/DOWN REWEIGHT PRI-AFF
-7 1.00000 root foo
-6 1.00000 rack foo-rack
-5 1.00000 host foo-host
0 ssd 1.00000 osd.0 up 1.00000 1.00000
-1 2.00000 root default
-2 2.00000 host gitbuilder-ceph-rpm-centos7-amd64-basic
1 ssd 1.00000 osd.1 up 1.00000 1.00000
2 ssd 1.00000 osd.2 up 1.00000 1.00000
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:30:26 +08:00
Brad Hubbard
f8acc53d82
osd: Log audit
...
Review current log messages for consistency, accuracy and necessesity as
part of usability initiative. First in a series.
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
2017-07-26 17:34:28 +10:00
Sage Weil
766229b034
qa/standalone/scrub: separate scrub/repair tests from rest of osd/
...
They are slow. Run them separately.
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-24 22:11:50 -04:00
Sage Weil
cabad62242
qa/standalone/ceph-helpers: factor rbd pool create out of run_mon
...
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-24 22:11:50 -04:00
Sage Weil
b12bebe432
qa/standalone/mon/osd-pool-create: stop testing create pool output
...
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-24 22:11:49 -04:00
Sage Weil
71ea171604
qa: move ceph-helpers and misc src/test/*.sh tests to qa/standalone
...
- stop running via make check
- add teuthology yamls to run them
- disable ceph_objecstore_tool.py for now (too slow for make check, and
we can't use vstart in teuthology via a package install)
- drop cephtool tests since those are already covered by other teuthology
tests
- leave a handful of (fast!) ceph-helpers tests for make check for minimal
integration tests.
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-24 22:11:49 -04:00