Commit Graph

61 Commits

Author SHA1 Message Date
Kefu Chai
0c47aa8217 qa: respect $TEMPDIR
ceph-disk and ceph-detect-init are build in $TEMPDIR if it's defined.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-09-15 12:19:50 +08:00
Xie Xingguo
0e604b112e Merge pull request #17515 from xiexingguo/wip-data-digest
osd/PrimaryLogPG: do not set data digest for bluestore

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2017-09-13 18:31:10 +08:00
xie xingguo
afcb617dc9 osd/PrimaryLogPG: do not generate data digest for BlueStore by default
BlueStore enables CRC by default, so this is a dup and gains
no more benefits.

Turn this off by default, which is good for performance.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-09-13 12:17:16 +08:00
David Zafman
44f51024cc Merge pull request #17538 from dzafman/wip-21272
Add export and remove ceph-objectstore-tool command option

Reviewed-by: Sage Weil <sage@redhat.com>
2017-09-11 20:12:27 -07:00
David Zafman
3bb20f6d75 ceph-objectstore-tool: Make pg removal require --force
Add new export-remove to combine the 2 operations

Fixes: http://tracker.ceph.com/issues/21272

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-08 17:56:05 -07:00
David Zafman
49ca1fff7f ceph-objectstore-tool: Better messages for bad --journal-path
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-08 17:50:46 -07:00
David Zafman
3ac219df2d test: Fix ceph-objectstore-tool test for standalone and latest code
vstart.sh now defaults to bluestore, so specify filestore
Set environment for run-standalone.sh and cmake build
Create td/cot_dir as test directory
Crush output format change
Change dir into test directory
Give a little time after pool creation
Check for core files as ceph-helpers.sh does

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-08 16:53:53 -07:00
David Zafman
495c32fd31 test: Move ceph-objectstore-tool test to standalone
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-09-08 16:53:30 -07:00
Yuri Weinstein
0c2a139ee6 Merge pull request #17513 from Liuchang0812/wip-max-avail-in-df
mon: incorrect MAX AVAIL in "ceph df"

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2017-09-08 13:41:07 -07:00
Sage Weil
e2bc8883ba qa/standalone/mon/misc.sh: fix mon feature test
Signed-off-by: Sage Weil <sage@redhat.com>
2017-09-06 10:18:07 -04:00
liuchang0812
365558571c mon: incorrect MAX AVAIL in "ceph df"
Fixes: http://tracker.ceph.com/issues/21243

Signed-off-by: liuchang0812 <liuchang0812@gmail.com>
2017-09-06 21:09:29 +08:00
xie xingguo
2ee80aead8 mon/OSDMonitor: make 'osd crush class rename' idempotent
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-29 10:43:35 +08:00
Kefu Chai
30b5b4627c Merge pull request #16494 from asomers/bin_bash
misc: Fix bash path in shebangs

Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-27 10:14:14 +08:00
Sage Weil
5db94f4786 Merge pull request #17126 from xiexingguo/wip-nicenum
common/types: make numbers a bit nicer when displaying space usage

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-25 10:11:06 -05:00
Sage Weil
84465bf5a5 qa/standalone/scrub/osd-scrub-repair: fix grep pattern
PGMap shows

    ss << pg_sum.stats.sum.num_objects_unfound
       << "/" << pg_sum.stats.sum.num_objects << " objects unfound (" << b << "%)";

but we were grepping for "1/1 unfound" instead of "1/1 objects
unfound".

Introduced by fe81b7e3a5.

Fixes: http://tracker.ceph.com/issues/21127
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-25 11:03:44 -04:00
Kefu Chai
85b63670d9 Merge pull request #17039 from dzafman/wip-18206
osd: Fixes for osd_scrub_during_recovery handling

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-22 22:50:24 +08:00
xie xingguo
1ea448ac75 common/types: make numbers a bit nicer when displaying space usage
Was:
----------------------------------------------------------------------------
GLOBAL:
    SIZE       AVAIL      RAW USED     %RAW USED
    30911M     27050M        3861M         12.49
POOLS:
    NAME                  ID     USED        %USED     MAX AVAIL     OBJECTS
    rbd                   0      101216k      1.10         8913M        1178
    cephfs_data_a         1            0         0         8913M           0
    cephfs_metadata_a     2          892         0         8913M          21
----------------------------------------------------------------------------

Now:
----------------------------------------------------------------------------
GLOBAL:
    SIZE      AVAIL     RAW USED     %RAW USED
    30.2G     26.4G        3.77G         12.50
POOLS:
    NAME                  ID     USED      %USED     MAX AVAIL     OBJECTS
    rbd                   0      99.2M      1.10         8.70G        1180
    cephfs_data_a         1          0         0         8.70G           0
    cephfs_metadata_a     2        892         0         8.70G          21
----------------------------------------------------------------------------

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-22 12:33:10 +08:00
David Zafman
367c32c69a osd: Fixes for osd_scrub_during_recovery handling
Fixes: http://tracker.ceph.com/issues/18206

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-21 17:08:14 -07:00
David Zafman
9f3d970a0d tests: osd-scrub-snaps.sh minor cleanup
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-21 17:08:14 -07:00
David Zafman
4c949b6258 osd, rados: Adding ss_attr_missing and ss_attr_corrupt errors to list-inconsistent-obj
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-11 11:37:32 -07:00
David Zafman
5f58301a13 osd, rados: Improve size scrub error handling
Fixes: http://tracker.ceph.com/issues/20243

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-11 11:37:32 -07:00
David Zafman
8ad4b29113 osd: Add whether shard is primary in list-inconsistent-obj
Add new field in the client interface
Update test case

Fixes: http://tracker.ceph.com/issues/18836

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-11 11:37:03 -07:00
Yuri Weinstein
11c57701c6 Merge pull request #16961 from xiexingguo/wip-class-rename
crush: "osd crush class rename" support

Reviewed-by: Sage Weil <sage@redhat.com>
2017-08-11 06:18:57 -07:00
Sage Weil
d2d9b41275 Merge pull request #16709 from dzafman/wip-standalone
qa/standalone: misc fixes
2017-08-10 21:33:43 -05:00
xie xingguo
d792e8d528 crush: "osd crush class rename" support
In 076a6abd80 I killed the 'class rename' command
and thought it was totally useless but I was wrong.

Consider the following user case:
(1) randomly choose some OSDs(e.g., from different hosts) and try to make them for private use only,
    say, by grouping them into 'pool1'
(2) ceph osd crush set-device-class pool1 'OSDs from (1)'
(3) ceph osd crush rule create-replicated rule_for_pool1 default host pool1
(4) ceph osd pool rename pool1 pool2
(5) ceph osd crush class rename pool1 pool2

From the above user case, we need to safely change a pool name without worrying
any risk of data migration. That is why the 'osd crush class rename' command
is still needed here.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-11 08:32:39 +08:00
David Zafman
e24ac51a82 qa: Fix broken test_activate_osd() due to missing space
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 12:37:05 -07:00
David Zafman
ae2c5331fb qa: Fix races with waiting for scrubs
The trigger_scrub sets the last_scrub_stamp backwards to
force a scheduled scrub.  In a small window this stamp could get propagated
to the mgr.  A test failure occurred because wait_for_scrub() was confused
by seeing a backward moving date.

The most critical change is having wait_for_scrub() make sure that the
date advances past the previous in value.

A test failed because the random backoff kept delayed triggered scrub, so
set osd_scrub_backoff throughout.

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 12:37:05 -07:00
David Zafman
dddda523d1 qa: Testing of ceph-helpers.sh, teardown on fail to dump logs, save cores
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 12:37:05 -07:00
David Zafman
1fe6cb0f02 osd: Avoid confusion over legacy snaps when head_exists corrupt
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 12:37:05 -07:00
David Zafman
229de6b71d qa: Add support for core dumps
Save core dumps when running tests locally
Dump logs to output whenever cores seen

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 12:37:04 -07:00
David Zafman
4db5124e1a qa: For FreeBSD skip osd-dup.sh because there is no bluestore
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 08:30:47 -07:00
David Zafman
61bfd236ad qa: Raise mon-data-avail-warn to pass tests with less space
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 08:30:47 -07:00
David Zafman
574b3cd3d4 qa: Add common generalized inject_eio() to ceph-helpers.sh
Retry for a while to allow pool to appear

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 08:30:47 -07:00
David Zafman
3988ebab43 qa: osd-scrub-repair.sh handle older versions of jq
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 08:30:47 -07:00
David Zafman
2a679a36de qa: Add support for specifying sub-tests with run-standalone.sh
Fix test-ceph-helpers.sh to pass additional arguments on

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 08:30:47 -07:00
David Zafman
69413618a0 qa: ceph-helpers.sh fixes
Add missing teardown to cleanup test directory
Fix pgid due to elimination of initial default pool
Testing could never fail because run_tests return ignored

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-10 08:30:47 -07:00
xie xingguo
87952fc68d crush: automatically kill dead classes
If a class is no more referenced by any devices or crush rules,
it shall be considered as dead.

This patch makes Ceph automatically recycles those dead classes,
so user does not to explicitly call 'class rm', which is unsafe
and annoying.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-05 18:53:39 +08:00
xie xingguo
b863883ca7 crush: remove 'class rm' command
The current version is broken. E.g., it should only remove a class
which is never referenced by any device.

Since we now create new classes automatically, we shall automatically
recycle dead classes too. So this command is definitely unuseful.
(Actually it is weird that we keep 'class rm' without keeping the
 corresponding 'class create' command).

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-05 18:52:30 +08:00
xie xingguo
f1d80ff750 crush: do not automatically recycle class for 'rm-device-class'
This will prevent the current crush rule from referencing a non-existent
shadow tree and hence avoid a coredump such as below:

 0> 2017-08-05 09:54:19.943349 7f73887d6700 -1 /clove/vm/xxg/rpm/ceph/rpmbuild/BUILD/ceph-12.1.2.1/src/crush/CrushWrapper.cc: In function 'int CrushWrapper::get_rule_weight_osd_map(unsigned
 int, std::map<int, float>*)' thread 7f73887d6700 time 2017-08-05 09:54:19.941291
/clove/vm/xxg/rpm/ceph/rpmbuild/BUILD/ceph-12.1.2.1/src/crush/CrushWrapper.cc: 1631: FAILED assert(b)

 ceph version 12.1.2.1-11-gd0f812a (d0f812a3a757b319c26794f558b57770663ab324) luminous (rc)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x7f7398b66ea0]
 2: (CrushWrapper::get_rule_weight_osd_map(unsigned int, std::map<int, float, std::less<int>, std::allocator<std::pair<int const, float> > >*)+0x54e) [0x7f7398daac4e]
 3: (PGMap::get_rule_avail(OSDMap const&, int) const+0x68) [0x7f73989a6428]
 4: (PGMap::get_rules_avail(OSDMap const&, std::map<int, long, std::less<int>, std::allocator<std::pair<int const, long> > >*) const+0x35c) [0x7f73989b748c]
 5: (PGMap::encode_digest(OSDMap const&, ceph::buffer::list&, unsigned long) const+0x16) [0x7f73989b7506]
 6: (DaemonServer::send_report()+0x2a4) [0x7f73989f5474]
 7: (DaemonServer::maybe_ready(int)+0x2f9) [0x7f73989f6129]
 8: (DaemonServer::ms_dispatch(Message*)+0xce) [0x7f73989ff68e]
 9: (DispatchQueue::entry()+0x792) [0x7f7398dd2a22]
 10: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f7398c1429d]
 11: (()+0x7df3) [0x7f739640cdf3]
 12: (clone()+0x6d) [0x7f73954f23ed]

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-05 18:44:59 +08:00
David Zafman
99ad4bbd91 qa: Add create_pool() which sleeps 1 second like python variant
wait_for_clean() can miss the new pool if it races with pool create.

Fixes: http://tracker.ceph.com/issues/20465

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-04 06:38:09 -07:00
David Zafman
b20dfc2864 qa: Add special test_failure.sh script (not run by default)
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-04 06:38:09 -07:00
David Zafman
8c768050a5 qa: run-standalone.sh improvements
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-04 06:38:09 -07:00
David Zafman
4314cdd666 qa: Dump logs after daemons are killed to make sure everything is flushed
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-08-04 06:38:09 -07:00
xie xingguo
734b5f2c60 test/osd-fast-mark-down: enable 'osd-class-update-on-start' by default
116cf759c8
will now hide all shadow trees(roots), so this is not applicable anymore
(actually it is misleading).

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-03 17:26:26 -04:00
Sage Weil
41bcf2fee5 Merge pull request #16281 from badone/wip-PG-cluster-log-audit
osd: Log audit

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-27 16:25:30 -05:00
Alan Somers
3aae5ca6fd scripts: fix bash path in shebangs
/bin/bash is a Linuxism.  Other operating systems install bash to
different paths.  Use /usr/bin/env in shebangs to find bash.

Signed-off-by: Alan Somers <asomers@gmail.com>
2017-07-27 13:24:26 -06:00
Sage Weil
e469a8044c qa/standalone/crush/crush-classes: fix test
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:25:25 -04:00
Sage Weil
380de3395f qa/standalone/README
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:24:52 -04:00
xie xingguo
076a6abd80 crush: kill 'class rename'
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:40:50 +08:00
xie xingguo
a27fd9d25c crush: kill "class create" command
The device class is now self and automatically managed.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:40:17 +08:00