Commit Graph

2494 Commits

Author SHA1 Message Date
Sage Weil
7c350180b1 qa/suites/rados/mgr/tasks/failover: whitelist
remote/smithi025/log/ceph.log.gz:2017-08-03 07:02:15.049074 mon.b mon.0 172.21.15.25:6789/0 197 : cluster [INF] Manager daemon x is unresponsive, replacing it with standby daemon y
remote/smithi025/log/ceph.log.gz:2017-08-03 07:03:10.078032 mon.b mon.0 172.21.15.25:6789/0 226 : cluster [WRN] Manager daemon x is unresponsive.  No standby daemons available.

x and y may be swapped, so whitelist the rest of the string.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-03 12:40:01 -04:00
Patrick Donnelly
d4ed085238
Merge PR #16713 into master
* refs/remotes/upstream/pull/16713/head:
	qa: ignore failed MDS message during upgrade
2017-08-02 19:41:42 -07:00
Patrick Donnelly
6cad5be68c
Merge PR #16714 into master
* refs/remotes/upstream/pull/16714/head:
	qa: test export_pin is correct in dumped subtree
	mds: print export_pin for dumped subtree

Reviewed-by: Douglas Fuller <dfuller@redhat.com>
Reviewed-by: huanwen ren <ren.huanwen@zte.com.cn>
2017-08-02 18:41:12 -07:00
Sage Weil
5085dc1164 qa/suites/powercycle: whitelist health for thrashing
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-02 11:06:43 -04:00
Kefu Chai
1ff1f836da Merge pull request #16722 from tchaikov/wip-qa-fixes
qa/suites: escape the parenthesis of the whitelist text

Reviewed-by: Sage Weil <sage@redhat.com>
2017-08-02 13:00:01 +08:00
Kefu Chai
a70be4e00c qa/suites: more whitelisting
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-08-02 10:00:57 +08:00
Kefu Chai
69c6402bbd Merge pull request #16727 from jcsp/wip-doc-config-hel
doc/qa: cover `config help` command

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-01 23:38:28 +08:00
Jason Dillaman
2589f57ecd Merge pull request #16656 from idryomov/wip-qa-newer-fio
qa/tasks/rbd_fio: bump default fio version to 2.21

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2017-08-01 10:14:46 -04:00
Kefu Chai
d12c51ca91 qa/suites: escape the parenthesis of the whitelist text
so we can avoid the warnings like

grep: Unmatched ( or \(

because we pass the whitelisted string to `egrep -v "$1"` directly.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-08-01 21:54:44 +08:00
Kefu Chai
d67d6c57ae qa/workunits/ceph-disk: fix the path to ceph-helpers-root.sh
partially reverts 841f3bd

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-08-01 21:54:44 +08:00
John Spray
ac2b9d63ca qa: include config help in admin socket test
Signed-off-by: John Spray <john.spray@redhat.com>
2017-08-01 13:38:40 +01:00
Patrick Donnelly
8db2c43e79
qa: test export_pin is correct in dumped subtree
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-31 15:33:49 -07:00
Patrick Donnelly
5e5ff5c086
qa: ignore failed MDS message during upgrade
The cluster is expected to become degraded during reboot.

Fixes: http://tracker.ceph.com/issues/20731
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-31 14:45:07 -07:00
Patrick Donnelly
019f20ff98
Merge PR #16640 into master
* refs/remotes/upstream/pull/16640/head:
	qa: fix wait for wrong health message

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-28 09:55:49 -07:00
Patrick Donnelly
6fc2ee383f
Merge PR #16413 into master
* refs/remotes/upstream/pull/16413/head:
	qa/cephfs: lsof if umount fails

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-28 09:55:23 -07:00
Sage Weil
c3c2b31c87 Merge pull request #16568 from liewegas/wip-application-warn
qa,doc: document and fix tests for pool application warnings
2017-07-28 09:00:46 -05:00
Kefu Chai
75e361433d qa/run-standalone.sh: fix the find option to be compatible with GNU find
also re-indent to be consistent with other part of this script

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-07-28 14:22:02 +08:00
Kefu Chai
2a128f4829 Merge pull request #16599 from liewegas/wip-standalone-fixes
qa/workunits: adjust path to ceph-helpers.sh

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-28 13:18:47 +08:00
Patrick Donnelly
fb039383e9
Merge PR #16435 into master
* refs/remotes/upstream/pull/16435/head:
	qa: whitelist trim error during powercycle tests

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-27 17:54:59 -07:00
Patrick Donnelly
ced01a2335
qa: fix wait for wrong health message
Fixes: http://tracker.ceph.com/issues/20805

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-27 14:40:05 -07:00
Sage Weil
41bcf2fee5 Merge pull request #16281 from badone/wip-PG-cluster-log-audit
osd: Log audit

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-27 16:25:30 -05:00
Sage Weil
862392fbf9 Merge pull request #16514 from liewegas/wip-20744
qa/tasks/ceph: wait for mgr to activate and pg stats to flush in health()

Reviewed-by: John Spray <john.spray@redhat.com>
2017-07-27 16:24:59 -05:00
Patrick Donnelly
d7f5af40a2
qa: whitelist trim error during powercycle tests
Fixes: http://tracker.ceph.com/issues/20566

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-27 13:24:21 -07:00
Sage Weil
541de391e1 Merge pull request #16572 from liewegas/wip-pidfile
test: add separate ceph-helpers-based smoke test

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-27 12:32:36 -05:00
Ilya Dryomov
bd6e3e5f1f qa/tasks/rbd_fio: bump default fio version to 2.21
I'm seeing sporadic single thread deadlocks on fio stat_mutex during krbd
thrash runs:

  (gdb) info threads
    Id   Target Id         Frame
  * 1    Thread 0x7f89ee730740 (LWP 15604) 0x00007f89ed9f41bd in __lll_lock_wait () from /lib64/libpthread.so.0
  (gdb) bt
  #0  0x00007f89ed9f41bd in __lll_lock_wait () from /lib64/libpthread.so.0
  #1  0x00007f89ed9f17b2 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  #2  0x00000000004429b9 in fio_mutex_down (mutex=0x7f89ee72d000) at mutex.c:170
  #3  0x0000000000459704 in thread_main (data=<optimized out>) at backend.c:1639
  #4  0x000000000045b013 in fork_main (offset=0, shmid=<optimized out>, sk_out=0x0) at backend.c:1778
  #5  run_threads (sk_out=sk_out@entry=0x0) at backend.c:2195
  #6  0x000000000045b47f in fio_backend (sk_out=sk_out@entry=0x0) at backend.c:2400
  #7  0x000000000040cb0c in main (argc=2, argv=0x7fffad3e3888, envp=<optimized out>) at fio.c:63
  (gdb) up 2
  170                     pthread_cond_wait(&mutex->cond, &mutex->lock);
  (gdb) p mutex.lock.__data.__owner
  $1 = 15604

Upgrading to 2.21 seems to make these go away.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-27 18:57:43 +02:00
Sage Weil
c1aef68f02 Merge pull request #16569 from liewegas/wip-set-not-put
mon: 'config-key put' -> 'config-key set'

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
2017-07-27 11:34:37 -05:00
Sage Weil
e469a8044c qa/standalone/crush/crush-classes: fix test
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:25:25 -04:00
Sage Weil
380de3395f qa/standalone/README
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:24:52 -04:00
Sage Weil
0b5036f072 qa/suites/rados/upgrade: fix upgrade wait for healthy
There is no mgr, so we can't call ceph.healthy.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:10:34 -04:00
Sage Weil
a40d94b163 qa/tasks/ceph: wait for pg stats to flush in healthy check
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:10:27 -04:00
Sage Weil
80978dea8a qa/tasks/ceph_manager: wait_for_all_up -> wait_for_all_osds_up
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:10:26 -04:00
Sage Weil
7648894e55 qa/tasks/ceph_manager: expose flush_all_pg_stats
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:10:26 -04:00
Sage Weil
c7430c56cd Merge pull request #16388 from xiexingguo/wip-class-misc-fixes
crush, mon: simplify device class manipulation commands

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-27 11:04:33 -05:00
Sage Weil
203c68ad55 Merge pull request #16575 from liewegas/wip-20693
qa/suites/rados: at-end: ignore PG_{AVAILABILITY,DEGRADED}

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-27 08:31:53 -05:00
Sage Weil
e398fd4ee4 qa/suites: more whitelisting
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 09:31:24 -04:00
Jason Dillaman
42fabc2e80 Merge pull request #16398 from dillaman/wip-20655
rbd-mirror: guard the deletion of non-primary images

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2017-07-27 08:27:39 -04:00
David Zafman
e92c953d7b Merge pull request #16610 from dzafman/wip-fix-reg11184
test: reg11184 might not always find pg 2.0 prior to import

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-26 11:42:15 -07:00
Sage Weil
5534912daa qa/workunits/cephtool/test.sh: add some config-key tests
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-26 14:13:22 -04:00
Sage Weil
4eb1a518e3 mon: 'config-key put' -> 'config-key set'
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-26 14:10:08 -04:00
Sage Weil
ee06dc6996 Merge pull request #16530 from xiexingguo/wip-fix-pgtemp
mon: prime pg_temp and a few health warning fixes

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-26 13:09:33 -05:00
Sage Weil
59a3a4a40e Merge pull request #16559 from hjwsm1989/dump-stuck
qa/tasks/dump_stuck: fix dump_stuck test bug

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-26 11:59:21 -05:00
David Zafman
7c43840399 test: reg11184 might not always find pg 2.0 prior to import
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-07-26 09:46:15 -07:00
Sage Weil
56ffd7a727 Merge pull request #16571 from ceph/wip-cd-bluestore-2
qa/tasks/ceph-deploy: Fix bluestore options for ceph-deploy

Reviewed-by: Tamil Muthamizhan <tmuthami@redhat.com>
2017-07-26 11:43:50 -05:00
xie xingguo
076a6abd80 crush: kill 'class rename'
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:40:50 +08:00
xie xingguo
a27fd9d25c crush: kill "class create" command
The device class is now self and automatically managed.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:40:17 +08:00
xie xingguo
edd8930346 crush: allow "crush class rm" to automatically recycle shadow tree(s)
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:39:41 +08:00
xie xingguo
9d908c14f6 crush: rm-device-class support
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:39:08 +08:00
xie xingguo
32fb548797 crush: guard set-device-class
If a device has already been bounded to a class,
do not allow to change its class silently.
Require user call rm-device-class first.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:34:08 +08:00
xie xingguo
e4e83a0dd7 crush: fix class_is_in_use()
A class can be considered as in-use only if it is referenced by
any of the existing crush rules.

The patch also makes the output more human readable. For example:

./bin/ceph osd crush rule create-replicated myrule default host ssd
./bin/ceph osd crush class rm ssd
Error EBUSY: class 'ssd' still referenced by crush_rule 'myrule'

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:31:39 +08:00
xie xingguo
f3a3180cca crush: rebuild shadow tree on "crush create-or-move/move"
This patch solves the problem below:

./bin/ceph osd crush move osd.0 root=foo rack=foo-rack host=foo-host
moved item id 0 name 'osd.0' to location {host=foo-host,rack=foo-rack,root=foo} in crush map

 ./bin/ceph osd crush rule create-replicated foo-rule foo host ssd
Error EINVAL: root foo has no devices with class ssd

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:30:59 +08:00