Commit Graph

2485 Commits

Author SHA1 Message Date
Josh Durgin
63693779fc qa: timeout when waiting for mgr to be available
Otherwise during upgrades we wait forever.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-08-02 02:18:28 -04:00
Kefu Chai
69c6402bbd Merge pull request #16727 from jcsp/wip-doc-config-hel
doc/qa: cover `config help` command

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-01 23:38:28 +08:00
Jason Dillaman
2589f57ecd Merge pull request #16656 from idryomov/wip-qa-newer-fio
qa/tasks/rbd_fio: bump default fio version to 2.21

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2017-08-01 10:14:46 -04:00
John Spray
ac2b9d63ca qa: include config help in admin socket test
Signed-off-by: John Spray <john.spray@redhat.com>
2017-08-01 13:38:40 +01:00
Patrick Donnelly
019f20ff98
Merge PR #16640 into master
* refs/remotes/upstream/pull/16640/head:
	qa: fix wait for wrong health message

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-28 09:55:49 -07:00
Patrick Donnelly
6fc2ee383f
Merge PR #16413 into master
* refs/remotes/upstream/pull/16413/head:
	qa/cephfs: lsof if umount fails

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-28 09:55:23 -07:00
Sage Weil
c3c2b31c87 Merge pull request #16568 from liewegas/wip-application-warn
qa,doc: document and fix tests for pool application warnings
2017-07-28 09:00:46 -05:00
Kefu Chai
75e361433d qa/run-standalone.sh: fix the find option to be compatible with GNU find
also re-indent to be consistent with other part of this script

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-07-28 14:22:02 +08:00
Kefu Chai
2a128f4829 Merge pull request #16599 from liewegas/wip-standalone-fixes
qa/workunits: adjust path to ceph-helpers.sh

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-28 13:18:47 +08:00
Patrick Donnelly
fb039383e9
Merge PR #16435 into master
* refs/remotes/upstream/pull/16435/head:
	qa: whitelist trim error during powercycle tests

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-27 17:54:59 -07:00
Patrick Donnelly
ced01a2335
qa: fix wait for wrong health message
Fixes: http://tracker.ceph.com/issues/20805

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-27 14:40:05 -07:00
Sage Weil
41bcf2fee5 Merge pull request #16281 from badone/wip-PG-cluster-log-audit
osd: Log audit

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-27 16:25:30 -05:00
Sage Weil
862392fbf9 Merge pull request #16514 from liewegas/wip-20744
qa/tasks/ceph: wait for mgr to activate and pg stats to flush in health()

Reviewed-by: John Spray <john.spray@redhat.com>
2017-07-27 16:24:59 -05:00
Patrick Donnelly
d7f5af40a2
qa: whitelist trim error during powercycle tests
Fixes: http://tracker.ceph.com/issues/20566

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-27 13:24:21 -07:00
Sage Weil
541de391e1 Merge pull request #16572 from liewegas/wip-pidfile
test: add separate ceph-helpers-based smoke test

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-27 12:32:36 -05:00
Ilya Dryomov
bd6e3e5f1f qa/tasks/rbd_fio: bump default fio version to 2.21
I'm seeing sporadic single thread deadlocks on fio stat_mutex during krbd
thrash runs:

  (gdb) info threads
    Id   Target Id         Frame
  * 1    Thread 0x7f89ee730740 (LWP 15604) 0x00007f89ed9f41bd in __lll_lock_wait () from /lib64/libpthread.so.0
  (gdb) bt
  #0  0x00007f89ed9f41bd in __lll_lock_wait () from /lib64/libpthread.so.0
  #1  0x00007f89ed9f17b2 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  #2  0x00000000004429b9 in fio_mutex_down (mutex=0x7f89ee72d000) at mutex.c:170
  #3  0x0000000000459704 in thread_main (data=<optimized out>) at backend.c:1639
  #4  0x000000000045b013 in fork_main (offset=0, shmid=<optimized out>, sk_out=0x0) at backend.c:1778
  #5  run_threads (sk_out=sk_out@entry=0x0) at backend.c:2195
  #6  0x000000000045b47f in fio_backend (sk_out=sk_out@entry=0x0) at backend.c:2400
  #7  0x000000000040cb0c in main (argc=2, argv=0x7fffad3e3888, envp=<optimized out>) at fio.c:63
  (gdb) up 2
  170                     pthread_cond_wait(&mutex->cond, &mutex->lock);
  (gdb) p mutex.lock.__data.__owner
  $1 = 15604

Upgrading to 2.21 seems to make these go away.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-27 18:57:43 +02:00
Sage Weil
c1aef68f02 Merge pull request #16569 from liewegas/wip-set-not-put
mon: 'config-key put' -> 'config-key set'

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
2017-07-27 11:34:37 -05:00
Sage Weil
e469a8044c qa/standalone/crush/crush-classes: fix test
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:25:25 -04:00
Sage Weil
380de3395f qa/standalone/README
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:24:52 -04:00
Sage Weil
0b5036f072 qa/suites/rados/upgrade: fix upgrade wait for healthy
There is no mgr, so we can't call ceph.healthy.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:10:34 -04:00
Sage Weil
a40d94b163 qa/tasks/ceph: wait for pg stats to flush in healthy check
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:10:27 -04:00
Sage Weil
80978dea8a qa/tasks/ceph_manager: wait_for_all_up -> wait_for_all_osds_up
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:10:26 -04:00
Sage Weil
7648894e55 qa/tasks/ceph_manager: expose flush_all_pg_stats
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:10:26 -04:00
Sage Weil
c7430c56cd Merge pull request #16388 from xiexingguo/wip-class-misc-fixes
crush, mon: simplify device class manipulation commands

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-27 11:04:33 -05:00
Sage Weil
203c68ad55 Merge pull request #16575 from liewegas/wip-20693
qa/suites/rados: at-end: ignore PG_{AVAILABILITY,DEGRADED}

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-27 08:31:53 -05:00
Sage Weil
e398fd4ee4 qa/suites: more whitelisting
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 09:31:24 -04:00
Jason Dillaman
42fabc2e80 Merge pull request #16398 from dillaman/wip-20655
rbd-mirror: guard the deletion of non-primary images

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2017-07-27 08:27:39 -04:00
David Zafman
e92c953d7b Merge pull request #16610 from dzafman/wip-fix-reg11184
test: reg11184 might not always find pg 2.0 prior to import

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-26 11:42:15 -07:00
Sage Weil
5534912daa qa/workunits/cephtool/test.sh: add some config-key tests
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-26 14:13:22 -04:00
Sage Weil
4eb1a518e3 mon: 'config-key put' -> 'config-key set'
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-26 14:10:08 -04:00
Sage Weil
ee06dc6996 Merge pull request #16530 from xiexingguo/wip-fix-pgtemp
mon: prime pg_temp and a few health warning fixes

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-26 13:09:33 -05:00
Sage Weil
59a3a4a40e Merge pull request #16559 from hjwsm1989/dump-stuck
qa/tasks/dump_stuck: fix dump_stuck test bug

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-26 11:59:21 -05:00
David Zafman
7c43840399 test: reg11184 might not always find pg 2.0 prior to import
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-07-26 09:46:15 -07:00
Sage Weil
56ffd7a727 Merge pull request #16571 from ceph/wip-cd-bluestore-2
qa/tasks/ceph-deploy: Fix bluestore options for ceph-deploy

Reviewed-by: Tamil Muthamizhan <tmuthami@redhat.com>
2017-07-26 11:43:50 -05:00
xie xingguo
076a6abd80 crush: kill 'class rename'
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:40:50 +08:00
xie xingguo
a27fd9d25c crush: kill "class create" command
The device class is now self and automatically managed.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:40:17 +08:00
xie xingguo
edd8930346 crush: allow "crush class rm" to automatically recycle shadow tree(s)
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:39:41 +08:00
xie xingguo
9d908c14f6 crush: rm-device-class support
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:39:08 +08:00
xie xingguo
32fb548797 crush: guard set-device-class
If a device has already been bounded to a class,
do not allow to change its class silently.
Require user call rm-device-class first.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:34:08 +08:00
xie xingguo
e4e83a0dd7 crush: fix class_is_in_use()
A class can be considered as in-use only if it is referenced by
any of the existing crush rules.

The patch also makes the output more human readable. For example:

./bin/ceph osd crush rule create-replicated myrule default host ssd
./bin/ceph osd crush class rm ssd
Error EBUSY: class 'ssd' still referenced by crush_rule 'myrule'

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:31:39 +08:00
xie xingguo
f3a3180cca crush: rebuild shadow tree on "crush create-or-move/move"
This patch solves the problem below:

./bin/ceph osd crush move osd.0 root=foo rack=foo-rack host=foo-host
moved item id 0 name 'osd.0' to location {host=foo-host,rack=foo-rack,root=foo} in crush map

 ./bin/ceph osd crush rule create-replicated foo-rule foo host ssd
Error EINVAL: root foo has no devices with class ssd

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:30:59 +08:00
xie xingguo
10bf2a633f crush: fix "crush create-or-move/move" would drop osd's class
Was:
     ./bin/ceph osd tree
    ID CLASS WEIGHT  TYPE NAME                                        UP/DOWN REWEIGHT PRI-AFF
    -1       3.00000 root default
    -2       3.00000     host gitbuilder-ceph-rpm-centos7-amd64-basic
     0   ssd 1.00000         osd.0                                         up  1.00000 1.00000
     1   ssd 1.00000         osd.1                                         up  1.00000 1.00000
     2   ssd 1.00000         osd.2                                         up  1.00000 1.00000

    ./bin/ceph osd crush move osd.0 root=foo rack=foo-rack  host=foo-host
    moved item id 0 name 'osd.0' to location {host=foo-host,rack=foo-rack,root=foo} in crush map

     ./bin/ceph osd tree
    ID CLASS WEIGHT  TYPE NAME                                        UP/DOWN REWEIGHT PRI-AFF
    -7       1.00000 root foo
    -6       1.00000     rack foo-rack
    -5       1.00000         host foo-host
     0       1.00000             osd.0                                     up  1.00000 1.00000
    -1       2.00000 root default
    -2       2.00000     host gitbuilder-ceph-rpm-centos7-amd64-basic
     1   ssd 1.00000         osd.1                                         up  1.00000 1.00000
     2   ssd 1.00000         osd.2                                         up  1.00000 1.00000

    Now:
    ./bin/ceph osd tree
    ID CLASS WEIGHT  TYPE NAME                                        UP/DOWN REWEIGHT PRI-AFF
    -1       3.00000 root default
    -2       3.00000     host gitbuilder-ceph-rpm-centos7-amd64-basic
     0   ssd 1.00000         osd.0                                         up  1.00000 1.00000
     1   ssd 1.00000         osd.1                                         up  1.00000 1.00000
     2   ssd 1.00000         osd.2                                         up  1.00000 1.00000

    ./bin/ceph osd crush move osd.0 root=foo rack=foo-rack  host=foo-host
    moved item id 0 name 'osd.0' to location {host=foo-host,rack=foo-rack,root=foo} in crush map

    ./bin/ceph osd tree
    ID CLASS WEIGHT  TYPE NAME                                        UP/DOWN REWEIGHT PRI-AFF
    -7       1.00000 root foo
    -6       1.00000     rack foo-rack
    -5       1.00000         host foo-host
     0   ssd 1.00000             osd.0                                     up  1.00000 1.00000
    -1       2.00000 root default
    -2       2.00000     host gitbuilder-ceph-rpm-centos7-amd64-basic
     1   ssd 1.00000         osd.1                                         up  1.00000 1.00000
     2   ssd 1.00000         osd.2                                         up  1.00000 1.00000

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:30:26 +08:00
Sage Weil
742005bd75 Merge pull request #16579 from liewegas/wip-fix-nonregression
qa/suites/rados/singleton/all/erasure-code-nonregression: fix typo

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Amik Kumar <amitkuma@redhat.com>
2017-07-26 08:46:43 -05:00
Sage Weil
c1bdd36d8f qa/workunits/erasure-code/encode-decode-nonregression: do not require git checkout
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-26 09:35:46 -04:00
Sage Weil
841f3bdf92 qa/workunits: adjust path to ceph-helpers.sh
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-26 08:08:01 -04:00
Willem Jan Withagen
ae88edd25d qa: make run-standalone work on FreeBSD
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
2017-07-26 12:01:37 +02:00
Kefu Chai
d85a7889fd Merge pull request #16446 from xiexingguo/wip-destroyed
mon: show destroyed status in tree view; do not auto-out destroyed osds

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-26 17:15:53 +08:00
Brad Hubbard
f8acc53d82 osd: Log audit
Review current log messages for consistency, accuracy and necessesity as
part of usability initiative. First in a series.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
2017-07-26 17:34:28 +10:00
xie xingguo
96eb0a9887 mon/OSDMonitor: apply new 'destroyed' status to 'osd tree' filter
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 15:13:32 +08:00
Sage Weil
326019a466 qa/suites/rados: whitelist various tests
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-25 22:29:07 -04:00