Commit Graph

2585 Commits

Author SHA1 Message Date
John Spray
ac2b9d63ca qa: include config help in admin socket test
Signed-off-by: John Spray <john.spray@redhat.com>
2017-08-01 13:38:40 +01:00
Patrick Donnelly
8db2c43e79
qa: test export_pin is correct in dumped subtree
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-31 15:33:49 -07:00
Patrick Donnelly
5e5ff5c086
qa: ignore failed MDS message during upgrade
The cluster is expected to become degraded during reboot.

Fixes: http://tracker.ceph.com/issues/20731
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-31 14:45:07 -07:00
Patrick Donnelly
019f20ff98
Merge PR #16640 into master
* refs/remotes/upstream/pull/16640/head:
	qa: fix wait for wrong health message

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-28 09:55:49 -07:00
Patrick Donnelly
6fc2ee383f
Merge PR #16413 into master
* refs/remotes/upstream/pull/16413/head:
	qa/cephfs: lsof if umount fails

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-28 09:55:23 -07:00
Sage Weil
c3c2b31c87 Merge pull request #16568 from liewegas/wip-application-warn
qa,doc: document and fix tests for pool application warnings
2017-07-28 09:00:46 -05:00
Kefu Chai
75e361433d qa/run-standalone.sh: fix the find option to be compatible with GNU find
also re-indent to be consistent with other part of this script

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-07-28 14:22:02 +08:00
Kefu Chai
2a128f4829 Merge pull request #16599 from liewegas/wip-standalone-fixes
qa/workunits: adjust path to ceph-helpers.sh

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-28 13:18:47 +08:00
Patrick Donnelly
fb039383e9
Merge PR #16435 into master
* refs/remotes/upstream/pull/16435/head:
	qa: whitelist trim error during powercycle tests

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-27 17:54:59 -07:00
Patrick Donnelly
ced01a2335
qa: fix wait for wrong health message
Fixes: http://tracker.ceph.com/issues/20805

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-27 14:40:05 -07:00
Sage Weil
41bcf2fee5 Merge pull request #16281 from badone/wip-PG-cluster-log-audit
osd: Log audit

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-27 16:25:30 -05:00
Sage Weil
862392fbf9 Merge pull request #16514 from liewegas/wip-20744
qa/tasks/ceph: wait for mgr to activate and pg stats to flush in health()

Reviewed-by: John Spray <john.spray@redhat.com>
2017-07-27 16:24:59 -05:00
Patrick Donnelly
d7f5af40a2
qa: whitelist trim error during powercycle tests
Fixes: http://tracker.ceph.com/issues/20566

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-27 13:24:21 -07:00
Sage Weil
541de391e1 Merge pull request #16572 from liewegas/wip-pidfile
test: add separate ceph-helpers-based smoke test

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-27 12:32:36 -05:00
Ilya Dryomov
bd6e3e5f1f qa/tasks/rbd_fio: bump default fio version to 2.21
I'm seeing sporadic single thread deadlocks on fio stat_mutex during krbd
thrash runs:

  (gdb) info threads
    Id   Target Id         Frame
  * 1    Thread 0x7f89ee730740 (LWP 15604) 0x00007f89ed9f41bd in __lll_lock_wait () from /lib64/libpthread.so.0
  (gdb) bt
  #0  0x00007f89ed9f41bd in __lll_lock_wait () from /lib64/libpthread.so.0
  #1  0x00007f89ed9f17b2 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  #2  0x00000000004429b9 in fio_mutex_down (mutex=0x7f89ee72d000) at mutex.c:170
  #3  0x0000000000459704 in thread_main (data=<optimized out>) at backend.c:1639
  #4  0x000000000045b013 in fork_main (offset=0, shmid=<optimized out>, sk_out=0x0) at backend.c:1778
  #5  run_threads (sk_out=sk_out@entry=0x0) at backend.c:2195
  #6  0x000000000045b47f in fio_backend (sk_out=sk_out@entry=0x0) at backend.c:2400
  #7  0x000000000040cb0c in main (argc=2, argv=0x7fffad3e3888, envp=<optimized out>) at fio.c:63
  (gdb) up 2
  170                     pthread_cond_wait(&mutex->cond, &mutex->lock);
  (gdb) p mutex.lock.__data.__owner
  $1 = 15604

Upgrading to 2.21 seems to make these go away.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-27 18:57:43 +02:00
Sage Weil
c1aef68f02 Merge pull request #16569 from liewegas/wip-set-not-put
mon: 'config-key put' -> 'config-key set'

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
2017-07-27 11:34:37 -05:00
Sage Weil
e469a8044c qa/standalone/crush/crush-classes: fix test
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:25:25 -04:00
Sage Weil
380de3395f qa/standalone/README
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:24:52 -04:00
Sage Weil
0b5036f072 qa/suites/rados/upgrade: fix upgrade wait for healthy
There is no mgr, so we can't call ceph.healthy.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:10:34 -04:00
Sage Weil
a40d94b163 qa/tasks/ceph: wait for pg stats to flush in healthy check
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:10:27 -04:00
Sage Weil
80978dea8a qa/tasks/ceph_manager: wait_for_all_up -> wait_for_all_osds_up
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:10:26 -04:00
Sage Weil
7648894e55 qa/tasks/ceph_manager: expose flush_all_pg_stats
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:10:26 -04:00
Sage Weil
c7430c56cd Merge pull request #16388 from xiexingguo/wip-class-misc-fixes
crush, mon: simplify device class manipulation commands

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-27 11:04:33 -05:00
Sage Weil
203c68ad55 Merge pull request #16575 from liewegas/wip-20693
qa/suites/rados: at-end: ignore PG_{AVAILABILITY,DEGRADED}

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-27 08:31:53 -05:00
Sage Weil
e398fd4ee4 qa/suites: more whitelisting
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 09:31:24 -04:00
Jason Dillaman
42fabc2e80 Merge pull request #16398 from dillaman/wip-20655
rbd-mirror: guard the deletion of non-primary images

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2017-07-27 08:27:39 -04:00
David Zafman
e92c953d7b Merge pull request #16610 from dzafman/wip-fix-reg11184
test: reg11184 might not always find pg 2.0 prior to import

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-26 11:42:15 -07:00
Sage Weil
5534912daa qa/workunits/cephtool/test.sh: add some config-key tests
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-26 14:13:22 -04:00
Sage Weil
4eb1a518e3 mon: 'config-key put' -> 'config-key set'
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-26 14:10:08 -04:00
Sage Weil
ee06dc6996 Merge pull request #16530 from xiexingguo/wip-fix-pgtemp
mon: prime pg_temp and a few health warning fixes

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-26 13:09:33 -05:00
Sage Weil
59a3a4a40e Merge pull request #16559 from hjwsm1989/dump-stuck
qa/tasks/dump_stuck: fix dump_stuck test bug

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-26 11:59:21 -05:00
David Zafman
7c43840399 test: reg11184 might not always find pg 2.0 prior to import
Signed-off-by: David Zafman <dzafman@redhat.com>
2017-07-26 09:46:15 -07:00
Sage Weil
56ffd7a727 Merge pull request #16571 from ceph/wip-cd-bluestore-2
qa/tasks/ceph-deploy: Fix bluestore options for ceph-deploy

Reviewed-by: Tamil Muthamizhan <tmuthami@redhat.com>
2017-07-26 11:43:50 -05:00
xie xingguo
076a6abd80 crush: kill 'class rename'
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:40:50 +08:00
xie xingguo
a27fd9d25c crush: kill "class create" command
The device class is now self and automatically managed.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:40:17 +08:00
xie xingguo
edd8930346 crush: allow "crush class rm" to automatically recycle shadow tree(s)
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:39:41 +08:00
xie xingguo
9d908c14f6 crush: rm-device-class support
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:39:08 +08:00
xie xingguo
32fb548797 crush: guard set-device-class
If a device has already been bounded to a class,
do not allow to change its class silently.
Require user call rm-device-class first.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:34:08 +08:00
xie xingguo
e4e83a0dd7 crush: fix class_is_in_use()
A class can be considered as in-use only if it is referenced by
any of the existing crush rules.

The patch also makes the output more human readable. For example:

./bin/ceph osd crush rule create-replicated myrule default host ssd
./bin/ceph osd crush class rm ssd
Error EBUSY: class 'ssd' still referenced by crush_rule 'myrule'

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:31:39 +08:00
xie xingguo
f3a3180cca crush: rebuild shadow tree on "crush create-or-move/move"
This patch solves the problem below:

./bin/ceph osd crush move osd.0 root=foo rack=foo-rack host=foo-host
moved item id 0 name 'osd.0' to location {host=foo-host,rack=foo-rack,root=foo} in crush map

 ./bin/ceph osd crush rule create-replicated foo-rule foo host ssd
Error EINVAL: root foo has no devices with class ssd

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:30:59 +08:00
xie xingguo
10bf2a633f crush: fix "crush create-or-move/move" would drop osd's class
Was:
     ./bin/ceph osd tree
    ID CLASS WEIGHT  TYPE NAME                                        UP/DOWN REWEIGHT PRI-AFF
    -1       3.00000 root default
    -2       3.00000     host gitbuilder-ceph-rpm-centos7-amd64-basic
     0   ssd 1.00000         osd.0                                         up  1.00000 1.00000
     1   ssd 1.00000         osd.1                                         up  1.00000 1.00000
     2   ssd 1.00000         osd.2                                         up  1.00000 1.00000

    ./bin/ceph osd crush move osd.0 root=foo rack=foo-rack  host=foo-host
    moved item id 0 name 'osd.0' to location {host=foo-host,rack=foo-rack,root=foo} in crush map

     ./bin/ceph osd tree
    ID CLASS WEIGHT  TYPE NAME                                        UP/DOWN REWEIGHT PRI-AFF
    -7       1.00000 root foo
    -6       1.00000     rack foo-rack
    -5       1.00000         host foo-host
     0       1.00000             osd.0                                     up  1.00000 1.00000
    -1       2.00000 root default
    -2       2.00000     host gitbuilder-ceph-rpm-centos7-amd64-basic
     1   ssd 1.00000         osd.1                                         up  1.00000 1.00000
     2   ssd 1.00000         osd.2                                         up  1.00000 1.00000

    Now:
    ./bin/ceph osd tree
    ID CLASS WEIGHT  TYPE NAME                                        UP/DOWN REWEIGHT PRI-AFF
    -1       3.00000 root default
    -2       3.00000     host gitbuilder-ceph-rpm-centos7-amd64-basic
     0   ssd 1.00000         osd.0                                         up  1.00000 1.00000
     1   ssd 1.00000         osd.1                                         up  1.00000 1.00000
     2   ssd 1.00000         osd.2                                         up  1.00000 1.00000

    ./bin/ceph osd crush move osd.0 root=foo rack=foo-rack  host=foo-host
    moved item id 0 name 'osd.0' to location {host=foo-host,rack=foo-rack,root=foo} in crush map

    ./bin/ceph osd tree
    ID CLASS WEIGHT  TYPE NAME                                        UP/DOWN REWEIGHT PRI-AFF
    -7       1.00000 root foo
    -6       1.00000     rack foo-rack
    -5       1.00000         host foo-host
     0   ssd 1.00000             osd.0                                     up  1.00000 1.00000
    -1       2.00000 root default
    -2       2.00000     host gitbuilder-ceph-rpm-centos7-amd64-basic
     1   ssd 1.00000         osd.1                                         up  1.00000 1.00000
     2   ssd 1.00000         osd.2                                         up  1.00000 1.00000

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 22:30:26 +08:00
Sage Weil
742005bd75 Merge pull request #16579 from liewegas/wip-fix-nonregression
qa/suites/rados/singleton/all/erasure-code-nonregression: fix typo

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Amik Kumar <amitkuma@redhat.com>
2017-07-26 08:46:43 -05:00
Sage Weil
c1bdd36d8f qa/workunits/erasure-code/encode-decode-nonregression: do not require git checkout
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-26 09:35:46 -04:00
Sage Weil
841f3bdf92 qa/workunits: adjust path to ceph-helpers.sh
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-26 08:08:01 -04:00
Willem Jan Withagen
ae88edd25d qa: make run-standalone work on FreeBSD
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
2017-07-26 12:01:37 +02:00
Kefu Chai
d85a7889fd Merge pull request #16446 from xiexingguo/wip-destroyed
mon: show destroyed status in tree view; do not auto-out destroyed osds

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-26 17:15:53 +08:00
Brad Hubbard
f8acc53d82 osd: Log audit
Review current log messages for consistency, accuracy and necessesity as
part of usability initiative. First in a series.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
2017-07-26 17:34:28 +10:00
xie xingguo
96eb0a9887 mon/OSDMonitor: apply new 'destroyed' status to 'osd tree' filter
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-26 15:13:32 +08:00
Sage Weil
326019a466 qa/suites/rados: whitelist various tests
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-25 22:29:07 -04:00
Sage Weil
2ef8614f67 qa/suites/rados/singleton/all/erasure-code-nonregression: fix typo
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-25 22:26:43 -04:00
Sage Weil
d2c31a8114 Merge pull request #16469 from xiexingguo/wip-fix-test
test: s/osd_objectstore_type/osd_objectstore
2017-07-25 21:04:22 -05:00
Sage Weil
3683cdf496 qa/suites/rados: at-end: ignore PG_{AVAILABILITY,DEGRADED}
With the peering deletes change, setting luminous sets the osdmap flag
which triggers a new peering interval.  That can lead to health warnings
about PG_AVAILABILITY or PG_DEGRADED.  Ignore those!

Fixes: http://tracker.ceph.com/issues/20693
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-25 18:29:07 -04:00
Vasu Kulkarni
45c6a9acc4 Add both filestore and bluestore options for tests
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-07-25 15:16:37 -07:00
Vasu Kulkarni
bdf6851fb0 Add ceph-deploy overrides options
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-07-25 15:10:38 -07:00
Vasu Kulkarni
25c89804e4 bluestore config options for tests
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-07-25 12:26:11 -07:00
Vasu Kulkarni
05cafd5011 Add bluestore overrides for ceph-deploy
ceph-deploy doesn't use ceph overrides, Add same overrides for ceph-deploy

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-07-25 12:26:11 -07:00
Vasu Kulkarni
12a1ceba6e Move ceph-deploy config options into its own folder
The old structure of link at top folder is pretty much outdated, the test
config option needs to be specific to cluster yaml.

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-07-25 12:26:11 -07:00
Vasu Kulkarni
2fa0fae72f Add option to specify bluestore/filestore options
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-07-25 12:26:03 -07:00
Sage Weil
a264725b62 Merge pull request #16541 from liewegas/wip-20761
qa/workunits/cephtool/test.sh: disable 'fs status' until bug is fixed
2017-07-25 14:03:38 -05:00
Jason Dillaman
76fd882464 qa/workunits/rbd: rbd-mirror now treats no primary image as unknown state
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-25 07:17:15 -04:00
huangjun
daf8efee32 qa/tasks/dump_stuck: fix dump_stuck test bug
Test cluster with 2 osds, stop osd.0, if osd.1
  report the pg stats during pg peering, mon will
  record pg state to 'peering',then stop osd.1,
  finally the pg state will stuck in 'stale+peering',
  which is unexpected.

  Let's wait_for_active() after stop osd.0.

  Signed-off-by: huangjun <huangjun@xsky.com>
2017-07-25 11:14:07 +00:00
xie xingguo
450633b9e6 mon/OSDMonitor: ENOENT on removing non-existent app key
So we don't bother to trigger an pool update, which is potentially
big stuff.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-25 13:19:35 +08:00
xie xingguo
b4dcdecb6a mon/OSDMonitor: ENOENT on disabling non-existend app
so we don't bother to trigger an pool update, which is potentially
big stuff.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-25 13:19:29 +08:00
Sage Weil
7c157863a8 qa/run-standalone.sh: helper to run all standalone tests
Nothing fancy, but documents how these are run.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-24 22:11:50 -04:00
Sage Weil
766229b034 qa/standalone/scrub: separate scrub/repair tests from rest of osd/
They are slow.  Run them separately.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-24 22:11:50 -04:00
Sage Weil
cabad62242 qa/standalone/ceph-helpers: factor rbd pool create out of run_mon
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-24 22:11:50 -04:00
Sage Weil
b12bebe432 qa/standalone/mon/osd-pool-create: stop testing create pool output
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-24 22:11:49 -04:00
Sage Weil
71ea171604 qa: move ceph-helpers and misc src/test/*.sh tests to qa/standalone
- stop running via make check
- add teuthology yamls to run them
- disable ceph_objecstore_tool.py for now (too slow for make check, and
we can't use vstart in teuthology via a package install)
- drop cephtool tests since those are already covered by other teuthology
tests
- leave a handful of (fast!) ceph-helpers tests for make check for minimal
integration tests.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-24 22:11:49 -04:00
Alan Somers
821511bd32 openstack: Fix shebangs on openstack scripts
Many of the files in qa/qa_scripts/openstack had incorrect shebang
lines: the bang was missing.  This means that those scripts would
execute using the calling user's login shell, which is doubtless not
what the author intended.  Now they'll always use bash.

Two scripts do not need shebangs, because they contain only library
functions and don't execute anything.  I removed their shebangs.

Signed-off-by: Alan Somers <asomers@gmail.com>
2017-07-24 17:33:02 -06:00
Sage Weil
f347ef54c2 qa/workunits/cephtool/test.sh: disable 'fs status' until bug is fixed
See http://tracker.ceph.com/issues/20761
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-24 16:54:13 -04:00
Sage Weil
2e5955212d qa/tasks/workunit: allow alt basedir
Instead of 'qa/workunits' allow something like 'qa/standalone'.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-24 15:44:51 -04:00
Sage Weil
02c2e853d3 Merge pull request #16509 from liewegas/wip-rgw-wait
qa/suits/rados/basic/tasks/rgw_snaps: wait for pools to be created

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2017-07-24 11:55:54 -05:00
Sage Weil
29549e6834 Merge pull request #13723 from ovh/bp-forced-recovery
osd/PG: make prioritized recovery possible

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-24 09:01:03 -05:00
John Spray
343e1a4281 qa: update whitelist for "wrongly marked me down"
Signed-off-by: John Spray <john.spray@redhat.com>
2017-07-24 14:54:46 +01:00
Sage Weil
fc8374b472 Merge pull request #16326 from liewegas/wip-weight-set
crush,mon: add weight-set introspection and manipulation commands

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com
2017-07-24 08:27:06 -05:00
Sage Weil
ecd1193ab9 qa/suites/rados/basic/tasks/rgw_snaps: wait for pools to be be created
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-22 18:54:46 -04:00
Sage Weil
9b4002b6b8 qa/suites/rados/basic/tasks/rgw_snaps: fix pool list
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-22 18:54:45 -04:00
Sage Weil
08bdc2c867 Merge pull request #16500 from liewegas/wip-compact-sudo
qa/workunits/cephtool/test.sh: add sudo for daemon compact

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-22 13:01:20 -05:00
xie xingguo
fa0e314cde test: s/osd_objectstore_type/osd_objectstore/
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-22 15:22:31 +08:00
Sage Weil
4e6487cad4 Merge pull request #15991 from dillaman/wip-rbd-auth-profile
mon,osd: new rbd-based cephx cap profiles

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-21 22:38:42 -05:00
Sage Weil
0429acda45 Merge pull request #16460 from liewegas/wip-mgr-metadata
mon: add mgr metdata commands, and overall 'versions' command for all daemon versions

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-21 22:36:09 -05:00
Sage Weil
2f272ab451 qa/workunits/cephtool/test.sh: add sudo for daemon compact
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-21 23:18:03 -04:00
Patrick Donnelly
9506789ce1
Merge PR 16379 into master
* refs/remotes/upstream/pull/16379/head:
	qa: fix MDS_CLIENT_RECALL copy error

Reviewed-by: Zheng Yan <zyan@redhat.com>
2017-07-21 13:23:07 -07:00
Patrick Donnelly
23e3d40751
Merge PR 16226 into master
* refs/remotes/upstream/pull/16226/head:
	qa: wait for OSDMap to propagate for snap purge

Reviewed-by: Zheng Yan <zyan@redhat.com>
2017-07-21 13:22:47 -07:00
Jason Dillaman
44fa7ee788 qa/workunits/rbd: rbd-mirror tests should use 'mirror' user
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-21 14:30:18 -04:00
Jason Dillaman
56614d0ee9 qa/suites/rbd: mirroring tests should use rbd cap profiles
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-21 14:30:18 -04:00
Jason Dillaman
d32485ff37 qa/workunits/rbd: devstack test should use auth profiles
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-21 14:30:18 -04:00
Sage Weil
09b89ace82 qa/workunits/mon/crush_ops.sh: fix in-use rule removal test
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-21 13:50:57 -04:00
Sage Weil
fac1de8259 qa/workunits/mon/crush_ops: require luminous clients for test
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-21 13:50:57 -04:00
Sage Weil
70263dae67 mon: 'osd crush weight-set {ls,dump,create[-compat],rm[-compat],reweight[-compat]}' commands
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-21 13:50:52 -04:00
Kefu Chai
10b88b5d82 test: create asok files in a temp directory under $TMPDIR
to shorten the pathname of unix domain socket created for admin socket,
so it does not exceed the limit of 107 on GNU/Linux:

* ceph-helper.sh: the temp directory is named ${TMPDIR:-/tmp}/ceph-asok.$$
* vstart.sh: the temp directory is named `mktemp -u -d "${TMPDIR:-/tmp}/ceph-asok.XXXXXX"`

Fixes: http://tracker.ceph.com/issues/16895
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-07-22 01:05:29 +08:00
Sage Weil
cb084a55f6 Merge pull request #16453 from liewegas/wip-workloadgen
crush: enforce buckets-before-rules rule

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
2017-07-21 11:01:22 -05:00
Sage Weil
75ac7d85da qa/workunits/cephtool/test.sh: add a few tests
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-21 11:25:05 -04:00
Joao Eduardo Luis
6f6fbe7870 qa: flush out monc's dropped msgs on msgr failure injection
We have a few open tickets regarding the mgr being down during suites
involving messenger failure injection. There are a few suspicions that
this may be related with the monclient, but we'll need more logs to
validate those suspicions and, more, to validate we're actually fixing
the issue.

Signed-off-by: Joao Eduardo Luis <joao@suse.de>
2017-07-21 15:29:21 +01:00
Jos Collin
fae6dc4786 Merge pull request #16430 from yuriw/wip_add_luminous
qa: Added luminous to the mix in schedule_subset.sh

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Jos Collin <jcollin@redhat.com>
2017-07-21 12:11:29 +00:00
Kefu Chai
4599eb7963 Merge pull request #16454 from liewegas/wip-fix-ceph-scrub
qa/tasks/ceph_manager: wait for osd to start after objectstore-tool sequence

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-21 19:31:19 +08:00
Kefu Chai
0193e38b3f Merge pull request #16028 from jcsp/wip-mgr-commands
mon: load mgr commands at runtime

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-21 18:16:13 +08:00
Sage Weil
6c4992aeca qa/workunits/cephtool/test.sh: fix test to watch audit channel
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-21 11:40:48 +08:00
Sage Weil
2e8413dede qa: remove workloadgen test
The CRUSH rule creation is busted (rules and buckets out of order), but
after I fix that it doesn't seem to run right anyway.  Remove it.
We get the mon thrasher coverage from rados/monthrash already; I don't
think this is adding meaningful coverage for the amount of effort it takes
to maintain.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-20 18:06:50 -04:00
Sage Weil
59e3827be7 qa/tasks/reg11184: import run
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-20 17:42:59 -04:00
Sage Weil
27e8d75f61 Merge pull request #16429 from liewegas/wip-jewel-x
qa/suites/upgrade/jewel-x: misc fixes for new health checks
2017-07-20 10:47:05 -05:00
Sage Weil
3de9f22ce0 Merge pull request #16423 from liewegas/wip-ls
mon: '* list' -> '* ls'

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-20 10:43:34 -05:00
Kefu Chai
acc24bf0dc Merge pull request #16444 from tchaikov/wip-test-osd-stat
qa/workunits/cephtool/test.sh: "ceph osd stat" output changed, update accordingly

Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
2017-07-20 23:41:53 +08:00
Sage Weil
583a38bca2 qa/tasks/ceph_manager: wait for osd to start after objectstore-tool sequence
Fixes: http://tracker.ceph.com/issues/20705
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-20 11:41:36 -04:00
Kefu Chai
3dfa9daeca Merge pull request #16443 from wjwithagen/bug-wjw-qa-test-reorder
cephtool/test.sh: Only delete a test pool when no longer needed.

Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-20 22:13:37 +08:00
Kefu Chai
a1d16185a2 qa/tasks/reg11184: use literal 'foo' instead pool_name
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-07-20 21:35:41 +08:00
Kefu Chai
ba525a829c qa/workunits/cephtool/test.sh: "ceph osd stat" output changed, update test accordingly
Signed-off-by: Kefu Chai <kchai@redhat.com>
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
2017-07-20 19:34:53 +08:00
Willem Jan Withagen
e3760fa936 cephtool/test.sh: Only delete a test pool when no longer needed.
the pool_getset pool is deleted before all tests on it are complete

4: /home/jenkins/workspace/ceph-master/qa/workunits/cephtool/test.sh:1990: test_mon_osd_pool_set:  ceph osd pool delete pool_get
set pool_getset --yes-i-really-really-mean-it
4: pool 'pool_getset' removed
4: /home/jenkins/workspace/ceph-master/qa/workunits/cephtool/test.sh:1992: test_mon_osd_pool_set:  ceph osd pool get rbd crush_r
ule
4: /home/jenkins/workspace/ceph-master/qa/workunits/cephtool/test.sh:1992: test_mon_osd_pool_set:  grep 'crush_rule: '
4: crush_rule: replicated_rule
4: /home/jenkins/workspace/ceph-master/qa/workunits/cephtool/test.sh:1994: test_mon_osd_pool_set:  ceph -f json osd pool get poo
l_getset compression_mode
4: Error ENOENT: unrecognized pool 'pool_getset'

Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
2017-07-20 12:24:14 +02:00
Kefu Chai
aea471d73a Merge pull request #16403 from wjwithagen/bug-wjw-ceph-osd-stat
test: ceph osd stat out has changed, fix tests for that

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-20 18:06:47 +08:00
Ilya Dryomov
67db89f6c2 Merge pull request #16428 from idryomov/wip-krbd-luminous-thrash
qa: thrash tests for backoff and upmap

Reviewed-by: Vasu Kulkarni <vasu@redhat.com>
2017-07-20 11:28:22 +02:00
Piotr Dałek
b0134cc7a8 qa: add force/cancel recovery/backfill to QA testing
This randomly issues pg force-recovery/force-backfill and
pg cancel-force-recovery/cancel-force-backfill during QA
testing. Disabled for upgrades from hammer, jewel and kraken.

Signed-off-by: Piotr Dałek <piotr.dalek@corp.ovh.com>
2017-07-20 09:35:55 +02:00
Jason Dillaman
836ab7ad95 test: skip pool application metadata tests if OSDs not at min luminous
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-19 13:13:01 -04:00
Jason Dillaman
fa90be842e test: enable pool applications for new pools
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-19 13:13:01 -04:00
Jason Dillaman
3514d6e53e mon: added new "osd pool application" commands
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-19 13:13:01 -04:00
Sage Weil
572a942f8f mon: 'auth list' -> 'auth ls'
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-19 12:33:14 -04:00
Yuri Weinstein
b865912b33 Added luminous to the mix
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
2017-07-19 09:03:23 -07:00
Willem Jan Withagen
4f49402589 qa/workunits/cephtool/test.sh: ceph osd stat out has changed, fix tests for that
The output of ceph osd stat has changed,
It printed:

cluster b370a29d-9287-4ca3-ab57-3d824f65e339
 health HEALTH_OK
 monmap e1: 1 mons at {ceph1=10.0.0.8:6789/0}, election epoch 2, quorum 0 ceph1
 osdmap e63: 2 osds: 2 up, 2 in
  pgmap v41338: 952 pgs, 20 pools, 17130 MB data, 2199 objects
        115 GB used, 167 GB / 297 GB avail
             952 active+clean

but now the osdmap line has gone and thus this no longer works:
qa/workunits/cephtool/test.sh:1944:
old_pgs=$(ceph osd pool get $TEST_POOL_GETSET pg_num | sed -e 's/pg_num: //')
new_pgs=$(($old_pgs+$(ceph osd stat | grep osdmap | awk '{print $3}')*32))

4: qa/workunits/cephtool/test.sh: line 1945: 10+*32: syntax errotoken is "*32")

 - And parse the output in json , with jq, for better reliability

Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
2017-07-19 16:34:12 +02:00
John Spray
b28c300258 qa/doc: update for "mgr tell" no longer needed
Signed-off-by: John Spray <john.spray@redhat.com>
2017-07-19 08:58:40 -04:00
Ilya Dryomov
7e7f6cfe5c qa/suites/krbd: add luminous thrash tests
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-19 12:18:16 +02:00
Ilya Dryomov
0635c25e74 qa/suites/krbd: reorganize thrash tests
- factor out install and ceph into ceph/ceph.yaml
- pg_num thrashing + 20 minute health timeout for thrashosds
- common thrashosds-health.yaml whitelist
- drop iozone workload

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-19 12:18:16 +02:00
Ilya Dryomov
dac11877e2 qa/suites/krbd: heavier rbd_fio workload
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-19 12:18:16 +02:00
Ilya Dryomov
682c5a42e1 qa/tasks/rbd_fio: dump fio options before starting
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-19 12:18:16 +02:00
Ilya Dryomov
03f69b3275 qa/tasks/rbd_fio: support libaio engine
Want to set iodepth and do direct AIO.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-19 12:18:16 +02:00
Yan, Zheng
b49d6d8ead qa/cephfs: lsof if umount fails
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-07-19 15:32:37 +08:00
Patrick Donnelly
f8e0571982
qa: fix MDS_CLIENT_RECALL copy error
Fixes: http://tracker.ceph.com/issues/20682

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-18 16:06:20 -07:00
Sage Weil
fd9582f085 Merge pull request #15432 from dachary/wip-osd-new
ceph-disk: support osd new

Reviewed-by: Alfredo Deza <adeza@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-18 13:12:51 -05:00
Sage Weil
7102de8761 qa/suites/upgrade/jewel-x/point-to-point: move set-require-min-compat-client
Do it after workload completes and all jewel clients go away.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-18 12:32:17 -04:00
Sage Weil
e2fdfc0b10 qa/suites/upgrade/jewel-x: link to thrashosds yaml
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-18 12:29:01 -04:00
Sage Weil
81ae434c7f Merge pull request #16359 from liewegas/wip-cli-stdinout
ceph: allow '-' with -i and -o for stdin/stdout

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Alfredo Deza <adeza@redhat.com>
2017-07-18 08:39:14 -05:00
Patrick Donnelly
5b1a229fca
Merge PR 16200 into master
* refs/remotes/upstream/pull/16200/head:
	qa: thrash max_mds and deactivate ranks

Reviewed-by: Zheng Yan <zyan@redhat.com>
2017-07-17 22:33:34 -07:00
Patrick Donnelly
39ad17a152
Merge PR 15979 into master
* refs/remotes/upstream/pull/15979/head:
	Ignore unmatched rstat errors from MDS during rebuild testing

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-17 22:33:31 -07:00
Patrick Donnelly
b94d1dc385
Merge PR 16288 into master
* refs/remotes/upstream/pull/16288/head:
	qa/cephfs: don't use int() to convert string of float point number

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-17 22:31:05 -07:00
Sage Weil
dfc9c36606 fix ceph.py 2017-07-17 16:27:13 -04:00
Sage Weil
6ffc677dc5 qa/suites/upgade/jewel-x/parallel: ignore FS_ and MDS_ errors during restart
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-17 15:25:38 -04:00
Sage Weil
c66da972df qa/tasks/ceph.py: create osds in order
We aren't passing id to legacy 'osd create', which means we have to go
in order!

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-17 15:22:47 -04:00
Sage Weil
3a4931b0e4 ceph: allow '-' with -i and -o for stdin/stdout
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-17 09:38:52 -04:00
Ilya Dryomov
0f75d79c34 qa/tasks/rbd_fio: use teuthology.packaging for handling packages
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-17 15:32:51 +02:00
Kefu Chai
c142f25a60 Merge pull request #16346 from liewegas/wip-20602
mon: skip crush smoke test when running under valgrind

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-17 20:15:24 +08:00
Sage Weil
dd61a7f737 Merge pull request #16189 from bassam/pr-msgr-bind-addr
mon: add support public_bind_addr option

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-16 21:26:23 -05:00
Sage Weil
6e33ba0183 Merge pull request #16349 from liewegas/wip-vstart-bind
vstart.sh: bind restful, dashboard to ::, not 127.0.0.1

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-16 21:24:53 -05:00
Sage Weil
f9433e488b qa/suites/rados/rest/mgr-restful: simplify
Use default port; don't bother setting bind addr.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-16 21:28:03 -04:00
Kefu Chai
c596bff584 qa/suites/ceph-disk: whitelist health warnings
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-07-15 11:27:02 +08:00
Kefu Chai
73c0740b08 tests: ceph-disk: use communicate() instead of wait() for output
to avoid possible deadlock. quote from doc of Popen.wait()

> This will deadlock when using stdout=PIPE and/or stderr=PIPE and the
child process generates enough output to a pipe such that it blocks
waiting for the OS pipe buffer to accept more data. Use communicate() to
avoid that.

and print out the stdout and stderr using LOG.warn() if the command
fails.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-07-15 11:27:02 +08:00
Kefu Chai
0cc65197d6 Merge pull request #16045 from Liuchang0812/wip-compact-osd-feature
osd: compact osd feature

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-15 10:57:13 +08:00
Sage Weil
d52763c2cc Merge pull request #16221 from liewegas/wip-20546
crush/CrushWrapper: make get_immediate_parent[_id] ignore per-class shadow hierarchy

Reviewed-by: Neha Ojha <nojha@redhat.com>
2017-07-14 15:09:22 -05:00
Loic Dachary
d199cedc8f tests: ceph-disk destroy needs --purge
The former semantic of ceph-disk destroy is now implemented with the
--purge flag. Use that for the ceph-disk suite.

Signed-off-by: Loic Dachary <loic@dachary.org>
2017-07-14 19:47:01 +02:00
Loic Dachary
91b9646f71 tests: count OSD ids in PGs {wait,get}_osd_id_used_by_pgs
Signed-off-by: Loic Dachary <loic@dachary.org>
2017-07-14 19:47:00 +02:00
Loic Dachary
1902a414f3 tests: ceph-helper uses ceph osd purge
Instead of removing each element related to an OSD individually.

Signed-off-by: Loic Dachary <loic@dachary.org>
2017-07-14 19:47:00 +02:00
Bassam Tabbara
a8da9fd077 test,qa/workunits: add tests for public_bind_addr
Add a set of new tests for the case when public_addr and public_bind_addr
are different for a mon. In order to test this properly I had to employ
port forwarding with socat. This helps simulate what would happen in a
environment like Kubernetes. socat is now a build dependency.

Also, moved jq_success to ceph-helpers.sh and refactored run_mon to enable
creating the mons without creating the rbd pool immediately.

Signed-off-by: Bassam Tabbara <bassam.tabbara@quantum.com>
2017-07-14 10:41:49 -07:00
Sage Weil
960f00071f qa/suites: disable mon crush smoke test with valgrind
Valgrind runs itself on forked children, and does its cleanup when they
complete, and this is slow... slow enough that it frequently makes the
test time out.

Valgrind let's you ignore child *processes* that you exec, but I can't
find a way to skip forked children in the same address space.

Work around this by skip this validation when running under valgrind.

Fixes: http://tracker.ceph.com/issues/20602
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-14 11:51:47 -04:00