Commit Graph

543 Commits

Author SHA1 Message Date
Kefu Chai
b2d7f4f4c7 qa/suites/rados/upgrade/jewel-x-singleton: tolerate sloppy past_intervals
See-also: d5d5d7d1
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-08-28 15:19:41 +08:00
Sage Weil
893b3ac6fa Merge pull request #17227 from liewegas/wip-jewel-x
qa/suites/upgrade/jewel-x/parallel: tolerate laggy mgr
2017-08-24 09:30:31 -05:00
Sage Weil
bf296018ff qa/suites/upgrade/jewel-x/parallel: tolerate laggy mgr
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-24 10:30:01 -04:00
Sage Weil
d3632fd2f9 Merge pull request #17226 from liewegas/wip-jewel-x
qa/suites/upgrade/jewel-x/stress-split: tolerate sloppy past_intervals
2017-08-24 09:27:44 -05:00
Sage Weil
d5d5d7d1d2 qa/suites/upgrade/jewel-x/stress-split: tolerate sloppy past_intervals
This is harmless in general, esp during upgrade.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-24 10:23:22 -04:00
Sage Weil
4f1fca0483 Merge pull request #17203 from liewegas/wip-jewel-x
qa/suites/upgarde/jewel-x/parallel: tolerate mgr warning
2017-08-23 17:21:37 -05:00
Yuri Weinstein
304b492187 Initial check in luminous-x suite
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
2017-08-23 14:53:55 -07:00
Matt Benjamin
1e1731e663 Merge pull request #16612 from cbodley/wip-20668
rgw: fixes for multisite replication of encrypted objects
2017-08-23 15:57:02 -04:00
Sage Weil
5455f599b3 qa/suites/upgrade/jewel-x/parallel: tolerate OBJECT_MISPLACED
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-23 14:24:00 -04:00
Sage Weil
2504ab1675 qa/suites/upgarde/jewel-x/parallel: tolerate mgr warning
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-23 14:22:34 -04:00
Patrick Donnelly
75967dbfe7
Merge PR #17111 into master
* refs/remotes/upstream/pull/17111/head:
	qa: add health whitelist for kcephfs suite

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-23 06:49:11 -07:00
Casey Bodley
5e67c681f7 Merge pull request #16344 from rzarzynski/wip-rgwqa-tempest
rgw, qa: integrate Tempest to verify RadosGW's compliance with Swift API

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2017-08-22 15:02:15 -04:00
Yan, Zheng
b10989209f qa: add health whitelist for kcephfs suite
Fixes: http://tracker.ceph.com/issues/20892
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-08-21 17:01:22 +08:00
Vasu Kulkarni
9cc00c5c1a Rename folders to fix task order
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-08-18 11:35:54 -07:00
Vasu Kulkarni
1041c803f1 use bluestore with dmcrypt option
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-08-18 11:09:50 -07:00
Vasu Kulkarni
f6de5d9f9e Add dmcrypt option
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-08-18 11:08:00 -07:00
Vasu Kulkarni
60d00e0ead Separate the main task from options
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-08-18 11:05:01 -07:00
Vasu Kulkarni
0395b84488 Catchup with recent changes with ceph-ansible
Adds osd_scenario and ceph_stable_release variables

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-08-18 10:47:22 -07:00
Casey Bodley
f27ebabe55 test/rgw: add kms encryption key for teuthology
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-08-16 12:22:31 -04:00
Kefu Chai
17d6e96a6f Merge pull request #16967 from liewegas/wip-upgrade-health
mon: fix legacy health checks in 'ceph status' during upgrade; fix jewel-x upgrade combo

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-16 19:44:36 +08:00
Radoslaw Zarzynski
ed8a6b89e4 qa/suites/rgw/tempest: use fixed-1 cluster instead of fixed-2.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2017-08-14 01:11:22 +00:00
Radoslaw Zarzynski
43a7399720 qa/tasks/rgw: make the frontend_prefix per-client configurable.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2017-08-14 01:11:18 +00:00
Radoslaw Zarzynski
09db786581 qa/suites/rgw: move the Tempest testing to its dedicated sub-suite.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2017-08-13 23:14:40 +00:00
Radoslaw Zarzynski
99e1d443a0 qa/suites/rgw: freeze the Tempest version for RGW testing.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2017-08-13 23:14:40 +00:00
Radoslaw Zarzynski
afe1ad3010 qa, rgw: Keystone's instances can be now accessed via non-local network interfaces.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2017-08-13 23:14:40 +00:00
Radoslaw Zarzynski
849f46f8cf qa/suites/rgw: integrate Tempest to verify Swift API compliance.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2017-08-13 23:14:40 +00:00
Sage Weil
d69f0e120b qa/suites/rados/objectstore/objectstore: less debug
Saw an ENOSPC.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-13 14:41:43 -04:00
Sage Weil
41e5a85308 qa/suites/rados/verify/validater/valgrind: whitelist PG_
Peering might be slow due to valgrind.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-12 14:18:59 -04:00
Sage Weil
12007044b1 qa/suites/rados/multimon/tasks/mon_lock_with_skew: whitelist PG_
Default pool pgs not up because mons too broken for OSDs to peer.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-12 14:15:15 -04:00
Sage Weil
ad23d7dc1f qa/suites/rados/multimon: whitelist mgr down vs clock skew test
Clock skew might make us fail the mgr.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-11 13:42:02 -04:00
Sage Weil
dd2fb6c40b Merge pull request #16944 from liewegas/wip-kraken-x
mon/Elector: force election epoch bump on start

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2017-08-10 11:12:43 -05:00
Sage Weil
c46bdf5efd Revert "qa/suites/upgrade/jewel-x/parallel: thrash layout"
This reverts commit 435777dbff.

This test combination is not yet stable.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-10 09:51:29 -04:00
Sage Weil
a0b9f37dbc qa/suites/upgrade/jewel-x/parallel: no loadgenbig
When we do the thrashing this leads to ENOSPC on smithi.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-10 09:44:17 -04:00
Sage Weil
435777dbff qa/suites/upgrade/jewel-x/parallel: thrash layout
We can't kill and restart osds because that will interfere with
the upgrade process.  We can, however, thrash the layout by
tweaking osd weights and so on.  This will exercise osd recovery
paths during the upgrade that aren't normally exercised (outside
of stress-split..which doesn't upgrade individual osds while they
are non-clean).

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-09 22:07:48 -04:00
Sage Weil
b61be07d45 qa/suites/upgrade/kraken-x/stress-split: more whitelisting
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-09 13:58:55 -04:00
Sage Weil
1043fca076 Merge pull request #16923 from liewegas/wip-20738
qa/suites/rados/objectstore: logs
2017-08-09 12:45:29 -05:00
Sage Weil
bbd5fe354c qa/suites/upgarde/jewel-x/point-to-point-x: disable app warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-09 09:18:54 -04:00
Sage Weil
c8d60396c7 qa/suites/rados/objectstore: logs
Hunting http://tracker.ceph.com/issues/20738

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-08 18:07:18 -04:00
Sage Weil
bf29142b08 qa/suites/upgrade/kraken-x/stress-split*: whitelist
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-07 21:36:58 -04:00
Sage Weil
2234a0ed11 qa/suites/upgrade/kraken-x/parallel: whitelist
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-07 21:36:58 -04:00
Sage Weil
3e7d157871 qa/suites/upgrade/jewel-x/parallel: fix POOL_APP_NOT_ENABLED disable
This code runs on the mgr.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-07 15:12:10 -04:00
Sage Weil
387ad56a69 qa/clusters/fixed-[23]: 4 osds per node, not 3
Smithi have 4 nvme partitions available for use.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-07 13:36:05 -04:00
Sage Weil
b5fae9a9ca Merge pull request #16873 from liewegas/wip-4-nodes
qa/suites: change fixed-2.yaml users to get 4 openstack disks

Reviewed-by: Zack Cerza <zcerza@redhat.com>
2017-08-07 11:27:40 -05:00
Sage Weil
3ffca50824 Merge pull request #16864 from smithfarm/wip-big-openstack
qa: big: add openstack.yaml
2017-08-07 11:02:59 -05:00
Sage Weil
f683d2d374 qa/suites: change fixed-2.yaml users to get 4 openstack disks
Follow-up for 4203c4f887

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-07 11:56:33 -04:00
Sage Weil
a872c44be7 Merge pull request #16842 from liewegas/wip-more-ec-map-discon
qa/suites/rados/thrash: fix thrashing with ec vs map discon
2017-08-07 10:48:56 -05:00
Nathan Cutler
8bb3d8444f qa: big: add openstack.yaml
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-08-07 12:07:36 +02:00
Sage Weil
ed2d984ad1 qa/suites/upgarde/jewel-x/parallel: more whitelisting
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-06 10:04:14 -04:00
Sage Weil
58f15d2b98 qa/suites/upgrade/jewel-x/parallel: more whitelisting
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-06 09:56:55 -04:00
Sage Weil
622e950e43 qa/suites/upgrade/*-x/parallel: whitelist more stuff
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-06 09:56:55 -04:00
Sage Weil
2d260443f0 qa/suites/upgrade/*/parallel: disable POOL_APP_NOT_ENABLED
There is some other random workload running (that creates pools)
while we upgrade and wait for healthy.  Just disable the warning
for these tests.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-06 09:56:55 -04:00
Sage Weil
f4c2863999 qa/suites/upgrade/jewel-x/parallel: whitelist OSD_DOWN
We restart OSDs during the upgrade.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-06 09:56:55 -04:00
Sage Weil
6307e03c6d qa/suites/rados/thrash/workloads/cache-agent-big: m=2
...because we do the test_map_discontinuity thing.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-05 14:33:13 -04:00
Patrick Donnelly
04d8ba4b04
Merge PR #16833 into master
* refs/remotes/upstream/pull/16833/head:
	qa: whitelist expected MDS_CLIENT_OLDEST_TID warn
	qa: ignore insufficient standby during failover
	qa: fix read-only whitelist
	mds: MDS_DAMAGED to MDS_DAMAGE
	doc: remove duplicate CephFS health check doc
2017-08-04 20:26:09 -07:00
Patrick Donnelly
29e5f0a450
qa: whitelist expected MDS_CLIENT_OLDEST_TID warn
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-04 20:21:43 -07:00
Patrick Donnelly
06f53e4a82
qa: ignore insufficient standby during failover
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-04 20:14:59 -07:00
Patrick Donnelly
42cd1c7122
qa: fix read-only whitelist
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-04 20:14:48 -07:00
Sage Weil
62e51661e6 Merge branch 'wip-qa-rbd-health' of git://github.com/dillaman/ceph
# Conflicts:
#	qa/tasks/ceph.py
2017-08-04 15:07:22 -04:00
Sage Weil
ffd171fd46 Merge pull request #16820 from liewegas/wip-more-whitelist
qa/suites/rados: a bit more whitelisting

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-08-04 13:44:08 -05:00
Sage Weil
82cf3046de qa/suites/rados/basic/tasks/rados_python: POOL_APP_NOT_ENABLED
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-04 13:39:13 -04:00
Sage Weil
9c7a653fee Merge pull request #16769 from liewegas/wip-20295-b
os/bluestore: allow multiple DeferredBatches in flight at once

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-04 11:04:38 -05:00
Sage Weil
c8af364699 Merge pull request #16739 from liewegas/wip-multi-backfill-reject
qa/suites/rados/singleton-nomsgr/all/multi-backfill-reject: sleep longer
2017-08-04 08:41:06 -05:00
Sage Weil
1ae9ff173b qa/suites/rados/upgrade: ignore FS_DEGRADED from mds restart
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-04 09:34:31 -04:00
Sage Weil
27a685f626 qa/suites/rados/monthrash: ignore MGR_DOWN
Heavily thrashing mons + mgr reconnect backoff may make us fail
to process the beacon.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-04 09:34:15 -04:00
Nathan Cutler
d919987caa tests: rbd: reproducer for rbd-on-EC issue
This introduces a new "rbd/singleton-bluestore" suite because creating an rbd
on an EC-backed datapool will fail on filestore.

References: http://tracker.ceph.com/issues/20295
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-08-03 22:54:17 -04:00
Patrick Donnelly
9d348ad8c9
qa: add health whitelist for all fs sub-suites
Fixes: http://tracker.ceph.com/issues/20892

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-03 14:01:28 -07:00
Patrick Donnelly
60fa9714d4
Merge PR #16768 into master
* refs/remotes/upstream/pull/16768/head:
	qa: fix log whitelist string

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-03 13:55:42 -07:00
Patrick Donnelly
66756c4f65
Merge PR #16292 into master
* refs/remotes/upstream/pull/16292/head:
	qa: use new hex rep of inode
	qa: fix whitelist error message
	mds: refine "Scrub error" cluster log message
	mds: polish clog messages
	doc: developer logging guidance

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-03 13:55:21 -07:00
Sage Weil
342607f4d5 Merge pull request #16749 from tchaikov/wip-restful-delete-key
mgr: handle "module.set_config(.., None)" correctly 

Reviewed-by: John Spray <john.spray@redhat.com>
2017-08-03 15:53:27 -05:00
Josh Durgin
b172642124 Merge pull request #16789 from liewegas/wip-ec-m-2
qa: avoid map-gap tests for k=2 m=1

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-08-03 11:20:13 -07:00
Sage Weil
ef21c9d7df qa/suites/rados/thrash-erasure-code: do not test map gap with m=1
We test EC profiles with m=1 here, and mapgap can lead to incomplete pgs
because it takes an osd down and waits for healthy.

Fixes: http://tracker.ceph.com/issues/20844
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-03 14:13:02 -04:00
Sage Weil
f74d71f708 qa/suites/rados/thrash-erasure-coe-big/clsuter: 12 osds on 3 nodes not 4
smithi have 4 nvme partitions available, not 3.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-03 14:11:43 -04:00
Sage Weil
63221e21f5 qa/suites/rados/thrash-erasure-code-big: add k=4 m=2
Get better coverage for larger codes.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-03 14:10:36 -04:00
Sage Weil
e994b03335 qa/suites/rados/monthrash/worklaods/rados_api_tests: whitelist SMALLER_PGP_NUM
The rados/test.sh fiddles with pg_num.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-03 13:31:39 -04:00
Sage Weil
7c350180b1 qa/suites/rados/mgr/tasks/failover: whitelist
remote/smithi025/log/ceph.log.gz:2017-08-03 07:02:15.049074 mon.b mon.0 172.21.15.25:6789/0 197 : cluster [INF] Manager daemon x is unresponsive, replacing it with standby daemon y
remote/smithi025/log/ceph.log.gz:2017-08-03 07:03:10.078032 mon.b mon.0 172.21.15.25:6789/0 226 : cluster [WRN] Manager daemon x is unresponsive.  No standby daemons available.

x and y may be swapped, so whitelist the rest of the string.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-03 12:40:01 -04:00
Jason Dillaman
c2b451e8cb qa: fix RBD-related POOL_APP_NOT_ENABLED health warnings
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-08-03 09:50:41 -04:00
Patrick Donnelly
d4ed085238
Merge PR #16713 into master
* refs/remotes/upstream/pull/16713/head:
	qa: ignore failed MDS message during upgrade
2017-08-02 19:41:42 -07:00
Patrick Donnelly
7f04d88af8
qa: fix whitelist error message
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-02 16:52:30 -07:00
Patrick Donnelly
8e975a6347
qa: fix log whitelist string
Fixes: http://tracker.ceph.com/issues/20889

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-02 16:32:19 -07:00
Sage Weil
5085dc1164 qa/suites/powercycle: whitelist health for thrashing
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-02 11:06:43 -04:00
Kefu Chai
da1a60ced1 qa: refactor suites/rados/rest/mgr-restful
- use "ceph restful restart" to restart the restful API server instead
of restarting the ceph-mgr
- test "ceph restful delete-key"
- test "ceph restful list-keys"

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-08-02 18:20:56 +08:00
Kefu Chai
1ff1f836da Merge pull request #16722 from tchaikov/wip-qa-fixes
qa/suites: escape the parenthesis of the whitelist text

Reviewed-by: Sage Weil <sage@redhat.com>
2017-08-02 13:00:01 +08:00
Kefu Chai
a70be4e00c qa/suites: more whitelisting
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-08-02 10:00:57 +08:00
Sage Weil
c955bf528f qa/suites/rados/singleton-nomsgr/all/multi-backfill-reject: sleep longer
I saw a failure where the 30% backfill probability was enough that we
just didn't manage to backfill all of the pgs during the 5 minute recovery
timeout during ceph.py shutdown.  Build in some additional time for the
test to recover.

http://pulpito.ceph.com/sage-2017-08-01_15:32:10-rados-wip-sage-testing-distro-basic-smithi/1469184

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-01 15:50:47 -04:00
Kefu Chai
d12c51ca91 qa/suites: escape the parenthesis of the whitelist text
so we can avoid the warnings like

grep: Unmatched ( or \(

because we pass the whitelisted string to `egrep -v "$1"` directly.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-08-01 21:54:44 +08:00
John Spray
ac2b9d63ca qa: include config help in admin socket test
Signed-off-by: John Spray <john.spray@redhat.com>
2017-08-01 13:38:40 +01:00
Patrick Donnelly
5e5ff5c086
qa: ignore failed MDS message during upgrade
The cluster is expected to become degraded during reboot.

Fixes: http://tracker.ceph.com/issues/20731
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-31 14:45:07 -07:00
Sage Weil
c3c2b31c87 Merge pull request #16568 from liewegas/wip-application-warn
qa,doc: document and fix tests for pool application warnings
2017-07-28 09:00:46 -05:00
Patrick Donnelly
fb039383e9
Merge PR #16435 into master
* refs/remotes/upstream/pull/16435/head:
	qa: whitelist trim error during powercycle tests

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-27 17:54:59 -07:00
Sage Weil
41bcf2fee5 Merge pull request #16281 from badone/wip-PG-cluster-log-audit
osd: Log audit

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-27 16:25:30 -05:00
Sage Weil
862392fbf9 Merge pull request #16514 from liewegas/wip-20744
qa/tasks/ceph: wait for mgr to activate and pg stats to flush in health()

Reviewed-by: John Spray <john.spray@redhat.com>
2017-07-27 16:24:59 -05:00
Patrick Donnelly
d7f5af40a2
qa: whitelist trim error during powercycle tests
Fixes: http://tracker.ceph.com/issues/20566

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-27 13:24:21 -07:00
Sage Weil
0b5036f072 qa/suites/rados/upgrade: fix upgrade wait for healthy
There is no mgr, so we can't call ceph.healthy.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 12:10:34 -04:00
Sage Weil
203c68ad55 Merge pull request #16575 from liewegas/wip-20693
qa/suites/rados: at-end: ignore PG_{AVAILABILITY,DEGRADED}

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-27 08:31:53 -05:00
Sage Weil
e398fd4ee4 qa/suites: more whitelisting
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 09:31:24 -04:00
Sage Weil
56ffd7a727 Merge pull request #16571 from ceph/wip-cd-bluestore-2
qa/tasks/ceph-deploy: Fix bluestore options for ceph-deploy

Reviewed-by: Tamil Muthamizhan <tmuthami@redhat.com>
2017-07-26 11:43:50 -05:00
Brad Hubbard
f8acc53d82 osd: Log audit
Review current log messages for consistency, accuracy and necessesity as
part of usability initiative. First in a series.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
2017-07-26 17:34:28 +10:00
Sage Weil
326019a466 qa/suites/rados: whitelist various tests
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-25 22:29:07 -04:00
Sage Weil
2ef8614f67 qa/suites/rados/singleton/all/erasure-code-nonregression: fix typo
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-25 22:26:43 -04:00
Sage Weil
3683cdf496 qa/suites/rados: at-end: ignore PG_{AVAILABILITY,DEGRADED}
With the peering deletes change, setting luminous sets the osdmap flag
which triggers a new peering interval.  That can lead to health warnings
about PG_AVAILABILITY or PG_DEGRADED.  Ignore those!

Fixes: http://tracker.ceph.com/issues/20693
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-25 18:29:07 -04:00
Vasu Kulkarni
45c6a9acc4 Add both filestore and bluestore options for tests
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-07-25 15:16:37 -07:00
Vasu Kulkarni
25c89804e4 bluestore config options for tests
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-07-25 12:26:11 -07:00
Vasu Kulkarni
12a1ceba6e Move ceph-deploy config options into its own folder
The old structure of link at top folder is pretty much outdated, the test
config option needs to be specific to cluster yaml.

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-07-25 12:26:11 -07:00
Sage Weil
766229b034 qa/standalone/scrub: separate scrub/repair tests from rest of osd/
They are slow.  Run them separately.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-24 22:11:50 -04:00
Sage Weil
71ea171604 qa: move ceph-helpers and misc src/test/*.sh tests to qa/standalone
- stop running via make check
- add teuthology yamls to run them
- disable ceph_objecstore_tool.py for now (too slow for make check, and
we can't use vstart in teuthology via a package install)
- drop cephtool tests since those are already covered by other teuthology
tests
- leave a handful of (fast!) ceph-helpers tests for make check for minimal
integration tests.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-24 22:11:49 -04:00
Sage Weil
02c2e853d3 Merge pull request #16509 from liewegas/wip-rgw-wait
qa/suits/rados/basic/tasks/rgw_snaps: wait for pools to be created

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2017-07-24 11:55:54 -05:00
Sage Weil
29549e6834 Merge pull request #13723 from ovh/bp-forced-recovery
osd/PG: make prioritized recovery possible

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-24 09:01:03 -05:00
John Spray
343e1a4281 qa: update whitelist for "wrongly marked me down"
Signed-off-by: John Spray <john.spray@redhat.com>
2017-07-24 14:54:46 +01:00
Sage Weil
ecd1193ab9 qa/suites/rados/basic/tasks/rgw_snaps: wait for pools to be be created
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-22 18:54:46 -04:00
Sage Weil
9b4002b6b8 qa/suites/rados/basic/tasks/rgw_snaps: fix pool list
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-22 18:54:45 -04:00
Jason Dillaman
56614d0ee9 qa/suites/rbd: mirroring tests should use rbd cap profiles
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-21 14:30:18 -04:00
Sage Weil
cb084a55f6 Merge pull request #16453 from liewegas/wip-workloadgen
crush: enforce buckets-before-rules rule

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
2017-07-21 11:01:22 -05:00
Joao Eduardo Luis
6f6fbe7870 qa: flush out monc's dropped msgs on msgr failure injection
We have a few open tickets regarding the mgr being down during suites
involving messenger failure injection. There are a few suspicions that
this may be related with the monclient, but we'll need more logs to
validate those suspicions and, more, to validate we're actually fixing
the issue.

Signed-off-by: Joao Eduardo Luis <joao@suse.de>
2017-07-21 15:29:21 +01:00
Kefu Chai
0193e38b3f Merge pull request #16028 from jcsp/wip-mgr-commands
mon: load mgr commands at runtime

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-21 18:16:13 +08:00
Sage Weil
2e8413dede qa: remove workloadgen test
The CRUSH rule creation is busted (rules and buckets out of order), but
after I fix that it doesn't seem to run right anyway.  Remove it.
We get the mon thrasher coverage from rados/monthrash already; I don't
think this is adding meaningful coverage for the amount of effort it takes
to maintain.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-20 18:06:50 -04:00
Sage Weil
27e8d75f61 Merge pull request #16429 from liewegas/wip-jewel-x
qa/suites/upgrade/jewel-x: misc fixes for new health checks
2017-07-20 10:47:05 -05:00
Ilya Dryomov
67db89f6c2 Merge pull request #16428 from idryomov/wip-krbd-luminous-thrash
qa: thrash tests for backoff and upmap

Reviewed-by: Vasu Kulkarni <vasu@redhat.com>
2017-07-20 11:28:22 +02:00
Piotr Dałek
b0134cc7a8 qa: add force/cancel recovery/backfill to QA testing
This randomly issues pg force-recovery/force-backfill and
pg cancel-force-recovery/cancel-force-backfill during QA
testing. Disabled for upgrades from hammer, jewel and kraken.

Signed-off-by: Piotr Dałek <piotr.dalek@corp.ovh.com>
2017-07-20 09:35:55 +02:00
Jason Dillaman
fa90be842e test: enable pool applications for new pools
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-19 13:13:01 -04:00
John Spray
b28c300258 qa/doc: update for "mgr tell" no longer needed
Signed-off-by: John Spray <john.spray@redhat.com>
2017-07-19 08:58:40 -04:00
Ilya Dryomov
7e7f6cfe5c qa/suites/krbd: add luminous thrash tests
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-19 12:18:16 +02:00
Ilya Dryomov
0635c25e74 qa/suites/krbd: reorganize thrash tests
- factor out install and ceph into ceph/ceph.yaml
- pg_num thrashing + 20 minute health timeout for thrashosds
- common thrashosds-health.yaml whitelist
- drop iozone workload

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-19 12:18:16 +02:00
Ilya Dryomov
dac11877e2 qa/suites/krbd: heavier rbd_fio workload
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-19 12:18:16 +02:00
Sage Weil
7102de8761 qa/suites/upgrade/jewel-x/point-to-point: move set-require-min-compat-client
Do it after workload completes and all jewel clients go away.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-18 12:32:17 -04:00
Sage Weil
e2fdfc0b10 qa/suites/upgrade/jewel-x: link to thrashosds yaml
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-18 12:29:01 -04:00
Patrick Donnelly
39ad17a152
Merge PR 15979 into master
* refs/remotes/upstream/pull/15979/head:
	Ignore unmatched rstat errors from MDS during rebuild testing

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-17 22:33:31 -07:00
Sage Weil
6ffc677dc5 qa/suites/upgade/jewel-x/parallel: ignore FS_ and MDS_ errors during restart
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-17 15:25:38 -04:00
Kefu Chai
c142f25a60 Merge pull request #16346 from liewegas/wip-20602
mon: skip crush smoke test when running under valgrind

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-17 20:15:24 +08:00
Sage Weil
6e33ba0183 Merge pull request #16349 from liewegas/wip-vstart-bind
vstart.sh: bind restful, dashboard to ::, not 127.0.0.1

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-16 21:24:53 -05:00
Sage Weil
f9433e488b qa/suites/rados/rest/mgr-restful: simplify
Use default port; don't bother setting bind addr.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-16 21:28:03 -04:00
Kefu Chai
c596bff584 qa/suites/ceph-disk: whitelist health warnings
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-07-15 11:27:02 +08:00
Sage Weil
960f00071f qa/suites: disable mon crush smoke test with valgrind
Valgrind runs itself on forked children, and does its cleanup when they
complete, and this is slow... slow enough that it frequently makes the
test time out.

Valgrind let's you ignore child *processes* that you exec, but I can't
find a way to skip forked children in the same address space.

Work around this by skip this validation when running under valgrind.

Fixes: http://tracker.ceph.com/issues/20602
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-14 11:51:47 -04:00
Sage Weil
4fcfb8ca9b qa/suites/rados/singleton/all/reg11184: whitelist health warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-12 18:39:24 -04:00
Sage Weil
bf6c075b7e qa/suites/fs: whitelist health warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-12 12:52:03 -04:00
Sage Weil
8d711a5659 qa/suites/rgw/thrash: whitelist
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-12 12:52:03 -04:00
Sage Weil
3d268d6e83 qa/suites/rbd: whitelist health messages
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-12 12:52:03 -04:00
Sage Weil
93de19adcf qa: whitelist health warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-12 12:52:03 -04:00
Sage Weil
63f97ddcf6 qa/suites/rados: whitelist health warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-12 12:52:02 -04:00
Douglas Fuller
1cb02ee1eb Ignore unmatched rstat errors from MDS during rebuild testing
Fixes: http://tracker.ceph.com/issues/20441

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-07-12 11:40:46 -05:00
Sage Weil
f2b837578a Merge pull request #16244 from liewegas/wip-11793
qa/suites/rados/thrash/workload/*: enable rados.py cache tiering ops

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-11 13:01:42 -05:00
Sage Weil
2afbc60be7 qa/suites/: enable rados.py cache tiering ops
These weren't being exercised!

See http://tracker.ceph.com/issues/11793

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-11 14:01:15 -04:00
Sage Weil
8b21c6b6fd Merge pull request #16027 from liewegas/wip-crush-rule-class
mon,crush: create crush rules using device classes for replicated and ec pools via cli

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2017-07-08 21:34:13 -05:00
Sage Weil
0c79c4ffac Merge pull request #16228 from smithfarm/wip-rados-upgrade-2
tests: fix rados/upgrade/jewel-x-singleton and make workunit task handle repo URLs not ending in ".git"

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-08 21:32:36 -05:00
Sage Weil
4bc9f566d0 qa/suites/rados/upgrade: upgrade client.0 node too
Fixes: http://tracker.ceph.com/issues/20368
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-07-08 18:56:09 +02:00
Sage Weil
e30b32bca4 qa/suites/rados/singleton/all/mon-auth-caps: more osds so we can go clean
and scrub

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-07 17:39:22 -04:00
Patrick Donnelly
64c6079d69
Merge remote-tracking branch 'upstream/pull/15937/head' into master
* upstream/pull/15937/head:
  qa: remove unused quota config option

Reviewed-by: John Spray <jspray@redhat.com>
2017-07-06 21:38:45 -07:00
Sage Weil
dc7a2aaf7a erasure-code: ruleset-* -> crush-*
1) ruleset is an obsolete term, and
2) crush-{rule,failure-domain,...} is more descriptive.

Note that we are changing the names of the erasure code profile keys
from ruleset-* to crush-*.  We will update this on upgrade when the
luminous flag is set, but that means that during mon upgrade you cannot
create EC pools that use these fields.

When the upgrade completes (users sets require_osd_release = luminous)
existing ec profiles are updated automatically.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-06 15:01:03 -04:00
Mykola Golub
2a9f56f818 Merge pull request #15860 from dillaman/wip-20168
librbd: fail IO request when exclusive lock cannot be obtained

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
2017-07-05 14:52:55 +03:00
Kefu Chai
04e0ef541d Merge pull request #15754 from tchaikov/wip-test-auth-caps
qa/suites: add test exercising workunits/mon/auth_caps.sh

Reviewed By: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Joao Eduardo Luis <joao@suse.de>
2017-07-05 15:05:21 +08:00
Mykola Golub
866cf72440 Merge pull request #15956 from dillaman/wip-librbd-devstack
test: fix failing rbd devstack teuthology test

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
2017-07-01 15:02:29 +03:00