Commit Graph

418 Commits

Author SHA1 Message Date
Kefu Chai
17d6e96a6f Merge pull request #16967 from liewegas/wip-upgrade-health
mon: fix legacy health checks in 'ceph status' during upgrade; fix jewel-x upgrade combo

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-16 19:44:36 +08:00
Sage Weil
d69f0e120b qa/suites/rados/objectstore/objectstore: less debug
Saw an ENOSPC.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-13 14:41:43 -04:00
Sage Weil
41e5a85308 qa/suites/rados/verify/validater/valgrind: whitelist PG_
Peering might be slow due to valgrind.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-12 14:18:59 -04:00
Sage Weil
12007044b1 qa/suites/rados/multimon/tasks/mon_lock_with_skew: whitelist PG_
Default pool pgs not up because mons too broken for OSDs to peer.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-12 14:15:15 -04:00
Sage Weil
ad23d7dc1f qa/suites/rados/multimon: whitelist mgr down vs clock skew test
Clock skew might make us fail the mgr.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-11 13:42:02 -04:00
Sage Weil
dd2fb6c40b Merge pull request #16944 from liewegas/wip-kraken-x
mon/Elector: force election epoch bump on start

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2017-08-10 11:12:43 -05:00
Sage Weil
c46bdf5efd Revert "qa/suites/upgrade/jewel-x/parallel: thrash layout"
This reverts commit 435777dbff.

This test combination is not yet stable.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-10 09:51:29 -04:00
Sage Weil
a0b9f37dbc qa/suites/upgrade/jewel-x/parallel: no loadgenbig
When we do the thrashing this leads to ENOSPC on smithi.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-10 09:44:17 -04:00
Sage Weil
435777dbff qa/suites/upgrade/jewel-x/parallel: thrash layout
We can't kill and restart osds because that will interfere with
the upgrade process.  We can, however, thrash the layout by
tweaking osd weights and so on.  This will exercise osd recovery
paths during the upgrade that aren't normally exercised (outside
of stress-split..which doesn't upgrade individual osds while they
are non-clean).

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-09 22:07:48 -04:00
Sage Weil
b61be07d45 qa/suites/upgrade/kraken-x/stress-split: more whitelisting
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-09 13:58:55 -04:00
Sage Weil
1043fca076 Merge pull request #16923 from liewegas/wip-20738
qa/suites/rados/objectstore: logs
2017-08-09 12:45:29 -05:00
Sage Weil
bbd5fe354c qa/suites/upgarde/jewel-x/point-to-point-x: disable app warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-09 09:18:54 -04:00
Sage Weil
c8d60396c7 qa/suites/rados/objectstore: logs
Hunting http://tracker.ceph.com/issues/20738

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-08 18:07:18 -04:00
Sage Weil
bf29142b08 qa/suites/upgrade/kraken-x/stress-split*: whitelist
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-07 21:36:58 -04:00
Sage Weil
2234a0ed11 qa/suites/upgrade/kraken-x/parallel: whitelist
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-07 21:36:58 -04:00
Sage Weil
3e7d157871 qa/suites/upgrade/jewel-x/parallel: fix POOL_APP_NOT_ENABLED disable
This code runs on the mgr.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-07 15:12:10 -04:00
Sage Weil
387ad56a69 qa/clusters/fixed-[23]: 4 osds per node, not 3
Smithi have 4 nvme partitions available for use.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-07 13:36:05 -04:00
Sage Weil
b5fae9a9ca Merge pull request #16873 from liewegas/wip-4-nodes
qa/suites: change fixed-2.yaml users to get 4 openstack disks

Reviewed-by: Zack Cerza <zcerza@redhat.com>
2017-08-07 11:27:40 -05:00
Sage Weil
3ffca50824 Merge pull request #16864 from smithfarm/wip-big-openstack
qa: big: add openstack.yaml
2017-08-07 11:02:59 -05:00
Sage Weil
f683d2d374 qa/suites: change fixed-2.yaml users to get 4 openstack disks
Follow-up for 4203c4f887

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-07 11:56:33 -04:00
Sage Weil
a872c44be7 Merge pull request #16842 from liewegas/wip-more-ec-map-discon
qa/suites/rados/thrash: fix thrashing with ec vs map discon
2017-08-07 10:48:56 -05:00
Nathan Cutler
8bb3d8444f qa: big: add openstack.yaml
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-08-07 12:07:36 +02:00
Sage Weil
ed2d984ad1 qa/suites/upgarde/jewel-x/parallel: more whitelisting
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-06 10:04:14 -04:00
Sage Weil
58f15d2b98 qa/suites/upgrade/jewel-x/parallel: more whitelisting
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-06 09:56:55 -04:00
Sage Weil
622e950e43 qa/suites/upgrade/*-x/parallel: whitelist more stuff
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-06 09:56:55 -04:00
Sage Weil
2d260443f0 qa/suites/upgrade/*/parallel: disable POOL_APP_NOT_ENABLED
There is some other random workload running (that creates pools)
while we upgrade and wait for healthy.  Just disable the warning
for these tests.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-06 09:56:55 -04:00
Sage Weil
f4c2863999 qa/suites/upgrade/jewel-x/parallel: whitelist OSD_DOWN
We restart OSDs during the upgrade.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-06 09:56:55 -04:00
Sage Weil
6307e03c6d qa/suites/rados/thrash/workloads/cache-agent-big: m=2
...because we do the test_map_discontinuity thing.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-05 14:33:13 -04:00
Patrick Donnelly
04d8ba4b04
Merge PR #16833 into master
* refs/remotes/upstream/pull/16833/head:
	qa: whitelist expected MDS_CLIENT_OLDEST_TID warn
	qa: ignore insufficient standby during failover
	qa: fix read-only whitelist
	mds: MDS_DAMAGED to MDS_DAMAGE
	doc: remove duplicate CephFS health check doc
2017-08-04 20:26:09 -07:00
Patrick Donnelly
29e5f0a450
qa: whitelist expected MDS_CLIENT_OLDEST_TID warn
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-04 20:21:43 -07:00
Patrick Donnelly
06f53e4a82
qa: ignore insufficient standby during failover
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-04 20:14:59 -07:00
Patrick Donnelly
42cd1c7122
qa: fix read-only whitelist
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-04 20:14:48 -07:00
Sage Weil
62e51661e6 Merge branch 'wip-qa-rbd-health' of git://github.com/dillaman/ceph
# Conflicts:
#	qa/tasks/ceph.py
2017-08-04 15:07:22 -04:00
Sage Weil
ffd171fd46 Merge pull request #16820 from liewegas/wip-more-whitelist
qa/suites/rados: a bit more whitelisting

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-08-04 13:44:08 -05:00
Sage Weil
82cf3046de qa/suites/rados/basic/tasks/rados_python: POOL_APP_NOT_ENABLED
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-04 13:39:13 -04:00
Sage Weil
9c7a653fee Merge pull request #16769 from liewegas/wip-20295-b
os/bluestore: allow multiple DeferredBatches in flight at once

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-08-04 11:04:38 -05:00
Sage Weil
c8af364699 Merge pull request #16739 from liewegas/wip-multi-backfill-reject
qa/suites/rados/singleton-nomsgr/all/multi-backfill-reject: sleep longer
2017-08-04 08:41:06 -05:00
Sage Weil
1ae9ff173b qa/suites/rados/upgrade: ignore FS_DEGRADED from mds restart
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-04 09:34:31 -04:00
Sage Weil
27a685f626 qa/suites/rados/monthrash: ignore MGR_DOWN
Heavily thrashing mons + mgr reconnect backoff may make us fail
to process the beacon.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-04 09:34:15 -04:00
Nathan Cutler
d919987caa tests: rbd: reproducer for rbd-on-EC issue
This introduces a new "rbd/singleton-bluestore" suite because creating an rbd
on an EC-backed datapool will fail on filestore.

References: http://tracker.ceph.com/issues/20295
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-08-03 22:54:17 -04:00
Patrick Donnelly
9d348ad8c9
qa: add health whitelist for all fs sub-suites
Fixes: http://tracker.ceph.com/issues/20892

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-03 14:01:28 -07:00
Patrick Donnelly
60fa9714d4
Merge PR #16768 into master
* refs/remotes/upstream/pull/16768/head:
	qa: fix log whitelist string

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-03 13:55:42 -07:00
Patrick Donnelly
66756c4f65
Merge PR #16292 into master
* refs/remotes/upstream/pull/16292/head:
	qa: use new hex rep of inode
	qa: fix whitelist error message
	mds: refine "Scrub error" cluster log message
	mds: polish clog messages
	doc: developer logging guidance

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-03 13:55:21 -07:00
Sage Weil
342607f4d5 Merge pull request #16749 from tchaikov/wip-restful-delete-key
mgr: handle "module.set_config(.., None)" correctly 

Reviewed-by: John Spray <john.spray@redhat.com>
2017-08-03 15:53:27 -05:00
Josh Durgin
b172642124 Merge pull request #16789 from liewegas/wip-ec-m-2
qa: avoid map-gap tests for k=2 m=1

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-08-03 11:20:13 -07:00
Sage Weil
ef21c9d7df qa/suites/rados/thrash-erasure-code: do not test map gap with m=1
We test EC profiles with m=1 here, and mapgap can lead to incomplete pgs
because it takes an osd down and waits for healthy.

Fixes: http://tracker.ceph.com/issues/20844
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-03 14:13:02 -04:00
Sage Weil
f74d71f708 qa/suites/rados/thrash-erasure-coe-big/clsuter: 12 osds on 3 nodes not 4
smithi have 4 nvme partitions available, not 3.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-03 14:11:43 -04:00
Sage Weil
63221e21f5 qa/suites/rados/thrash-erasure-code-big: add k=4 m=2
Get better coverage for larger codes.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-03 14:10:36 -04:00
Sage Weil
e994b03335 qa/suites/rados/monthrash/worklaods/rados_api_tests: whitelist SMALLER_PGP_NUM
The rados/test.sh fiddles with pg_num.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-03 13:31:39 -04:00
Sage Weil
7c350180b1 qa/suites/rados/mgr/tasks/failover: whitelist
remote/smithi025/log/ceph.log.gz:2017-08-03 07:02:15.049074 mon.b mon.0 172.21.15.25:6789/0 197 : cluster [INF] Manager daemon x is unresponsive, replacing it with standby daemon y
remote/smithi025/log/ceph.log.gz:2017-08-03 07:03:10.078032 mon.b mon.0 172.21.15.25:6789/0 226 : cluster [WRN] Manager daemon x is unresponsive.  No standby daemons available.

x and y may be swapped, so whitelist the rest of the string.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-03 12:40:01 -04:00