Commit Graph

82 Commits

Author SHA1 Message Date
Kefu Chai
17d6e96a6f Merge pull request #16967 from liewegas/wip-upgrade-health
mon: fix legacy health checks in 'ceph status' during upgrade; fix jewel-x upgrade combo

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-16 19:44:36 +08:00
Sage Weil
dd2fb6c40b Merge pull request #16944 from liewegas/wip-kraken-x
mon/Elector: force election epoch bump on start

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2017-08-10 11:12:43 -05:00
Sage Weil
c46bdf5efd Revert "qa/suites/upgrade/jewel-x/parallel: thrash layout"
This reverts commit 435777dbff.

This test combination is not yet stable.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-10 09:51:29 -04:00
Sage Weil
a0b9f37dbc qa/suites/upgrade/jewel-x/parallel: no loadgenbig
When we do the thrashing this leads to ENOSPC on smithi.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-10 09:44:17 -04:00
Sage Weil
435777dbff qa/suites/upgrade/jewel-x/parallel: thrash layout
We can't kill and restart osds because that will interfere with
the upgrade process.  We can, however, thrash the layout by
tweaking osd weights and so on.  This will exercise osd recovery
paths during the upgrade that aren't normally exercised (outside
of stress-split..which doesn't upgrade individual osds while they
are non-clean).

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-09 22:07:48 -04:00
Sage Weil
b61be07d45 qa/suites/upgrade/kraken-x/stress-split: more whitelisting
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-09 13:58:55 -04:00
Sage Weil
bbd5fe354c qa/suites/upgarde/jewel-x/point-to-point-x: disable app warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-09 09:18:54 -04:00
Sage Weil
bf29142b08 qa/suites/upgrade/kraken-x/stress-split*: whitelist
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-07 21:36:58 -04:00
Sage Weil
2234a0ed11 qa/suites/upgrade/kraken-x/parallel: whitelist
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-07 21:36:58 -04:00
Sage Weil
3e7d157871 qa/suites/upgrade/jewel-x/parallel: fix POOL_APP_NOT_ENABLED disable
This code runs on the mgr.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-07 15:12:10 -04:00
Sage Weil
ed2d984ad1 qa/suites/upgarde/jewel-x/parallel: more whitelisting
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-06 10:04:14 -04:00
Sage Weil
58f15d2b98 qa/suites/upgrade/jewel-x/parallel: more whitelisting
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-06 09:56:55 -04:00
Sage Weil
622e950e43 qa/suites/upgrade/*-x/parallel: whitelist more stuff
Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-06 09:56:55 -04:00
Sage Weil
2d260443f0 qa/suites/upgrade/*/parallel: disable POOL_APP_NOT_ENABLED
There is some other random workload running (that creates pools)
while we upgrade and wait for healthy.  Just disable the warning
for these tests.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-06 09:56:55 -04:00
Sage Weil
f4c2863999 qa/suites/upgrade/jewel-x/parallel: whitelist OSD_DOWN
We restart OSDs during the upgrade.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-06 09:56:55 -04:00
Patrick Donnelly
d4ed085238
Merge PR #16713 into master
* refs/remotes/upstream/pull/16713/head:
	qa: ignore failed MDS message during upgrade
2017-08-02 19:41:42 -07:00
Kefu Chai
d12c51ca91 qa/suites: escape the parenthesis of the whitelist text
so we can avoid the warnings like

grep: Unmatched ( or \(

because we pass the whitelisted string to `egrep -v "$1"` directly.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-08-01 21:54:44 +08:00
Patrick Donnelly
5e5ff5c086
qa: ignore failed MDS message during upgrade
The cluster is expected to become degraded during reboot.

Fixes: http://tracker.ceph.com/issues/20731
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-31 14:45:07 -07:00
Sage Weil
e398fd4ee4 qa/suites: more whitelisting
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 09:31:24 -04:00
Sage Weil
29549e6834 Merge pull request #13723 from ovh/bp-forced-recovery
osd/PG: make prioritized recovery possible

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-24 09:01:03 -05:00
John Spray
343e1a4281 qa: update whitelist for "wrongly marked me down"
Signed-off-by: John Spray <john.spray@redhat.com>
2017-07-24 14:54:46 +01:00
Sage Weil
27e8d75f61 Merge pull request #16429 from liewegas/wip-jewel-x
qa/suites/upgrade/jewel-x: misc fixes for new health checks
2017-07-20 10:47:05 -05:00
Piotr Dałek
b0134cc7a8 qa: add force/cancel recovery/backfill to QA testing
This randomly issues pg force-recovery/force-backfill and
pg cancel-force-recovery/cancel-force-backfill during QA
testing. Disabled for upgrades from hammer, jewel and kraken.

Signed-off-by: Piotr Dałek <piotr.dalek@corp.ovh.com>
2017-07-20 09:35:55 +02:00
Jason Dillaman
fa90be842e test: enable pool applications for new pools
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-19 13:13:01 -04:00
Sage Weil
7102de8761 qa/suites/upgrade/jewel-x/point-to-point: move set-require-min-compat-client
Do it after workload completes and all jewel clients go away.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-18 12:32:17 -04:00
Sage Weil
e2fdfc0b10 qa/suites/upgrade/jewel-x: link to thrashosds yaml
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-18 12:29:01 -04:00
Sage Weil
6ffc677dc5 qa/suites/upgade/jewel-x/parallel: ignore FS_ and MDS_ errors during restart
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-17 15:25:38 -04:00
Sage Weil
f2b837578a Merge pull request #16244 from liewegas/wip-11793
qa/suites/rados/thrash/workload/*: enable rados.py cache tiering ops

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-11 13:01:42 -05:00
Sage Weil
2afbc60be7 qa/suites/: enable rados.py cache tiering ops
These weren't being exercised!

See http://tracker.ceph.com/issues/11793

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-11 14:01:15 -04:00
Sage Weil
dc7a2aaf7a erasure-code: ruleset-* -> crush-*
1) ruleset is an obsolete term, and
2) crush-{rule,failure-domain,...} is more descriptive.

Note that we are changing the names of the erasure code profile keys
from ruleset-* to crush-*.  We will update this on upgrade when the
luminous flag is set, but that means that during mon upgrade you cannot
create EC pools that use these fields.

When the upgrade completes (users sets require_osd_release = luminous)
existing ec profiles are updated automatically.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-06 15:01:03 -04:00
Sage Weil
fc7afc239f Merge pull request #15853 from liewegas/wip-simpler-ceph
qa/tasks/ceph: simplify ceph deployment slightly

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-06-27 14:13:53 -05:00
Sage Weil
e7006d06fb qa/tasks/ceph: explicitly add osds to crush map for upgrades
Before kraken, ceph-osd didn't add itself to crush... ceph-osd-prestart.sh
did it.  And ceph.py doesn't use that.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-27 12:52:35 -04:00
Kefu Chai
1b3f0bbd66 qa/suites/upgrade/hammer-jewel-x: upgrade all mon to luminous before osd
luminous osd requires that monmap has REQUIRE_LUMINOUS before it boots.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-06-22 12:19:01 +08:00
Kefu Chai
f11737a643 qa/suites/upgrade/hammer-jewel-x: replace kraken.yaml with luminous.yaml
* add mgr.x to roles
* to setup mgr and set the require-osd-release bit in osdmap
* do not restart an osd for waiting for healthy: the cluster is not
  healthy until the require-osd-release=luminous is set in osdmap, and a
  mgr is up and running.

Fixes: http://tracker.ceph.com/issues/20342
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-06-22 12:19:01 +08:00
Sage Weil
27697a443d Merge pull request #15637 from liewegas/wip-point-to-point
qa/upgrade/jewel-x/point-to-point: add a mgr during final upgrade
2017-06-19 21:59:30 -05:00
Kefu Chai
b7f59b6437 qa/suites/upgrade: remove duplicated upgrade task
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-06-13 17:00:06 +08:00
Kefu Chai
3734280522 qa/suites/upgrade: set "sortbitwise" for jewel clusters
so ceph.healthy or wait-for-healthy won't be blocked by this warning.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-06-13 17:00:00 +08:00
Sage Weil
5bc1a25bbe qa/upgrade/jewel-x/point-to-point: add a mgr during final upgrade
Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-12 14:19:02 -04:00
Kefu Chai
8185bc059d qa/suites/upgrade/hammer-jewel-x: don't initially start mgr daemons
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-06-04 14:35:15 -04:00
Yuri Weinstein
02242ea48e Removed all 'default_idle_timeout' due to chnage in rwg task
8c74c8a639 (diff-995b04809fcabacc3e3ecfaea903a41aL539)

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
2017-06-01 14:01:30 -07:00
Sage Weil
22ddc2e64a qa/suites/upgrade/kraken-x: enable experimental for bluestore
Signed-off-by: Sage Weil <sage@redhat.com>
2017-05-30 09:28:13 -04:00
Sage Weil
73f8fb9976 qa/suites/upgarde/jewel-x: add final scrub and legacy snapset check
Signed-off-by: Sage Weil <sage@redhat.com>
2017-05-05 13:39:14 -04:00
Sage Weil
fcd64d75ab Merge pull request #14444 from liewegas/wip-past-intervals
osd: simplify past_intervals representation

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-05-02 10:51:37 -05:00
Sage Weil
e4874b4091 Merge pull request #14788 from liewegas/wip-jewel-x-rgw
qa/suites/jewel-x/point-to-point: don't scane for keys on second s3tests either

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-05-01 17:12:39 -05:00
Sage Weil
1868c56f54 qa/suites/upgrade/kraken-x: limit fs matrix
Signed-off-by: Sage Weil <sage@redhat.com>
2017-04-28 17:51:33 -04:00
Sage Weil
3ab8dff07f qa/suites/upgrade/jewel-x: add cache tiering + snaps workload
Signed-off-by: Sage Weil <sage@redhat.com>
2017-04-28 11:30:38 -04:00
Sage Weil
d063c3dc73 qa/suites/upgrade/kraken-x/stress-split-erasure-code: fix
Signed-off-by: Sage Weil <sage@redhat.com>
2017-04-28 11:14:45 -04:00
Sage Weil
79f95bc65f qa/suites/upgrade/kraken-x/parallel: fix
Signed-off-by: Sage Weil <sage@redhat.com>
2017-04-28 11:13:50 -04:00
Sage Weil
8dfc148652 qa/suites/upgrade/jewel-x/parallel: remove stray kraken.yaml
Signed-off-by: Sage Weil <sage@redhat.com>
2017-04-28 11:13:22 -04:00
Sage Weil
dd174148ef qa/suites/upgrade/kraken-x/stress-split: updates
Bring this in line with jewel-x (which now passes).

Signed-off-by: Sage Weil <sage@redhat.com>
2017-04-27 10:07:44 -04:00