RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2024-12-25 21:03:31 +00:00

Author	SHA1	Message	Date
Sage Weil	21027233b2	qa/tasks/ceph_manager: revive osds before doing final rerr reset We assume below that rerrosd is up, but it may not be when we exit the loop. Fixes: http://tracker.ceph.com/issues/21206 Signed-off-by: Sage Weil <sage@redhat.com>	2017-08-31 14:55:46 -04:00
Sage Weil	a40d94b163	qa/tasks/ceph: wait for pg stats to flush in healthy check Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-27 12:10:27 -04:00
Sage Weil	80978dea8a	qa/tasks/ceph_manager: wait_for_all_up -> wait_for_all_osds_up Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-27 12:10:26 -04:00
Sage Weil	7648894e55	qa/tasks/ceph_manager: expose flush_all_pg_stats Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-27 12:10:26 -04:00
Sage Weil	02c2e853d3	Merge pull request #16509 from liewegas/wip-rgw-wait qa/suits/rados/basic/tasks/rgw_snaps: wait for pools to be created Reviewed-by: Casey Bodley <cbodley@redhat.com>	2017-07-24 11:55:54 -05:00
Sage Weil	29549e6834	Merge pull request #13723 from ovh/bp-forced-recovery osd/PG: make prioritized recovery possible Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2017-07-24 09:01:03 -05:00
Sage Weil	ecd1193ab9	qa/suites/rados/basic/tasks/rgw_snaps: wait for pools to be be created Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-22 18:54:46 -04:00
Sage Weil	583a38bca2	qa/tasks/ceph_manager: wait for osd to start after objectstore-tool sequence Fixes: http://tracker.ceph.com/issues/20705 Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-20 11:41:36 -04:00
Piotr Dałek	b0134cc7a8	qa: add force/cancel recovery/backfill to QA testing This randomly issues pg force-recovery/force-backfill and pg cancel-force-recovery/cancel-force-backfill during QA testing. Disabled for upgrades from hammer, jewel and kraken. Signed-off-by: Piotr Dałek <piotr.dalek@corp.ovh.com>	2017-07-20 09:35:55 +02:00
Jason Dillaman	836ab7ad95	test: skip pool application metadata tests if OSDs not at min luminous Signed-off-by: Jason Dillaman <dillaman@redhat.com>	2017-07-19 13:13:01 -04:00
Sage Weil	56e2965502	qa/tasks/ceph_manager: wait longer for pg stats to flush An ill-timed mgr restart could blow the current 15s wait. Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-13 12:13:45 -04:00
David Zafman	33edfe3a0f	test: Add two new singleton test yamls radom-eio and thrash-eio New option "random_eio" to Thrasher, sets 1 osd random read percentage New option "objectsize" to radosbench task (-o bench option) New option "type" to radosbench specify write, seq or rand Signed-off-by: David Zafman <dzafman@redhat.com>	2017-06-23 08:09:15 -07:00
Sage Weil	6a00ba0e26	qa/tasks/ceph_manager: get osds all in after thrashing Otherwise we might end up with some PGs remapped, which means they won't get scrubbed. Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-20 12:07:25 -04:00
Sage Weil	f870cc5f28	qa/tasks/thrashosds: wait before wait_for_recovery Make sure OSDs are up and they have flushed their PG stats before waiting for recovery to ensure that we do not see a stale 'clean' state. Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-15 12:14:24 -04:00
Kefu Chai	e8b23d6852	qa/tasks: add a blacklist for flush_pg_stats() so we don't wait for marked out osds. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-06-02 13:06:50 -04:00
Sage Weil	ab1b78ae00	qa/tasks: use new reliable flush_pg_stats helper The helper gets a sequence number from the osd (or osds), and then polls the mon until that seq is reflected there. This is overkill in some cases, since many tests only require that the stats be reflected on the mgr (not the mon), but waiting for it to also reach the mon is sufficient! Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-02 13:02:45 -04:00
Kefu Chai	8abc6e1bea	qa/tasks/rebuild_mondb: update to address ceph-mgr changes - revive ceph-mgr after updating the keyring cap - grant "mgr:allow *" to client.admin - minor refactors Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-05-28 09:59:50 +08:00
Sage Weil	5ab996ab3c	qa/tasks/ceph_manager: 'ceph $service tell ...' is obsolete This died forever ago; no need for the fallback here. Signed-off-by: Sage Weil <sage@redhat.com>	2017-05-23 22:53:53 -04:00
Kefu Chai	da1161cbd8	qa/tasks/ceph_manager: always fix pgp_num when done with thrashosd task Fixes: http://tracker.ceph.com/issues/19771 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-05-03 18:28:27 +08:00
Sage Weil	27dd6530a2	Merge pull request #14559 from liewegas/wip-pg-map mon: move 'pg map' to OSDMonitor Reviewed-by: Kefu Chai <kchai@redhat.com>	2017-04-21 18:53:17 -05:00
Sage Weil	069182f91f	qa/tasks/ceph_manager: use 'pg map' for get_pg_{primary,replica} Pulling this out of the 'pg dump' heap is inefficient. Also, pg dump data comes from the mgr and may be stale. Signed-off-by: Sage Weil <sage@redhat.com>	2017-04-21 10:56:28 -04:00
Kefu Chai	6fa16c4477	Merge pull request #14584 from tchaikov/wip-19631 qa/suites: Revert "qa/suites: add mon-reweight-min-pgs-per-osd = 4" Reviewed-by: Sage Weil <sage@redhat.com>	2017-04-21 22:56:21 +08:00
Kefu Chai	e6a436bb27	qa/tasks/ceph_manager: be able to store options with service type so we are able to change options for services other than mon while thrashing. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-04-20 14:18:21 +08:00
Kefu Chai	ee653ba87c	Merge pull request #14608 from tchaikov/wip-19594 qa/tasks: assert on pg status with a timeout Reviewed-by: Sage Weil <sage@redhat.com>	2017-04-20 10:49:12 +08:00
Kefu Chai	960032e513	qa/tasks: update tests with helper to wait for pg-stats and remove unused helpers Fixes: http://tracker.ceph.com/issues/19594 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-04-20 09:35:05 +08:00
Kefu Chai	1207caf3a2	qa/tasks/ceph_manager: add a "wait_for_pg_stats()" decorator and accompany it with two helpers to access the pg stats in a more natural way Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-04-20 09:35:04 +08:00
Josh Durgin	6fba80c1fa	osd, OSDMonitor, qa: mark ec overwrites non-experimental Keep the pool flag around so we can distinguish between a pool that should maintain hashes for each chunk, and a missing one is a bug, vs an overwrites pool where we rely on bluestore checksums for detecting corruption. Signed-off-by: Josh Durgin <jdurgin@redhat.com>	2017-04-19 17:45:43 -07:00
Sage Weil	ee1bb01a54	Merge pull request #14556 from liewegas/wip-pgupmap osd: pg-remap -> pg-upmap Reviewed-by: David Zafman <dzafman@redhat.com>	2017-04-19 17:07:01 -05:00
Sage Weil	ce188e8fdf	osd: pg-remap -> pg-upmap 'remap' is to non-specific a name. In particular, it sounds like it is related to the 'remapped' PG state but in reality it is not related. 'upmap' or 'pg-upmap' is more specific: it maps a pgid to the 'up' set value (or item) Signed-off-by: Sage Weil <sage@redhat.com>	2017-04-18 12:59:40 -04:00
Kefu Chai	1b54b5f3f1	Merge pull request #14415 from smithfarm/wip-19556 tests: Thrasher: handle "OSD has the store locked" gracefully Reviewed-by: Kefu Chai <kchai@redhat.com>	2017-04-18 23:18:35 +08:00
David Zafman	a5731076ad	osd: Handle backfillfull_ratio just like nearfull and full Add BACKFILLFULL as a local OSD cur_state Notify monitor of this new fullness state Signed-off-by: David Zafman <dzafman@redhat.com>	2017-04-17 08:00:24 -07:00
Nathan Cutler	a5b19d2d73	tests: Thrasher: handle "OSD has the store locked" gracefully On slower machines (VPS, OVH) it takes time for the OSD to go down. Fixes: http://tracker.ceph.com/issues/19556 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2017-04-11 16:09:45 +02:00
Sage Weil	2a08cbbed5	qa/tasks/thrashosds,ceph_manager: thrash pg_remap[_items] Signed-off-by: Sage Weil <sage@redhat.com>	2017-03-28 10:12:10 -04:00
Sage Weil	296708091c	qa/tasks/ceph_manager: use new luminous set-full-ratio etc Signed-off-by: Sage Weil <sage@redhat.com>	2017-03-07 16:39:09 -05:00
Sage Weil	a202b68d18	qa/tasks/thrashosds: chance_thrash_cluster_full Induce a momentarily full cluster. Signed-off-by: Sage Weil <sage@redhat.com>	2017-03-07 13:33:44 -05:00
Samuel Just	44b26f6ab4	Merge pull request #13594 from athanatos/wip-snap-trim-sleep osd: add snap trim reservation and re-implement osd_snap_trim_sleep Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2017-02-24 14:09:17 -08:00
Kefu Chai	c0f0cde399	test: Thrasher: do not update pools_to_fix_pgp_num if nothing happens we should not update pools_to_fix_pgp_num if the pool is not expanded or the pg_num is not increased due to pgs being created. this prevent us from fixing the pgp_num after done with thrashing if we actually did nothing when fixing the pgp_num when thrashing, but we removed the pool from pools_to_fix_pgp_num after set_pool_pgpnum() returns. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-19 13:10:46 +08:00
Samuel Just	4aebf59d90	rados: check that pool is done trimming before removing it Signed-off-by: Samuel Just <sjust@redhat.com>	2017-02-13 09:47:02 -08:00
Kefu Chai	de59b5102c	test: Thrasher: restore changed options after done with thrash Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:51 +08:00
Kefu Chai	761a1dc391	tests: Thrasher: extract _set_config() method Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:50 +08:00
Kefu Chai	995e144e3e	tests: CephManager: add get_config() method Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:50 +08:00
Kefu Chai	136483a8f9	test: Thrasher: update pgp_num of all expanded pools if not yet otherwise wait_until_healthy will fail after timeout as seeing warning like: HEALTH_WARN pool cephfs_data pg_num 182 > pgp_num 172 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:50 +08:00
Nathan Cutler	db2582e25e	tests: fix regression in qa/tasks/ceph_master.py https://github.com/ceph/ceph/pull/13194 introduced a regression: 2017-02-06T16:14:23.162 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last): File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 722, in wrapper return func(self) File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 839, in do_thrash self.choose_action()() File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 305, in kill_osd output = proc.stderr.getvalue() AttributeError: 'NoneType' object has no attribute 'getvalue' This is because the original patch failed to pass "stderr=StringIO()" to run(). Fixes: http://tracker.ceph.com/issues/16263 Signed-off-by: Nathan Cutler <ncutler@suse.com> Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-06 19:37:38 +01:00
Sage Weil	5fc3dd36e2	Merge pull request #13237 from smithfarm/wip-18799 tests: Thrasher: eliminate a race between kill_osd and __init__ Reviewed-by: Sage Weil <sage@redhat.com>	2017-02-05 12:49:30 -06:00
Nathan Cutler	b519d38fb1	tests: Thrasher: eliminate a race between kill_osd and __init__ If Thrasher.__init__() spawns the do_thrash thread before initializing the ceph_objectstore_tool property, do_thrash races with the rest of Thrasher.__init__() and in some cases do_thrash can call kill_osd() before Trasher.__init__() progresses much further. This can lead to an exception ("AttributeError: Thrasher instance has no attribute 'ceph_objectstore_tool'") being thrown in kill_osd(). This commit eliminates the race by making sure the ceph_objectstore_tool attribute is initialized before the do_thrash thread is spawned. Fixes: http://tracker.ceph.com/issues/18799 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2017-02-02 23:23:54 +01:00
Nathan Cutler	046e873026	tests: ignore bogus ceph-objectstore-tool error in ceph_manager Fixes: http://tracker.ceph.com/issues/16263 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2017-01-31 00:49:05 +01:00
Sage Weil	c01f2ee0e2	move ceph-qa-suite dirs into qa/	2016-12-14 11:29:55 -06:00

47 Commits