RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-04-04 15:36:24 +00:00

Author	SHA1	Message	Date
Sage Weil	d014b7924d	qa/tasks/ceph_manager: 5s -> 15s for 'osd out' to be visible Signed-off-by: Sage Weil <sage@redhat.com>	2019-07-12 08:56:50 -05:00
Sage Weil	0b4ce2ab4c	qa/tasks/ceph_manager: make is_{clean,recovered,active_or_down} less racy Currently these can be thrown off if the cluster is creating or removing pools at the same time. Fix by taking a single snapshot of the pg stats and based our judgement on that. Signed-off-by: Sage Weil <sage@redhat.com>	2019-07-10 11:04:49 -05:00
Kefu Chai	fbd4836d24	qa/tasks/ceph_manager.py: ignore errors in test_pool_min_size to be specific, ignore errors when querying erasure coded pool's erasure-code-profile. the pool might be removed after "test_pool_min_size" lists all pools and before queries the pools' erasure-code-profile. in that case, we should just continue on with the next pool. normally, the pools are created by the "radosbench" tasks. and they don't delete the ec profiles after removing the ec pools using them, but i don't want to rely on this fact. so, in this change, the `try` block guards both `ceph osd pool get <pool_name> erasure_code_profile` and `ceph osd erasure-code-profile get <profile>` calls. Fixes: http://tracker.ceph.com/issues/40533 Signed-off-by: Kefu Chai <kchai@redhat.com>	2019-06-27 19:00:23 +08:00
Kefu Chai	1a2700f404	qa/tasks: extract {ERASURE_CODED,REPLICATED}_POOL out so they can be reused by `Thrasher`. Signed-off-by: Kefu Chai <kchai@redhat.com>	2019-06-27 19:00:23 +08:00
Chang Liu	b02e2f6cf2	test: update test_pool_min_size test in thrasher Signed-off-by: Chang Liu <liuchang0812@gmail.com>	2019-05-10 10:45:25 +08:00
Greg Farnum	0ee63a0450	qa: extend get_pool_property() to allow non-int values Signed-off-by: Greg Farnum <gfarnum@redhat.com>	2019-05-10 10:45:25 +08:00
Greg Farnum	7950ce2488	qa: don't create rbd pool for min-size thrashing tests Signed-off-by: Greg Farnum <gfarnum@redhat.com>	2019-05-10 10:45:25 +08:00
Greg Farnum	b701395065	qa: write a thrasher for putting PGs below min_size and watching them recover Signed-off-by: Greg Farnum <gfarnum@redhat.com>	2019-05-10 10:45:25 +08:00
Greg Farnum	78755091f9	qa: remove unused variable from ceph_manager Pyflakes warned me about this. Signed-off-by: Greg Farnum <gfarnum@redhat.com>	2019-05-10 10:45:25 +08:00
Sage Weil	54c5202b74	qa/tasks/ceph: stop any split/merge activity before scrubbing If there are leftover merges at the end of the run they can take a long time to get through, blowing our timeout for (waiting for pgs to become active and to stop splitting/merge) and scrubbing pgs. Stop all of that at the end of the run so that we don't have to wait so long. Signed-off-by: Sage Weil <sage@redhat.com>	2019-01-14 06:51:21 -06:00
Sage Weil	0d4c4db3c0	qa/tasks/ceph_manager: compare osd flush seq #'s as ints Signed-off-by: Sage Weil <sage@redhat.com>	2019-01-03 11:17:38 -06:00
Sage Weil	ac2430a43d	qa/tasks/ceph_manager: make get_mon_status use mon addr We don't have the 'mon addr' config property any more. Signed-off-by: Sage Weil <sage@redhat.com>	2019-01-03 11:17:31 -06:00
Sage Weil	28aaca58e7	qa/tasks/ceph_manager: avoid test_map_discontinuity stall with too few up osds Some tests have m=2,k=2 and this will break them. Sometimes even if we have 5 up osds, we end up with 4 and CRUSH gets picky, so build in a buffer and only do this if we have 6 up. We don't have an easy way from here to see what the min up osds for healthy is... basically this map discontinuity test just sucks. Signed-off-by: Sage Weil <sage@redhat.com>	2018-11-20 17:12:43 -06:00
Sage Weil	b678356594	qa/tasks/ceph_manager: fix get_stuck_pgs from pg dump change Fixes `95b7d2340c` Fixes: http://tracker.ceph.com/issues/36485 Signed-off-by: Sage Weil <sage@redhat.com>	2018-10-21 10:52:38 -05:00
Patrick Donnelly	d491227956	qa: fix run call args Fixes: http://tracker.ceph.com/issues/36450 Introduced-by: `95746ecce9` Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2018-10-15 14:45:18 -07:00
John Spray	67d147c00d	Merge pull request #23622 from renhwztetecs/renhw-wip-25103 mgr: fixup pgs show in unknown state Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: John Spray <john.spray@redhat.com>	2018-10-10 13:28:33 +01:00
Volker Theile	95746ecce9	mgr: Add ability to trigger a cluster/audit log message from Python Fixes: https://tracker.ceph.com/issues/36194 Signed-off-by: Volker Theile <vtheile@suse.com>	2018-10-04 13:33:18 +02:00
huanwen ren	ed442447c0	qa: modify the format for add pgmap_ready. Signed-off-by: huanwen ren <ren.huanwen@zte.com.cn>	2018-09-27 23:22:50 +08:00
Kefu Chai	4b0e2c8ed4	qa: fix typos Signed-off-by: Kefu Chai <kchai@redhat.com>	2018-09-21 12:41:42 +08:00
Kefu Chai	510d9e1345	Merge pull request #23723 from xiexingguo/wip-list-missing osd/PrimaryLogPG: rename list_missing -> list_unfound command Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>	2018-09-11 20:25:21 +08:00
Sage Weil	6bd682f53d	ceph-objectstore-tool: prevent import of pg that has since merged We currently import a portion of the PG if it has split. Merge is more complicated, though, mainly because COT is operating in a mode where it fast-forwards the PG to the latest OSDMap epoch, which means it has to implement any transformations to the PG (split/merge) independently. Avoid doing this for merge. Signed-off-by: Sage Weil <sage@redhat.com>	2018-09-07 12:09:05 -05:00
Sage Weil	0b59b7a688	qa/tasks/thrashosds: support merging pgs too Signed-off-by: Sage Weil <sage@redhat.com>	2018-09-07 12:09:05 -05:00
xie xingguo	85ba2f0a82	osd/PrimaryLogPG: s/list_missing/list_unfound/ Also: - Do not print offset until specified - Count missing objects correctly (used to be primary's local missing) Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>	2018-09-06 09:52:20 +08:00
Sage Weil	2c26fb0fe1	rados: drop mkpool, rmpool commands - mkpool and rmpool users should use the normal cli/mon commands Signed-off-by: Sage Weil <sage@redhat.com>	2018-08-31 09:27:36 -05:00
Dan Mick	7fc8714a27	qa/tasks/{ceph_manager.py,vstart_runner.py}: allow kwargs in raw_* Allow passing kwargs (like stdin=) to the local and teuthology clusters when running tests Signed-off-by: Dan Mick <dan.mick@redhat.com>	2018-06-29 14:51:34 -07:00
David Zafman	151de1797b	test: wait_for_pg_stats() should do another check after last 13 second sleep Signed-off-by: David Zafman <dzafman@redhat.com>	2018-05-23 17:27:14 -07:00
Vasu Kulkarni	7881a19d92	qa/tasks: wait_for_clean is called after ceph task as well after osd's are up, the default timeout is none in that case, there are cases where it can hang forever due to error cases, since this dumps quite a lot of info the logs grow in GB's, with default timeout of 1200 we can avoid such huge logs and fail sooner. Any tests needing higher timeout can pass the required value. Signed-off-by: Vasu Kulkarni <vasu@redhat.com>	2018-04-09 17:24:42 -07:00
Sage Weil	577737d007	osd: osd_mon_report_interval_min -> osd_mon_report_interval, kill _max The _max isn't used. Drop the _min suffix. Signed-off-by: Sage Weil <sage@redhat.com>	2018-04-06 11:00:14 -05:00
Tatjana Dehler	25a0ed93ec	mgr/dashboard: add 'osd metadata' command call Signed-off-by: Tatjana Dehler <tdehler@suse.com>	2018-03-23 11:11:17 +01:00
Neha Ojha	e3899dc901	qa/tasks/ceph_manager: use set_config on revived osd Signed-off-by: Neha Ojha <nojha@redhat.com>	2018-03-14 12:37:56 -07:00
Sage Weil	8651e15c93	qa/tasks/ceph_manager: tolerate failure to force backfill/recoery The pool may have been deleted out from underneath us. Signed-off-by: Sage Weil <sage@redhat.com>	2018-01-03 08:37:02 -06:00
Sage Weil	aafb3a565d	qa/tasks/ceph_manager: tolerate tell osd.* error It's possible for tell osd.* to race against an osd we stopped but the cluster doesn't know is down yet. In tha case we'll get ENXIO on that osd and the command will fail. In this context, we don't care. Signed-off-by: Sage Weil <sage@redhat.com>	2017-12-06 17:51:20 -06:00
Kefu Chai	a406553a79	qa/tasks/ceph_manager: add inject_args() method * move Thrasher._set_config() to CephManager, and make it a public method, and rename it to inject_args(), * use this method instead of using 'tell ... injectargs ...' directly Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-11-29 18:44:16 +08:00
Kefu Chai	749bbda075	qa/tasks: prolong revive_osd() timeout to 6 min see also #17902 Fixes: http://tracker.ceph.com/issues/21474 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-11-20 13:40:59 +08:00
Kefu Chai	7f549af459	qa: do not wait for down/out osd for pg convergence that osd is not invovlved in the PG state changes. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-11-08 14:50:10 +08:00
Sage Weil	d21809b14e	qa/tasks/thrashosds: set min_in default to 4 We have EC tests with k=2,m=2, so we need a min of 4. Fixes: http://tracker.ceph.com/issues/21997 Signed-off-by: Sage Weil <sage@redhat.com>	2017-11-01 08:32:48 -05:00
Patrick Donnelly	c58161f25b	Merge PR #17266 into master * refs/pull/17266/head: qa: update test_ceph_argparse to test fs cmds qa: use fs rm_data_pool qa: fix mdsmap lookup qa: remove usage of mds dump PendingReleaseNotes: add obsoleted mds commands qa: remove use of obsolete mds commands ceph_volume_client: remove use of obsolete mds cmd doc: update on obsolete mds commands cephfs: obsolete deprecated mds commands Reviewed-by: Douglas Fuller <dfuller@redhat.com>	2017-10-24 16:37:14 -07:00
Patrick Donnelly	3a5f090a1e	qa: remove usage of mds dump Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-10-24 11:32:43 -07:00
Kefu Chai	4c7df944c7	osd: add max-pg-per-osd limit osd will refused to create new pgs, until its pg number is lower than the max-pg-per-osd upper bound setting. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-10-17 23:08:40 +08:00
Kefu Chai	e21114274f	qa: s/backfill/backfilling/ it's renamed "backfilling" in `4015343f` . Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-10-11 11:52:43 +08:00
Sage Weil	b6a5c09dba	ceph-objectstore-tool: remove rm-past-intervals op The OSD doesn't rebuild this on demand anymore. Signed-off-by: Sage Weil <sage@redhat.com>	2017-10-06 13:08:18 -05:00
Sage Weil	61799c4c8c	Merge pull request #17810 from hjwsm1989/wip-21294 qa/ceph_manager: check pg state again before timedout Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2017-09-25 12:33:34 -05:00
Kefu Chai	42be200c56	qa/tasks: prolong revive_osd() timeout to 6 min bluestore_fsck_on_mount and bluestore_fsck_on_mount_deep are enabled by default. and bluestore is used as the default store backend. it takes longer to perform the deep fsck with verbose log. so prolong the revive_osd()'s timeout from 150 sec to 360 sec. Fixes: http://tracker.ceph.com/issues/21474 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-09-22 10:58:41 +08:00
huangjun	fa40add7f0	qa/ceph_manager: check pg state again before timedout Pg state maybe all in active+clean when no recovering going on, so check it again before timedout. Fixes: http://tracker.ceph.com/issues/21294 Signed-off-by: huangjun <huangjun@xsky.com>	2017-09-20 00:04:04 +08:00
yonghengdexin735	fc5ac9ea69	common:fix error word Signed-off-by: yonghengdexin735 <zhang.zezhu@zte.com.cn>	2017-09-13 10:22:08 +08:00
David Zafman	3bb20f6d75	ceph-objectstore-tool: Make pg removal require --force Add new export-remove to combine the 2 operations Fixes: http://tracker.ceph.com/issues/21272 Signed-off-by: David Zafman <dzafman@redhat.com>	2017-09-08 17:56:05 -07:00
Sage Weil	21027233b2	qa/tasks/ceph_manager: revive osds before doing final rerr reset We assume below that rerrosd is up, but it may not be when we exit the loop. Fixes: http://tracker.ceph.com/issues/21206 Signed-off-by: Sage Weil <sage@redhat.com>	2017-08-31 14:55:46 -04:00
Sage Weil	a40d94b163	qa/tasks/ceph: wait for pg stats to flush in healthy check Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-27 12:10:27 -04:00
Sage Weil	80978dea8a	qa/tasks/ceph_manager: wait_for_all_up -> wait_for_all_osds_up Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-27 12:10:26 -04:00
Sage Weil	7648894e55	qa/tasks/ceph_manager: expose flush_all_pg_stats Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-27 12:10:26 -04:00
Sage Weil	02c2e853d3	Merge pull request #16509 from liewegas/wip-rgw-wait qa/suits/rados/basic/tasks/rgw_snaps: wait for pools to be created Reviewed-by: Casey Bodley <cbodley@redhat.com>	2017-07-24 11:55:54 -05:00
Sage Weil	29549e6834	Merge pull request #13723 from ovh/bp-forced-recovery osd/PG: make prioritized recovery possible Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2017-07-24 09:01:03 -05:00
Sage Weil	ecd1193ab9	qa/suites/rados/basic/tasks/rgw_snaps: wait for pools to be be created Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-22 18:54:46 -04:00
Sage Weil	583a38bca2	qa/tasks/ceph_manager: wait for osd to start after objectstore-tool sequence Fixes: http://tracker.ceph.com/issues/20705 Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-20 11:41:36 -04:00
Piotr Dałek	b0134cc7a8	qa: add force/cancel recovery/backfill to QA testing This randomly issues pg force-recovery/force-backfill and pg cancel-force-recovery/cancel-force-backfill during QA testing. Disabled for upgrades from hammer, jewel and kraken. Signed-off-by: Piotr Dałek <piotr.dalek@corp.ovh.com>	2017-07-20 09:35:55 +02:00
Jason Dillaman	836ab7ad95	test: skip pool application metadata tests if OSDs not at min luminous Signed-off-by: Jason Dillaman <dillaman@redhat.com>	2017-07-19 13:13:01 -04:00
Sage Weil	56e2965502	qa/tasks/ceph_manager: wait longer for pg stats to flush An ill-timed mgr restart could blow the current 15s wait. Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-13 12:13:45 -04:00
David Zafman	33edfe3a0f	test: Add two new singleton test yamls radom-eio and thrash-eio New option "random_eio" to Thrasher, sets 1 osd random read percentage New option "objectsize" to radosbench task (-o bench option) New option "type" to radosbench specify write, seq or rand Signed-off-by: David Zafman <dzafman@redhat.com>	2017-06-23 08:09:15 -07:00
Sage Weil	6a00ba0e26	qa/tasks/ceph_manager: get osds all in after thrashing Otherwise we might end up with some PGs remapped, which means they won't get scrubbed. Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-20 12:07:25 -04:00
Sage Weil	f870cc5f28	qa/tasks/thrashosds: wait before wait_for_recovery Make sure OSDs are up and they have flushed their PG stats before waiting for recovery to ensure that we do not see a stale 'clean' state. Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-15 12:14:24 -04:00
Kefu Chai	e8b23d6852	qa/tasks: add a blacklist for flush_pg_stats() so we don't wait for marked out osds. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-06-02 13:06:50 -04:00
Sage Weil	ab1b78ae00	qa/tasks: use new reliable flush_pg_stats helper The helper gets a sequence number from the osd (or osds), and then polls the mon until that seq is reflected there. This is overkill in some cases, since many tests only require that the stats be reflected on the mgr (not the mon), but waiting for it to also reach the mon is sufficient! Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-02 13:02:45 -04:00
Kefu Chai	8abc6e1bea	qa/tasks/rebuild_mondb: update to address ceph-mgr changes - revive ceph-mgr after updating the keyring cap - grant "mgr:allow *" to client.admin - minor refactors Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-05-28 09:59:50 +08:00
Sage Weil	5ab996ab3c	qa/tasks/ceph_manager: 'ceph $service tell ...' is obsolete This died forever ago; no need for the fallback here. Signed-off-by: Sage Weil <sage@redhat.com>	2017-05-23 22:53:53 -04:00
Kefu Chai	da1161cbd8	qa/tasks/ceph_manager: always fix pgp_num when done with thrashosd task Fixes: http://tracker.ceph.com/issues/19771 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-05-03 18:28:27 +08:00
Sage Weil	27dd6530a2	Merge pull request #14559 from liewegas/wip-pg-map mon: move 'pg map' to OSDMonitor Reviewed-by: Kefu Chai <kchai@redhat.com>	2017-04-21 18:53:17 -05:00
Sage Weil	069182f91f	qa/tasks/ceph_manager: use 'pg map' for get_pg_{primary,replica} Pulling this out of the 'pg dump' heap is inefficient. Also, pg dump data comes from the mgr and may be stale. Signed-off-by: Sage Weil <sage@redhat.com>	2017-04-21 10:56:28 -04:00
Kefu Chai	6fa16c4477	Merge pull request #14584 from tchaikov/wip-19631 qa/suites: Revert "qa/suites: add mon-reweight-min-pgs-per-osd = 4" Reviewed-by: Sage Weil <sage@redhat.com>	2017-04-21 22:56:21 +08:00
Kefu Chai	e6a436bb27	qa/tasks/ceph_manager: be able to store options with service type so we are able to change options for services other than mon while thrashing. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-04-20 14:18:21 +08:00
Kefu Chai	ee653ba87c	Merge pull request #14608 from tchaikov/wip-19594 qa/tasks: assert on pg status with a timeout Reviewed-by: Sage Weil <sage@redhat.com>	2017-04-20 10:49:12 +08:00
Kefu Chai	960032e513	qa/tasks: update tests with helper to wait for pg-stats and remove unused helpers Fixes: http://tracker.ceph.com/issues/19594 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-04-20 09:35:05 +08:00
Kefu Chai	1207caf3a2	qa/tasks/ceph_manager: add a "wait_for_pg_stats()" decorator and accompany it with two helpers to access the pg stats in a more natural way Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-04-20 09:35:04 +08:00
Josh Durgin	6fba80c1fa	osd, OSDMonitor, qa: mark ec overwrites non-experimental Keep the pool flag around so we can distinguish between a pool that should maintain hashes for each chunk, and a missing one is a bug, vs an overwrites pool where we rely on bluestore checksums for detecting corruption. Signed-off-by: Josh Durgin <jdurgin@redhat.com>	2017-04-19 17:45:43 -07:00
Sage Weil	ee1bb01a54	Merge pull request #14556 from liewegas/wip-pgupmap osd: pg-remap -> pg-upmap Reviewed-by: David Zafman <dzafman@redhat.com>	2017-04-19 17:07:01 -05:00
Sage Weil	ce188e8fdf	osd: pg-remap -> pg-upmap 'remap' is to non-specific a name. In particular, it sounds like it is related to the 'remapped' PG state but in reality it is not related. 'upmap' or 'pg-upmap' is more specific: it maps a pgid to the 'up' set value (or item) Signed-off-by: Sage Weil <sage@redhat.com>	2017-04-18 12:59:40 -04:00
Kefu Chai	1b54b5f3f1	Merge pull request #14415 from smithfarm/wip-19556 tests: Thrasher: handle "OSD has the store locked" gracefully Reviewed-by: Kefu Chai <kchai@redhat.com>	2017-04-18 23:18:35 +08:00
David Zafman	a5731076ad	osd: Handle backfillfull_ratio just like nearfull and full Add BACKFILLFULL as a local OSD cur_state Notify monitor of this new fullness state Signed-off-by: David Zafman <dzafman@redhat.com>	2017-04-17 08:00:24 -07:00
Nathan Cutler	a5b19d2d73	tests: Thrasher: handle "OSD has the store locked" gracefully On slower machines (VPS, OVH) it takes time for the OSD to go down. Fixes: http://tracker.ceph.com/issues/19556 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2017-04-11 16:09:45 +02:00
Sage Weil	2a08cbbed5	qa/tasks/thrashosds,ceph_manager: thrash pg_remap[_items] Signed-off-by: Sage Weil <sage@redhat.com>	2017-03-28 10:12:10 -04:00
Sage Weil	296708091c	qa/tasks/ceph_manager: use new luminous set-full-ratio etc Signed-off-by: Sage Weil <sage@redhat.com>	2017-03-07 16:39:09 -05:00
Sage Weil	a202b68d18	qa/tasks/thrashosds: chance_thrash_cluster_full Induce a momentarily full cluster. Signed-off-by: Sage Weil <sage@redhat.com>	2017-03-07 13:33:44 -05:00
Samuel Just	44b26f6ab4	Merge pull request #13594 from athanatos/wip-snap-trim-sleep osd: add snap trim reservation and re-implement osd_snap_trim_sleep Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2017-02-24 14:09:17 -08:00
Kefu Chai	c0f0cde399	test: Thrasher: do not update pools_to_fix_pgp_num if nothing happens we should not update pools_to_fix_pgp_num if the pool is not expanded or the pg_num is not increased due to pgs being created. this prevent us from fixing the pgp_num after done with thrashing if we actually did nothing when fixing the pgp_num when thrashing, but we removed the pool from pools_to_fix_pgp_num after set_pool_pgpnum() returns. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-19 13:10:46 +08:00
Samuel Just	4aebf59d90	rados: check that pool is done trimming before removing it Signed-off-by: Samuel Just <sjust@redhat.com>	2017-02-13 09:47:02 -08:00
Kefu Chai	de59b5102c	test: Thrasher: restore changed options after done with thrash Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:51 +08:00
Kefu Chai	761a1dc391	tests: Thrasher: extract _set_config() method Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:50 +08:00
Kefu Chai	995e144e3e	tests: CephManager: add get_config() method Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:50 +08:00
Kefu Chai	136483a8f9	test: Thrasher: update pgp_num of all expanded pools if not yet otherwise wait_until_healthy will fail after timeout as seeing warning like: HEALTH_WARN pool cephfs_data pg_num 182 > pgp_num 172 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:50 +08:00
Nathan Cutler	db2582e25e	tests: fix regression in qa/tasks/ceph_master.py https://github.com/ceph/ceph/pull/13194 introduced a regression: 2017-02-06T16:14:23.162 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last): File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 722, in wrapper return func(self) File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 839, in do_thrash self.choose_action()() File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 305, in kill_osd output = proc.stderr.getvalue() AttributeError: 'NoneType' object has no attribute 'getvalue' This is because the original patch failed to pass "stderr=StringIO()" to run(). Fixes: http://tracker.ceph.com/issues/16263 Signed-off-by: Nathan Cutler <ncutler@suse.com> Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-06 19:37:38 +01:00
Sage Weil	5fc3dd36e2	Merge pull request #13237 from smithfarm/wip-18799 tests: Thrasher: eliminate a race between kill_osd and __init__ Reviewed-by: Sage Weil <sage@redhat.com>	2017-02-05 12:49:30 -06:00
Nathan Cutler	b519d38fb1	tests: Thrasher: eliminate a race between kill_osd and __init__ If Thrasher.__init__() spawns the do_thrash thread before initializing the ceph_objectstore_tool property, do_thrash races with the rest of Thrasher.__init__() and in some cases do_thrash can call kill_osd() before Trasher.__init__() progresses much further. This can lead to an exception ("AttributeError: Thrasher instance has no attribute 'ceph_objectstore_tool'") being thrown in kill_osd(). This commit eliminates the race by making sure the ceph_objectstore_tool attribute is initialized before the do_thrash thread is spawned. Fixes: http://tracker.ceph.com/issues/18799 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2017-02-02 23:23:54 +01:00
Nathan Cutler	046e873026	tests: ignore bogus ceph-objectstore-tool error in ceph_manager Fixes: http://tracker.ceph.com/issues/16263 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2017-01-31 00:49:05 +01:00
Sage Weil	c01f2ee0e2	move ceph-qa-suite dirs into qa/	2016-12-14 11:29:55 -06:00

1 2 3 4

193 Commits