RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2024-12-28 22:43:29 +00:00

Author	SHA1	Message	Date
Sage Weil	0b4ce2ab4c	qa/tasks/ceph_manager: make is_{clean,recovered,active_or_down} less racy Currently these can be thrown off if the cluster is creating or removing pools at the same time. Fix by taking a single snapshot of the pg stats and based our judgement on that. Signed-off-by: Sage Weil <sage@redhat.com>	2019-07-10 11:04:49 -05:00
Kefu Chai	fbd4836d24	qa/tasks/ceph_manager.py: ignore errors in test_pool_min_size to be specific, ignore errors when querying erasure coded pool's erasure-code-profile. the pool might be removed after "test_pool_min_size" lists all pools and before queries the pools' erasure-code-profile. in that case, we should just continue on with the next pool. normally, the pools are created by the "radosbench" tasks. and they don't delete the ec profiles after removing the ec pools using them, but i don't want to rely on this fact. so, in this change, the `try` block guards both `ceph osd pool get <pool_name> erasure_code_profile` and `ceph osd erasure-code-profile get <profile>` calls. Fixes: http://tracker.ceph.com/issues/40533 Signed-off-by: Kefu Chai <kchai@redhat.com>	2019-06-27 19:00:23 +08:00
Kefu Chai	1a2700f404	qa/tasks: extract {ERASURE_CODED,REPLICATED}_POOL out so they can be reused by `Thrasher`. Signed-off-by: Kefu Chai <kchai@redhat.com>	2019-06-27 19:00:23 +08:00
Chang Liu	b02e2f6cf2	test: update test_pool_min_size test in thrasher Signed-off-by: Chang Liu <liuchang0812@gmail.com>	2019-05-10 10:45:25 +08:00
Greg Farnum	0ee63a0450	qa: extend get_pool_property() to allow non-int values Signed-off-by: Greg Farnum <gfarnum@redhat.com>	2019-05-10 10:45:25 +08:00
Greg Farnum	7950ce2488	qa: don't create rbd pool for min-size thrashing tests Signed-off-by: Greg Farnum <gfarnum@redhat.com>	2019-05-10 10:45:25 +08:00
Greg Farnum	b701395065	qa: write a thrasher for putting PGs below min_size and watching them recover Signed-off-by: Greg Farnum <gfarnum@redhat.com>	2019-05-10 10:45:25 +08:00
Greg Farnum	78755091f9	qa: remove unused variable from ceph_manager Pyflakes warned me about this. Signed-off-by: Greg Farnum <gfarnum@redhat.com>	2019-05-10 10:45:25 +08:00
Sage Weil	54c5202b74	qa/tasks/ceph: stop any split/merge activity before scrubbing If there are leftover merges at the end of the run they can take a long time to get through, blowing our timeout for (waiting for pgs to become active and to stop splitting/merge) and scrubbing pgs. Stop all of that at the end of the run so that we don't have to wait so long. Signed-off-by: Sage Weil <sage@redhat.com>	2019-01-14 06:51:21 -06:00
Sage Weil	0d4c4db3c0	qa/tasks/ceph_manager: compare osd flush seq #'s as ints Signed-off-by: Sage Weil <sage@redhat.com>	2019-01-03 11:17:38 -06:00
Sage Weil	ac2430a43d	qa/tasks/ceph_manager: make get_mon_status use mon addr We don't have the 'mon addr' config property any more. Signed-off-by: Sage Weil <sage@redhat.com>	2019-01-03 11:17:31 -06:00
Sage Weil	28aaca58e7	qa/tasks/ceph_manager: avoid test_map_discontinuity stall with too few up osds Some tests have m=2,k=2 and this will break them. Sometimes even if we have 5 up osds, we end up with 4 and CRUSH gets picky, so build in a buffer and only do this if we have 6 up. We don't have an easy way from here to see what the min up osds for healthy is... basically this map discontinuity test just sucks. Signed-off-by: Sage Weil <sage@redhat.com>	2018-11-20 17:12:43 -06:00
Sage Weil	b678356594	qa/tasks/ceph_manager: fix get_stuck_pgs from pg dump change Fixes `95b7d2340c` Fixes: http://tracker.ceph.com/issues/36485 Signed-off-by: Sage Weil <sage@redhat.com>	2018-10-21 10:52:38 -05:00
Patrick Donnelly	d491227956	qa: fix run call args Fixes: http://tracker.ceph.com/issues/36450 Introduced-by: `95746ecce9` Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2018-10-15 14:45:18 -07:00
John Spray	67d147c00d	Merge pull request #23622 from renhwztetecs/renhw-wip-25103 mgr: fixup pgs show in unknown state Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: John Spray <john.spray@redhat.com>	2018-10-10 13:28:33 +01:00
Volker Theile	95746ecce9	mgr: Add ability to trigger a cluster/audit log message from Python Fixes: https://tracker.ceph.com/issues/36194 Signed-off-by: Volker Theile <vtheile@suse.com>	2018-10-04 13:33:18 +02:00
huanwen ren	ed442447c0	qa: modify the format for add pgmap_ready. Signed-off-by: huanwen ren <ren.huanwen@zte.com.cn>	2018-09-27 23:22:50 +08:00
Kefu Chai	4b0e2c8ed4	qa: fix typos Signed-off-by: Kefu Chai <kchai@redhat.com>	2018-09-21 12:41:42 +08:00
Kefu Chai	510d9e1345	Merge pull request #23723 from xiexingguo/wip-list-missing osd/PrimaryLogPG: rename list_missing -> list_unfound command Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>	2018-09-11 20:25:21 +08:00
Sage Weil	6bd682f53d	ceph-objectstore-tool: prevent import of pg that has since merged We currently import a portion of the PG if it has split. Merge is more complicated, though, mainly because COT is operating in a mode where it fast-forwards the PG to the latest OSDMap epoch, which means it has to implement any transformations to the PG (split/merge) independently. Avoid doing this for merge. Signed-off-by: Sage Weil <sage@redhat.com>	2018-09-07 12:09:05 -05:00
Sage Weil	0b59b7a688	qa/tasks/thrashosds: support merging pgs too Signed-off-by: Sage Weil <sage@redhat.com>	2018-09-07 12:09:05 -05:00
xie xingguo	85ba2f0a82	osd/PrimaryLogPG: s/list_missing/list_unfound/ Also: - Do not print offset until specified - Count missing objects correctly (used to be primary's local missing) Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>	2018-09-06 09:52:20 +08:00
Sage Weil	2c26fb0fe1	rados: drop mkpool, rmpool commands - mkpool and rmpool users should use the normal cli/mon commands Signed-off-by: Sage Weil <sage@redhat.com>	2018-08-31 09:27:36 -05:00
Dan Mick	7fc8714a27	qa/tasks/{ceph_manager.py,vstart_runner.py}: allow kwargs in raw_* Allow passing kwargs (like stdin=) to the local and teuthology clusters when running tests Signed-off-by: Dan Mick <dan.mick@redhat.com>	2018-06-29 14:51:34 -07:00
David Zafman	151de1797b	test: wait_for_pg_stats() should do another check after last 13 second sleep Signed-off-by: David Zafman <dzafman@redhat.com>	2018-05-23 17:27:14 -07:00
Vasu Kulkarni	7881a19d92	qa/tasks: wait_for_clean is called after ceph task as well after osd's are up, the default timeout is none in that case, there are cases where it can hang forever due to error cases, since this dumps quite a lot of info the logs grow in GB's, with default timeout of 1200 we can avoid such huge logs and fail sooner. Any tests needing higher timeout can pass the required value. Signed-off-by: Vasu Kulkarni <vasu@redhat.com>	2018-04-09 17:24:42 -07:00
Sage Weil	577737d007	osd: osd_mon_report_interval_min -> osd_mon_report_interval, kill _max The _max isn't used. Drop the _min suffix. Signed-off-by: Sage Weil <sage@redhat.com>	2018-04-06 11:00:14 -05:00
Tatjana Dehler	25a0ed93ec	mgr/dashboard: add 'osd metadata' command call Signed-off-by: Tatjana Dehler <tdehler@suse.com>	2018-03-23 11:11:17 +01:00
Neha Ojha	e3899dc901	qa/tasks/ceph_manager: use set_config on revived osd Signed-off-by: Neha Ojha <nojha@redhat.com>	2018-03-14 12:37:56 -07:00
Sage Weil	8651e15c93	qa/tasks/ceph_manager: tolerate failure to force backfill/recoery The pool may have been deleted out from underneath us. Signed-off-by: Sage Weil <sage@redhat.com>	2018-01-03 08:37:02 -06:00
Sage Weil	aafb3a565d	qa/tasks/ceph_manager: tolerate tell osd.* error It's possible for tell osd.* to race against an osd we stopped but the cluster doesn't know is down yet. In tha case we'll get ENXIO on that osd and the command will fail. In this context, we don't care. Signed-off-by: Sage Weil <sage@redhat.com>	2017-12-06 17:51:20 -06:00
Kefu Chai	a406553a79	qa/tasks/ceph_manager: add inject_args() method * move Thrasher._set_config() to CephManager, and make it a public method, and rename it to inject_args(), * use this method instead of using 'tell ... injectargs ...' directly Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-11-29 18:44:16 +08:00
Kefu Chai	749bbda075	qa/tasks: prolong revive_osd() timeout to 6 min see also #17902 Fixes: http://tracker.ceph.com/issues/21474 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-11-20 13:40:59 +08:00
Kefu Chai	7f549af459	qa: do not wait for down/out osd for pg convergence that osd is not invovlved in the PG state changes. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-11-08 14:50:10 +08:00
Sage Weil	d21809b14e	qa/tasks/thrashosds: set min_in default to 4 We have EC tests with k=2,m=2, so we need a min of 4. Fixes: http://tracker.ceph.com/issues/21997 Signed-off-by: Sage Weil <sage@redhat.com>	2017-11-01 08:32:48 -05:00
Patrick Donnelly	c58161f25b	Merge PR #17266 into master * refs/pull/17266/head: qa: update test_ceph_argparse to test fs cmds qa: use fs rm_data_pool qa: fix mdsmap lookup qa: remove usage of mds dump PendingReleaseNotes: add obsoleted mds commands qa: remove use of obsolete mds commands ceph_volume_client: remove use of obsolete mds cmd doc: update on obsolete mds commands cephfs: obsolete deprecated mds commands Reviewed-by: Douglas Fuller <dfuller@redhat.com>	2017-10-24 16:37:14 -07:00
Patrick Donnelly	3a5f090a1e	qa: remove usage of mds dump Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-10-24 11:32:43 -07:00
Kefu Chai	4c7df944c7	osd: add max-pg-per-osd limit osd will refused to create new pgs, until its pg number is lower than the max-pg-per-osd upper bound setting. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-10-17 23:08:40 +08:00
Kefu Chai	e21114274f	qa: s/backfill/backfilling/ it's renamed "backfilling" in `4015343f` . Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-10-11 11:52:43 +08:00
Sage Weil	b6a5c09dba	ceph-objectstore-tool: remove rm-past-intervals op The OSD doesn't rebuild this on demand anymore. Signed-off-by: Sage Weil <sage@redhat.com>	2017-10-06 13:08:18 -05:00
Sage Weil	61799c4c8c	Merge pull request #17810 from hjwsm1989/wip-21294 qa/ceph_manager: check pg state again before timedout Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2017-09-25 12:33:34 -05:00
Kefu Chai	42be200c56	qa/tasks: prolong revive_osd() timeout to 6 min bluestore_fsck_on_mount and bluestore_fsck_on_mount_deep are enabled by default. and bluestore is used as the default store backend. it takes longer to perform the deep fsck with verbose log. so prolong the revive_osd()'s timeout from 150 sec to 360 sec. Fixes: http://tracker.ceph.com/issues/21474 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-09-22 10:58:41 +08:00
huangjun	fa40add7f0	qa/ceph_manager: check pg state again before timedout Pg state maybe all in active+clean when no recovering going on, so check it again before timedout. Fixes: http://tracker.ceph.com/issues/21294 Signed-off-by: huangjun <huangjun@xsky.com>	2017-09-20 00:04:04 +08:00
yonghengdexin735	fc5ac9ea69	common:fix error word Signed-off-by: yonghengdexin735 <zhang.zezhu@zte.com.cn>	2017-09-13 10:22:08 +08:00
David Zafman	3bb20f6d75	ceph-objectstore-tool: Make pg removal require --force Add new export-remove to combine the 2 operations Fixes: http://tracker.ceph.com/issues/21272 Signed-off-by: David Zafman <dzafman@redhat.com>	2017-09-08 17:56:05 -07:00
Sage Weil	21027233b2	qa/tasks/ceph_manager: revive osds before doing final rerr reset We assume below that rerrosd is up, but it may not be when we exit the loop. Fixes: http://tracker.ceph.com/issues/21206 Signed-off-by: Sage Weil <sage@redhat.com>	2017-08-31 14:55:46 -04:00
Sage Weil	a40d94b163	qa/tasks/ceph: wait for pg stats to flush in healthy check Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-27 12:10:27 -04:00
Sage Weil	80978dea8a	qa/tasks/ceph_manager: wait_for_all_up -> wait_for_all_osds_up Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-27 12:10:26 -04:00
Sage Weil	7648894e55	qa/tasks/ceph_manager: expose flush_all_pg_stats Signed-off-by: Sage Weil <sage@redhat.com>	2017-07-27 12:10:26 -04:00
Sage Weil	02c2e853d3	Merge pull request #16509 from liewegas/wip-rgw-wait qa/suits/rados/basic/tasks/rgw_snaps: wait for pools to be created Reviewed-by: Casey Bodley <cbodley@redhat.com>	2017-07-24 11:55:54 -05:00

1 2

92 Commits