RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2024-12-29 06:52:35 +00:00

Author	SHA1	Message	Date
Sage Weil	b92be2ca9b	qa/standalone/osd/osd-fast-mark-down: use v1 addr w/ simplemessenger Signed-off-by: Sage Weil <sage@redhat.com>	2019-01-03 11:17:31 -06:00
Igor Fedotov	d07c10dfc0	os/bluestore: add main device expand capability. One can do that via ceph-bluestore-tool's bluefs-bdev-expand command Signed-off-by: Igor Fedotov <ifedotov@suse.com>	2018-11-29 12:48:20 +03:00
Sage Weil	c8a8dc21fd	Merge PR #24828 into master * refs/pull/24828/head: qa/osd-bluefs-volume-ops: use ceph-bluestore-tool for fsck qa/osd-bluefs-volume-ops: reduce space usage for the test case Reviewed-by: David Zafman <dzafman@redhat.com>	2018-11-08 16:26:52 -06:00
Sage Weil	9ab9dcfc0d	Merge PR #24809 into master * refs/pull/24809/head: os/bluestore: omit redundant '/' in OSD path for ceph-bluestore-tool if os/bluestore: improve error handling for migrate ops in qa/standtalone/osd-bluefs-volume-ops: remove redundant code. Reviewed-by: Sage Weil <sage@redhat.com>	2018-10-30 15:09:45 -05:00
Igor Fedotov	f5520ea304	qa/osd-bluefs-volume-ops: use ceph-bluestore-tool for fsck Signed-off-by: Igor Fedotov <ifedotov@suse.com>	2018-10-30 15:38:16 +03:00
Igor Fedotov	80e67abdfd	qa/osd-bluefs-volume-ops: reduce space usage for the test case Signed-off-by: Igor Fedotov <ifedotov@suse.com>	2018-10-30 15:38:15 +03:00
Sage Weil	c40685ebdd	Merge PR #24787 into master * refs/pull/24787/head: Merge PR #24796 into nautilus osd: fix heartbeat_reset unlock Merge PR #24780 into nautilus Merge PR #24761 into nautilus Merge PR #24651 into nautilus osd: fix race between op_wq and context_queue test: Make sure kill_daemons failure will be easy to find test: Add flush_pg_stats to make test more deterministic	2018-10-29 08:36:34 -05:00
Igor Fedotov	5d38f8b49b	qa/standtalone/osd-bluefs-volume-ops: remove redundant code. Signed-off-by: Igor Fedotov <ifedotov@suse.com>	2018-10-29 16:30:36 +03:00
Kefu Chai	4af71e7c00	Merge pull request #23103 from ifed01/wip-ifed-bluefs-migrate os/bluestore: allow ceph-bluestore-tool to coalesce, add and migrate BlueFS backing volumes Reviewed-by: Sage Weil <sage@redhat.com>	2018-10-22 22:33:08 +08:00
David Zafman	da3c556aa2	test: Make sure kill_daemons failure will be easy to find Signed-off-by: David Zafman <dzafman@redhat.com>	2018-10-17 16:54:45 -07:00
David Zafman	b33edbc4f6	test: Add flush_pg_stats to make test more deterministic Signed-off-by: David Zafman <dzafman@redhat.com>	2018-10-17 16:54:45 -07:00
Igor Fedotov	02b5768a4f	tests: add qa test case for bluefs volume coalescence Signed-off-by: Igor Fedotov <ifedotov@suse.com>	2018-10-17 22:39:27 +03:00
huanwen ren	f1219d716d	qa/osd: fixup osd-rep-recov-eio.sh fails to parse pg dump Fixes: http://tracker.ceph.com/issues/36418 Signed-off-by: huanwen ren <ren.huanwen@zte.com.cn>	2018-10-16 02:18:22 +08:00
Sage Weil	9bf7c810a7	Merge PR #23985 into master * refs/pull/23985/head: ceph-objectstore-tool: add back pool dne check qa/suites/rados/singleton/reg11184: remove old test ceph-objectstore-tool: import pg at original epoch osd: handle null pg slot on startup ceph-objectstore-tool: drop support for ancient export files osd: avoid dropping osd_lock when pg osdmaps are not laggy qa/standalone/osd/pg-merge.sh: add merge vs pg import test Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn> Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2018-09-21 08:21:53 -05:00
Kefu Chai	4b0e2c8ed4	qa: fix typos Signed-off-by: Kefu Chai <kchai@redhat.com>	2018-09-21 12:41:42 +08:00
Sage Weil	26cb966cab	ceph-objectstore-tool: import pg at original epoch - In the jewel era, we fast-forwarded the PG to the OSD's latest epoch and cleared past_intervals. - In mimic, as of `2347ecb961`, we brought the PG up to date while updating past_intervals. (At the same time we removed the OSD's parallel past_intervals regeneration.) The problem is that the tool then has to reimplement the past_intervals update logic, and also has to cope with splits and merges. Splits are somewhat easier (until now we enable partial import of a PG into a split child), but merges are not so easy. This patch changes it so we import the PG and leave the pg_epoch matching the import file. The OSD is then responsible for bringing it up to date with the latest map, and dealing with any intervening splits or merges. We also adjust the safety check to ensure that we don't collide with any existing PG, either a child we eventually split into, or a parent we eventually merge into. Fixes: http://tracker.ceph.com/issues/35955 Signed-off-by: Sage Weil <sage@redhat.com>	2018-09-20 12:58:00 -05:00
Sage Weil	da887c82ce	qa/standalone/osd/pg-merge.sh: add merge vs pg import test - You can't import the source half a PG that's since merged. Sorry! We could implement this later. - You can import the target half, but the result will then be incomplete, and you rely on backfill to clean it up. - Map gaps don't affect this behavior. Signed-off-by: Sage Weil <sage@redhat.com>	2018-09-17 12:52:46 -05:00
David Zafman	ef6940fbb6	test: osd-backfill-stats.sh: Fix subtests to get primary which can change Fixes: http://tracker.ceph.com/issues/35982 Signed-off-by: David Zafman <dzafman@redhat.com>	2018-09-13 13:19:23 -07:00
Kefu Chai	510d9e1345	Merge pull request #23723 from xiexingguo/wip-list-missing osd/PrimaryLogPG: rename list_missing -> list_unfound command Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>	2018-09-11 20:25:21 +08:00
Sage Weil	f47921f293	qa/standalone/osd/osd-backfill-stats: fixes Grep from the primary's log, not every osd's log. For the backfill_remapped task in particular, after the pg_temp change it just so happens that the primary changes across the pool size change and thus two different primaries do (some) backfill. Fix that test to pass the correct primary. Other tests are unaffected as they do not (happen to) trigger a primary change and already satisfied the (removed) check that only one OSD does backfill. Signed-off-by: Sage Weil <sage@redhat.com>	2018-09-07 17:11:18 -05:00
xie xingguo	85ba2f0a82	osd/PrimaryLogPG: s/list_missing/list_unfound/ Also: - Do not print offset until specified - Count missing objects correctly (used to be primary's local missing) Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>	2018-09-06 09:52:20 +08:00
Xie Xingguo	0857124d23	Merge pull request #23663 from xiexingguo/wip-incompat-async-fixes osd: some recovery improvements and cleanups Reviewed-by: Sage Weil <sage@redhat.com>	2018-09-01 14:27:27 +08:00
xie xingguo	22786cffa8	osd/PG: force auth_log_shard to be primary when appropriate So if there are a lot fo missing objects on primary, we can make use of auth_log_shard to restore client I/O quickly. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>	2018-08-31 16:29:25 +08:00
Sage Weil	85083f39b5	Merge PR #23572 into master * refs/pull/23572/head: qa/standalone/osd/osd-force-create-pg: add force-create-pg test mon/MonCommands: fix 'osd force-create-pg' Reviewed-by: Kefu Chai <kchai@redhat.com>	2018-08-30 08:52:44 -05:00
Josh Durgin	cc41b51c6a	Merge pull request #23518 from dzafman/wip-25084 osd: When possible check CRC in build_push_op() so repair can eventually stop Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2018-08-23 11:39:05 -07:00
David Zafman	bc33170310	test: Use pids instead of jobspecs which were wrong Fixes: http://tracker.ceph.com/issues/27056 Signed-off-by: David Zafman <dzafman@redhat.com>	2018-08-22 10:57:04 -07:00
David Zafman	c1b2bd7f16	test: Fix test to detect a test setup failure Signed-off-by: David Zafman <dzafman@redhat.com>	2018-08-15 15:45:44 -07:00
David Zafman	72c34949fc	test: Add test for filestore bad CRC in primary pull request Signed-off-by: David Zafman <dzafman@redhat.com>	2018-08-15 15:45:44 -07:00
Sage Weil	ed9ec42c42	qa/standalone/osd/osd-force-create-pg: add force-create-pg test Signed-off-by: Sage Weil <sage@redhat.com>	2018-08-15 06:47:47 -05:00
Sage Weil	4108ebc0ab	qa/standalone/osd/ec-error-rollforward: reproduce bug 24597 This reproduces http://tracker.ceph.com/issues/24597 Signed-off-by: Sage Weil <sage@redhat.com>	2018-07-11 16:15:49 -05:00
Sage Weil	4f9fdd98e2	qa/standalone/osd/repro_long_log.sh: fix test The log trimming case wasn't quite right. Before HEAD^ we were rolling forward too aggressively and miscalculating the can_rollforward_to, which affected the trim_to calculation. Signed-off-by: Sage Weil <sage@redhat.com>	2018-07-11 16:15:49 -05:00
David Zafman	33538aca35	test: Fix standalone main usage Signed-off-by: David Zafman <dzafman@redhat.com>	2018-06-18 14:09:14 -07:00
David Zafman	39fc43556f	test: Put files in private test directory Signed-off-by: David Zafman <dzafman@redhat.com>	2018-06-18 14:08:23 -07:00
Erwan Velu	e6e10246c6	tests: Protecting rados bench against endless loop If the cluster dies during the rados bench, the maximum running time is no more considered and all emitted aios are pending. rados bench never quits and the global testing timeout (3600 sec : 1 hour) have to be reach to get a failure. This situation is dramatic for a background test or a CI run as it locks the whole job for too long for an event that will never occurs. This ideal solution would be having 'rados bench' considering a failure once the timeout is reached when aios are pending. A possible workaround here is to put use the system command 'timeout' before calling rados bench and fail if rados didn't completed on time. To avoid side effects, this patch is doubling rados timeout. If rados didn't completed after twice the expected time, it have to fail to avoid locking the whole testing job. Please find below the way it worked on a real test case. We can see no IO after t>2 but despite timeout=4 the bench continue. Thanks to this patch, the bench is stopped at t=8 and return 1. 5: /home/erwan/ceph/src/test/smoke.sh:55: TEST_multimon: timeout 8 rados -p foo bench 4 write -b 4096 --no-cleanup 5: hints = 1 5: Maintaining 16 concurrent writes of 4096 bytes to objects of size 4096 for up to 4 seconds or 0 objects 5: Object prefix: benchmark_data_mr-meeseeks_184960 5: sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s) 5: 0 0 0 0 0 0 - 0 5: 1 16 1144 1128 4.40538 4.40625 0.00412965 0.0141116 5: 2 16 2147 2131 4.16134 3.91797 0.00985654 0.0109079 5: 3 16 2147 2131 2.77424 0 - 0.0109079 5: 4 16 2147 2131 2.0807 0 - 0.0109079 5: 5 16 2147 2131 1.66456 0 - 0.0109079 5: 6 16 2147 2131 1.38714 0 - 0.0109079 5: 7 16 2147 2131 1.18897 0 - 0.0109079 5: /home/erwan/ceph/src/test/smoke.sh:55: TEST_multimon: return 1 5: /home/erwan/ceph/src/test/smoke.sh:18: run: return 1 Signed-off-by: Erwan Velu <erwan@redhat.com>	2018-06-14 11:06:52 +02:00
Neha Ojha	7f6f4f90fe	qa: modify TEST_recovery_sizeup() to handle async recovery Signed-off-by: Neha Ojha <nojha@redhat.com>	2018-03-15 11:13:34 -07:00
David Zafman	8a7e6c2349	Merge pull request #20220 from dzafman/wip-calc-stats3 osd: Improve recovery stat handling by using peer_missing and missing_loc info Reviewed-by: Sage Weil <sage@redhat.com>	2018-03-14 11:07:44 -07:00
David Zafman	af85f3cc48	test: osd-backfill-stats.sh parallel osd-recovery-stats.sh check() changes Signed-off-by: David Zafman <dzafman@redhat.com>	2018-03-14 10:07:11 -07:00
David Zafman	acc1f80684	test: Use "(est)" in log message when an osd doesn't have peer_missing Consolidate check() code and common script code TEST_recovery_multi() wasn't reliable due to delayed peer_missing Signed-off-by: David Zafman <dzafman@redhat.com>	2018-03-14 10:07:11 -07:00
David Zafman	12e331b742	test: osd-recovery-stats.sh: New test with different missing objs on multiple OSDs Signed-off-by: David Zafman <dzafman@redhat.com>	2018-03-14 10:07:11 -07:00
David Zafman	09b5697ba2	test: Correction for better degraded/misplaced handling Signed-off-by: David Zafman <dzafman@redhat.com>	2018-03-14 10:07:11 -07:00
David Zafman	d7fd9174b9	osd: Fix for handling more than 1 missing target Fix test case to test more than 1 target Signed-off-by: David Zafman <dzafman@redhat.com>	2018-03-14 10:07:03 -07:00
Josh Durgin	1c15458a00	PrimaryLogPG: only trim up to osd_pg_log_trim_max entries at once This prevents the fix for http://tracker.ceph.com/issues/22050 or potential future bugs from causing too much latency by trimming too many log entries at once. Signed-off-by: Josh Durgin <jdurgin@redhat.com>	2018-03-09 19:14:28 -05:00
Josh Durgin	b50186bfe6	PG, PrimaryLogPG: trim log and rollback info for error log entries Regular updates piggyback some osd state for this purpose with MOSDRepOp[Reply]. Do the same thing for pure log entry updates (write errors and lost/revert additions) via MOSDPGUpdateLogMissing[Reply]. Fixes: http://tracker.ceph.com/issues/22050 Signed-off-by: Josh Durgin <jdurgin@redhat.com>	2018-03-09 17:54:08 -05:00
Josh Durgin	2067f7c679	Merge pull request #20786 from dzafman/wip-zafman-log-trim tools/ceph-objectstore-tool: command to trim the pg log Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2018-03-08 16:42:31 -08:00
Josh Durgin	b01e4ea5e2	tools: Add pg log trim command to ceph-objectstore-tool Add test script that verifies the command in qa/standalone/osd Fixes: http://tracker.ceph.com/issues/23242 Signed-off-by: Josh Durgin <jdurgin@redhat.com> Signed-off-by: David Zafman <dzafman@redhat.com>	2018-03-08 15:58:55 -08:00
Sage Weil	c9e974800f	qa: --no-mon-config for ceph-objectstore-tool --op mkfs .. Signed-off-by: Sage Weil <sage@redhat.com>	2018-03-06 14:44:50 -06:00
Kefu Chai	ac56a202fd	qa/standalone: extract delete_pool() some tests, like osd-backfill-stats.sh are using delete_pool(), but they don't have this function defined. and this function is defined in standalone tests separately, so would be simpler if we can consolidate them in ceph-helper.sh. Signed-off-by: Kefu Chai <kchai@redhat.com>	2018-02-28 15:40:28 +08:00
David Zafman	7ccb7b7023	Merge pull request #19850 from dzafman/wip-calc-stats osd/PG: re-write of _update_calc_stats and improve pg degraded state Fixes: http://tracker.ceph.com/issues/20059 Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2018-01-16 11:58:49 -08:00
Kefu Chai	7aba57b9b4	Merge pull request #18191 from hjwsm1989/osd-mark-down qa/standalone/osd/osd-mark-down: create pool to get updated osdmap faster Reviewed-by: Kefu Chai <kchai@redhat.com>	2018-01-15 11:09:02 +08:00
David Zafman	88ce0c1a91	test: Verify stat calculations during backfill Signed-off-by: David Zafman <dzafman@redhat.com>	2018-01-14 18:17:23 -08:00

1 2

63 Commits