RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-03 01:22:53 +00:00

Author	SHA1	Message	Date
Sage Weil	f5a1c57c94	qa/standalone/scrub/osd-scrub-snaps: snapmapper omap is now 'm' ...due to per-pool omap. Fixes 91f533be71e5fd9b2c0135a5b54d663425a1d9c4 Fixes: https://tracker.ceph.com/issues/41353 Signed-off-by: Sage Weil <sage@redhat.com>	2019-08-20 16:18:41 -05:00
Sage Weil	1e36be9567	qa/standalone/mon/health-mute.sh: fix up rachet test Make sure we provide time for the mute to get cleared out by tick(). Signed-off-by: Sage Weil <sage@redhat.com>	2019-08-19 12:30:10 -05:00
Sage Weil	9352fc94ab	qa/standalone/mon/health-mute.sh: s/kill daemons/kill_daemons/ Signed-off-by: Sage Weil <sage@redhat.com>	2019-08-19 09:27:51 -05:00
Kefu Chai	fc55a51a87	Merge pull request #29579 from liewegas/wip-big-vs-bluestore osd: scrub error on big objects; make bluestore refuse to start on big objects Reviewed-by: David Zafman <dzafman@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-08-16 20:24:43 +08:00
Sage Weil	710fef96ea	qa/standalone/mon/health-mutes: add tests Make sure mute and unmute work. Make sure stick is sticky. Mkae sure counts can go down bupt if they go upt hte mute clears. Signed-off-by: Sage Weil <sage@redhat.com>	2019-08-14 20:40:08 -05:00
David Zafman	5928fe8ca0	osd/PG: scrub error when objects are larger than osd_max_object_size Signed-off-by: David Zafman <dzafman@redhat.com>	2019-08-14 20:25:12 -05:00
Kefu Chai	f13c7c83d9	Merge pull request #29342 from Jeegn-Chen/wip-scrub-extended-sleep osd: support osd_scrub_extended_sleep Reviewed-by: David Zafman <dzafman@redhat.com>	2019-08-13 09:09:52 +08:00
Jeegn Chen	3bfb5c2621	osd: support osd_scrub_extended_sleep 1. always take osd_scrub_sleep for manually initiated scrubs 2. when scrub_time_permit() return true for scheduled ones, the existing osd_scrub_sleep is used 3. when scrub_time_permit() return false for scheduled ones, there may be 2 scenarios 3.1 if osd_scrub_extended_sleep <= osd_scrub_sleep, let's take osd_scrub_sleep 3.2 otherwise, let's take osd_scrub_extended_sleep Fixes: http://tracker.ceph.com/issues/40955 Signed-off-by: Jeegn Chen <jeegnchen@tencent.com>	2019-08-12 16:54:36 +08:00
David Zafman	b1c14b7f6e	Merge pull request #29494 from dzafman/wip-scrub-test test: Bump sleep time for slower machines Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-08-07 18:30:31 -07:00
David Zafman	74d294d70b	test: Bump sleep time for slower machines Signed-off-by: David Zafman <dzafman@redhat.com>	2019-08-05 07:40:09 -07:00
Changcheng Liu	43ad4bf0dc	ceph-objectstore-tool: set log date format Set datefmt parameter to track the log information %F Equivalent to %Y-%m-%d %T Equivalent to "%H:%M:%S" Signed-off-by: Robert Church <robert.church@windriver.com> Reviewed-by: Changcheng Liu <changcheng.liu@aliyun.com>	2019-07-25 09:39:19 +08:00
Sage Weil	1b46267cf7	Merge PR #28839 into master * refs/pull/28839/head: osd: support osd_repair_during_recovery Reviewed-by: David Zafman <dzafman@redhat.com>	2019-07-16 10:07:53 -05:00
Sage Weil	ff7813aa14	qa/standalone/scrub/osd-scrub-snaps.sh: adjust expected output SnapSet now dumps just seq, not a (fake) SnapContext. Signed-off-by: Sage Weil <sage@redhat.com>	2019-07-12 09:55:06 -05:00
Sage Weil	03b9c66080	ceph-objectstore-tool: fix use of SnapSet::snaps Instead, use clone_snaps to identify clones. Signed-off-by: Sage Weil <sage@redhat.com>	2019-07-12 09:55:06 -05:00
Sage Weil	23eaf7c498	qa/standalone/scrub/osd-scrub-snaps: fix kv grep SnapMapper keys are now SNA_, not MAP_. Fixes: http://tracker.ceph.com/issues/40725 Signed-off-by: Sage Weil <sage@redhat.com>	2019-07-12 08:11:21 -05:00
Sage Weil	b2eb5232de	Merge PR #28901 into master * refs/pull/28901/head: qa/standalone/scrub/osd-scrub-repair: fix 'scrub ok' grep osd/osd_types: remove 'snap_context' from SnapSet::dump() Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2019-07-08 08:36:05 -05:00
Jeegn Chen	80f4e1f677	osd: support osd_repair_during_recovery osd_repair_during_recovery=true allow explicitly requested reqair to be scheduled on OSDs with active recovering. Fixes: http://tracker.ceph.com/issues/40620 Signed-off-by: Jeegn Chen <jeegnchen@tencent.com>	2019-07-08 09:26:27 +08:00
Sage Weil	a960f2faa7	qa/standalone/scrub/osd-scrub-repair: fix 'scrub ok' grep The log now also has a 'purged_snaps scrub ok' message that (generally) precedes the first scrubbed PG. Signed-off-by: Sage Weil <sage@redhat.com>	2019-07-04 18:27:37 -05:00
Sage Weil	70ad54a0b3	osd/osd_types: remove 'snap_context' from SnapSet::dump() We no longer have a snaps field with real values, so dumping this as a "snap_context" is silly. Instead, just dump the seq. Adjust qa/standalone/scrub/osd-scrub-repair.sh accordingly. Signed-off-by: Sage Weil <sage@redhat.com>	2019-07-04 18:24:41 -05:00
Sage Weil	71e5cba00b	Merge PR #28867 into master * refs/pull/28867/head: qa/standalone/ceph-helpers: more osd debug Reviewed-by: David Zafman <dzafman@redhat.com>	2019-07-03 21:27:20 -05:00
David Zafman	fe3b693d0f	Merge pull request #28334 from dzafman/wip-40073 osd: Fix the way that auto repair triggers after regular scrub Reviewed-by: Neha Ojha <nojha@redhat.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2019-07-03 15:27:27 -07:00
Sage Weil	0d0759531a	qa/standalone/ceph-helpers: more osd debug debug_ms=1 debug_monc=20 Hunting down http://tracker.ceph.com/issues/40666 Signed-off-by: Sage Weil <sage@redhat.com>	2019-07-03 16:53:00 -05:00
David Zafman	27918bb906	osd: Handle scrub interval changes Global changes reschedule all PG scrubs Pool changes reschedule pool PG scrubs Signed-off-by: David Zafman <dzafman@redhat.com>	2019-06-27 14:20:54 -07:00
Neha Ojha	bd15824567	Merge pull request #28204 from dzafman/wip-39555 mon: Improve health status for backfill_toofull and recovery_toofull Reviewed-by: Joao Eduardo Luis <joao@suse.de> Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-06-20 11:12:10 -07:00
David Zafman	fa698e18e1	mon: Improve health status for backfill_toofull and recovery_toofull Treat backfull_toofull as a warning condition because it can resolve itself. Includes test case for PG_BACKFILL_FULL Includes test case for recovery_toofull / PG_RECOVERY_FULL Fixes: https://tracker.ceph.com/issues/39555 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-06-20 02:22:01 +00:00
xie xingguo	ec27a162de	mgr, osd: 'ceph osd df' by pool Our test admin has been asking for this for the past few years:-) Besides, this is also useful for operating on large Ceph clusters with mutliple storage pools possibly spanning over all osds. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>	2019-06-18 20:29:40 +08:00
David Zafman	590b4138ae	Merge pull request #28302 from dzafman/wip-40078 test: Make sure that extra scheduled scrubs don't confuse test Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2019-06-05 14:43:30 -07:00
Kefu Chai	cdba0f1420	qa/standalone/ceph-helpers: resurrect all OSD before waiting for health address the regression introduced by e62cfceb in e62cfceb, we wanted to test the newly introduced TOO_FEW_OSDS warning, so we increased the number of OSD to the size of pool, so if the number of OSD is less than pool size, monitor will send a warning message. but we need to bring all OSDs back if we are expecting a healthy cluster. in this change, all OSDs are resurrect before `wait_for_health_ok`. Signed-off-by: Kefu Chai <kchai@redhat.com>	2019-05-30 23:52:36 +08:00
Kefu Chai	f6b022bdbe	Merge pull request #27806 from ashitakasam/add-osd-alarm osd: Better error message when OSD count is less than osd_pool_default_size Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-05-30 21:28:54 +08:00
David Zafman	893d227c82	test: Make sure that extra scheduled scrubs don't confuse test Fixes: http://tracker.ceph.com/issues/40078 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-05-29 14:03:57 -07:00
David Zafman	7959159e83	test: Adding standalone test of log copy handling Signed-off-by: David Zafman <dzafman@redhat.com>	2019-05-10 15:31:51 -07:00
zjh	e62cfceb95	qa/standalone: remove osd_pool_default_size in test_wait_for_health_ok Signed-off-by: zjh <jhzeng93@foxmail.com>	2019-05-06 14:35:54 +08:00
Samuel Just	5ea5c47152	test-erasure-eio: first eio may be fixed during recovery The changes to the way EC/ReplicatedBackend communicate read t showerrors had a side effect of making first eio on the object in TEST_rados_get_subread_eio_shard_[01] repair itself depending on the timing of the killed osd recovering. The test should be improved to actually test that behavior at some point. Signed-off-by: Samuel Just <sjust@redhat.com>	2019-05-01 11:22:28 -07:00
sjust@redhat.com	252d5c20cf	osd/: move stat updates and publishing to PeeringState Signed-off-by: Samuel Just <sjust@redhat.com>	2019-05-01 11:22:24 -07:00
David Zafman	66b041fa4a	Merge pull request #27769 from dzafman/wip-39333 osd-backfill-space.sh test failed in TEST_backfill_multi_partial() Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-04-26 11:55:04 -07:00
David Zafman	9931023457	test: osd-backfill-spsace.sh doesn't matter which PG wins the race Fixes: http://tracker.ceph.com/issues/39333 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-04-26 10:11:00 -07:00
David Zafman	39cc14bdc1	Merge pull request #27503 from dzafman/wip-39099 osd: Give recovery for inactive PGs a higher priority Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-04-25 15:06:56 -07:00
David Zafman	71d254647a	test: osd-recovery-scrub.sh ignore error from kill_daemons() Another work around for http://tracker.ceph.com/issues/38195 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-04-25 13:53:27 -07:00
David Zafman	71d82dbeb9	test: Add tests for pool recovery priority conversion Signed-off-by: David Zafman <dzafman@redhat.com>	2019-04-25 13:53:27 -07:00
David Zafman	444aa9f9fe	osd, mon: New pool recovery priority range -10 to 10 Use OSD_POOL_PRIORITY_MAX and OSD_POOL_PRIORITY_MIN constants Scale legacy priorities if exceeds maximum Signed-off-by: David Zafman <dzafman@redhat.com>	2019-04-25 13:53:27 -07:00
David Zafman	3a234164d0	Merge pull request #27279 from dzafman/wip-divergent Improvements to standalone tests Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-04-24 10:58:11 -07:00
Sage Weil	a3a4af3454	Merge PR #27656 into master * refs/pull/27656/head: doc/dev/erasure-coded-pool: update doc/rados/operations/erasure-code*: update default ec profile references common/options: change default erasure-code-profile to k=2 m=2 Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-04-24 08:14:55 -05:00
David Zafman	7e77898001	test: Divergent testing of _merge_object_divergent_entries() cases Case 1: A more recent update exists Case 2: The first entry in the divergent sequence is a create Case 3 NOT TESTED - Ohject currently missing Case 4: We can rollback all of the entries Case 5: We cannot rollback at least 1 of the entries Support starting OSDs even when "noup" is set (don't wait for up). Move create_ec_pool() to ceph-helpers.sh Fixes: https://tracker.ceph.com/issues/39162 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-04-22 18:50:24 -07:00
Sage Weil	755e8c4ef2	Merge PR #27595 into master * refs/pull/27595/head: osd: add 'ceph osd stop <osd.nnn>' command Reviewed-by: Sage Weil <sage@redhat.com>	2019-04-20 08:52:01 -05:00
Sage Weil	3e86be7d50	common/options: change default erasure-code-profile to k=2 m=2 Signed-off-by: Sage Weil <sage@redhat.com>	2019-04-19 16:47:57 -05:00
xie xingguo	5dbae13ce0	osd: add 'ceph osd stop <osd.nnn>' command stop command can be used to force stopping a specified osd daemon, e.g., you don't have to pre-figure out where it located. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>	2019-04-18 13:55:02 +08:00
David Zafman	96861a8116	ceph-objectstore-tool: Rename dump-import to dump-export If user specifies dump-import it will still work, but isn't in the usage that way. Fixes: http://tracker.ceph.com/issues/39284 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-04-12 13:17:45 -07:00
Sage Weil	dc97651cbd	Merge PR #27499 into master * refs/pull/27499/head: qa/standalone/osd/osd-markdown: fix dup command disabling Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-04-12 06:54:58 -05:00
Sage Weil	f7216d0b2c	qa/standalone/osd/osd-markdown: fix dup command disabling The ceph cli tool checks for the presence of the variable, not its value. Fixes: http://tracker.ceph.com/issues/38359 Signed-off-by: Sage Weil <sage@redhat.com>	2019-04-10 16:44:38 -05:00
David Zafman	69fa515c95	test: Make most tests use default objectstore bluestore Change run_osd() to default objectstore bluestore Use run_osd_filestore() to use the non-default objectstore Fix inject_eio to handle any objectstore if config prefixed with type Remaining tests using filestore: osd-pool-create.sh TEST_pool_create_rep_expected_num_objects Test filestore directory creation qa/standalone/osd/osd-dup.sh TEST_filestore_to_bluestore Obvious qa/standalone/osd/osd-rep-recov-eio.sh TEST_rep_read_unfound Requires data digest in object info qa/standalone/scrub/osd-scrub-repair.sh multiple tests Erasure code pools append mode for filestore is tested qa/standalone/special/ceph_objectstore_tool.py Test code verifies COT by directly examining filestore contents Fixes: https://tracker.ceph.com/issues/39162 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-04-10 08:55:04 -07:00
Kefu Chai	3805935ae0	Merge pull request #26806 from xiexingguo/wip-repair-eio-rep osd: automatically repair replicated replica on pulling error Reviewed-by: David Zafman <dzafman@redhat.com>	2019-04-08 19:46:36 +08:00
xie xingguo	6a8aedc107	qa: add new test case for pulling error Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>	2019-04-04 11:04:43 +08:00
David Zafman	11f072fee1	Add checking of num_shards_repaired in osd stats Signed-off-by: David Zafman <dzafman@redhat.com>	2019-04-04 11:04:42 +08:00
Sage Weil	3c9db396ae	Merge PR #27141 into master * refs/pull/27141/head: mon/OSDMonitor: fix osd boot feature vs require_osd_release check include/ceph_features: retire 7 other old features include/ceph_features: retire ERASURE_CODE_PLUGINS_V2 include/ceph_features: retire OSD_ERASURE_CODES include/ceph_features: update comment to align with N+2 upgrades include/ceph_features: adjust whitespace for retired and now usable features mon: remove check for jewel mons mds/FSMap: remove support for encoding jewel FSMap include/ceph_features: enable SERVER_OCTOPUS test/cli/osdmaptool/feature-set-unset-list: add octopus to output test/cli/osdmaptool/feature-set-unset-list: change unknown feature bit qa/releases/octopus.yaml: add octopus upgrade final step osd/OSDMap: octopus encoding features mon/OSDMonitor: add mon_debug_no_require_octopus mon/OSDMonitor: allow 'osd require-osd-release octopus' mon: add ondisk incompat octopus feature mon/mon_types: add mon feature for octopus include/ceph_features: SERVER_O -> SERVER_OCTOPUS Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-04-03 14:59:03 -05:00
Sage Weil	d667228c2e	Merge PR #27146 into master * refs/pull/27146/head: mon/MonMap: add min_quorum_size() helper mon/MDSMonitor: add 'mds ok-to-stop' command mon: add 'mon ok-to-{stop,add-offline,rm}' commands Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2019-04-03 13:49:19 -05:00
Sage Weil	3760e8f918	mon/OSDMonitor: add mon_debug_no_require_octopus Signed-off-by: Sage Weil <sage@redhat.com>	2019-04-02 16:19:43 -05:00
Sage Weil	aa33a26e32	mon/MDSMonitor: add 'mds ok-to-stop' command Signed-off-by: Sage Weil <sage@redhat.com>	2019-04-01 14:58:50 -05:00
Sage Weil	fbfa772047	mon/mon_types: add mon feature for octopus Signed-off-by: Sage Weil <sage@redhat.com>	2019-04-01 11:26:33 -05:00
Sage Weil	cfba0acc01	mon: add 'mon ok-to-{stop,add-offline,rm}' commands Helpers to decide when it is safe to stop a mon, add a mon that is not started, or remove a mon. (Adding and start a mon would always be safe, but it takes time to sync, so it's not really possible to do quickly.) Signed-off-by: Sage Weil <sage@redhat.com>	2019-04-01 11:05:52 -05:00
Sage Weil	420edba243	Merge PR #27169 into master * refs/pull/27169/head: common/config: parse --default-$option as a default value Reviewed-by: Sébastien Han <seb@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-03-27 09:48:33 -05:00
Sage Weil	fdd2000631	common/config: parse --default-$option as a default value Sometimes it is useful to specify an alternative default value for an option via the command line such that it has a lower priority than the mon config database, config file, the rest of the command line, or the environment. Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-26 11:00:27 -05:00
David Zafman	57abdb11fa	osd, test: Add num_shards_repaired to osd_stat_t for pushes with repair set 3(3) Fixes: http://tracker.ceph.com/issues/38616 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-03-25 16:03:36 -07:00
David Zafman	d2ca3d2feb	osd: Track num_objects_repaired in pg stats 2(3) Leave repair pg state on until recovery finishes or a new scrub starts Fixes: http://tracker.ceph.com/issues/38616 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-03-25 16:03:36 -07:00
David Zafman	2202e5d0b1	test, osd: Improvements to auto_repair 1(3) Allow auto_repair for replicated bluestore pools Regular scrub within auto repair parameters will trigger deep scrub New state failed_repair if PG repair attempt could not fix everything Set failed_repair if not possible to repair anything Fixes: http://tracker.ceph.com/issues/38616 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-03-23 09:52:40 -07:00
David Zafman	315d324889	test: osd-scrub-repair.sh: use corrupt_and_repair_lrc for lrc tests Fix for argument handling of create_ec_pool() Always pass a value for allow_overwrites for consistency Caused by: 3ca750d41dfe33c6efea4abc96d2bd426a9742b9 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-03-23 09:52:40 -07:00
Sage Weil	be1187575b	Merge PR #27021 into master * refs/pull/27021/head: msg: remove XioMessenger qa/suites/rados/thrash-old-clients: add nautilus qa/suites/rados/thrash-old-clients: add mimic v1 variant qa/suites/rados/thrash-old-clients: add mimic qa/suites/rados/thrash-old-clients: collapse msgr and client choice qa: remove simplemessenger tests ceph_test_msgr: remove simple msg: remove SimpleMessenger Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn> Reviewed-by: Matt Benjamin <mbenjami@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com>	2019-03-22 04:42:30 -05:00
Kefu Chai	f2b3bfa3aa	Merge pull request #26955 from liewegas/wip-slow-add crush: various fixes for weight-sets, the osd_crush_update_weight_set option, and tests Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>	2019-03-22 15:42:13 +08:00
Sage Weil	28b4392a71	qa: remove simplemessenger tests Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-20 06:10:25 -05:00
Sage Weil	4c741c109d	qa/standalone/crush/crush-choose-args: add weight-set tests Verify we have the expected behavior for creates and moves that maintain bucket summation, both with and without the osd_crush_update_weight_set option enabled. Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-20 04:57:51 -05:00
Sage Weil	f20c736e99	qa/standalone/crush/crush-choose-args: fix test - Make the initial weight-set actually consistent (summing) - Fix the intermediate state so that it reflects a correctly maintained summation. Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-20 04:57:51 -05:00
Sage Weil	13d7c4f4ec	Merge PR #26898 into nautilus * refs/pull/26898/head: osd/PG: invalidate PG if merging with unexpected version osd,mon: include more pg merge metadata in pg_pool_t qa/standalone/osd/pg-split-merge.sh: reproduce pg merge problem with empty pgs osd: add osd_debug_no_{acting_change,purge_strays} Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-03-14 22:37:18 -05:00
Sage Weil	4bb4f7a891	Merge PR #26894 into nautilus * refs/pull/26894/head: qa/standalone/erasure-code/test-erasure-code: adjust test to avoid m=0 erasure-code: ensure m >= 1 mon/OSDMonitor: set ec min_size to k + min(1, m - 1) Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-03-13 22:07:45 -05:00
Sage Weil	52d5797c3d	qa/standalone/erasure-code/test-erasure-code: adjust test to avoid m=0 _DD is k=2 m=0, which we don't allow. Switch it to cDD. I confess I don't fully understand why this was _DD to begin with, but I'm pretty sure mapping is there to control the order of results so that it can be mapped to the CRUSH rule output sanely, and the coding portion is not relevant to the test. Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-13 12:46:50 -05:00
Sage Weil	fb915c4805	osd/PG: invalidate PG if merging with unexpected version If the source or target PG version is 0'0, we may silently take the max of the source and target and still leave the PG complete. This specifically can happen with an empty PG, as seen with bug 38655. In theory we could encounter one of the PGs with some other last_update that doesn't match what we expect. If that ever happens, make sure the result is incomplete so that backfill can clean up. Additionally check that the pool metadata for the last merge matches the PGs at all. This could mismatch if we have an osdmap gap and are forced to do some merge without merge info at all... in which case we should definitely invalidate: there should be newer copies of the PG(s), and we have no idea whether the PGs we are merging are what we want. If this is some disaster recovery situation, an operator is always free to use ceph-objectstore-tool to re-mark a PG complete (at their own peril!). Fixes: http://tracker.ceph.com/issues/38655 Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-12 10:08:46 -05:00
David Zafman	51a45e796e	qa/test-erasure-code.sh: Don't grep entire bluestore directory Bluestore caused grep crash with "grep: memory exhausted" due to size of "block" storage. Fixes: http://tracker.ceph.com/issues/38678 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-03-11 18:47:29 -07:00
David Zafman	d4915ee503	qa: Don't create rbd pool because it creates an object This also reverts commit 10b9626ea7b09e7c124067a2ce08a76eea073c9c. Fixes: http://tracker.ceph.com/issues/38631 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-03-11 16:57:51 -07:00
David Zafman	8114a2619b	qa: Can't wait for clean when there aren't any pools/PGs. Fixes: http://tracker.ceph.com/issues/38678 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-03-11 16:02:48 -07:00
Sage Weil	f978b27d2b	qa/standalone/osd/pg-split-merge.sh: reproduce pg merge problem with empty pgs This reproduces http://tracker.ceph.com/issues/38655 Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-11 17:10:28 -05:00
Sage Weil	2ad02fbfe3	qa/standalone/erasure-code/test-erasure-eio.sh: still need to create rbd pool Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-09 09:34:49 -06:00
Sage Weil	10b9626ea7	qa/standalone/scrub/osd-scrub-repair: fix unfound grep It's now "1/2 unfound": 1/2 objects unfound (50.000%) ..presumably due to the rbd pool init creating the rbd_directory. Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-08 18:23:48 -06:00
Sage Weil	30fc7f5e97	qa/standalone/ceph-helpers: fix test_wait_for_clean Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-08 18:07:10 -06:00
Sage Weil	1e2b0c7252	qa/standalone/ceph-helpers.sh: fix test_run_mon - Only create each osd once - forget the first osdmap dump test; it's pointless Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-08 17:43:00 -06:00
Sage Weil	bf74c1adc4	qa/standalone/osd/osd-rep-recov-eio: fix better - no need for the default pool size - no initial osds or it will collide with setup_osds later - no need for rbd pool at all Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-08 17:41:11 -06:00
Sage Weil	62136d381a	Merge PR #26794 into master * refs/pull/26794/head: mon/MgrMonitor: only try to update always_on_modules if >= NAUTILUS qa/standalone/mon/msgr-v2-transition: add some tests for enabling msgr v2 mon/MonmapMonitor: add 'ceph mon set-addrs <name> <addrvec>' command Revert "mon/MonClient: disable ms_bind_msgr2 if NAUTILUS feature not set" mon/OSDMonitor: use legacy_equals to compare osd addrs msg/msg_types: make legacy_equals() symmetrical mon/MDSMonitor: stop using get_orig_source_inst() Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-03-07 22:12:52 -06:00
Sage Weil	c939eefa16	qa/standalone/mon/msgr-v2-transition: add some tests for enabling msgr v2 Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-07 16:35:35 -06:00
Sage Weil	b59ff3860f	qa/standalone/osd/osd-force-create-pg: create more pgs Avoid warnings about too few pgs. Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-06 16:27:56 -06:00
Sage Weil	cba0483b09	qa/standalone: make sure an osd is running before create_rbd_pool 'rbd pool init' now does IO. Drop the pool, or change the pool size to 1. Fixes: http://tracker.ceph.com/issues/38585 Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-06 16:27:56 -06:00
Sage Weil	01316aa7bd	qa/standalone/osd/pg-split-merge: fix import_after_merge_and_gap This test introduces a map gap. What should happen is that when there is such a gap, we cannot import. Previously, the test didn't reliably produce a map gap at all, and didn't check that import failed--it verified that it passed. Fix the test so that it reliably produces a gap and reports min_last_epoch_clean to the mon so we can trim. Then verify we fail to import, but can with --force. But remove the pg again, because if we force an import with a map gap the osd will refuse to start. Fixes: http://tracker.ceph.com/issues/38525 Signed-off-by: Sage Weil <sage@redhat.com>	2019-03-03 10:23:27 -06:00
Sage Weil	c6a7b2cbd1	qa/standalone/osd/osd-markdown: disable CLI command dups The markdown test is based on marking down a specific number of times, but the duplicate commands from the CLI may not get absorbed/batched by the mon, breaking the test. Override the default qa/tasks/workunit.py behavior of sending dups. Fixes: http://tracker.ceph.com/issues/38359 Signed-off-by: Sage Weil <sage@redhat.com>	2019-02-18 15:02:25 -06:00
David Zafman	64beabc4c6	test: Limit loops waiting for force-backfill/force-recovery to happen Fixes: http://tracker.ceph.com/issues/38309 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-02-13 17:44:53 -08:00
David Zafman	910a95b9c8	test: osd-backfill-stats.sh Fix check of multi backfill OSDs, skip remapped test Signed-off-by: David Zafman <dzafman@redhat.com>	2019-02-07 20:05:58 -08:00
David Zafman	690ff9a21f	Merge pull request #26213 from dzafman/wip-38041 osd: Fix recovery and backfill priority handling Reviewed-by: Neha Ojha <nojha@redhat.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2019-02-07 17:26:34 -08:00
David Zafman	ca5cf14fa8	test: Add scripts to test backfill/recovery priority handling Signed-off-by: David Zafman <dzafman@redhat.com>	2019-02-07 15:46:23 -08:00
Sage Weil	dcdca44aa4	qa/standalone/ceph-helpers: fix health_ok test Stopping the osd daemon won't reliably get you HEALTH_WARN or ERR; you have to make sure it is also marked down. Signed-off-by: Sage Weil <sage@redhat.com>	2019-02-07 12:10:34 -06:00
David Zafman	36e305c4b6	test: Ignore kill_daemons() error Workaround for: http://tracker.ceph.com/issues/38195 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-02-05 11:31:32 -08:00
David Zafman	bca4fe98b1	test: Fix kill_daemon() to check after last large sleep Signed-off-by: David Zafman <dzafman@redhat.com>	2019-02-05 11:30:04 -08:00
David Zafman	cc6339c0cd	test: Increase timeouts in osd-backfill-space.sh because of failure seen Fixes: http://tracker.ceph.com/issues/38027 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-02-05 11:29:32 -08:00
David Zafman	70b5136208	test: Add option to wait_for_clean() to execute at every sleep Signed-off-by: David Zafman <dzafman@redhat.com>	2019-01-30 09:35:51 -08:00
David Zafman	553d83dd24	Merge pull request #25403 from liyichao/rdigest tools: Add clear-data-digest command to objectstore tool. Reviewed-by: David Zafman <dzafman@redhat.com>	2019-01-30 09:30:23 -08:00
David Zafman	894bdf080e	Merge pull request #26158 from dzafman/wip-38053 Add hashinfo testing for dump command of ceph-objectstore-tool Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-01-30 09:29:01 -08:00
Kefu Chai	8d5ddb5817	Merge pull request #26091 from tchaikov/wip-36737 cmake: use $CMAKE_BINARY_DIR for default $CEPH_BUILD_VIRTUALENV Tested-by: Yuri Weinstein <yweins@redhat.com> Reviewed-by: Lenz Grimmer <lgrimmer@suse.com>	2019-01-30 21:35:52 +08:00
Kefu Chai	94a84b6f5a	test: listen on random port in tests which start ceph-mon See-also: http://tracker.ceph.com/issues/36737 Signed-off-by: Kefu Chai <kchai@redhat.com>	2019-01-27 21:16:54 +08:00
David Zafman	388e54d906	test: ceph-objectstore-tool cut down on large run Signed-off-by: David Zafman <dzafman@redhat.com>	2019-01-26 19:35:11 -08:00
David Zafman	07e4273c6a	test: ceph-objectstore-tool: Add test for EC object dump to check hinfo section Fixes: http://tracker.ceph.com/issues/38053 Signed-off-by: David Zafman <dzafman@redhat.com>	2019-01-26 19:35:11 -08:00
David Zafman	786b39f18f	test: ceph-objectstore-tool: Fix EC code handling so it doesn't skip EC objects Signed-off-by: David Zafman <dzafman@redhat.com>	2019-01-26 19:35:11 -08:00
David Zafman	0753dc2b17	test: Remove unnecessary shell code that breaks this python test Caused by: 8a694fc2f9e75864e064fe591feda9fab943c15e Signed-off-by: David Zafman <dzafman@redhat.com>	2019-01-26 19:35:11 -08:00
David Zafman	ef2dc05de0	osd, test: Add test case with osd support for overdue PG scrubs and deep scrubs Add trigger_deep_scrub osd command for testing Publish stats when trigger_scrub/trigger_deep_scrub is used for testing Add optional argument to trigger_scrub/trigger_deep_scrub for amount of extra time to change last scrub stamps Signed-off-by: David Zafman <dzafman@redhat.com>	2019-01-23 16:49:33 -08:00
David Zafman	879d89aace	test: Correct typo trying to call flush_pg_stats Signed-off-by: David Zafman <dzafman@redhat.com>	2019-01-23 16:49:33 -08:00
David Zafman	99ddd3666b	Merge pull request #22797 from dzafman/wip-19753 osd: Deny reservation if expected backfill size would put us over bac… Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2019-01-18 07:42:00 -08:00
liyichao	da5832b2b4	tools: Add clear-data-digest command to objectstore tool. There may be a situation where data digest in object info is inconsistent with that computed from object data, then deep-scrub will fail even though all three repicas have the same object data. Fixes: https://tracker.ceph.com/issues/37935 Signed-off-by: Li Yichao <liyichao.good@gmail.com>	2019-01-17 11:03:51 +08:00
Vikhyat Umrao	8a694fc2f9	qa: specify filestore for misc tests Signed-off-by: Vikhyat Umrao <vumrao@redhat.com> Signed-off-by: Sage Weil <sage@redhat.com>	2019-01-16 13:09:19 -06:00
Sage Weil	2762955576	qa/standalone/mon/mon-handle-forward: fix grep path and check return results This makes the test more strict and less confusing. Signed-off-by: Sage Weil <sage@redhat.com>	2019-01-10 17:18:38 -06:00
Sage Weil	251f667ef8	Merge PR #25009 into master * refs/pull/25009/head: librbd: stringify locker name with get_legacy_str() osdc/Objecter: fix list_watchers addr rendering to match legacy test/crimson: disable unittest_seastar_messenger test msg/msg_types: encode entity_addr_t TYPE_ANY as TYPE_LEGACY for pre-nautilus client: make blacklist detection handle TYPE_ANY entries mon/OSDMonitor: maintain compat output for 'blacklist ls' client: maintain compat for {inst,addr}_str in status dump qa/tasks/ceph_manager: compare osd flush seq #'s as ints qa/suites/fs: make use of simple.yaml where appropriate qa/msgr: move msgr factet into generic re-usable dir crimson: fix monmap build for seastar doc/start/ceph.conf: trim the sample ceph.conf file doc/rados/operations: only describe --public-{addr,network} method for adding mons PendingReleaseNotes: deprecate 'mon addr' doc: fix some 'mon addr' references doc/rados/configuration: fix some 'mon addr' references doc/rados/configuration/network-config-ref: revise network docs somewhat doc/rados/configuration/network-config-ref: remove totally obsolete section qa/suites/rados: replace mon_seesaw.py task with a small bash script qa/suites/fs/upgrade: don't bind to v2 addrs qa/tasks/mon_thrash: avoid 'mon addr' in mon section mon/MonClient: disable ms_bind_msgr2 if NAUTILUS feature not set osd/OSDMap: maintain compat addr fields msg/msg_types: add get_legacy_str() mds/MDSMap.h: maintain compat addr field mon/MgrMap: maintain compat active_addr field mon/MonClient: reconnect to mon if it's addrvec appears to have changed qa/tasks/ceph.conf.template: increase mon_mgr_mkfs_grace msg/async/ProtocolV2: fill in IP for all peer_addrs msg/async: print all addrs on debug lines mon/MonMap: no noname- mon name prefix when for_mkfs ceph-monstore-tool: print initial monmap msg/async/ProtocolV2: advertise ourselves as a v2 addr when using v2 protocol msg/async: assert existing protocol matches current protocol msg/async: add missing modelines mon/MonMap: add missing modeline vstart.sh: put mon addrs in mon_host, not 'mon addr' msg/async: better debug around conn map lookups and updates mon/MonClient: dump initial monmap at debug level 10 qa/standalone/osd/osd-fast-mark-down: use v1 addr w/ simplemessenger qa/tasks/ceph: set initial monmap features with using addrvec addrs monmaptool: add --enable-all-features option qa/tasks/ceph: only use monmaptool --addv if addr has [,:v] qa/tasks/ceph_manager: make get_mon_status use mon addr qa/tasks/ceph: keep mon addrs in ctx namespace mon/OSDMonitor: log all osd addrs on boot msg/simple: behave when v2 and v1 addrs are present at target mon/MonClient: warn if global_id changes msg/Connection: add warning/note on get_peer_global_id mds/MDSDaemon: clean up handle_mds_map debug output a bit qa/suites/rados/upgrade: debug mds mds/MDSRank: improve is_stale_message to handle addrvecs msg/async: make loopback detect when sending to one of our many addrs qa/suites/rados/upgrade: no aggressive pg num changes mon/OSDMonitor: require nautilus mons for require_osd_release=nautilus mon/OSDMonitor: require mimic mons for require_osd_release=mimic qa/suites/rados/thrash-old-clients: use legacy addr syntax in ceph.conf msg/async: preserve peer features when replacing a connection qa/tasks/ceph.py: move methods from teuthology.git into ceph.py directly; support mon bind * options mon/MonMap: adjust build_initial behavior for mkfs vs probe mon/MonMap: improve ambiguous addr behavior qa/suites/rados/upgrade: spread mons a bit qa/rados/thrash-old-clients: keep mons on separate hosts qa/standalone/mon/misc.sh: tweak test to be more robust qa/tasks/mon_seesaw: expect v1/v2 prefix in addr osd/OSDMap: fix is_blacklisted() check to assume type ANY mon/OSDMonitor: use ANY addr type for blacklisting mon/msg_types: TYPE_V1ORV2 -> TYPE_ANY qa/workunits/cephtool: fix blacklist test qa/suites/upgrade: install old version with only v1 addrs common/options: by default, bind to both msgr v1 and v2 addresses vstart.sh: add --msgr1, --msgr2, --msgr21 options msg/async/ProtocolV2: be flexible with server identity check msg/msg_types: fix entity_addrvec_t::parse() with null end arg qa/suites/rados/basic/msgr: no msgr2 addrs in initial monmaps qa/tasks/ceph: add 'mon_bind_addrvec' and 'mon_bind_msgr2' options monmaptool: add --addv argument to pass in addrvec directly qa/suites/rados/basic/msgr: do not use msgr2 with simplemessenger qa/suites/rados/basic/msgr: async is not experimental messages/MOSDBoot: fix compat with pre-nautilus mon/MonMap: allow v1 or v2 to be explicitly specified along with part msg/msg_types: allow parsing of IPs without assuming v1 vs v2 msg/msg_types: default parse to v2 addrs msg: standarize on v1: and v2: prefixes for all entity_addr_t's vstart.sh: use msgr2 by default mon/MonMap: remove get_addr() methods ceph-mon: adjust startup/bind/join sequence to use addrs mon: use MonMap::get_addrs() (instead of get_addr()) mon/MonClient: change pending_cons to addrvec-based map mon/MonMap: fix set_addr() caller, kill wrapper mon/MonMap: remove addr-based add() monmaptool: fix --add to do either legacy or msgr2+legacy monmaptool: clean up iterator use a bit mon/MonMap: handle ambiguous mon addrs by trying both legacy and msgr mon/MonMap: take addrvec for set_initial_members mon/MonMap: use addrvecs for test instances mon: pass addrvec via MMonJoin mon/MonmapMonitor: fix 'mon add' to populate addrvec mon/MonMap: addr -> addrvec msg/async/ProtocolV2: only update socket_addr if we learned our addr osd: go active even if mon only accepted our v1 addr test/msgr: add test for msgr2 protocol msg/async/ProtocolV2: share socket_addr and all addrs during handshake msg/async: print socket_addr for the connection msg/async: msgr2 protocol placeholder msg/async: move ProtocolV1 class to its own source file msg/async: keep listen addr in ServerSocket, pass to new connections msg/async/AsyncMessenger: fix set_addr_unknowns Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2019-01-04 13:42:09 -06:00
Sage Weil	16980bd12f	qa/suites/rados: replace mon_seesaw.py task with a small bash script The teuthology test did not like the change to remove 'mon addr' from ceph.conf. The standalone script is easier to test. Note that it avoids mon names 'a', 'b', 'c' since the MonMap::build_initial uses those. Signed-off-by: Sage Weil <sage@redhat.com>	2019-01-03 11:17:31 -06:00
Sage Weil	b92be2ca9b	qa/standalone/osd/osd-fast-mark-down: use v1 addr w/ simplemessenger Signed-off-by: Sage Weil <sage@redhat.com>	2019-01-03 11:17:31 -06:00
Sage Weil	7559a47f5b	qa/standalone/mon/misc.sh: tweak test to be more robust Signed-off-by: Sage Weil <sage@redhat.com>	2019-01-03 11:17:31 -06:00
David Zafman	554ea73cb5	test: Disable duplicate request command test during scrub testing Scrub testing requires an orderly control of scrubbing. Most but not all the time, the duplicate scrub request is ignored because the first request hasn't finished. Teuthology enables this environment variable in the workunit handling. Fixes: https://tracker.ceph.com/issues/36525 Signed-off-by: David Zafman <dzafman@redhat.com>	2018-12-21 18:28:23 -08:00
David Zafman	094d39aa09	test: Add testing for erasure code backfill out of space detection Signed-off-by: David Zafman <dzafman@redhat.com>	2018-12-18 09:30:44 -08:00
David Zafman	3b8f86c8b0	test: Add testing for backfill out of space detection Signed-off-by: David Zafman <dzafman@redhat.com>	2018-12-18 09:30:44 -08:00
David Zafman	975dbc5841	test: Minor improvement to create_ec_pool() Signed-off-by: David Zafman <dzafman@redhat.com>	2018-12-10 20:16:01 -08:00
Igor Fedotov	79fd227639	qa: replace raw_bytes_used field access in QA test cases Signed-off-by: Igor Fedotov <ifedotov@suse.com>	2018-12-06 18:54:21 +03:00
Igor Fedotov	d07c10dfc0	os/bluestore: add main device expand capability. One can do that via ceph-bluestore-tool's bluefs-bdev-expand command Signed-off-by: Igor Fedotov <ifedotov@suse.com>	2018-11-29 12:48:20 +03:00
David Zafman	1841928e28	test: Add test for requested scrub priority Signed-off-by: David Zafman <dzafman@redhat.com>	2018-11-14 23:57:20 -08:00
Josh Durgin	fd2a4c5733	Merge pull request #22476 from dzafman/wip-23875 Removal of snapshot with corrupt replica crashes osd Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com>	2018-11-09 15:15:01 -08:00
David Zafman	a159f162c5	test: osd-scrub-snaps.sh: After snapshot removal wait for snaptrim to complete Due to deliberate corruptions snaptrim_error means snaptrim is done Signed-off-by: David Zafman <dzafman@redhat.com>	2018-11-08 14:48:20 -08:00
David Zafman	e37f95ac27	test: osd-scrub-snaps.sh: Testing with new --rmtype in ceph-objectstore-tool Use --rmtype snapmap with new obj16 to remove snapmap only, check for repair message Use --rmtype nosnapmap to remove obj5 while leaving snapmap behind Signed-off-by: David Zafman <dzafman@redhat.com>	2018-11-08 14:48:20 -08:00
David Zafman	f43faf4ad7	test: cleanup: Remove redundant cat of log and handle errors in create_scenario() Signed-off-by: David Zafman <dzafman@redhat.com>	2018-11-08 14:48:19 -08:00
Sage Weil	c8a8dc21fd	Merge PR #24828 into master * refs/pull/24828/head: qa/osd-bluefs-volume-ops: use ceph-bluestore-tool for fsck qa/osd-bluefs-volume-ops: reduce space usage for the test case Reviewed-by: David Zafman <dzafman@redhat.com>	2018-11-08 16:26:52 -06:00
Sage Weil	5b9be42bf5	Merge PR #15047 into master * refs/pull/15047/head: tool/ceph_objectstore_tool: add new op that reset last_complete to last_update Reviewed-by: Sage Weil <sage@redhat.com>	2018-11-06 10:47:18 -06:00
Sage Weil	9ab9dcfc0d	Merge PR #24809 into master * refs/pull/24809/head: os/bluestore: omit redundant '/' in OSD path for ceph-bluestore-tool if os/bluestore: improve error handling for migrate ops in qa/standtalone/osd-bluefs-volume-ops: remove redundant code. Reviewed-by: Sage Weil <sage@redhat.com>	2018-10-30 15:09:45 -05:00
Igor Fedotov	f5520ea304	qa/osd-bluefs-volume-ops: use ceph-bluestore-tool for fsck Signed-off-by: Igor Fedotov <ifedotov@suse.com>	2018-10-30 15:38:16 +03:00
Igor Fedotov	80e67abdfd	qa/osd-bluefs-volume-ops: reduce space usage for the test case Signed-off-by: Igor Fedotov <ifedotov@suse.com>	2018-10-30 15:38:15 +03:00
Sage Weil	c40685ebdd	Merge PR #24787 into master * refs/pull/24787/head: Merge PR #24796 into nautilus osd: fix heartbeat_reset unlock Merge PR #24780 into nautilus Merge PR #24761 into nautilus Merge PR #24651 into nautilus osd: fix race between op_wq and context_queue test: Make sure kill_daemons failure will be easy to find test: Add flush_pg_stats to make test more deterministic	2018-10-29 08:36:34 -05:00
Igor Fedotov	5d38f8b49b	qa/standtalone/osd-bluefs-volume-ops: remove redundant code. Signed-off-by: Igor Fedotov <ifedotov@suse.com>	2018-10-29 16:30:36 +03:00
Xie Xingguo	e6f9241aeb	Merge pull request #24657 from xiexingguo/wip-rm-device-class-fix mon/OSDMonitor: two "ceph osd crush class rm" fixes Reviewed-by: Sage Weil <sage@redhat.com>	2018-10-27 09:49:57 +08:00
xie xingguo	5bcac35213	mon/OSDMonitor: do not remove device class still referenced by ec-profiles Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>	2018-10-23 21:17:56 +08:00
xie xingguo	4bc54587a1	mon/OSDMonitor: make "ceph osd crush class rm" idempotent Removing a non-existent device class should be generally okay. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>	2018-10-23 21:17:56 +08:00
Yan Jun	1e98c72dfc	mon: drop repeated 'goodchars' and add osd crush ls testcase Signed-off-by: Yan Jun <yan.jun8@zte.com.cn>	2018-10-23 16:32:45 +08:00
Kefu Chai	4af71e7c00	Merge pull request #23103 from ifed01/wip-ifed-bluefs-migrate os/bluestore: allow ceph-bluestore-tool to coalesce, add and migrate BlueFS backing volumes Reviewed-by: Sage Weil <sage@redhat.com>	2018-10-22 22:33:08 +08:00
liuchang0812	7c008d279e	tool/ceph_objectstore_tool: add new op that reset last_complete to last_update Fixes: http://tracker.ceph.com/issues/19382 Signed-off-by: liuchang0812 <liuchang0812@gmail.com>	2018-10-22 11:03:06 +08:00
David Zafman	da3c556aa2	test: Make sure kill_daemons failure will be easy to find Signed-off-by: David Zafman <dzafman@redhat.com>	2018-10-17 16:54:45 -07:00
David Zafman	b33edbc4f6	test: Add flush_pg_stats to make test more deterministic Signed-off-by: David Zafman <dzafman@redhat.com>	2018-10-17 16:54:45 -07:00
Igor Fedotov	02b5768a4f	tests: add qa test case for bluefs volume coalescence Signed-off-by: Igor Fedotov <ifedotov@suse.com>	2018-10-17 22:39:27 +03:00
Sage Weil	54d539d79a	Merge PR #24603 into master * refs/pull/24603/head: crush: get "ceph osd crush class create/rm" back Reviewed-by: Sage Weil <sage@redhat.com>	2018-10-17 10:06:26 -05:00
xie xingguo	d7ff33e9fd	crush: get "ceph osd crush class create/rm" back This reverts a27fd9d25cb2819e25cc48b790c40afac0250464 and b863883ca783487401fde4f4480ed1d9b093363e. Quote form Sébastien Han: > IIRC at some point, we were able to create a device class from the CLI. Now it seems that the device class gets created when at least one OSD of a particular class starts. In ceph-ansible, we create pools after the initial monitors are up and we want to assign a device crush class on some of them. That's not possible at the moment since there no device class available yet. Also, someone might want to create its own device class. Something as crazy as running Filestore with a tmpfs osd store and might want to isolate them. I know it's a very limited use case, but still, it could be desired. See also https://www.spinics.net/lists/ceph-devel/msg41152.html Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>	2018-10-16 08:45:49 +08:00
huanwen ren	f1219d716d	qa/osd: fixup osd-rep-recov-eio.sh fails to parse pg dump Fixes: http://tracker.ceph.com/issues/36418 Signed-off-by: huanwen ren <ren.huanwen@zte.com.cn>	2018-10-16 02:18:22 +08:00
John Spray	67d147c00d	Merge pull request #23622 from renhwztetecs/renhw-wip-25103 mgr: fixup pgs show in unknown state Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: John Spray <john.spray@redhat.com>	2018-10-10 13:28:33 +01:00
huanwen ren	ed442447c0	qa: modify the format for add pgmap_ready. Signed-off-by: huanwen ren <ren.huanwen@zte.com.cn>	2018-09-27 23:22:50 +08:00
Sage Weil	9bf7c810a7	Merge PR #23985 into master * refs/pull/23985/head: ceph-objectstore-tool: add back pool dne check qa/suites/rados/singleton/reg11184: remove old test ceph-objectstore-tool: import pg at original epoch osd: handle null pg slot on startup ceph-objectstore-tool: drop support for ancient export files osd: avoid dropping osd_lock when pg osdmaps are not laggy qa/standalone/osd/pg-merge.sh: add merge vs pg import test Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn> Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2018-09-21 08:21:53 -05:00
Kefu Chai	4b0e2c8ed4	qa: fix typos Signed-off-by: Kefu Chai <kchai@redhat.com>	2018-09-21 12:41:42 +08:00
Sage Weil	26cb966cab	ceph-objectstore-tool: import pg at original epoch - In the jewel era, we fast-forwarded the PG to the OSD's latest epoch and cleared past_intervals. - In mimic, as of 2347ecb9614b0cd4cd9eae1d67b03119cc7ad18e, we brought the PG up to date while updating past_intervals. (At the same time we removed the OSD's parallel past_intervals regeneration.) The problem is that the tool then has to reimplement the past_intervals update logic, and also has to cope with splits and merges. Splits are somewhat easier (until now we enable partial import of a PG into a split child), but merges are not so easy. This patch changes it so we import the PG and leave the pg_epoch matching the import file. The OSD is then responsible for bringing it up to date with the latest map, and dealing with any intervening splits or merges. We also adjust the safety check to ensure that we don't collide with any existing PG, either a child we eventually split into, or a parent we eventually merge into. Fixes: http://tracker.ceph.com/issues/35955 Signed-off-by: Sage Weil <sage@redhat.com>	2018-09-20 12:58:00 -05:00
Sage Weil	da887c82ce	qa/standalone/osd/pg-merge.sh: add merge vs pg import test - You can't import the source half a PG that's since merged. Sorry! We could implement this later. - You can import the target half, but the result will then be incomplete, and you rely on backfill to clean it up. - Map gaps don't affect this behavior. Signed-off-by: Sage Weil <sage@redhat.com>	2018-09-17 12:52:46 -05:00
Kefu Chai	338612ad88	Merge pull request #24088 from dzafman/wip-35982 qa/standalone: Standalone test corrections Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com>	2018-09-17 22:35:43 +08:00
Kefu Chai	f46523e464	Merge pull request #23955 from wjwithagen/wjw-fix-ceph-helpers.sh test: Start using GNU awk and fix archiving directory Reviewed-by: Kefu Chai <kchai@redhat.com>	2018-09-17 15:44:06 +08:00
David Zafman	ef6940fbb6	test: osd-backfill-stats.sh: Fix subtests to get primary which can change Fixes: http://tracker.ceph.com/issues/35982 Signed-off-by: David Zafman <dzafman@redhat.com>	2018-09-13 13:19:23 -07:00
David Zafman	6d53e2c380	test: Fix for error message changed in ceph-objectstore-tool Fixes: http://tracker.ceph.com/issues/35982 Caused by: 6bd682f53dfe0b2f7c31b5c1ba081afb72f1dd6c Signed-off-by: David Zafman <dzafman@redhat.com>	2018-09-13 13:19:11 -07:00
David Zafman	7f83a24553	Merge pull request #24018 from dzafman/wip-35912 qa/standalone: Minor test improvements Reviewed-by: Kefu Chai <kchai@redhat.com>	2018-09-12 13:15:44 -07:00
Kefu Chai	1578875194	Merge pull request #24013 from dzafman/wip-35845 test: Use a grep pattern that works across releases Reviewed-by: Kefu Chai <kchai@redhat.com>	2018-09-12 23:00:39 +08:00
Kefu Chai	510d9e1345	Merge pull request #23723 from xiexingguo/wip-list-missing osd/PrimaryLogPG: rename list_missing -> list_unfound command Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>	2018-09-11 20:25:21 +08:00
David Zafman	6e3f04365f	test: Trap termination so we can capture logs on teuthology timeout Signed-off-by: David Zafman <dzafman@redhat.com>	2018-09-10 12:23:07 -07:00
David Zafman	dc80f8585a	test: Use a grep pattern that works across releases Fixes: http://tracker.ceph.com/issues/35845 Signed-off-by: David Zafman <dzafman@redhat.com>	2018-09-10 08:21:36 -07:00
Sage Weil	4d2a73c7f1	Merge PR #23845 into master * refs/pull/23845/head: osd/OSDMap: include age in up and in counts for ceph status mon/OSDMonitor: set new_last_{up,in}_change osd/OSDMap: store last_up_change and last_in_change mgr/MgrMap: include mgr age in map printer mon/MgrMap: track active_changed timestamp mon: include mon quorum age in status include/utime: add utimespan_str helper Reviewed-by: John Spray <john.spray@redhat.com>	2018-09-10 07:45:58 -05:00
Sage Weil	f47921f293	qa/standalone/osd/osd-backfill-stats: fixes Grep from the primary's log, not every osd's log. For the backfill_remapped task in particular, after the pg_temp change it just so happens that the primary changes across the pool size change and thus two different primaries do (some) backfill. Fix that test to pass the correct primary. Other tests are unaffected as they do not (happen to) trigger a primary change and already satisfied the (removed) check that only one OSD does backfill. Signed-off-by: Sage Weil <sage@redhat.com>	2018-09-07 17:11:18 -05:00
Sage Weil	4fc02a7f48	osd/OSDMap: include age in up and in counts for ceph status Signed-off-by: Sage Weil <sage@redhat.com>	2018-09-07 09:07:50 -05:00
Willem Jan Withagen	bfe7a2afaa	test: Start using GNU awk and fix archiving directory awk uses some tests that the native FreeBSD awk does not support: like: BEGIN{print 0 < 90} And TESTDIR is not set when calling ceph-helpers from smoke.sh So fix with keeping the archive in /tmp Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>	2018-09-06 15:50:20 +02:00
xie xingguo	85ba2f0a82	osd/PrimaryLogPG: s/list_missing/list_unfound/ Also: - Do not print offset until specified - Count missing objects correctly (used to be primary's local missing) Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>	2018-09-06 09:52:20 +08:00
Sage Weil	88df536908	Merge PR #23540 into master * refs/pull/23540/head: include/ceph_fs: rename old auid field PendingReleaseNotes: note about auid support removal radosgw-admin: remove -a --auth-uid arg rgw: remove auid member from RGWUserInfo auth: remove auid member from EntityAuth osd: remove auid session member mon: remove auid session member doc/dev/cephx_protocol: drop auid reference auth: remove auid args from handle_request and verify_authorizer mon/OSDMonitor: remove 'osd pool {get,set} <name> auid ...' mon/OSDMonitor: remove auid arg for 'osd lspools' and deprecate osd/OSDCap: remove auid from grammar osd/OSDCap: remove auid from is_capable() etc args auth: clean up cap parse error messages mon/AuthMonitor: raise health warning on invalid caps mon/AuthMonitor: drop ancient auth inc encoding compat messages/MPoolOp: drop auid member osdc/Objecter: drop change_pool_auid pybind/rados: drop auid arg to pool_create pybind/rados: drop change_auid rados: drop mkpool, rmpool commands rados: remove 'chown' command librados: deprecate calls that take auid librados: mark all auid calls deprecated mon/OSDMonitor: drop variable pool auid for prepare_new_pool mon/OSDMonitor: remove pool auid change support osdc/Objecter: do not pass auid to create_pool ceph-authtool: remove auid options qa/workunits/cephtool: remove auid tests Reviewed-by: Gregory Farnum <gfarnum@redhat.com>	2018-09-01 15:53:31 -05:00
Xie Xingguo	0857124d23	Merge pull request #23663 from xiexingguo/wip-incompat-async-fixes osd: some recovery improvements and cleanups Reviewed-by: Sage Weil <sage@redhat.com>	2018-09-01 14:27:27 +08:00
Sage Weil	2c26fb0fe1	rados: drop mkpool, rmpool commands - mkpool and rmpool users should use the normal cli/mon commands Signed-off-by: Sage Weil <sage@redhat.com>	2018-08-31 09:27:36 -05:00
xie xingguo	22786cffa8	osd/PG: force auth_log_shard to be primary when appropriate So if there are a lot fo missing objects on primary, we can make use of auth_log_shard to restore client I/O quickly. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>	2018-08-31 16:29:25 +08:00
Sage Weil	85083f39b5	Merge PR #23572 into master * refs/pull/23572/head: qa/standalone/osd/osd-force-create-pg: add force-create-pg test mon/MonCommands: fix 'osd force-create-pg' Reviewed-by: Kefu Chai <kchai@redhat.com>	2018-08-30 08:52:44 -05:00
David Zafman	b0d2c64d6b	Merge pull request #23376 from dzafman/wip-25108 object errors found in be_select_auth_object() aren't logged the same Reviewed-by: Kefu Chai <kchai@redhat.com>	2018-08-23 13:23:55 -07:00
Josh Durgin	cc41b51c6a	Merge pull request #23518 from dzafman/wip-25084 osd: When possible check CRC in build_push_op() so repair can eventually stop Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2018-08-23 11:39:05 -07:00
David Zafman	687f63e599	test: Update tests for error message changes Signed-off-by: David Zafman <dzafman@redhat.com>	2018-08-23 11:09:22 -07:00
David Zafman	58c4d32203	test: Verify cluster logging of scrub error messages Signed-off-by: David Zafman <dzafman@redhat.com>	2018-08-23 11:09:22 -07:00
David Zafman	bc33170310	test: Use pids instead of jobspecs which were wrong Fixes: http://tracker.ceph.com/issues/27056 Signed-off-by: David Zafman <dzafman@redhat.com>	2018-08-22 10:57:04 -07:00
David Zafman	d0b260c272	test: Fix test to use -gt instead of creating an empty file "0" Signed-off-by: David Zafman <dzafman@redhat.com>	2018-08-17 19:33:44 -07:00
Noah Watkins	61e2648c19	qa/ceph_objectstore_tool.py: set mgr module path Signed-off-by: Noah Watkins <nwatkins@redhat.com>	2018-08-17 15:24:10 -07:00
Noah Watkins	7d3fa9bda3	qa/standalone/ceph-helpers.sh: fix mgr module path callers of get_python_path were not passing in a $1 parameter, so ceph_lib was an empty string resulting in an invalid path to the built cython modules. assume this is called from the `lib` parent directory. pass path to the manager modules when starting ceph-mgr. Signed-off-by: Noah Watkins <nwatkins@redhat.com>	2018-08-17 15:21:57 -07:00
David Zafman	c1b2bd7f16	test: Fix test to detect a test setup failure Signed-off-by: David Zafman <dzafman@redhat.com>	2018-08-15 15:45:44 -07:00
David Zafman	72c34949fc	test: Add test for filestore bad CRC in primary pull request Signed-off-by: David Zafman <dzafman@redhat.com>	2018-08-15 15:45:44 -07:00
Sage Weil	ed9ec42c42	qa/standalone/osd/osd-force-create-pg: add force-create-pg test Signed-off-by: Sage Weil <sage@redhat.com>	2018-08-15 06:47:47 -05:00
Sage Weil	662e883683	qa/standalone/crush/crush-choose-args: run mgr The osd purge command needs a running mgr. Fixes: `d2b41d4095` Signed-off-by: Sage Weil <sage@redhat.com>	2018-08-07 11:57:05 -05:00
David Zafman	67d9e44de6	test: Add test for repair of bad object info data_digest on all copies Signed-off-by: David Zafman <dzafman@redhat.com>	2018-07-26 07:50:23 -07:00
Sage Weil	4108ebc0ab	qa/standalone/osd/ec-error-rollforward: reproduce bug 24597 This reproduces http://tracker.ceph.com/issues/24597 Signed-off-by: Sage Weil <sage@redhat.com>	2018-07-11 16:15:49 -05:00
Sage Weil	4f9fdd98e2	qa/standalone/osd/repro_long_log.sh: fix test The log trimming case wasn't quite right. Before HEAD^ we were rolling forward too aggressively and miscalculating the can_rollforward_to, which affected the trim_to calculation. Signed-off-by: Sage Weil <sage@redhat.com>	2018-07-11 16:15:49 -05:00
David Zafman	fbc8bcfe05	test: test_get_timeout_delays() fix Caused by: `7b0d1c8b8a` Signed-off-by: David Zafman <dzafman@redhat.com>	2018-07-03 14:01:36 -07:00
Josh Durgin	9106dc56c2	Merge pull request #22761 from fullerdj/wip-djf-24686 osd/filestore: Change default filestore_merge_threshold to -10 Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2018-07-02 17:36:00 -07:00
Douglas Fuller	75f55f2dfc	osd/filestore: Change default filestore_merge_threshold to -1 Performance evaluations of medium to large size Ceph clusters have demonstrated negligible performance impact from unnecessarily deep directory hierarchies but significant performance impact from filestore split and merge activity. Disable merges by default. Fixes: http://tracker.ceph.com/issues/24686 Signed-off-by: Douglas Fuller <dfuller@redhat.com>	2018-06-29 11:45:12 -04:00
David Zafman	663d96e934	Merge pull request #22727 from dzafman/wip-21664 qa/standalone/scrub: When possible show side-by-side diff in addition to regular diff Reviewed-by: Kefu Chai <kchai@redhat.com>	2018-06-28 19:59:21 -04:00
David Zafman	3ff56a82a4	Merge pull request #22763 from dzafman/wip-remove-sudo qa: Don't use sudo when moving logs Reviewed-by: Neha Ojha <nojha@redhat.com>	2018-06-28 18:37:24 -04:00
David Zafman	23ed63e15f	Merge pull request #22441 from ErwanAliasr1/evelu-makecheck Improving make check reliability Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: David Zafman <dzafman@redhat.com>	2018-06-28 14:55:12 -04:00
David Zafman	808c628304	qa: Don't use sudo when moving logs Caused by: `f0964beac5` Signed-off-by: David Zafman <dzafman@redhat.com>	2018-06-28 09:17:06 -07:00
David Zafman	ebb05b2542	test: When possible show side-by-side diff in addition to regular diff Fixes: https://tracker.ceph.com/issues/21664 Signed-off-by: David Zafman <dzafman@redhat.com>	2018-06-26 18:23:07 -07:00
David Zafman	f0964beac5	qa: For teuthology copy logs to teuthology expected location Signed-off-by: David Zafman <dzafman@redhat.com>	2018-06-25 18:06:01 -07:00
Erwan Velu	57df91380b	qa/standalone/ceph-helpers.sh: Setup ulimit in setup() If ulimit is set to a 1024 value, ceph-osd will segfault with the following error : filestore(td/smoke/0) error (24) Too many open files not handled on operation 0x55565d1fd004 (2182.1.0, or op 0, counting from 0) This patch is about to insure that before setting up ceph daemons in tests, a valid ulimit value is setup. Signed-off-by: Erwan Velu <erwan@redhat.com>	2018-06-25 22:09:14 +02:00
Erwan Velu	7b0d1c8b8a	qa/standalone/ceph-helpers.sh: Thinner resolution in get_timeout_delays() get_timeout_delays() is a generic function to compute delays for a long period of time without saturating the CPU is busy loops. It works pretty fine when the delay is short like having the following series when requesting a 20seconds timeout : "0.1 0.2 0.4 0.8 1.6 3.2 6.4 7.3 ". Here the maximum between two loops is 7.3 which is perfectly fine. When the timeout reaches 300sec, the same code produces the following series : "0.1 0.2 0.4 0.8 1.6 3.2 6.4 12.8 25.6 51.2 102.4 95.3 " In such example there is delays which are nearly 2 minutes ! That is not efficient as the expected event, between two loops, could arrive just after this long sleep occurs making a minute+ sleep for nothing. On a local system that could be ok while on a CI, if all jobs run like CI the overall is pretty unefficient by generating useless CPU waits. This patch is about adding a maximum acceptable delay time between two loops while keeping the same rampup behavior. On the same 300 seconds delay example, with MAX_TIMEOUT set to 10, we now have the following series: "0.1 0.2 0.4 0.8 1.6 3.2 6.4 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 7.3" We can see that the long 12/25/51/102/95 values vanished and being replaced by a series of 10 seconds. It's up to every test defining the probability of having a soonish event to complete. The MAX_TIMEOUT is set to 15seconds. Signed-off-by: Erwan Velu <erwan@redhat.com>	2018-06-25 22:09:14 +02:00
Sage Weil	3cd7d5eb22	Merge PR #22343 into master * refs/pull/22343/head: qa/standalone remove ceph-disk from activate_osd helper cmake: remove subman.sh tests test remove ceph-disk directory debian: remove ceph_detect_init python files from base qa/standalone remove virtualenv paths for ceph-disk and ceph-detect-init debian: remove ceph-disk ceph-detect-init python files rpm: remove ceph-disk ceph-detect-init python files alpine: remove ceph-disk ceph-detect-init python files alpine: remove ceph-osd and parttypeuuid udev rules debian: remove ceph-osd and parttypeuuid udev rules rpm: remove ceph-osd and parttypeuuid udev rules ceph-helpers.sh: remove ceph-disk, set up osds directly CMakeLists.txt: add back CEPH_BUILD_VIRTUALENV alpine: remove ceph-disk, add ceph-volume in APKBUILD.in upstart: remove ceph-disk activation call doc/install add anchor for manual osd deployment in freebsd guide doc/dev remove ceph-disk from freebsd guide, link to manual reference doc/dev/config-key remove ceph-disk references doc/dev remove ceph-disk.rst doc/dev: change ceph-disk suite examples for ceph-deploy doc/man_index: remove ceph-disk, ceph-detect-init refs doc/install: remove ceph-disk from freebsd examples doc/rados remove ceph-disk from man references doc/man remove ceph-disk ref from ceph-volume-systemd doc/man: update reference from ceph-disk to ceph-volume doc/man: remove ceph-disk, ceph-detect-init from cmake doc/man/ceph-volume remove doc reference to ceph-disk doc/man: remove ceph-disk, ceph-detect-init qa/suites: remove ceph-disk qa/run-standalone.sh: remove requirement for ceph-detect-init virtualenv qa/workunits: remove ceph-detect-init from rbdmapfile test qa/workunits: remove ceph-detect-init from ceph-helpers-root.sh qa/workunits: remove ceph-disk build: remove ceph-disk from freebsd script cmake: remove ceph-disk, ceph-detect-init tox tests init-ceph: remove ceph-disk cmake: remove top-level entries for ceph-disk, ceph-detect-init debian: remove ceph-detect-init references debian: remove ceph-disk references src: remove ceph-detect-init tool rpm: remove ceph-disk, ceph-detect-init from spec file test: remove subman script script: remove subman script udev: remove parttypeuuid rules for ceph-disk tool remove ceph-disk from ps-ceph.pl upstart: remove ceph-disk conf file systemd: remove ceph-disk from CMakeLists systemd: remove ceph-disk service udev: remove ceph-disk rules src: remove ceph-disk tool	2018-06-19 07:07:55 -05:00
David Zafman	fe09fc5e9d	test: Fail immediately if some operations fail Signed-off-by: David Zafman <dzafman@redhat.com>	2018-06-18 14:09:14 -07:00
David Zafman	33538aca35	test: Fix standalone main usage Signed-off-by: David Zafman <dzafman@redhat.com>	2018-06-18 14:09:14 -07:00

... 2 3 4 5 6 ...

528 Commits