RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-02 09:02:34 +00:00

Author	SHA1	Message	Date
Yuri Weinstein	e419a29be5	Merge pull request #42735 from amathuria/wip-amathuria-scrub-stats osd/scrub: Add stats to PG dump for number of objects scrubbed Reviewed-by: Ronen Friedman <rfriedma@redhat.com>	2022-01-14 10:46:28 -08:00
Gabriel BenHanokh	a39b1f3cf7	tools/ceph-bluestore-tool: Fix bluefs-bdev-expand command Update allocation file when we expand-device Add the expended space to the allocator and then force an update to the allocation file There is also a new standalone test case for expand Fixes: https://tracker.ceph.com/issues/53699 Signed-off-by: Gabriel Benhanokh <gbenhano@redhat.com>	2022-01-12 18:07:59 +02:00
Aishwarya Mathuria	91885f1a87	qa/standalone: add test to check if objects_scrubbed is equal to number of objects in a PG once a scrub finishes Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>	2022-01-12 14:57:40 +05:30
Yuri Weinstein	58faf5712e	Merge pull request #43919 from ronen-fr/wip-rf-test-nodeep osd/scrub (& qa/standalone): test for scrub behavior when no-scrub is set but no-deep-scrub is not Reviewed-by: Neha Ojha <nojha@redhat.com> Reviewed-by: Matan Breizman <Matan.Brz@gmail.com>	2021-12-08 13:04:57 -08:00
Neha Ojha	89d5b2a79e	Merge pull request #43336 from ifed01/wip-fix-bluefs-volumes-ops qa/osd-bluefs-volume-ops: fix bluefs volumes ops test case Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>	2021-12-02 08:39:41 -08:00
Ronen Friedman	20dd022715	qa/standalone: osd-scrub-repair.sh: fix expected "not scrubbed since" warnings count Following PR#43244, the 'ceph tell pg deep_scrub' now sets both deep-scrub and "regular" scrub time-stamps. This necessitated a modification to TEST_scrub_warning, as more PGs in this test are late for their regular scrubbing. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-11-23 16:43:21 +00:00
Ronen Friedman	7008b85fc5	qa/standalone: test for scrub behavior when noscrub is set but nodeepscrub is not A bug (https://tracker.ceph.com/issues/52901 - now fixed) resulted in this combination of conditions leaving the PG in "scrubbing" state forever. That bug was fixed by PR#43521. The patch here adds a test to detect the (now fixed) wrong behavior. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-11-23 15:08:42 +00:00
Ronen Friedman	9dda986bd5	qa/standalone: fix scrub/osd-scrub-dump following changes to 'pg dump pgs' output Make osd-scrub-dump test ignore the 'scrubbing' that might be late to disappear from the modified (PR #43403) 'pg dump' output. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-11-12 18:43:41 +00:00
Ronen Friedman	10909c3cba	osd/scrub: update the stand-alone tests to check 'scrub scheduling' entries Analyzing and verifying the relevant entries in 'pg query' and 'pg dump' output. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-11-05 17:07:57 +02:00
Igor Fedotov	efb67445c2	qa/osd-bluefs-volume-ops: retry data writing if spillover hasn't happened. Fixes: https://tracker.ceph.com/issues/52676 Signed-off-by: Igor Fedotov <ifedotov@suse.com>	2021-11-02 17:26:39 +03:00
Ronen Friedman	52e9fa16ef	tests: modify osd-scrub-repair to match PR #43239 changes PR #43239 has modified ECBackend::get_hash_info() behavior. Modified the standalone scrub test to match. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-10-20 06:42:51 +00:00
Zack Cerza	b57539dc94	Revert "qa: support isal ec test for aarch64" This commit has been causing scheduled jobs to request e.g. aarch64 smithi machines, which don't exist. The dispatcher then tries to find them forever, requiring the dispatcher to be killed and restarted. The queue will sit idle until someone notices the problem. Signed-off-by: Zack Cerza <zack@redhat.com>	2021-10-12 12:53:58 -06:00
Dai Zhiwei	eaa385f3da	qa: support isal ec test for aarch64 modified: qa/standalone/erasure-code/test-erasure-code-plugins.sh new file: qa/suites/rados/thrash-erasure-code-isa/arch/aarch64.yaml Signed-off-by: Dai Zhiwei <daizhiwei3@huawei.com>	2021-10-08 14:37:25 +08:00
Aishwarya Mathuria	1b4e416f81	osd/scrub: Add scrub duration to pg dump stats Addition of a new column, SCRUB_DURATION, to the pg stats that stores the time taken for a PG scrub. Fixes: https://tracker.ceph.com/issues/52605 Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>	2021-10-01 13:27:27 +05:30
Neha Ojha	e273418bbb	Merge pull request #42604 from sseshasa/wip-skip-osd-benchmark osd: Add config option to skip running the osd benchmark during init and update documentation. Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2021-09-08 11:03:09 -07:00
Sridhar Seshasayee	f539bedc96	qa/standalone: Add standalone test to validate osd-mclock-skip-benchmark option Add a standalone test - test_activate_osd_skip_benchmark() in ceph-helpers.sh that exercises the osd-mclock-skip-benchmark option. Fixes: https://tracker.ceph.com/issues/52025 Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-09-01 14:19:03 +05:30
Igor Fedotov	0b0f8ef12f	qa/osd-bluefs-volume-ops: reproduce bluefs migrate bug Reproduces: https://tracker.ceph.com/issues/40434 Signed-off-by: Igor Fedotov <ifedotov@suse.com>	2021-08-31 16:23:22 +03:00
Sridhar Seshasayee	464e9ea6c0	qa/standalone/misc: ver-health.sh: Increase wait_for_health_string() timeout Modified test cases: 1. ver-health.sh: a. TEST_check_version_health_1(): To avoid intermittent timeouts observed in wait_for_health_string(), increase the wait time to 20 secs. Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-30 18:16:00 +05:30
Sridhar Seshasayee	33d2a2c93b	qa/standalone/scrub: Force a subset of scrub tests to use "wpq" scheduler The following tests in the test files mentioned below use the "osd_scrub_sleep" option to introduce delays during scrubbing to help determine scrubbing states, validate reservations during scrubbing etc.. This works when using the "wpq" scheduler. But when the "mclock_scheduler" is enabled, the "osd_scrub_sleep" is disabled and overridden to 0. This is done to delegate the scheduling of the background scrubs to the "mclock_scheduler" based on the set QoS parameters. Due to this, the checks to verify the scrub states, reservations etc. fail since the window to check them is very short due to scrubs completing very quickly. This affects a small subset of scrub tests mentioned below, 1. osd-scrub-dump.sh -> TEST_recover_unexpected() 2. osd-scrub-repair.sh -> TEST_auto_repair_bluestore_tag() 3. osd-scrub-test.sh -> TEST_scrub_abort(), TEST_deep_scrub_abort() Only for the above tests, until there's a reliable way to query scrub states with "--osd-scrub-sleep" set to 0, the "osd_op_queue" config option is set to "wpq". Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-30 18:16:00 +05:30
Sridhar Seshasayee	f658ff3511	qa/standalone/erasure-code: Modify erasure-code tests for mclock scheduler Modified test cases: 1. test-erasure-eio.sh: a. Test_ec_backfill_unfound(): - Set osd_mclock_profile to high_recovery_ops profile. - Increase the wait for backfill_unfound timeout to 240 secs. Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-30 18:16:00 +05:30
Sridhar Seshasayee	bdf36cf045	qa/standalone/osd-backfill: Modify backfill tests for mclock scheduler Modified test cases: 1. osd-backfill-prio.sh: Set osd_op_queue = wpq for all tests since the mclock doesn't consider recovery priority as part of its scheduling algorithm. 2. osd-backfill-space.sh: Set osd_mclock_profile to high_recovery_ops and increase the wait for backfills timeout to 1200 secs for the following tests: - TEST_backfill_test_simple() - TEST_backfill_test_multi() - TEST_backfill_test_sametarget() - TEST_backfill_multi_partial() - TEST_ec_backfill_simple() - TEST_ec_backfill_multi() - SKIP_TEST_ec_backfill_multi_partial() - SKIP_TEST_ec_backfill_multi_partial() 3. osd-backfill-stats: - TEST_backfill_ec_down_all_out(): Set osd_mclock_profile to high_recovery_ops and increase the wait for recovery timeout to 240 secs. Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-30 18:16:00 +05:30
Sridhar Seshasayee	2c577040cb	qa/standalone/osd: Modify osd tests for mclock scheduler Modified test cases: 1. osd-recovery-prio.sh: Set osd_op_queue = wpq for all tests since mclock doesn't consider recovery priority as part of its scheduling algorithm. 2. osd-recovery-stats.sh: a. TEST_recovery_undersized(): - Set osd_mclock_profile to high_recovery_ops profile. - Increase wait for recovery timeout to 300 secs. 3. osd-rep-recov-eio.sh: a. TEST_rep_backfill_unfound(): - Set osd_mclock_profile to high_recovery_ops profile. - Increase wait for backfill_unfound to 360 secs. 4. repeer-on-acting-back.sh: a. TEST_repeer_on_down_act(): - Set osd_mclock_profile to high_recovery_ops profile. (To improve the test duration) Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-30 18:16:00 +05:30
Sridhar Seshasayee	5a85a6a035	qa/standalone: Modify ceph-helpers.sh tests for mclock scheduler. List of changes: 1. Remove the enforcement to use osd_op_queue=wpq when an osd is brought up in the following functions: - run_osd() - run_osd_filestore() and - activate_osd() 2. New functions: - get_op_scheduler() - Get the current osd_op_queue for an osd. 3. Modified test cases: - test_run_osd() - Add check for osd_max_backfill count. The mclock scheduler overrides the count to 1000. 4. New test cases: - test_activate_osd_after_mark_down() - test_get_op_scheduler() Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-30 18:16:00 +05:30
Neha Ojha	2c528248df	Merge pull request #42410 from ronen-fr/wip-ronenf-standalone-repair qa/standalone: fixing the timings when waiting for deep-scrub to start Reviewed-by: Neha Ojha <nojha@redhat.com> Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-21 06:57:41 -07:00
Ronen Friedman	ed45acee34	qa/standalone: fixing the timings when waiting for deep-scrub to start initiate_and_fetch_state() initiates a scrub, then polls the published PG state looking for 'scrubbing'. Calling flush_pg_stats() as part of the polling process might cause the scrub and the following recovery to be missed altogether. Note: this polling mechanism is definitely not robust. Will be redesigned in the future. Fixes: https://tracker.ceph.com/issues/51581 Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-07-20 08:57:37 +03:00
Sage Weil	01c006c2de	Merge PR #42041 into master * refs/pull/42041/head: mgr/restful: ignore min/max_size test/crush: drop min/max_size refs qa/workunits/mon/pool_ops: remove test for min/max_size check qa: scrub a few remaining mentions of ruleset qa/standalone/mon/osd-*: fix tests PendingReleaseNotes: note min/max_size removal mgr/dashboard: remove max/min_size and ruleset mon/OSDMonitor: fix calls to CrushTester crush: eliminate min_size and max_size test/cli/crushtool: reunumber rulesets in test maps crushtool: require min/max or num-rep for --test crush: remove last traces of 'ruleset' test/cli/crushtool: use 'id' instead of 'ruleset' in crush inputs crushtool: take --min-rep and --max-rep explicitly crush/CrushTester: drop --ruleset doc: scrub 'ruleset' from docs src/erasure-code: rule, not ruleset mon/OSDMonitor: remove check_crush_rule() callers mon/OSDMonitor: rule, not ruleset crushtool: remove check for overlapped ruels crush/CrushWrapper: get_osd_pool_default_crush_replicated_ruleset -> rule crush: remove find_rule() mon/OSDMonitor: use pool's crush rule directly osd/OSDMap: drop checks for ruleset == ruleid osd/OSDMap: use pool's crush rule_id directly mon/PGMap: use pool's crush_rule directly mon/OSDMonitor: remove crush ruleset->rule rewrite Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Avan Thakkar <athakkar@redhat.com>	2021-07-14 14:38:59 -04:00
Sridhar Seshasayee	84cab65e3a	qa/standalone: Add missing teardowns to a subset of osd-scrub-repair tests Tests identified with missing teardown within osd-scrub-repair.sh: 1. TEST_periodic_scrub_replicated() 2. TEST_scrub_warning() 3. TEST_request_scrub_priority() Centralize setup and teardown within the run() function for all the tests. Fixes: https://tracker.ceph.com/issues/51580 Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-08 13:31:42 +05:30
Sridhar Seshasayee	a96c34f0ee	qa/standalone: Add missing teardowns to a subset of osd tests The following files and tests in them did not teardown the cluster after a test completed. 1. osd/osd-force-create.sh 2. osd/osd-reuse-id.sh 3. osd/pg-split-merge.sh This wouldn't cause issues if the tests are run individually. But when running all the tests in the files mentioned above, it could introduce unexpected test failures down the line. For e.g., multiple tests may create pools with same name and if they are not cleaned up properly, this could result in unexpected failures in a subsequent test. Fixes: https://tracker.ceph.com/issues/51580 Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-08 13:28:31 +05:30
Sage Weil	4fc7c3093c	qa/standalone/mon/osd-*: fix tests Signed-off-by: Sage Weil <sage@newdream.net>	2021-07-07 10:31:57 -04:00
Patrick Donnelly	d6c66f3fa6	qa,pybind/mgr: allow disabling .mgr pool This is mostly for testing: a lot of tests assume that there are no existing pools. These tests relied on a config to turn off creating the "device_health_metrics" pool which generally exists for any new Ceph cluster. It would be better to make these tests tolerant of the new .mgr pool but clearly there's a lot of these. So just convert the config to make it work. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2021-06-11 19:35:17 -07:00
Sridhar Seshasayee	94826eaadc	qa/standalone: Use osd op queue = wpq in activate_osd() This change is a follow-up to commit b6e9c0903d5ad9a699b675f9fa7739e9cce9a5f3 that set the scheduler to wpq in run_osd() and run_osd_filestore(). In addition, activate_osd() too has to set the scheduler type to 'wpq' in order to be consistent and avoid test failures. The above is a temporary measure until all the standalone tests are modified to run well with the mclock_scheduler. Fixes: https://tracker.ceph.com/issues/51074 Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-06-09 15:02:58 +05:30
Ronen Friedman	d6eb3e3a3c	test: recovery_scrub: do not display 'repair' status on auto-repair deep-scrub A new test: auto_repair_bluestore_tag. Based on auto_repair_bluestore_basic. Sets auto-repair, starts a periodic deep-scrub, then verifies that the PG state, while scrubbing, is 'scrubbing+deep' and not 'scrubbing+deep+repair'. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-05-18 17:43:28 +03:00
Neha Ojha	b6e9c0903d	qa/standalone: use osd op queue = wpq mclock_scheduler is now the default and some of these tests need to be modified to run well with it. Continue using wpq until https://tracker.ceph.com/issues/50574 is addressed. Signed-off-by: Neha Ojha <nojha@redhat.com>	2021-05-06 17:54:38 +00:00
Loïc Dachary	7fe0ac7c11	qa: verify the benefits of mempool cacheline optimization There already is a test to verify the mempool sharding works, in the sense that it uses at least half of the variables available to count the number of allocated objects and their total size. This new test verifies that, with sharding, object counting is at least twice faster than without sharding. It also collects cacheline contention data with the perf c2c tool. The manual analysis of this data shows the optimization gain is indeed related to cacheline contention. Fixes: https://tracker.ceph.com/issues/49896 Signed-off-by: Loïc Dachary <loic@dachary.org>	2021-04-30 12:11:13 +08:00
Ilya Dryomov	7eb9c5ddb2	Merge branch 'master' into wip-unauthorized-gids Sync up with master up to commit `3d8e73b266` ("Merge pull request #40731 from tchaikov/wip-yamlize-options"). Specifically, bring in src/common/options.cc yamlization and move new auth-related options into src/common/options/global.yaml.in. Conflicts: src/common/options.cc src/common/options/global.yaml.in Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-13 15:42:06 +02:00
Sage Weil	dcd90a1c8d	Merge PR #40626 into master * refs/pull/40626/head: qa/suites/rados/objectstore: separate store_test tests qa/standalone: split osd/ into 2 directories Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2021-04-12 22:38:49 -04:00
Sage Weil	0f65e5cffa	qa/standalone: split osd/ into 2 directories The whole osd/ directory takes 3 hours to run. Of that, about half is osd-backfill*: 2021-04-05T20:38:55.932 INFO:tasks.workunit:Running workunit osd/osd-backfill-prio.sh... 2021-04-05T20:47:27.184 INFO:tasks.workunit:Running workunit osd/osd-backfill-recovery-log.sh... 2021-04-05T20:55:59.497 INFO:tasks.workunit:Running workunit osd/osd-backfill-space.sh... 2021-04-05T21:48:47.549 INFO:tasks.workunit:Running workunit osd/osd-backfill-stats.sh... 2021-04-05T22:17:09.197 INFO:tasks.workunit:Running workunit osd/osd-bench.sh... Signed-off-by: Sage Weil <sage@newdream.net>	2021-04-12 09:59:17 -05:00
Ronen Friedman	b8045f7b18	Revert "test: Add test for scrub parallelism" This reverts commit `dd63577ab3`. As `08c3ede084` (the tested functionality) is reverted. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-04-07 08:37:03 +03:00
Sage Weil	72c4fc75ad	qa/standalone: default to disable insecure global id reclaim Signed-off-by: Sage Weil <sage@newdream.net>	2021-04-06 17:29:23 -04:00
Prashant D	92eb39ee6f	crush/CrushCompiler: print weight with uniform precision Fixes: https://tracker.ceph.com/issues/48508 Signed-off-by: Prashant D <pdhange@redhat.com>	2021-03-29 14:44:49 +11:00
David Zafman	eec821b6e5	test: osd-recovery-scrub.sh: Test fails if no scrubs happened for a recovering pg Change TEST_recovery_scrub_2 to create more objects and use osd_recovery_sleep to prevent recovery from finihing before we start to scrub. Verify that at least 1 scrub was started while the pg was reovering. Fixes: https://tracker.ceph.com/issues/49779 Signed-off-by: David Zafman <dzafman@redhat.com>	2021-03-14 16:19:46 -07:00
David Zafman	a4fd1d650e	Revert "qa/standalone/scrub/osd-recovery-scrub: fix unnoticed recovery state" This reverts commit `1323bdb839`. The tests needs to scrub while recovery is in progress, so catching recovery from the logs after the fact isn't the proper setup. We can use osd_recovery_sleep config. Signed-off-by: David Zafman <dzafman@redhat.com>	2021-03-13 11:40:55 -08:00
David Zafman	dd63577ab3	test: Add test for scrub parallelism Signed-off-by: David Zafman <dzafman@redhat.com>	2021-03-05 11:41:26 -08:00
Sage Weil	5e197a21e6	Merge PR #39455 into master * refs/pull/39455/head: doc/man/8/ceph: document --max option src/test/osd/safe-to-destroy: adjust test ceph: print command output to stdout even on error mgr/DaemonServer: include details in 'osd ok-to-stop' output mgr: add --max <n> to 'osd ok-to-stop' command mgr: relax osd ok-to-stop condition on degraded pgs Reviewed-by: Neha Ojha <nojha@redhat.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-02-27 10:15:27 -05:00
Sage Weil	33dee7d7bf	crush/CrushWrapper: update shadow trees on update_item() insert_item() already does this, but update_item did not. Fixes: https://tracker.ceph.com/issues/48065 Signed-off-by: Sage Weil <sage@newdream.net>	2021-02-22 14:21:04 -06:00
Sage Weil	722f57dee1	mgr: add --max <n> to 'osd ok-to-stop' command Given and initial (set of) osd(s), if provide up to N OSDs that can be stopped together without making PGs become unavailable. This can be used to quickly identify large(r) batches of OSDs that can be stopped together to (for example) upgrade. Signed-off-by: Sage Weil <sage@newdream.net>	2021-02-20 09:53:51 -05:00
Kefu Chai	8dc097ff46	qa/standalone/mon/misc: verify that len(monmap.features.persistent) == 9 in `beb62c029a`, FEATURE_QUINCY was added to ceph::features::mon::get_persistent(), so update the test accordingly. Signed-off-by: Kefu Chai <kchai@redhat.com>	2021-01-30 22:45:20 +08:00
Sage Weil	7bbc92eda3	mon: updates for quincy Signed-off-by: Sage Weil <sage@newdream.net>	2021-01-28 13:29:28 -06:00
Neha Ojha	5c11f40c12	Merge pull request #38856 from dzafman/wip-48789 test: Fix osd-scrub-scaps.sh to handle DB format change Reviewed-by: Ronen Friedman <rfriedma@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2021-01-15 16:27:59 -08:00
Neha Ojha	6fc9166af4	Merge pull request #38726 from ronen-fr/wip-ronenf-48720 qa/standalone/scrub/osd-recovery-scrub: handle primary change when waiting for scrub Reviewed-by: David Zafman <dzafman@redhat.com>	2021-01-15 13:46:30 -08:00

1 2 3 4 5 ...

599 Commits