RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2024-12-30 15:33:31 +00:00

Author	SHA1	Message	Date
Neha Ojha	2c528248df	Merge pull request #42410 from ronen-fr/wip-ronenf-standalone-repair qa/standalone: fixing the timings when waiting for deep-scrub to start Reviewed-by: Neha Ojha <nojha@redhat.com> Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-21 06:57:41 -07:00
Ronen Friedman	ed45acee34	qa/standalone: fixing the timings when waiting for deep-scrub to start initiate_and_fetch_state() initiates a scrub, then polls the published PG state looking for 'scrubbing'. Calling flush_pg_stats() as part of the polling process might cause the scrub and the following recovery to be missed altogether. Note: this polling mechanism is definitely not robust. Will be redesigned in the future. Fixes: https://tracker.ceph.com/issues/51581 Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-07-20 08:57:37 +03:00
Sage Weil	01c006c2de	Merge PR #42041 into master * refs/pull/42041/head: mgr/restful: ignore min/max_size test/crush: drop min/max_size refs qa/workunits/mon/pool_ops: remove test for min/max_size check qa: scrub a few remaining mentions of ruleset qa/standalone/mon/osd-*: fix tests PendingReleaseNotes: note min/max_size removal mgr/dashboard: remove max/min_size and ruleset mon/OSDMonitor: fix calls to CrushTester crush: eliminate min_size and max_size test/cli/crushtool: reunumber rulesets in test maps crushtool: require min/max or num-rep for --test crush: remove last traces of 'ruleset' test/cli/crushtool: use 'id' instead of 'ruleset' in crush inputs crushtool: take --min-rep and --max-rep explicitly crush/CrushTester: drop --ruleset doc: scrub 'ruleset' from docs src/erasure-code: rule, not ruleset mon/OSDMonitor: remove check_crush_rule() callers mon/OSDMonitor: rule, not ruleset crushtool: remove check for overlapped ruels crush/CrushWrapper: get_osd_pool_default_crush_replicated_ruleset -> rule crush: remove find_rule() mon/OSDMonitor: use pool's crush rule directly osd/OSDMap: drop checks for ruleset == ruleid osd/OSDMap: use pool's crush rule_id directly mon/PGMap: use pool's crush_rule directly mon/OSDMonitor: remove crush ruleset->rule rewrite Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Avan Thakkar <athakkar@redhat.com>	2021-07-14 14:38:59 -04:00
Sridhar Seshasayee	84cab65e3a	qa/standalone: Add missing teardowns to a subset of osd-scrub-repair tests Tests identified with missing teardown within osd-scrub-repair.sh: 1. TEST_periodic_scrub_replicated() 2. TEST_scrub_warning() 3. TEST_request_scrub_priority() Centralize setup and teardown within the run() function for all the tests. Fixes: https://tracker.ceph.com/issues/51580 Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-08 13:31:42 +05:30
Sridhar Seshasayee	a96c34f0ee	qa/standalone: Add missing teardowns to a subset of osd tests The following files and tests in them did not teardown the cluster after a test completed. 1. osd/osd-force-create.sh 2. osd/osd-reuse-id.sh 3. osd/pg-split-merge.sh This wouldn't cause issues if the tests are run individually. But when running all the tests in the files mentioned above, it could introduce unexpected test failures down the line. For e.g., multiple tests may create pools with same name and if they are not cleaned up properly, this could result in unexpected failures in a subsequent test. Fixes: https://tracker.ceph.com/issues/51580 Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-08 13:28:31 +05:30
Sage Weil	4fc7c3093c	qa/standalone/mon/osd-*: fix tests Signed-off-by: Sage Weil <sage@newdream.net>	2021-07-07 10:31:57 -04:00
Patrick Donnelly	d6c66f3fa6	qa,pybind/mgr: allow disabling .mgr pool This is mostly for testing: a lot of tests assume that there are no existing pools. These tests relied on a config to turn off creating the "device_health_metrics" pool which generally exists for any new Ceph cluster. It would be better to make these tests tolerant of the new .mgr pool but clearly there's a lot of these. So just convert the config to make it work. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2021-06-11 19:35:17 -07:00
Sridhar Seshasayee	94826eaadc	qa/standalone: Use osd op queue = wpq in activate_osd() This change is a follow-up to commit b6e9c0903d5ad9a699b675f9fa7739e9cce9a5f3 that set the scheduler to wpq in run_osd() and run_osd_filestore(). In addition, activate_osd() too has to set the scheduler type to 'wpq' in order to be consistent and avoid test failures. The above is a temporary measure until all the standalone tests are modified to run well with the mclock_scheduler. Fixes: https://tracker.ceph.com/issues/51074 Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-06-09 15:02:58 +05:30
Ronen Friedman	d6eb3e3a3c	test: recovery_scrub: do not display 'repair' status on auto-repair deep-scrub A new test: auto_repair_bluestore_tag. Based on auto_repair_bluestore_basic. Sets auto-repair, starts a periodic deep-scrub, then verifies that the PG state, while scrubbing, is 'scrubbing+deep' and not 'scrubbing+deep+repair'. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-05-18 17:43:28 +03:00
Neha Ojha	b6e9c0903d	qa/standalone: use osd op queue = wpq mclock_scheduler is now the default and some of these tests need to be modified to run well with it. Continue using wpq until https://tracker.ceph.com/issues/50574 is addressed. Signed-off-by: Neha Ojha <nojha@redhat.com>	2021-05-06 17:54:38 +00:00
Loïc Dachary	7fe0ac7c11	qa: verify the benefits of mempool cacheline optimization There already is a test to verify the mempool sharding works, in the sense that it uses at least half of the variables available to count the number of allocated objects and their total size. This new test verifies that, with sharding, object counting is at least twice faster than without sharding. It also collects cacheline contention data with the perf c2c tool. The manual analysis of this data shows the optimization gain is indeed related to cacheline contention. Fixes: https://tracker.ceph.com/issues/49896 Signed-off-by: Loïc Dachary <loic@dachary.org>	2021-04-30 12:11:13 +08:00
Ilya Dryomov	7eb9c5ddb2	Merge branch 'master' into wip-unauthorized-gids Sync up with master up to commit `3d8e73b266` ("Merge pull request #40731 from tchaikov/wip-yamlize-options"). Specifically, bring in src/common/options.cc yamlization and move new auth-related options into src/common/options/global.yaml.in. Conflicts: src/common/options.cc src/common/options/global.yaml.in Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-13 15:42:06 +02:00
Sage Weil	dcd90a1c8d	Merge PR #40626 into master * refs/pull/40626/head: qa/suites/rados/objectstore: separate store_test tests qa/standalone: split osd/ into 2 directories Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2021-04-12 22:38:49 -04:00
Sage Weil	0f65e5cffa	qa/standalone: split osd/ into 2 directories The whole osd/ directory takes 3 hours to run. Of that, about half is osd-backfill*: 2021-04-05T20:38:55.932 INFO:tasks.workunit:Running workunit osd/osd-backfill-prio.sh... 2021-04-05T20:47:27.184 INFO:tasks.workunit:Running workunit osd/osd-backfill-recovery-log.sh... 2021-04-05T20:55:59.497 INFO:tasks.workunit:Running workunit osd/osd-backfill-space.sh... 2021-04-05T21:48:47.549 INFO:tasks.workunit:Running workunit osd/osd-backfill-stats.sh... 2021-04-05T22:17:09.197 INFO:tasks.workunit:Running workunit osd/osd-bench.sh... Signed-off-by: Sage Weil <sage@newdream.net>	2021-04-12 09:59:17 -05:00
Ronen Friedman	b8045f7b18	Revert "test: Add test for scrub parallelism" This reverts commit `dd63577ab3`. As `08c3ede084` (the tested functionality) is reverted. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-04-07 08:37:03 +03:00
Sage Weil	72c4fc75ad	qa/standalone: default to disable insecure global id reclaim Signed-off-by: Sage Weil <sage@newdream.net>	2021-04-06 17:29:23 -04:00
Prashant D	92eb39ee6f	crush/CrushCompiler: print weight with uniform precision Fixes: https://tracker.ceph.com/issues/48508 Signed-off-by: Prashant D <pdhange@redhat.com>	2021-03-29 14:44:49 +11:00
David Zafman	eec821b6e5	test: osd-recovery-scrub.sh: Test fails if no scrubs happened for a recovering pg Change TEST_recovery_scrub_2 to create more objects and use osd_recovery_sleep to prevent recovery from finihing before we start to scrub. Verify that at least 1 scrub was started while the pg was reovering. Fixes: https://tracker.ceph.com/issues/49779 Signed-off-by: David Zafman <dzafman@redhat.com>	2021-03-14 16:19:46 -07:00
David Zafman	a4fd1d650e	Revert "qa/standalone/scrub/osd-recovery-scrub: fix unnoticed recovery state" This reverts commit `1323bdb839`. The tests needs to scrub while recovery is in progress, so catching recovery from the logs after the fact isn't the proper setup. We can use osd_recovery_sleep config. Signed-off-by: David Zafman <dzafman@redhat.com>	2021-03-13 11:40:55 -08:00
David Zafman	dd63577ab3	test: Add test for scrub parallelism Signed-off-by: David Zafman <dzafman@redhat.com>	2021-03-05 11:41:26 -08:00
Sage Weil	5e197a21e6	Merge PR #39455 into master * refs/pull/39455/head: doc/man/8/ceph: document --max option src/test/osd/safe-to-destroy: adjust test ceph: print command output to stdout even on error mgr/DaemonServer: include details in 'osd ok-to-stop' output mgr: add --max <n> to 'osd ok-to-stop' command mgr: relax osd ok-to-stop condition on degraded pgs Reviewed-by: Neha Ojha <nojha@redhat.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-02-27 10:15:27 -05:00
Sage Weil	33dee7d7bf	crush/CrushWrapper: update shadow trees on update_item() insert_item() already does this, but update_item did not. Fixes: https://tracker.ceph.com/issues/48065 Signed-off-by: Sage Weil <sage@newdream.net>	2021-02-22 14:21:04 -06:00
Sage Weil	722f57dee1	mgr: add --max <n> to 'osd ok-to-stop' command Given and initial (set of) osd(s), if provide up to N OSDs that can be stopped together without making PGs become unavailable. This can be used to quickly identify large(r) batches of OSDs that can be stopped together to (for example) upgrade. Signed-off-by: Sage Weil <sage@newdream.net>	2021-02-20 09:53:51 -05:00
Kefu Chai	8dc097ff46	qa/standalone/mon/misc: verify that len(monmap.features.persistent) == 9 in `beb62c029a`, FEATURE_QUINCY was added to ceph::features::mon::get_persistent(), so update the test accordingly. Signed-off-by: Kefu Chai <kchai@redhat.com>	2021-01-30 22:45:20 +08:00
Sage Weil	7bbc92eda3	mon: updates for quincy Signed-off-by: Sage Weil <sage@newdream.net>	2021-01-28 13:29:28 -06:00
Neha Ojha	5c11f40c12	Merge pull request #38856 from dzafman/wip-48789 test: Fix osd-scrub-scaps.sh to handle DB format change Reviewed-by: Ronen Friedman <rfriedma@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2021-01-15 16:27:59 -08:00
Neha Ojha	6fc9166af4	Merge pull request #38726 from ronen-fr/wip-ronenf-48720 qa/standalone/scrub/osd-recovery-scrub: handle primary change when waiting for scrub Reviewed-by: David Zafman <dzafman@redhat.com>	2021-01-15 13:46:30 -08:00
David Zafman	af9befb0f4	test: Fix osd-scrub-scaps.sh to handle DB format change Caused by: `f9c95fa7fc` Fixes: https://tracker.ceph.com/issues/48789 Signed-off-by: David Zafman <dzafman@redhat.com>	2021-01-15 10:35:30 -08:00
David Zafman	4814648155	test: osd-recovery-prio.sh replace sleep with wait for both PGs recovering fixes: https://tracker.ceph.com/issues/48842 Signed-off-by: David Zafman <dzafman@redhat.com>	2021-01-11 17:30:00 -08:00
Ronen Friedman	1323bdb839	qa/standalone/scrub/osd-recovery-scrub: fix unnoticed recovery state The 'recovering' state is transitory. Existing code looks for it by polling 'pg stat', missing from time to time. New version searches the tails of the relevant OSDs' logs. Fixes: https://tracker.ceph.com/issues/48719 Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-01-04 13:29:41 +02:00
Ronen Friedman	bb848cfd90	qa/standalone/ceph-helpers.sh: log meaningful PIDs for run_in_background() While the relevant comment says: '# Execute the command and prepend the output with its pid' the actual PID logged is the same for all background processes, which isn't very helpful. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2020-12-28 10:47:02 +02:00
Ronen Friedman	445db7f171	qa/standalone/scrub/osd-recovery-scrub: handle a Primary change Stop waiting for a scrub to happen if the Primary for the target PG changes. Fixes: https://tracker.ceph.com/issues/48720 Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2020-12-28 10:42:41 +02:00
Ronen Friedman	dff7faaf3c	qa/standalone/scrub/osd-scrub-snaps.sh: fix Python print syntax Fixes: https://tracker.ceph.com/issues/48690 Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2020-12-21 16:52:27 +02:00
Kefu Chai	694ed23e9d	qa/standalone/misc/ver-health.sh: include the bootup-time in my test bed, it takes 11 seconds to boot the 3 OSDs and to restart one of them, this fails the test. so we need to take the time into consideration. in this change, the delay is added to the total "warn_older_version_delay", so the monitor does not start sending warning earlier than expected. Signed-off-by: Kefu Chai <kchai@redhat.com>	2020-12-11 16:14:03 +08:00
Kefu Chai	4bcfa139ab	mon/HealthMonitor: use timespan for mon_warn_older_version_delay for better user experience Signed-off-by: Kefu Chai <kchai@redhat.com>	2020-12-11 16:12:47 +08:00
Kefu Chai	1f5406a752	src/*: do not pass cct to ceph_version_to_str() in `e5b1ae5554`, a new option named "debug_version_for_testing" is introduced to override the version so we can test version check. in crimson, we have two families of shared functions. - one of them is used by alien store. they are compiled with -DWITH_SEASTAR and -DWITH_ALIEN, to enable the shim code between seastar and POSIX thread. - another is used by crimson in general. where no lock is allowed. currently, we use the "crimson" and "ceph" namespace to differentiate these two families of functions, so they can colocate in the same executable without violating the ODR. see src/include/common_fwd.h for more details. the functions defined in src/common/version.cc are also shared by alien store and crimson code. and because we have different implementations of `CephContext` in crimson and in classic OSD (i.e. alienstore), we have to have different implementations of this function as well, if we follow the same approach. but since these functions are very simple and are non-blocking, there is not much value in differentiating them, it is better to inject the test settings using environment variable instead of using ceph option subsystem. in this change, "ceph_debug_version_for_testing" environment variable is checked instead, so that crimson and alienstore can share the same compilation unit of version.cc. and "debug_version_for_testing" option is removed. Signed-off-by: Kefu Chai <kchai@redhat.com>	2020-12-10 18:26:39 +08:00
Ronen Friedman	43b1129030	test: cancelling both noscrub and nodeep-scrub as part of osd-scrub-test.sh. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2020-12-09 20:16:23 +02:00
haoyixing	0e7e036aa7	doc/dev: use http://docs.ceph.com/en/latest/ instead of /docs/master/ for docs Several links under http://docs.ceph.com/docs/master/ were unable to access. Change them to http://docs.ceph.com/en/lastest so we can access them directly. Signed-off-by: haoyixing <haoyixing@kuaishou.com>	2020-11-24 12:49:47 +08:00
David Zafman	89af82bf4f	Merge pull request #38054 from dzafman/wip-test-fixes test: Fix osd-scrub-test.sh and ver-health.sh tests Reviewed-by: Neha Ojha <nojha@redhat.com>	2020-11-18 08:52:28 -08:00
David Zafman	38c3130654	test: Fix TEST_scrub_extended_sleep test (corrected test name) Didn't really test extended sleep in original code: Cause by: `3bfb5c2621` Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-16 18:30:14 -08:00
David Zafman	0a0ed890c2	test: Improve version checking test, to improve reliability Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-16 18:30:14 -08:00
Kefu Chai	0463a774c9	Merge pull request #37908 from dzafman/wip-47930 test: Fix race in TEST_recovery_scrub test Reviewed-by: Neha Ojha <nojha@redhat.com>	2020-11-16 01:00:56 +08:00
David Zafman	870bde04a5	test: Changes based on code review comments Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-11 15:31:26 -08:00
David Zafman	93373746f5	osd test: Delay reporting until mon_warn_older_version_delay has passed Move release notes description to 16.0.0 and update Update documentation Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-11 15:10:11 -08:00
David Zafman	9d988c3dbc	test: Simple test case for version health warning Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-11 15:10:11 -08:00
David Zafman	410e230d09	test: Fix race in TEST_recovery_scrub test Fixes: https://tracker.ceph.com/issues/47930 Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-10 00:45:13 +00:00
David Zafman	d3cc647583	osd: Eliminate day of weeek 7 and hour 24 Add test case for permitted hours to make sure scrub doesn't start Remove permitted hours in extended sleep test Fixes: https://tracker.ceph.com/issues/48077 Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-09 22:47:00 +00:00
David Zafman	ef47a3e708	test: set mon_allow_pool_size_one for consistency with original test intention Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-03 21:49:00 +00:00
Neha Ojha	343107766e	Merge pull request #37483 from dzafman/wip-46405 osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1 Reviewed-by: Brad Hubbard <bhubbard@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2020-10-08 11:44:00 -07:00
David Zafman	3ba7ebd3e2	test: Avoid races by waiting for PGs go clean before query Fixes: https://tracker.ceph.com/issues/46405 Signed-off-by: David Zafman <dzafman@redhat.com>	2020-10-01 19:43:57 +00:00

1 2 3 4 5 ...

576 Commits