RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-01 08:32:24 +00:00

Author	SHA1	Message	Date
Sridhar Seshasayee	2c577040cb	qa/standalone/osd: Modify osd tests for mclock scheduler Modified test cases: 1. osd-recovery-prio.sh: Set osd_op_queue = wpq for all tests since mclock doesn't consider recovery priority as part of its scheduling algorithm. 2. osd-recovery-stats.sh: a. TEST_recovery_undersized(): - Set osd_mclock_profile to high_recovery_ops profile. - Increase wait for recovery timeout to 300 secs. 3. osd-rep-recov-eio.sh: a. TEST_rep_backfill_unfound(): - Set osd_mclock_profile to high_recovery_ops profile. - Increase wait for backfill_unfound to 360 secs. 4. repeer-on-acting-back.sh: a. TEST_repeer_on_down_act(): - Set osd_mclock_profile to high_recovery_ops profile. (To improve the test duration) Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-30 18:16:00 +05:30
Sridhar Seshasayee	5a85a6a035	qa/standalone: Modify ceph-helpers.sh tests for mclock scheduler. List of changes: 1. Remove the enforcement to use osd_op_queue=wpq when an osd is brought up in the following functions: - run_osd() - run_osd_filestore() and - activate_osd() 2. New functions: - get_op_scheduler() - Get the current osd_op_queue for an osd. 3. Modified test cases: - test_run_osd() - Add check for osd_max_backfill count. The mclock scheduler overrides the count to 1000. 4. New test cases: - test_activate_osd_after_mark_down() - test_get_op_scheduler() Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-30 18:16:00 +05:30
Neha Ojha	2c528248df	Merge pull request #42410 from ronen-fr/wip-ronenf-standalone-repair qa/standalone: fixing the timings when waiting for deep-scrub to start Reviewed-by: Neha Ojha <nojha@redhat.com> Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-21 06:57:41 -07:00
Ronen Friedman	ed45acee34	qa/standalone: fixing the timings when waiting for deep-scrub to start initiate_and_fetch_state() initiates a scrub, then polls the published PG state looking for 'scrubbing'. Calling flush_pg_stats() as part of the polling process might cause the scrub and the following recovery to be missed altogether. Note: this polling mechanism is definitely not robust. Will be redesigned in the future. Fixes: https://tracker.ceph.com/issues/51581 Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-07-20 08:57:37 +03:00
Sage Weil	01c006c2de	Merge PR #42041 into master * refs/pull/42041/head: mgr/restful: ignore min/max_size test/crush: drop min/max_size refs qa/workunits/mon/pool_ops: remove test for min/max_size check qa: scrub a few remaining mentions of ruleset qa/standalone/mon/osd-*: fix tests PendingReleaseNotes: note min/max_size removal mgr/dashboard: remove max/min_size and ruleset mon/OSDMonitor: fix calls to CrushTester crush: eliminate min_size and max_size test/cli/crushtool: reunumber rulesets in test maps crushtool: require min/max or num-rep for --test crush: remove last traces of 'ruleset' test/cli/crushtool: use 'id' instead of 'ruleset' in crush inputs crushtool: take --min-rep and --max-rep explicitly crush/CrushTester: drop --ruleset doc: scrub 'ruleset' from docs src/erasure-code: rule, not ruleset mon/OSDMonitor: remove check_crush_rule() callers mon/OSDMonitor: rule, not ruleset crushtool: remove check for overlapped ruels crush/CrushWrapper: get_osd_pool_default_crush_replicated_ruleset -> rule crush: remove find_rule() mon/OSDMonitor: use pool's crush rule directly osd/OSDMap: drop checks for ruleset == ruleid osd/OSDMap: use pool's crush rule_id directly mon/PGMap: use pool's crush_rule directly mon/OSDMonitor: remove crush ruleset->rule rewrite Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Avan Thakkar <athakkar@redhat.com>	2021-07-14 14:38:59 -04:00
Sridhar Seshasayee	84cab65e3a	qa/standalone: Add missing teardowns to a subset of osd-scrub-repair tests Tests identified with missing teardown within osd-scrub-repair.sh: 1. TEST_periodic_scrub_replicated() 2. TEST_scrub_warning() 3. TEST_request_scrub_priority() Centralize setup and teardown within the run() function for all the tests. Fixes: https://tracker.ceph.com/issues/51580 Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-08 13:31:42 +05:30
Sridhar Seshasayee	a96c34f0ee	qa/standalone: Add missing teardowns to a subset of osd tests The following files and tests in them did not teardown the cluster after a test completed. 1. osd/osd-force-create.sh 2. osd/osd-reuse-id.sh 3. osd/pg-split-merge.sh This wouldn't cause issues if the tests are run individually. But when running all the tests in the files mentioned above, it could introduce unexpected test failures down the line. For e.g., multiple tests may create pools with same name and if they are not cleaned up properly, this could result in unexpected failures in a subsequent test. Fixes: https://tracker.ceph.com/issues/51580 Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-07-08 13:28:31 +05:30
Sage Weil	4fc7c3093c	qa/standalone/mon/osd-*: fix tests Signed-off-by: Sage Weil <sage@newdream.net>	2021-07-07 10:31:57 -04:00
Patrick Donnelly	d6c66f3fa6	qa,pybind/mgr: allow disabling .mgr pool This is mostly for testing: a lot of tests assume that there are no existing pools. These tests relied on a config to turn off creating the "device_health_metrics" pool which generally exists for any new Ceph cluster. It would be better to make these tests tolerant of the new .mgr pool but clearly there's a lot of these. So just convert the config to make it work. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2021-06-11 19:35:17 -07:00
Sridhar Seshasayee	94826eaadc	qa/standalone: Use osd op queue = wpq in activate_osd() This change is a follow-up to commit `b6e9c0903d` that set the scheduler to wpq in run_osd() and run_osd_filestore(). In addition, activate_osd() too has to set the scheduler type to 'wpq' in order to be consistent and avoid test failures. The above is a temporary measure until all the standalone tests are modified to run well with the mclock_scheduler. Fixes: https://tracker.ceph.com/issues/51074 Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-06-09 15:02:58 +05:30
Ronen Friedman	d6eb3e3a3c	test: recovery_scrub: do not display 'repair' status on auto-repair deep-scrub A new test: auto_repair_bluestore_tag. Based on auto_repair_bluestore_basic. Sets auto-repair, starts a periodic deep-scrub, then verifies that the PG state, while scrubbing, is 'scrubbing+deep' and not 'scrubbing+deep+repair'. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-05-18 17:43:28 +03:00
Neha Ojha	b6e9c0903d	qa/standalone: use osd op queue = wpq mclock_scheduler is now the default and some of these tests need to be modified to run well with it. Continue using wpq until https://tracker.ceph.com/issues/50574 is addressed. Signed-off-by: Neha Ojha <nojha@redhat.com>	2021-05-06 17:54:38 +00:00
Loïc Dachary	7fe0ac7c11	qa: verify the benefits of mempool cacheline optimization There already is a test to verify the mempool sharding works, in the sense that it uses at least half of the variables available to count the number of allocated objects and their total size. This new test verifies that, with sharding, object counting is at least twice faster than without sharding. It also collects cacheline contention data with the perf c2c tool. The manual analysis of this data shows the optimization gain is indeed related to cacheline contention. Fixes: https://tracker.ceph.com/issues/49896 Signed-off-by: Loïc Dachary <loic@dachary.org>	2021-04-30 12:11:13 +08:00
Ilya Dryomov	7eb9c5ddb2	Merge branch 'master' into wip-unauthorized-gids Sync up with master up to commit `3d8e73b266` ("Merge pull request #40731 from tchaikov/wip-yamlize-options"). Specifically, bring in src/common/options.cc yamlization and move new auth-related options into src/common/options/global.yaml.in. Conflicts: src/common/options.cc src/common/options/global.yaml.in Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-13 15:42:06 +02:00
Sage Weil	dcd90a1c8d	Merge PR #40626 into master * refs/pull/40626/head: qa/suites/rados/objectstore: separate store_test tests qa/standalone: split osd/ into 2 directories Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2021-04-12 22:38:49 -04:00
Sage Weil	0f65e5cffa	qa/standalone: split osd/ into 2 directories The whole osd/ directory takes 3 hours to run. Of that, about half is osd-backfill*: 2021-04-05T20:38:55.932 INFO:tasks.workunit:Running workunit osd/osd-backfill-prio.sh... 2021-04-05T20:47:27.184 INFO:tasks.workunit:Running workunit osd/osd-backfill-recovery-log.sh... 2021-04-05T20:55:59.497 INFO:tasks.workunit:Running workunit osd/osd-backfill-space.sh... 2021-04-05T21:48:47.549 INFO:tasks.workunit:Running workunit osd/osd-backfill-stats.sh... 2021-04-05T22:17:09.197 INFO:tasks.workunit:Running workunit osd/osd-bench.sh... Signed-off-by: Sage Weil <sage@newdream.net>	2021-04-12 09:59:17 -05:00
Ronen Friedman	b8045f7b18	Revert "test: Add test for scrub parallelism" This reverts commit `dd63577ab3`. As `08c3ede084` (the tested functionality) is reverted. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-04-07 08:37:03 +03:00
Sage Weil	72c4fc75ad	qa/standalone: default to disable insecure global id reclaim Signed-off-by: Sage Weil <sage@newdream.net>	2021-04-06 17:29:23 -04:00
Prashant D	92eb39ee6f	crush/CrushCompiler: print weight with uniform precision Fixes: https://tracker.ceph.com/issues/48508 Signed-off-by: Prashant D <pdhange@redhat.com>	2021-03-29 14:44:49 +11:00
David Zafman	eec821b6e5	test: osd-recovery-scrub.sh: Test fails if no scrubs happened for a recovering pg Change TEST_recovery_scrub_2 to create more objects and use osd_recovery_sleep to prevent recovery from finihing before we start to scrub. Verify that at least 1 scrub was started while the pg was reovering. Fixes: https://tracker.ceph.com/issues/49779 Signed-off-by: David Zafman <dzafman@redhat.com>	2021-03-14 16:19:46 -07:00
David Zafman	a4fd1d650e	Revert "qa/standalone/scrub/osd-recovery-scrub: fix unnoticed recovery state" This reverts commit `1323bdb839`. The tests needs to scrub while recovery is in progress, so catching recovery from the logs after the fact isn't the proper setup. We can use osd_recovery_sleep config. Signed-off-by: David Zafman <dzafman@redhat.com>	2021-03-13 11:40:55 -08:00
David Zafman	dd63577ab3	test: Add test for scrub parallelism Signed-off-by: David Zafman <dzafman@redhat.com>	2021-03-05 11:41:26 -08:00
Sage Weil	5e197a21e6	Merge PR #39455 into master * refs/pull/39455/head: doc/man/8/ceph: document --max option src/test/osd/safe-to-destroy: adjust test ceph: print command output to stdout even on error mgr/DaemonServer: include details in 'osd ok-to-stop' output mgr: add --max <n> to 'osd ok-to-stop' command mgr: relax osd ok-to-stop condition on degraded pgs Reviewed-by: Neha Ojha <nojha@redhat.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-02-27 10:15:27 -05:00
Sage Weil	33dee7d7bf	crush/CrushWrapper: update shadow trees on update_item() insert_item() already does this, but update_item did not. Fixes: https://tracker.ceph.com/issues/48065 Signed-off-by: Sage Weil <sage@newdream.net>	2021-02-22 14:21:04 -06:00
Sage Weil	722f57dee1	mgr: add --max <n> to 'osd ok-to-stop' command Given and initial (set of) osd(s), if provide up to N OSDs that can be stopped together without making PGs become unavailable. This can be used to quickly identify large(r) batches of OSDs that can be stopped together to (for example) upgrade. Signed-off-by: Sage Weil <sage@newdream.net>	2021-02-20 09:53:51 -05:00
Kefu Chai	8dc097ff46	qa/standalone/mon/misc: verify that len(monmap.features.persistent) == 9 in `beb62c029a`, FEATURE_QUINCY was added to ceph::features::mon::get_persistent(), so update the test accordingly. Signed-off-by: Kefu Chai <kchai@redhat.com>	2021-01-30 22:45:20 +08:00
Sage Weil	7bbc92eda3	mon: updates for quincy Signed-off-by: Sage Weil <sage@newdream.net>	2021-01-28 13:29:28 -06:00
Neha Ojha	5c11f40c12	Merge pull request #38856 from dzafman/wip-48789 test: Fix osd-scrub-scaps.sh to handle DB format change Reviewed-by: Ronen Friedman <rfriedma@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2021-01-15 16:27:59 -08:00
Neha Ojha	6fc9166af4	Merge pull request #38726 from ronen-fr/wip-ronenf-48720 qa/standalone/scrub/osd-recovery-scrub: handle primary change when waiting for scrub Reviewed-by: David Zafman <dzafman@redhat.com>	2021-01-15 13:46:30 -08:00
David Zafman	af9befb0f4	test: Fix osd-scrub-scaps.sh to handle DB format change Caused by: f9c95fa7fc7b0ee992c0249ff090fa7f751e9719 Fixes: https://tracker.ceph.com/issues/48789 Signed-off-by: David Zafman <dzafman@redhat.com>	2021-01-15 10:35:30 -08:00
David Zafman	4814648155	test: osd-recovery-prio.sh replace sleep with wait for both PGs recovering fixes: https://tracker.ceph.com/issues/48842 Signed-off-by: David Zafman <dzafman@redhat.com>	2021-01-11 17:30:00 -08:00
Ronen Friedman	1323bdb839	qa/standalone/scrub/osd-recovery-scrub: fix unnoticed recovery state The 'recovering' state is transitory. Existing code looks for it by polling 'pg stat', missing from time to time. New version searches the tails of the relevant OSDs' logs. Fixes: https://tracker.ceph.com/issues/48719 Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-01-04 13:29:41 +02:00
Ronen Friedman	bb848cfd90	qa/standalone/ceph-helpers.sh: log meaningful PIDs for run_in_background() While the relevant comment says: '# Execute the command and prepend the output with its pid' the actual PID logged is the same for all background processes, which isn't very helpful. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2020-12-28 10:47:02 +02:00
Ronen Friedman	445db7f171	qa/standalone/scrub/osd-recovery-scrub: handle a Primary change Stop waiting for a scrub to happen if the Primary for the target PG changes. Fixes: https://tracker.ceph.com/issues/48720 Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2020-12-28 10:42:41 +02:00
Ronen Friedman	dff7faaf3c	qa/standalone/scrub/osd-scrub-snaps.sh: fix Python print syntax Fixes: https://tracker.ceph.com/issues/48690 Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2020-12-21 16:52:27 +02:00
Kefu Chai	694ed23e9d	qa/standalone/misc/ver-health.sh: include the bootup-time in my test bed, it takes 11 seconds to boot the 3 OSDs and to restart one of them, this fails the test. so we need to take the time into consideration. in this change, the delay is added to the total "warn_older_version_delay", so the monitor does not start sending warning earlier than expected. Signed-off-by: Kefu Chai <kchai@redhat.com>	2020-12-11 16:14:03 +08:00
Kefu Chai	4bcfa139ab	mon/HealthMonitor: use timespan for mon_warn_older_version_delay for better user experience Signed-off-by: Kefu Chai <kchai@redhat.com>	2020-12-11 16:12:47 +08:00
Kefu Chai	1f5406a752	src/*: do not pass cct to ceph_version_to_str() in e5b1ae5554c4d8a20f9f0ff562b231ad0b0ba0ab, a new option named "debug_version_for_testing" is introduced to override the version so we can test version check. in crimson, we have two families of shared functions. - one of them is used by alien store. they are compiled with -DWITH_SEASTAR and -DWITH_ALIEN, to enable the shim code between seastar and POSIX thread. - another is used by crimson in general. where no lock is allowed. currently, we use the "crimson" and "ceph" namespace to differentiate these two families of functions, so they can colocate in the same executable without violating the ODR. see src/include/common_fwd.h for more details. the functions defined in src/common/version.cc are also shared by alien store and crimson code. and because we have different implementations of `CephContext` in crimson and in classic OSD (i.e. alienstore), we have to have different implementations of this function as well, if we follow the same approach. but since these functions are very simple and are non-blocking, there is not much value in differentiating them, it is better to inject the test settings using environment variable instead of using ceph option subsystem. in this change, "ceph_debug_version_for_testing" environment variable is checked instead, so that crimson and alienstore can share the same compilation unit of version.cc. and "debug_version_for_testing" option is removed. Signed-off-by: Kefu Chai <kchai@redhat.com>	2020-12-10 18:26:39 +08:00
Ronen Friedman	43b1129030	test: cancelling both noscrub and nodeep-scrub as part of osd-scrub-test.sh. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2020-12-09 20:16:23 +02:00
haoyixing	0e7e036aa7	doc/dev: use http://docs.ceph.com/en/latest/ instead of /docs/master/ for docs Several links under http://docs.ceph.com/docs/master/ were unable to access. Change them to http://docs.ceph.com/en/lastest so we can access them directly. Signed-off-by: haoyixing <haoyixing@kuaishou.com>	2020-11-24 12:49:47 +08:00
David Zafman	89af82bf4f	Merge pull request #38054 from dzafman/wip-test-fixes test: Fix osd-scrub-test.sh and ver-health.sh tests Reviewed-by: Neha Ojha <nojha@redhat.com>	2020-11-18 08:52:28 -08:00
David Zafman	38c3130654	test: Fix TEST_scrub_extended_sleep test (corrected test name) Didn't really test extended sleep in original code: Cause by: `3bfb5c2621` Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-16 18:30:14 -08:00
David Zafman	0a0ed890c2	test: Improve version checking test, to improve reliability Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-16 18:30:14 -08:00
Kefu Chai	0463a774c9	Merge pull request #37908 from dzafman/wip-47930 test: Fix race in TEST_recovery_scrub test Reviewed-by: Neha Ojha <nojha@redhat.com>	2020-11-16 01:00:56 +08:00
David Zafman	870bde04a5	test: Changes based on code review comments Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-11 15:31:26 -08:00
David Zafman	93373746f5	osd test: Delay reporting until mon_warn_older_version_delay has passed Move release notes description to 16.0.0 and update Update documentation Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-11 15:10:11 -08:00
David Zafman	9d988c3dbc	test: Simple test case for version health warning Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-11 15:10:11 -08:00
David Zafman	410e230d09	test: Fix race in TEST_recovery_scrub test Fixes: https://tracker.ceph.com/issues/47930 Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-10 00:45:13 +00:00
David Zafman	d3cc647583	osd: Eliminate day of weeek 7 and hour 24 Add test case for permitted hours to make sure scrub doesn't start Remove permitted hours in extended sleep test Fixes: https://tracker.ceph.com/issues/48077 Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-09 22:47:00 +00:00
David Zafman	ef47a3e708	test: set mon_allow_pool_size_one for consistency with original test intention Signed-off-by: David Zafman <dzafman@redhat.com>	2020-11-03 21:49:00 +00:00
Neha Ojha	343107766e	Merge pull request #37483 from dzafman/wip-46405 osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1 Reviewed-by: Brad Hubbard <bhubbard@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2020-10-08 11:44:00 -07:00
David Zafman	3ba7ebd3e2	test: Avoid races by waiting for PGs go clean before query Fixes: https://tracker.ceph.com/issues/46405 Signed-off-by: David Zafman <dzafman@redhat.com>	2020-10-01 19:43:57 +00:00
David Zafman	b20a277f05	test: Inconsequential change to get object names as desired Signed-off-by: David Zafman <dzafman@redhat.com>	2020-09-29 18:01:24 +00:00
Prashant D	f8b7fddc4c	mon: validate crush-failure-domain While creating erasure-coded profile make sure that user is specifying valid crush-failure-domain. Fixes: https://tracker.ceph.com/issues/47452 Signed-off-by: Prashant Dhange <pdhange@redhat.com>	2020-09-22 07:27:22 -04:00
Patrick Donnelly	7eceaf45de	Merge PR #37202 into master * refs/pull/37202/head: mon: allow overriding the initial mon_host Reviewed-by: Neha Ojha <nojha@redhat.com>	2020-09-18 18:54:57 -07:00
Neha Ojha	8ba0a61a51	Merge pull request #35906 from gregsfortytwo/wip-stretch-mode Add a new stretch mode for 2-site Ceph clusters Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2020-09-18 14:31:45 -07:00
Patrick Donnelly	ed3782e60a	mon: allow overriding the initial mon_host This overrides what the CephContext believes to be the current quorum of monitors (retrieved from other instances of the MonClient), introduced by [1]. Tests need to be able to target a specific monitor for exercising forwarding and other things. [1] 731e2db9fb4611f767446a3c8e778a097ce70d35 Fixes: https://tracker.ceph.com/issues/47180 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2020-09-16 18:34:23 -07:00
Greg Farnum	9506d09e3b	Merge remote-tracking branch 'origin/master' into wip-stretch-mode Conflicts: src/include/ceph_features.h Signed-off-by: Greg Farnum <gfarnum@redhat.com>	2020-09-15 02:25:07 +00:00
David Zafman	5b0ba0e5a8	test: Modify test to check new feature might_have_unfound added to list_unfound Signed-off-by: David Zafman <dzafman@redhat.com>	2020-09-14 13:06:29 -07:00
Greg Farnum	d02625331c	Merge remote-tracking branch 'origin/master' into wip-stretch-mode	2020-09-14 02:32:19 +00:00
Kefu Chai	e5b9b08cc4	Merge pull request #36962 from tchaikov/wip-qa-py3-cleanup qa: py3 cleanups Reviewed-by: Neha Ojha <nojha@redhat.com>	2020-09-10 09:39:20 +08:00
Neha Ojha	21c08f0be2	qa/*/mon/mon-last-epoch-clean.sh: mark osd out instead of down The test should mark the OSD out to check if only "in" OSDs are considered by the osdmap trimming logic. Fixes: https://tracker.ceph.com/issues/47309 Signed-off-by: Neha Ojha <nojha@redhat.com>	2020-09-04 22:09:05 +00:00
Kefu Chai	5c758f63aa	qa/standalone: always decode output from check_output() we could pass `text=True` for better readability, but that's introduced in python3.7, or pass `error="ignore"` but it's too long. Signed-off-by: Kefu Chai <kchai@redhat.com>	2020-09-03 13:09:16 +08:00
Kefu Chai	eda90040ad	qa: always use subprocess.{DEVNULL,check_output} no need to check for their existence, and prepare a replacement. because we've migrated to python3. and we only support python3.6 and up. Signed-off-by: Kefu Chai <kchai@redhat.com>	2020-09-03 13:09:16 +08:00
Kefu Chai	4f6443737e	Merge pull request #30838 from ifed01/wip-ifed-single-alloc os/bluestore: use single allocator for shared bluestore/bluefs device Reviewed-by: Sage Weil <sage@redhat.com>	2020-08-03 18:00:16 +08:00
Igor Fedotov	9a8f1ae492	os/bluestore: fix bluefs migrate/expand to match single allocator. Signed-off-by: Igor Fedotov <ifedotov@suse.com>	2020-07-31 15:36:47 +03:00
Dan van der Ster	b550112dba	qa/standalone/osd: add bad-inc-map.sh Test that the osd doesn't crash when it gets a bad incremental osdmap. Related-to: https://tracker.ceph.com/issues/46443 Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>	2020-07-28 23:15:42 +02:00
David Zafman	365e48d6ec	test: Check for interuption of scrubs with nosrub/nodeep_scrub Signed-off-by: David Zafman <dzafman@redhat.com>	2020-07-24 11:41:20 -07:00
David Zafman	f272768802	test: mon-last-epoch-clean.sh fixed to avoid shell globbing Signed-off-by: David Zafman <dzafman@redhat.com>	2020-07-24 11:40:24 -07:00
Greg Farnum	3ee09571dc	qa: update the mon/misc.sh script for the new feature count I have absolutely no idea why it's counting features, but apparently it is and bumping the value to 7 makes it pass. Signed-off-by: Greg Farnum <gfarnum@redhat.com>	2020-07-20 07:08:50 +00:00
Kefu Chai	0ac787be2a	qa/standalone: drop py2 support Signed-off-by: Kefu Chai <kchai@redhat.com>	2020-07-05 10:58:28 +08:00
Kefu Chai	48f0e02d76	qa/standalone: flake8 fixes Signed-off-by: Kefu Chai <kchai@redhat.com>	2020-06-23 23:01:27 +08:00
Neha Ojha	64bcd436cc	Merge pull request #35632 from dzafman/wip-46064 tools: Add statfs operation to ceph-objecstore-tool Reviewed-by: Neha Ojha <nojha@redhat.com>	2020-06-18 16:25:04 -07:00
David Zafman	19054ceb43	tools: Add statfs operation to ceph-objecstore-tool Fixes: https://tracker.ceph.com/issues/46064 Signed-off-by: David Zafman <dzafman@redhat.com>	2020-06-18 10:07:38 -07:00
David Zafman	41322eaa62	test: flush_pg_stats() ignore OSDs that don't respond to getting sequence This eliminates bogus errors in the logs and returned from flush_pg_stats() Signed-off-by: David Zafman <dzafman@redhat.com>	2020-06-16 17:45:26 -07:00
David Zafman	661996d434	mgr: Warn when too many reads are repaired on an OSD Include test case Configurable by setting mon_osd_warn_num_repaired (default 10) Ignore new health warning with random eio injection test Fixes: https://tracker.ceph.com/issues/41564 Signed-off-by: David Zafman <dzafman@redhat.com>	2020-06-16 17:45:27 -07:00
David Zafman	1efa5ca0a6	Merge pull request #35425 from dzafman/wip-44314 test: osd-backfill-stats.sh use nobackfill to avoid races in remainin… Reviewed-by: Neha Ojha <nojha@redhat.com>	2020-06-09 17:15:52 -07:00
David Zafman	92f970cbed	test: osd-backfill-stats.sh use nobackfill to avoid races in remaining test Fixes: https://tracker.ceph.com/issues/44314 Signed-off-by: David Zafman <dzafman@redhat.com>	2020-06-05 17:48:10 -07:00
Yuri Weinstein	b8f632327f	Merge pull request #35279 from badone/wip-py2-fix-osd-scrub-repair.sh qa/*/osd-scrub-repair.sh: Convert to python3 print syntax Reviewed-by: Kefu Chai <kchai@redhat.com>	2020-06-03 11:12:21 -07:00
Neha Ojha	3a06af5af5	qa/standalone/scrub/osd-scrub-snaps.sh: fix grep pattern The error looks like this: 2020-05-28T20:56:30.214+0000 7f66cdecf700 -1 log_channel(cluster) log [ERR] : scrub 1.0 1:ab946124:::obj15:head : can't decode 'snapset' attr void SnapSet::decode(ceph::buffer::v15_2_0::list::const_iterator&) no longer understand old encoding version 3 < 97: Malformed input Fixes: https://tracker.ceph.com/issues/45760 Signed-off-by: Neha Ojha <nojha@redhat.com>	2020-05-28 22:41:38 +00:00
Neha Ojha	f72b19d09c	qa/standalone/scrub/osd-scrub-repair.sh: fix grep pattern to match decode exception We fail because the error message in the log looks like: 2020-05-27T21:02:48.447+0000 7fbfc4e60700 -1 log_channel(cluster) log [ERR] : scrub 3.0 3:5c7b2c47:::ROBJ16:head : can't decode 'snapset' attr void SnapSet::decode(ceph::buffer::v15_2_0::list::const_iterator&) no longer understand old encoding version 3 < 97: Malformed input Fixes: https://tracker.ceph.com/issues/45660 Signed-off-by: Neha Ojha <nojha@redhat.com>	2020-05-28 00:38:17 +00:00
Brad Hubbard	80e7b7c19b	qa/*/osd-scrub-repair.sh: Convert to python3 print syntax Fixes: https://tracker.ceph.com/issues/45733 Signed-off-by: Brad Hubbard <bhubbard@redhat.com>	2020-05-28 08:32:54 +10:00
Neha Ojha	7c8b627eaa	qa/*/osd-scrub-repair.sh: don't fail if PG is in active+clean+wait a0b453ad335671bd92f165115d6ee984d2412448 added the wait state, which can make PGs stay in active+clean+wait for a while instead of going into active+clean directly. As far as TEST_auto_repair_bluestore_failed is concerned, we only care about the repair state being cleared. Fixes: https://tracker.ceph.com/issues/45075 Signed-off-by: Neha Ojha <nojha@redhat.com>	2020-04-23 20:24:28 +00:00
Neha Ojha	4f82ebf41b	qa/standalone/scrub/osd-scrub-repair.sh: fix race in TEST_auto_repair_bluestore_failed We need to flush_pg_stats before checking for active+clean. Fixed: https://tracker.ceph.com/issues/45075 Signed-off-by: Neha Ojha <nojha@redhat.com>	2020-04-20 18:29:51 +00:00
Neha Ojha	61ad12e6ad	Merge pull request #34541 from neha-ojha/wip-balancer-on mgr: turn on balancer in upmap mode by default Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2020-04-15 15:03:28 -07:00
Kefu Chai	eff9d0fc9a	Merge pull request #19076 from jecluis/wip-mon-fix-osdmap-lec-trim mon/OSDMonitor: allow trimming maps even if osds are down Reviewed-by: Kefu Chai <kchai@redhat.com>	2020-04-15 08:02:51 +08:00
Neha Ojha	ec85af5b19	qa/standalone/mon/osd-pool-df.sh: flush_pg_stats explicitly Signed-off-by: Neha Ojha <nojha@redhat.com>	2020-04-14 19:09:45 +00:00
Neha Ojha	321faa9c6b	qa/standalone/mon/osd-pool-df.sh: fix test to check for the right values Though the test passed, we weren't checking for the correct values: .../qa/standalone/mon/osd-pool-df.sh:62: TEST_ceph_df: ceph df -f json .../qa/standalone/mon/osd-pool-df.sh:62: TEST_ceph_df: jq .stats.total_avail_bytes ../qa/standalone/mon/osd-pool-df.sh:62: TEST_ceph_df: local global_avail=0 .../qa/standalone/mon/osd-pool-df.sh:63: TEST_ceph_df: ceph df -f json .../qa/standalone/mon/osd-pool-df.sh:63: TEST_ceph_df: jq '.pools \| map(select(.name == "$rep_poolname"))[0].stats.max_avail' ../qa/standalone/mon/osd-pool-df.sh:63: TEST_ceph_df: local rep_avail=null .../qa/standalone/mon/osd-pool-df.sh:64: TEST_ceph_df: ceph df -f json .../qa/standalone/mon/osd-pool-df.sh:64: TEST_ceph_df: jq '.pools \| map(select(.name == "$ec_poolname"))[0].stats.max_avail' ../qa/standalone/mon/osd-pool-df.sh:64: TEST_ceph_df: local ec_avail=null ../qa/standalone/mon/osd-pool-df.sh:66: TEST_ceph_df: echo '0 >= null3' ../qa/standalone/mon/osd-pool-df.sh:66: TEST_ceph_df: bc 1 ../qa/standalone/mon/osd-pool-df.sh:67: TEST_ceph_df: echo '0 >= null1.5' ../qa/standalone/mon/osd-pool-df.sh:67: TEST_ceph_df: bc 1 Signed-off-by: Neha Ojha <nojha@redhat.com>	2020-04-14 00:05:02 +00:00
Neha Ojha	480afa61b6	qa/standalone/mgr/balancer.sh: adapt test Now that the balancer is on by default the test needs these changes. Signed-off-by: Neha Ojha <nojha@redhat.com>	2020-04-14 00:05:02 +00:00
Sage Weil	731e508bbe	qa/standalone/mon/msgr-v2-transition: remove test v2 was introduced in nautilus, and we don't support mimic -> pacific upgrades (only mimic -> octopus). This test can be removed! Signed-off-by: Sage Weil <sage@redhat.com>	2020-04-08 08:10:32 -05:00
Sage Weil	279c437994	qa/standalone/mon/misc: update TEST_mon_features Signed-off-by: Sage Weil <sage@redhat.com>	2020-04-08 08:10:32 -05:00
Kefu Chai	b1738cd1ef	qa/standalone/scrub: s/$(pgid)/${pgid}/ to address the test failures like ``` 2020-04-07T15:44:58.693 INFO:tasks.workunit.client.0.smithi049.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-repair.sh:498: TEST_auto_repair_bluestore_failed: ceph pg dump pgs 2020-04-07T15:44:58.694 INFO:tasks.workunit.client.0.smithi049.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-repair.sh:498: TEST_auto_repair_bluestore_failed: pgid 2020-04-07T15:44:58.694 INFO:tasks.workunit.client.0.smithi049.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-repair.sh: line 498: pgid: command not found ``` Signed-off-by: Kefu Chai <kchai@redhat.com>	2020-04-08 00:54:46 +08:00
Sage Weil	04e0b9c2f8	Merge PR #34126 into master * refs/pull/34126/head: qa/*/osd-backfill-recovery-log.sh: flush_pg_stats before checking log length Reviewed-by: Sage Weil <sage@redhat.com>	2020-03-23 13:55:16 -05:00
Neha	cfebec1b12	qa/*/osd-backfill-recovery-log.sh: flush_pg_stats before checking log length It is possible for the pg dump to not be the latest when we check for newprimary in _common_test(). This is because mgr_stats_period is 5 seconds, and we may not have fetched the latest stats just yet. This causes the test to look at the same stats before and after wait_for_clean. Fixes: https://tracker.ceph.com/issues/43807 (2) Signed-off-by: Neha Ojha <nojha@redhat.com>	2020-03-23 15:37:12 +00:00
Joao Eduardo Luis	3d682c21f6	qa/standalone: exercise osdmon's last epoch clean Signed-off-by: Joao Eduardo Luis <joao@suse.de>	2020-03-23 14:58:59 +00:00
Kefu Chai	b0dca75a59	Merge pull request #34056 from xiexingguo/wip-44662 qa/*/osd-markdown.sh: propagate map to osd before testing its reaction Reviewed-by: Neha Ojha <nojha@redhat.com>	2020-03-21 14:27:51 +08:00
xie xingguo	afdff0cd3f	qa/*/osd-markdown.sh: propagate map to osd before testing its reaction Mon might fail to share the newest map with any of up osds, e.g., due to an injected broken pipe. Since we don't have any client activities during the osd-markdown tests, osds might be unaware of the map changes made through CLI. Make sure osds have pulled the newest map down before we can test its reaction correctly. Fixes: https://tracker.ceph.com/issues/44662 Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>	2020-03-19 18:17:28 +08:00
Neha	6edd1cb686	qa/standalone/osd/osd-backfill-stats.sh: get_latest_osdmap to propagate map change Fixes: https://tracker.ceph.com/issues/44518 Signed-off-by: Neha Ojha <nojha@redhat.com>	2020-03-18 22:57:41 +00:00
Sage Weil	603383605f	Merge PR #33885 into master * refs/pull/33885/head: Merge pull request #33848 from mchangir/octopus-tests-remove-suprious-whitespace Merge PR #33746 into octopus Merge PR #33830 into octopus Merge PR #33732 into octopus Merge PR #33620 into octopus Merge pull request #33876 from tchaikov/octopus-cephadm-mypy cephadm: add "assert foo is not None" for mypy check Merge pull request #33067 from tspmelo/wip-rbd-delete-with-snapshot cephadm: add grafana adopt Merge PR #33771 into octopus Merge PR #33850 into octopus Merge PR #33853 into octopus Merge PR #33857 into octopus Merge PR #32990 into octopus Merge PR #33713 into octopus Merge PR #33838 into octopus qa/tasks/cephadm: no default mon\|mgr\|crash service specs qa/suites/rados/cephadm/upgrade: upgrade start point that supports the no-spec option Merge PR #33832 into octopus cephadm: bootstrap: wait for mgr to restart after enabling a module mgr: add 'mgr_status' tell command Merge pull request #33839 from rhcs-dashboard/44538-fix-rgw-grafana-get-put-latencies Merge pull request #33743 from votdev/issue_43869_fix_qa_test cephadm: create initial mon and mgr service specs too cephadm: no need to pregenerate a crash key for the bootstrap host mgr/cephadm: do not complain when we don't have enough hosts mgr/cephadm: remove orphan daemons mgr/cephadm: report size=0 for fabricated ServiceDescription mgr/cephadm: safety check to prevent removing all mon\|mgr daemons mgr/cephadm: prevent scaling mon\|mgr below count=1 mgr/cephadm: do not remove daemons from remove_service Merge pull request #33805 from tchaikov/wip-44500 spec: Podman (temporarily) requires apparmor-abstractions on suse mgr/cephadm: Make sure we don't co-locate the same daemon monitoring: fix RGW grafana chart 'Average GET/PUT Latencies' tests: remove spurious whitespace mgr/cephadm: fix service list filtering Merge PR #33825 into octopus Merge PR #33811 into octopus Revert "Merge pull request #33673 from cbodley/wip-denc-enum" mgr/cephadm: fix upgrade order Merge PR #33801 into octopus Merge PR #33822 into octopus cephadm: bootstrap: tolerate error return from -h Merge PR #33809 into octopus Merge PR #32678 into octopus cephadm: use `sh` instead of `bash` during enter ceph.in: only shut down rados on clean exit common/ceph_timer: Pass reference to waited time on stack common/ceph_timer: Add test common/ceph_timer: Use unique_function, allowing noncopyable events common/ceph_timer: Couple cleanups common/ceph_timer: Fix namespaces common/ceph_timer: Add missing includes common/ceph_timer.h: Don't indent contents of a namespace mgr/dashboard: Crush rule modal mgr/dashboard: Preserve rule selection on pool type change mgr/dashboard: Crush rule is only send during replicated pool creation mgr/dashboard: Explicit returns in pool form mgr/dashboard: Removes fork join in pool form mgr/dashboard: Hide ECP actions during ec pool edit mgr/dashboard: Pool form erasure/replicated boolean mgr/dashboard: Change pool info API endpoint mgr/dashboard: Moves ECP info endpoint to UI-API mgr/cephadm: add _remove_osds_bg back to main loop mgr/cephadm/osd: update removal report immediately qa/tasks/ceph_manager: use StringIO for capturing COT output qa/standalone/scrub/osd-scrub-repair: force osdmap prop to osds qa/standalone/scrub/osd-scrub-test: wait longer for update qa/tasks/ceph_manager: capture stderr for COT qa/suites/rados/ceph: drop opensuse for now mon/MonClient: send logs to mon on separate schedule than pings mgr/dashboard: Fix missing ImageSpec usage mgr/dashboard: Allow removing RBD with snapshots mgr/dashboard: Refactor and cleanup tasks.mgr.dashboard.test_user mgr/dashboard: support multiple DriveGroups when creating OSDs mon/MonClient: send logs to mon even if we have no keelalive2 cephadm: flag dashboard user to change password Reviewed-by: Sebastian Wagner <swagner@suse.com>	2020-03-11 17:38:59 -05:00
Neha Ojha	6117a0d4db	Merge pull request #33281 from ideepika/wip-set-osd-pool-size-extra-param-check mon/OSDMonitor: add flag `--yes-i-really-mean-it` for setting pool size 1 Reviewed-by: Greg Farnum <gfarnum@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2020-03-09 19:14:50 -07:00

1 2 3 4 5 ...

628 Commits