RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-02-19 17:08:05 +00:00

Author	SHA1	Message	Date
Kamoltat	f06da20dff	pybind/mgr/progress: disable pg recovery event by default The progress module disabled the pg recovery event by default since the event is expensive and has interrupted other serviceis when there is OSDs being marked in/out from the the cluster. To turn the event on manually: ceph config set mgr mgr/progress/allow_pg_recovery_event true Updated qa/tasks/mgr/test_progress.py to enable the pg recovery event when testing the progress module. Signed-off-by: Kamoltat <ksirivad@redhat.com>	2022-02-03 17:51:42 +00:00
Kamoltat	5f33f2f6e0	mgr/test_progress.py: Delay recover in test_progress Changes some the tests in teuthology to make the test more deterministic. Using: `ceph osd set norecover` and `ceph osd set nobackfill` when marking osds in or out. As this will delay the recovery and make sure it the test cases get the chance to check that there is actually events poping up in the progress module. took out test_osd_cannot_recover from tasks/mgr/test_progress.py since it is no longer a relevant test case since recovery will get triggered regardless if pg is unmoved. Ignoring `OSDMAP_FLAGS` in teuthology because we are using norecover and nobackfill to delay the recovery process, therefore, it will create a health warning and fails the teuthology test. Signed-off-by: Kamoltat <ksirivad@redhat.com>	2021-07-13 19:33:20 +00:00
Kamoltat	4b00f1c2bd	pybind/mg/progress: Disregard unreported pgs The global recovery event progress calculations only takes into account pgs with `reported_epoch < start_epoch_of_event` but sometimes the pgs doesn't get move before or after the creation of the global recovery event, therefore this might result in a bug where the global event gets stuck forever unless there is another event that specifically makes the pgs that get stuck moves and updates its `reported_epoch`. Therefore, we decided to disregard pgs that are in active+clean state but has `reported_epoch < start_epoch_of_event`. Fixes: https://tracker.ceph.com/issues/49988 Signed-off-by: Kamoltat <ksirivad@redhat.com>	2021-06-09 15:11:32 +00:00
Sridhar Seshasayee	328271d587	qa/tasks: Enhance wait_until_true() to check & retry recovery progress With mclock scheduler enabled, the recovery throughput is throttled based on factors like the type of mclock profile enabled, the OSD capacity among others. Due to this the recovery times may vary and therefore the existing timeout of 120 secs may not be sufficient. To address the above, a new method called _is_inprogress_or_complete() is introduced in the TestProgress Class that checks if the event with the specified 'id' is in progress by checking the 'progress' key of the progress command response. This method also handles the corner case where the event completes just before it's called. The existing wait_until_true() method in the CephTestCase Class is modified to accept another function argument called "check_fn". This is set to the _is_inprogress_or_complete() function described earlier in the "test_turn_off_module" test that has been observed to fail due to the reasons already described above. A retry mechanism of a maximum of 5 attempts is introduced after the first timeout is hit. This means that the wait can extend up to a maximum of 600 secs (120 secs * 5) as long as there is recovery progress reported by the 'ceph progress' command result. Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-06-02 14:19:48 +05:30
Neha Ojha	1523bf9bdb	Merge pull request #38107 from ceph/wip-mgr-progress-fix-48217 qa/mgr/test_progress: add _get_osd_in_out_events to account for osd marked in/out events Reviewed-by: Neha Ojha <nojha@redhat.com>	2020-11-18 07:15:22 -08:00
Kamoltat	52fe9dbdae	qa/mgr/test_progress: fix bug 48217 Fixes a failing test case regarding osd coming back after being marked out. The old test case wasn't accounting for a specific event, therefore this resulted in the failure. The fix basically accounts for a specific event of osd being marked in/out. Fixes: https://tracker.ceph.com/issues/48217 Signed-off-by: Kamoltat <ksirivad@redhat.com>	2020-11-17 07:54:35 +00:00
Kamoltat	993bb02b30	mgr/progress: introduce turn off/on feature progress module can be turned off/on by using the commands: 'progress off' and 'progress on' As well as refractoring teuthology test suite to prevent future bugs that can possibly occur fixes: https://tracker.ceph.com/issues/47238 Signed-off-by: kamoltat <ksirivad@redhat.com>	2020-11-16 03:46:42 +00:00
Kamoltat	2af2afa5e9	mgr/progress: Global Recovery Event in ceph -s Modified the progress module and BaseMgrModule to support Global Recovert Event. Adding more arguments to update_progress_event, ceph_update_progress_event. To only show global recovery event progress with `ceph -s`. All sub events have been move to `ceph progress` Signed-off-by: Kamoltat <ksirivad@redhat.com>	2020-10-22 16:44:50 +00:00
Kefu Chai	7d37226548	qa/tasks/mgr: use relative import for better readability, and to ease the pain of developer to track back to the top level python package for referencing a submodule Signed-off-by: Kefu Chai <kchai@redhat.com>	2020-03-27 14:51:24 +08:00
Kefu Chai	947a74349d	qa: import with full path to be py3 compatible Signed-off-by: Kefu Chai <kchai@redhat.com>	2020-03-24 18:27:55 +08:00
Kefu Chai	7d262db114	qa/tasks: call super class's setUp() to address the regression introduced by `8729281121` Signed-off-by: Kefu Chai <kchai@redhat.com>	2020-02-15 12:39:08 +08:00
Ricardo Dias	b03537949a	qa/mgr/progress: fix timeout error when waiting for osd in event Fixes: https://tracker.ceph.com/issues/40618 Signed-off-by: Ricardo Dias <rdias@suse.com>	2019-09-03 11:44:05 +01:00
Kamoltat (Junior) Sirivadhna	baa714117c	qa/tasks/mgr/test_progress.py: fix bug in `9b4dbf0` follow-up-fix for `9b4dbf0` basically we wanna look at the list that has inprogress events to inprogress+complete Fixes: http://pulpito.ceph.com/kchai-2019-07-28_14:30:09-rados-wip-kefu2-testing-2019-07-28-1941-distro-basic-mira/4160881/ Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>	2019-08-05 11:03:33 -04:00
Kefu Chai	9b4dbf0749	qa/tasks/mgr/test_progress.py: s/ev/new_event/ as a follow-up fix for `5604ba4e` Fixes: http://tracker.ceph.com/issues/40618 Signed-off-by: Kefu Chai <kchai@redhat.com>	2019-07-28 19:32:24 +08:00
Kamoltat (Junior) Sirivadhna	5604ba4ec1	qa/mgr/progress: Update the test suite for progress module Update the test suite to reflect a feature change that has been merged to master in progress module where you create an event when an osd is marked in. Fixes: http://tracker.ceph.com/issues/40618 with success QA run in sepia: http://pulpito.ceph.com/kchai-2019-07-16_10:10:01-rados-master-distro-basic-mira/ Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>	2019-07-22 11:09:42 -04:00
Patrick Donnelly	1071f73c76	qa: use skipTest method instead of exception This is the recommended method to skip a test according to [1]. It also lets us avoid an unnecessary import. [1] https://docs.python.org/2/library/unittest.html#unittest.TestCase.skipTest Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2019-04-24 09:38:52 -07:00
Sage Weil	1d305f1264	mgr/progress: revise message syntax a bit "osd.0", not "OSD 0" Signed-off-by: Sage Weil <sage@redhat.com>	2019-02-08 13:50:27 -06:00
John Spray	5ecd69099d	qa: add tests for progress module Signed-off-by: John Spray <john.spray@redhat.com>	2018-09-11 11:21:35 +01:00

18 Commits