RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2024-12-28 22:43:29 +00:00

Author	SHA1	Message	Date
Laura Flores	c190aa9c82	Merge pull request #49181 from ljflores/wip-envlibrados-rocksdb-fix qa/workunits/rados: skip running envlibrados rocksdb tests on ubuntu	2023-01-17 16:32:20 -06:00
Laura Flores	acc8c7e2ef	qa/workunits/rados: skip running envlibrados rocksdb tests on ubuntu This test passes on centos and rhel, but fails on ubuntu from an invalid pointer. Since the envlibrados rocksdb tests are experimental and don't have any actual users, we can just run them on rhel and centos. At the moment, the actual bug is not fully understood, but it was decided that fixing it is low priority, and removing the test from problematic distros is okay for the time being. This commit is considered a workaround to the actual issue. Related tracker: https://tracker.ceph.com/issues/57632 Signed-off-by: Laura Flores <lflores@redhat.com>	2023-01-17 11:38:19 -06:00
Sridhar Seshasayee	5b2fee21e8	qa: Allow tests to override recovery configs with mClock scheduler enabled Set osd_mclock_override_recovery_settings option to true for tests that modify recovery/backfill configuration options. This prevents logging of the cluster warning when modifying recovery/backfill limits. Fixes: https://tracker.ceph.com/issues/57529 Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2022-12-12 18:12:46 +05:30
Radoslaw Zarzynski	5eaff49330	qa: qa/suites/rados/upgrade/parallel points to quincy Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2022-09-20 14:29:57 +00:00
Radoslaw Zarzynski	4baea08565	doc, qa: stubs and clean up for reef - remove upgrades from octopus - stubs for completing upgrade to reef Still missing the quincy-x upgrade tests. `c8e1f4c2b547a152e049af2b529bf415f6d76e59` has moved the `thrash-old-clients` tests back to the rados suite. This commit fixes the `release-checklists.rst` accordingly. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2022-09-20 14:29:47 +00:00
Sage Weil	39da18b31b	qa/workunits/mon/auth_key_rotation.sh: exercise pending key / rotation Signed-off-by: Sage Weil <sage@newdream.net>	2022-09-12 17:02:59 +00:00
Matan Breizman	5db85e6f45	qa/suites: Reduce rados_python time out This test runs for few minutes, reducing timeout from 3h to 1h to avoid hanging jobs. Signed-off-by: Matan Breizman <mbreizma@redhat.com>	2022-08-17 12:05:20 +00:00
Laura Flores	40062676c2	qa/suites/rados/thrash-erasure-code-big/thrashers: add `osd max backfills` setting to mapgap and pggrow All `rados/thrash-erasure-code-big` tests that die due to the “wait_for_recovery” timeout have one thing in common: They contain either `thrashers/pggrow` or `thrashers/mapgap`. The difference between pggrow and mapgap vs. all other non-offending thrashers (default, careful, fastread, and morepggrow) is that they lack an override setting for `osd max backfills`. `osd max backfills` is the max number of backfill operations allowed to/from an OSD. The higher the number, the quicker the recovery. By default, this value is 1. On all of the non-offending thrashers (default, careful, fastread, and morepggrow), the default 1 value gets overridden in their .yaml files with a value > 1. This is not the case for pggrow and mapgap, however, as they lack an `osd max backfills` override setting. The mclock op scheduler is known to override `osd max backfills` with a high value, but all of the thrash-erasure-code-big thrashers have their op queue set to “debug_random”, which chooses randomly between op queues (the debug_random op queue is set to override the default mclock_scheduler in qa/config/rados.yaml). So, coupled with the “debug_random” op queue, the low `osd max backfill` setting is causing some tests to time out in recovery. WITHOUT `osd max backfills`, as they are now, “mapgap” and “pggrow” tests die due to timed-out recovery about 17/100 times, as seen here with a pggrow test: http://pulpito.front.sepia.ceph.com/lflores-2022-05-18_14:24:29-rados:thrash-erasure-code-big-master-distro-default-smithi/ WITH `osd max backfills` specified, as I have suggested in this PR, 99/100 tests passed, with one test failing for a different reason: http://pulpito.front.sepia.ceph.com/lflores-2022-05-17_22:40:27-rados:thrash-erasure-code-big-master-distro-default-smithi/ I also scheduled 145 tests WITH `osd max backfills` that are a mix of pggrow and mapgap thrashers. 144/145 tests passed, with one test failing for a different reason. http://pulpito.front.sepia.ceph.com/lflores-2022-05-17_15:27:54-rados:thrash-erasure-code-big-master-distro-default-smithi/ Fixes: https://tracker.ceph.com/issues/51076 Signed-off-by: Laura Flores <lflores@redhat.com>	2022-05-19 18:29:00 -05:00
Neha Ojha	8a8945e640	Merge pull request #44868 from neha-ojha/wip-move-to-stream qa/distros: remove centos8 Reviewed-by: Adam King <adking@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2022-02-04 11:56:08 -08:00
Patrick Donnelly	1f714da814	qa: fix or add missing .qa links Using this command: find qa/suites/ -type d -execdir ln -sfT ../.qa/ {}/.qa \; Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2022-02-03 10:08:30 -05:00
Neha Ojha	8ca5729d21	qa/suites/rados/thrash-old-clients: remove centos_8.3_container_tools_3.0 Signed-off-by: Neha Ojha <nojha@redhat.com>	2022-02-02 23:26:54 +00:00
Neha Ojha	f849f1554c	qa/suites/rados: reduce the number of cephadm tests Currently, every rados run of ~400 jobs is running ~150 cephadm tests, which is unnecessary and redundant. With this change, we will run some basic cephadm tests within the rados suite. The following seems to be a good start. qa/suites/rados/cephadm/osds qa/suites/rados/cephadm/smoke qa/suites/rados/cephadm/smoke-singlehost qa/suites/rados/cephadm/workunits Signed-off-by: Neha Ojha <nojha@redhat.com>	2022-01-21 23:38:53 +00:00
Pere Diaz Bou	15dfa71cf7	mgr: TTLCache basic implementation Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com> Fixes: https://tracker.ceph.com/issues/48388	2022-01-05 10:11:58 +01:00
Kamoltat	bb42c71e7e	qa: Added workunit test for noautoscale flag set and unset the noautoscale flag, evaluate if the results are what we expected. As well as, evaluate if the flag is correct when we create new pools. Signed-off-by: Kamoltat <ksirivad@redhat.com>	2021-12-22 21:42:28 +00:00
Kamoltat	c194f4a3eb	qa/workunits/mon/pg_autoscaler: modified test script Modified test scrtipt to include `bulk` and remove all `profile` options. Signed-off-by: Kamoltat <ksirivad@redhat.com>	2021-12-20 21:46:37 +00:00
Sage Weil	b430fd538f	qa/suites/rados/thrash-old-clients: use better-support cephadm distro/podman Signed-off-by: Sage Weil <sage@newdream.net>	2021-11-30 10:47:53 -06:00
Ernesto Puerta	515af762bb	Merge pull request #43987 from rhcs-dashboard/53123-dashboard-nfs-cleanup mgr/dashboard: NFS non-existent files cleanup Reviewed-by: Alfonso Martínez <almartin@redhat.com> Reviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: ljflores <NOT@FOUND> Reviewed-by: Nizamudeen A <nia@redhat.com> Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>	2021-11-19 20:40:41 +01:00
Sage Weil	411b2d39c2	qa/suites/rados/dashboard: use single-container-host.yaml Signed-off-by: Sage Weil <sage@newdream.net>	2021-11-17 09:02:42 -06:00
Alfonso Martínez	045d2d0f76	mgr/dashboard: NFS non-existent files cleanup After https://github.com/ceph/ceph/pull/42526 and https://github.com/ceph/ceph/pull/43725 merges, the following files do not exist but there were still references to them: - src/pybind/mgr/dashboard/services/ganesha.py - qa/tasks/mgr/dashboard/test_ganesha.py The following files were renamed but there were still references to old names: - src/pybind/mgr/dashboard/controllers/nfsganesha.py: nfsganesha.py --> nfs.py - src/pybind/mgr/dashboard/tests/test_ganesha.py: test_ganesha.py --> test_nfs.py Other changes in qa/suites/rados/dashboard/tasks/dashboard.yaml: - Add missing task: tasks.mgr.dashboard.test_api - Sort dashboard tasks alphabetically. Fixes: https://tracker.ceph.com/issues/53123 Signed-off-by: Alfonso Martínez <almartin@redhat.com>	2021-11-17 13:25:17 +01:00
Sebastian Wagner	116a8c4208	qa/suites/rados/mgr: use only one objectstore instead of all I think we have enough coverage. Always testing all objectstores is a bit excessive in my opinion Signed-off-by: Sebastian Wagner <sewagner@redhat.com>	2021-10-28 15:01:29 +02:00
Kefu Chai	70b049ffdb	Merge pull request #43239 from trociny/wip-48959 osd: handle inconsistent hash info during backfill and deep scrub gracefully Reviewed-by: Samuel Just <sjust@redhat.com>	2021-10-14 22:43:16 +08:00
Ernesto Puerta	90bbcab09f	Merge pull request #42557 from ceph/feature-50336-cluster-creation-wizard mgr/dashboard: Cluster Creation/Expansion Wizard Reviewed-by: Alfonso Martínez <almartin@redhat.com> Reviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Nizamudeen A <nia@redhat.com> Reviewed-by: sebastian-philipp <NOT@FOUND> Reviewed-by: Volker Theile <vtheile@suse.com>	2021-10-14 15:12:42 +02:00
Nizamudeen A	59cbf97e6c	mgr/dashboard: Cluster Creation Add Host Section and e2es Add host section of the cluster creation workflow. 1. Fix bug in the modal where going forward one step on the wizard and coming back opens up the add host modal. 2. Rename Create Cluster to Expand Cluster as per the discussions 3. A skip confirmation modal to warn the user when he tries to skip the cluster creation 4. Adapted all the tests 5. Did some UI improvements like fixing and aligning the styles, colors.. - Used routed modal for host Additon form - Renamed the Create to Add in Host Form Fixes: https://tracker.ceph.com/issues/51517 Fixes: https://tracker.ceph.com/issues/51640 Fixes: https://tracker.ceph.com/issues/50336 Fixes: https://tracker.ceph.com/issues/50565 Signed-off-by: Avan Thakkar <athakkar@redhat.com> Signed-off-by: Aashish Sharma <aasharma@redhat.com> Signed-off-by: Nizamudeen A <nia@redhat.com>	2021-10-13 15:55:23 +05:30
Zack Cerza	b57539dc94	Revert "qa: support isal ec test for aarch64" This commit has been causing scheduled jobs to request e.g. aarch64 smithi machines, which don't exist. The dispatcher then tries to find them forever, requiring the dispatcher to be killed and restarted. The queue will sit idle until someone notices the problem. Signed-off-by: Zack Cerza <zack@redhat.com>	2021-10-12 12:53:58 -06:00
Dai Zhiwei	eaa385f3da	qa: support isal ec test for aarch64 modified: qa/standalone/erasure-code/test-erasure-code-plugins.sh new file: qa/suites/rados/thrash-erasure-code-isa/arch/aarch64.yaml Signed-off-by: Dai Zhiwei <daizhiwei3@huawei.com>	2021-10-08 14:37:25 +08:00
Kefu Chai	958b22e3ab	Merge pull request #43335 from liewegas/debug-51815 mon,auth: fix proposal (and mon db rebuild) of rotating secrets Reviewed-by: Neha Ojha <nojha@redhat.com>	2021-10-07 06:45:45 +08:00
Neha Ojha	363b223844	Merge pull request #42964 from trociny/wip-52448 osd: re-cache peer_bytes on every peering state activate Reviewed-by: Neha Ojha <nojha@redhat.com>	2021-10-06 09:26:16 -07:00
Sage Weil	eddfbbc421	qa/suites/rados/singleton/rebuild-mon-db: debug auth 30 Hunting https://tracker.ceph.com/issues/51815 Signed-off-by: Sage Weil <sage@newdream.net>	2021-10-01 14:42:23 -04:00
Mykola Golub	d35920da5e	qa/suites/rados: add inconsistent hinfo test Signed-off-by: Mykola Golub <mgolub@suse.com>	2021-09-28 16:43:02 +01:00
Sage Weil	0b361fc8b9	qa/packages: install ceph-volume Signed-off-by: Sage Weil <sage@newdream.net>	2021-09-19 21:51:19 -04:00
Mykola Golub	76743e0058	qa/suites/rados: add backfill_toofull test Signed-off-by: Mykola Golub <mgolub@suse.com>	2021-09-15 17:21:11 +03:00
Sridhar Seshasayee	7dcede75df	qa: Use osd_op_queue=wpq for tests using filestore backend. Force a subset of tests that explicitly employ the filestore backend to use WPQ scheduler. This is because mclock scheduler will not be optimized for filestore. Fixes: https://tracker.ceph.com/issues/52025 Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2021-09-02 18:15:54 +05:30
Mykola Golub	7311f6656f	qa/suites/rados: add crushdiff test Signed-off-by: Mykola Golub <mykola.golub@clyso.com>	2021-08-27 17:45:40 +03:00
Sebastian Wagner	e436483c77	qa/distro: Add centos_8.2_container_tools_3.0.yaml Let's avoid latest kubic stable Fixes: https://tracker.ceph.com/issues/52279 Signed-off-by: Sebastian Wagner <sewagner@redhat.com>	2021-08-20 10:53:11 +02:00
Neha Ojha	119544bb29	qa/suites/rados/perf/ceph.yaml: remove rgw This is no longer required because we removed cosbench workloads in `fd350fd015`. This is also required to prevent failures like the following or any other changes that break the rgw task: ``` 2021-08-06T20:13:25.812 INFO:teuthology.orchestra.run.smithi060.stderr:curl: (7) Failed to connect to smithi060.front.sepia.ceph.com port 80: Connection refused 2021-08-06T20:15:33.813 ERROR:teuthology.contextutil:Saw exception from nested tasks Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_git_teuthology_04c2febe7099917d97a71271f17abb5710030132/teuthology/contextutil.py", line 31, in nested vars.append(enter()) File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__ return next(self.gen) File "/home/teuthworker/src/github.com_ceph_ceph-c_3c0f8c8164075af7aac4d1f2805d3f4580709461/qa/tasks/rgw.py", line 191, in start_rgw wait_for_radosgw(url, remote) File "/home/teuthworker/src/github.com_ceph_ceph-c_3c0f8c8164075af7aac4d1f2805d3f4580709461/qa/tasks/util/rgw.py", line 94, in wait_for_radosgw assert exit_status == 0 AssertionError ``` Signed-off-by: Neha Ojha <nojha@redhat.com>	2021-08-09 15:08:11 +00:00
Neha Ojha	c9f8846b7f	Merge pull request #41907 from kamoltat/wip-ksirivad-progress-time-interval pybind/mgr/progress: introduce 5 second sleep interval Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2021-07-21 16:53:38 -07:00
Volker Theile	f7f163e75c	mgr/dashboard: Add configurable MOTD or wall notification Fixes: https://tracker.ceph.com/issues/51408 Signed-off-by: Volker Theile <vtheile@suse.com>	2021-07-14 10:48:49 +02:00
Kamoltat	5f33f2f6e0	mgr/test_progress.py: Delay recover in test_progress Changes some the tests in teuthology to make the test more deterministic. Using: `ceph osd set norecover` and `ceph osd set nobackfill` when marking osds in or out. As this will delay the recovery and make sure it the test cases get the chance to check that there is actually events poping up in the progress module. took out test_osd_cannot_recover from tasks/mgr/test_progress.py since it is no longer a relevant test case since recovery will get triggered regardless if pg is unmoved. Ignoring `OSDMAP_FLAGS` in teuthology because we are using norecover and nobackfill to delay the recovery process, therefore, it will create a health warning and fails the teuthology test. Signed-off-by: Kamoltat <ksirivad@redhat.com>	2021-07-13 19:33:20 +00:00
Kefu Chai	15fa32dc86	qa: run e2e test on centos only this change is a follow up of `02b8b0f490`, which failed to remove the random facet for distro. Signed-off-by: Kefu Chai <kchai@redhat.com>	2021-07-02 23:06:27 +08:00
Kefu Chai	e5c9315b11	Merge pull request #42084 from tchaikov/wip-49638 qa: run e2e test on centos only Reviewed-by: Laura Paduano <lpaduano@suse.com>	2021-06-30 19:26:42 +08:00
Kefu Chai	812e58c597	Merge pull request #42013 from ronen-fr/wip-ronenf-scrubs-config qa/suites/rados: add simultaneous scrubs to the thrasher Reviewed-by: Neha Ojha <nojha@redhat.com>	2021-06-29 16:21:52 +08:00
Kefu Chai	02b8b0f490	qa: run e2e test on centos only it's a regression introduced by the restrcuture of the test suites, let's pin the test to CentOS8. See-also: https://tracker.ceph.com/issues/49638 Signed-off-by: Kefu Chai <kchai@redhat.com>	2021-06-29 13:09:53 +08:00
Kefu Chai	29064f1bf8	Merge pull request #41937 from liewegas/mgr-crash mgr: generate crash dumps for Python exceptions in mgr modules Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-06-26 22:18:14 +08:00
Sage Weil	3edc04a46b	qa/suites/rados/mgr: whitelist module crash during selftest One of the selftests triggers an exception from serve(). Signed-off-by: Sage Weil <sage@newdream.net>	2021-06-25 13:48:45 -04:00
Ronen Friedman	d232c4e8d8	qa/suites/rados: add simultaneous scrubs (multiple options) to the thrasher Setting osd-max-scrubs to either 2 or 3. Triggered by https://tracker.ceph.com/issues/50346 Signed-off-by: Ronen Friedman <rfriedma@redhat.com>	2021-06-24 18:53:50 +03:00
Sage Weil	fe9963b03c	qa/suites/rados/dashboard: fix e2e test Move roles into task yaml. Rename e2e. Fixes: https://tracker.ceph.com/issues/51292 Signed-off-by: Sage Weil <sage@newdream.net>	2021-06-23 09:54:40 -05:00
Sage Weil	9074e87611	Merge PR #41827 into master * refs/pull/41827/head: qa: move dashboard e2e from cephadm -> rados suite Reviewed-by: Nizamudeen A <nia@redhat.com> Reviewed-by: Ernesto Puerta <epuertat@redhat.com>	2021-06-14 09:11:04 -04:00
Sage Weil	ac05b3568f	qa: move dashboard e2e from cephadm -> rados suite This test fails ~20% of the time. Signed-off-by: Sage Weil <sage@newdream.net>	2021-06-12 07:52:54 -05:00
Patrick Donnelly	d6c66f3fa6	qa,pybind/mgr: allow disabling .mgr pool This is mostly for testing: a lot of tests assume that there are no existing pools. These tests relied on a config to turn off creating the "device_health_metrics" pool which generally exists for any new Ceph cluster. It would be better to make these tests tolerant of the new .mgr pool but clearly there's a lot of these. So just convert the config to make it work. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2021-06-11 19:35:17 -07:00
Radoslaw Zarzynski	cec7c15f19	qa: use dump_metrics as alternative of get_heap_property "get_heap_property *" asock commands are exposed to operators to check the tcmalloc internals for understanding the performance of the memory subsystem. but crimson uses the builtin seastar allocator which is not backed by tcmalloc. but we can dump the metrics using the "dump_metrics" asock command which is only available from crimson-osd. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com> Signed-off-by: Kefu Chai <kchai@redhat.com>	2021-06-03 14:24:23 +08:00

1 2 3 4 5 ...

842 Commits