Commit Graph

831 Commits

Author SHA1 Message Date
Neha Ojha
f849f1554c qa/suites/rados: reduce the number of cephadm tests
Currently, every rados run of ~400 jobs is running ~150 cephadm tests,
which is unnecessary and redundant. With this change, we will run some
basic cephadm tests within the rados suite. The following seems to be
a good start.

qa/suites/rados/cephadm/osds
qa/suites/rados/cephadm/smoke
qa/suites/rados/cephadm/smoke-singlehost
qa/suites/rados/cephadm/workunits

Signed-off-by: Neha Ojha <nojha@redhat.com>
2022-01-21 23:38:53 +00:00
Pere Diaz Bou
15dfa71cf7 mgr: TTLCache basic implementation
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
Fixes: https://tracker.ceph.com/issues/48388
2022-01-05 10:11:58 +01:00
Kamoltat
bb42c71e7e qa: Added workunit test for noautoscale flag
set and unset the noautoscale flag,
evaluate if the results are what
we expected. As well as, evaluate
if the flag is correct when we
create new pools.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
2021-12-22 21:42:28 +00:00
Kamoltat
c194f4a3eb qa/workunits/mon/pg_autoscaler: modified test script
Modified test scrtipt to include `bulk` and
remove all `profile` options.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
2021-12-20 21:46:37 +00:00
Sage Weil
b430fd538f qa/suites/rados/thrash-old-clients: use better-support cephadm distro/podman
Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-30 10:47:53 -06:00
Ernesto Puerta
515af762bb
Merge pull request #43987 from rhcs-dashboard/53123-dashboard-nfs-cleanup
mgr/dashboard: NFS non-existent files cleanup

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: ljflores <NOT@FOUND>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
2021-11-19 20:40:41 +01:00
Sage Weil
411b2d39c2 qa/suites/rados/dashboard: use single-container-host.yaml
Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-17 09:02:42 -06:00
Alfonso Martínez
045d2d0f76 mgr/dashboard: NFS non-existent files cleanup
After https://github.com/ceph/ceph/pull/42526 and https://github.com/ceph/ceph/pull/43725 merges,
the following files do not exist but there were still references to them:
- src/pybind/mgr/dashboard/services/ganesha.py
- qa/tasks/mgr/dashboard/test_ganesha.py

The following files were renamed but there were still references to old names:
- src/pybind/mgr/dashboard/controllers/nfsganesha.py:  nfsganesha.py --> nfs.py
- src/pybind/mgr/dashboard/tests/test_ganesha.py:  test_ganesha.py --> test_nfs.py

Other changes in qa/suites/rados/dashboard/tasks/dashboard.yaml:
- Add missing task: tasks.mgr.dashboard.test_api
- Sort dashboard tasks alphabetically.

Fixes: https://tracker.ceph.com/issues/53123
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
2021-11-17 13:25:17 +01:00
Sebastian Wagner
116a8c4208
qa/suites/rados/mgr: use only one objectstore instead of all
I think we have enough coverage. Always testing all
objectstores is a bit excessive in my opinion

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-10-28 15:01:29 +02:00
Kefu Chai
70b049ffdb
Merge pull request #43239 from trociny/wip-48959
osd: handle inconsistent hash info during backfill and deep scrub gracefully

Reviewed-by: Samuel Just <sjust@redhat.com>
2021-10-14 22:43:16 +08:00
Ernesto Puerta
90bbcab09f
Merge pull request #42557 from ceph/feature-50336-cluster-creation-wizard
mgr/dashboard: Cluster Creation/Expansion Wizard

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: sebastian-philipp <NOT@FOUND>
Reviewed-by: Volker Theile <vtheile@suse.com>
2021-10-14 15:12:42 +02:00
Nizamudeen A
59cbf97e6c mgr/dashboard: Cluster Creation Add Host Section and e2es
Add host section of the cluster creation workflow.

1. Fix bug in the modal where going forward one step on the wizard and coming back opens up the add host modal.
2. Rename Create Cluster to Expand Cluster as per the discussions
3. A skip confirmation modal to warn the user when he tries to skip the
   cluster creation
4. Adapted all the tests
5. Did some UI improvements like fixing and aligning the styles,
   colors..
- Used routed modal for host Additon form
- Renamed the Create to Add in Host Form

Fixes: https://tracker.ceph.com/issues/51517
Fixes: https://tracker.ceph.com/issues/51640
Fixes: https://tracker.ceph.com/issues/50336
Fixes: https://tracker.ceph.com/issues/50565
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
Signed-off-by: Nizamudeen A <nia@redhat.com>
2021-10-13 15:55:23 +05:30
Zack Cerza
b57539dc94 Revert "qa: support isal ec test for aarch64"
This commit has been causing scheduled jobs to request e.g. aarch64
smithi machines, which don't exist. The dispatcher then tries to find them forever, requiring the dispatcher to be killed and restarted. The queue
will sit idle until someone notices the problem.

Signed-off-by: Zack Cerza <zack@redhat.com>
2021-10-12 12:53:58 -06:00
Dai Zhiwei
eaa385f3da qa: support isal ec test for aarch64
modified:   qa/standalone/erasure-code/test-erasure-code-plugins.sh
	new file:   qa/suites/rados/thrash-erasure-code-isa/arch/aarch64.yaml

Signed-off-by: Dai Zhiwei <daizhiwei3@huawei.com>
2021-10-08 14:37:25 +08:00
Kefu Chai
958b22e3ab
Merge pull request #43335 from liewegas/debug-51815
mon,auth: fix proposal (and mon db rebuild) of rotating secrets

Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-10-07 06:45:45 +08:00
Neha Ojha
363b223844
Merge pull request #42964 from trociny/wip-52448
osd: re-cache peer_bytes on every peering state activate

Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-10-06 09:26:16 -07:00
Sage Weil
eddfbbc421 qa/suites/rados/singleton/rebuild-mon-db: debug auth 30
Hunting https://tracker.ceph.com/issues/51815

Signed-off-by: Sage Weil <sage@newdream.net>
2021-10-01 14:42:23 -04:00
Mykola Golub
d35920da5e qa/suites/rados: add inconsistent hinfo test
Signed-off-by: Mykola Golub <mgolub@suse.com>
2021-09-28 16:43:02 +01:00
Sage Weil
0b361fc8b9 qa/packages: install ceph-volume
Signed-off-by: Sage Weil <sage@newdream.net>
2021-09-19 21:51:19 -04:00
Mykola Golub
76743e0058 qa/suites/rados: add backfill_toofull test
Signed-off-by: Mykola Golub <mgolub@suse.com>
2021-09-15 17:21:11 +03:00
Sridhar Seshasayee
7dcede75df qa: Use osd_op_queue=wpq for tests using filestore backend.
Force a subset of tests that explicitly employ the filestore backend to
use WPQ scheduler. This is because mclock scheduler will not be
optimized for filestore.

Fixes: https://tracker.ceph.com/issues/52025
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2021-09-02 18:15:54 +05:30
Mykola Golub
7311f6656f qa/suites/rados: add crushdiff test
Signed-off-by: Mykola Golub <mykola.golub@clyso.com>
2021-08-27 17:45:40 +03:00
Sebastian Wagner
e436483c77
qa/distro: Add centos_8.2_container_tools_3.0.yaml
Let's avoid latest kubic stable

Fixes: https://tracker.ceph.com/issues/52279
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-08-20 10:53:11 +02:00
Neha Ojha
119544bb29 qa/suites/rados/perf/ceph.yaml: remove rgw
This is no longer required because we removed cosbench workloads in
fd350fd015. This is also required to prevent
failures like the following or any other changes that break the rgw task:

```
2021-08-06T20:13:25.812 INFO:teuthology.orchestra.run.smithi060.stderr:curl: (7) Failed to connect to smithi060.front.sepia.ceph.com port 80: Connection refused
2021-08-06T20:15:33.813 ERROR:teuthology.contextutil:Saw exception from nested tasks
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_04c2febe7099917d97a71271f17abb5710030132/teuthology/contextutil.py", line 31, in nested
    vars.append(enter())
  File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_3c0f8c8164075af7aac4d1f2805d3f4580709461/qa/tasks/rgw.py", line 191, in start_rgw
    wait_for_radosgw(url, remote)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_3c0f8c8164075af7aac4d1f2805d3f4580709461/qa/tasks/util/rgw.py", line 94, in wait_for_radosgw
    assert exit_status == 0
AssertionError
```

Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-08-09 15:08:11 +00:00
Neha Ojha
c9f8846b7f
Merge pull request #41907 from kamoltat/wip-ksirivad-progress-time-interval
pybind/mgr/progress: introduce 5 second sleep interval

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2021-07-21 16:53:38 -07:00
Volker Theile
f7f163e75c mgr/dashboard: Add configurable MOTD or wall notification
Fixes: https://tracker.ceph.com/issues/51408

Signed-off-by: Volker Theile <vtheile@suse.com>
2021-07-14 10:48:49 +02:00
Kamoltat
5f33f2f6e0 mgr/test_progress.py: Delay recover in test_progress
Changes some the tests in teuthology to make
the test more deterministic.
Using:

`ceph osd set norecover` and
`ceph osd set nobackfill` when marking osds in
or out. As this will delay the recovery and make
sure it the test cases get the chance to check
that there is actually events poping up in
the progress module.

took out test_osd_cannot_recover from
tasks/mgr/test_progress.py since it is no longer
a relevant test case since recovery will get
triggered regardless if pg is unmoved.

Ignoring `OSDMAP_FLAGS` in teuthology
because we are using norecover and nobackfill
to delay the recovery process, therefore, it
will create a health warning and fails the
teuthology test.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
2021-07-13 19:33:20 +00:00
Kefu Chai
15fa32dc86 qa: run e2e test on centos only
this change is a follow up of 02b8b0f490,
which failed to remove the random facet for distro.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-07-02 23:06:27 +08:00
Kefu Chai
e5c9315b11
Merge pull request #42084 from tchaikov/wip-49638
qa: run e2e test on centos only

Reviewed-by: Laura Paduano <lpaduano@suse.com>
2021-06-30 19:26:42 +08:00
Kefu Chai
812e58c597
Merge pull request #42013 from ronen-fr/wip-ronenf-scrubs-config
qa/suites/rados: add simultaneous scrubs to the thrasher

Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-06-29 16:21:52 +08:00
Kefu Chai
02b8b0f490 qa: run e2e test on centos only
it's a regression introduced by the restrcuture of the test suites,
let's pin the test to CentOS8.

See-also: https://tracker.ceph.com/issues/49638
Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-06-29 13:09:53 +08:00
Kefu Chai
29064f1bf8
Merge pull request #41937 from liewegas/mgr-crash
mgr: generate crash dumps for Python exceptions in mgr modules

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-06-26 22:18:14 +08:00
Sage Weil
3edc04a46b qa/suites/rados/mgr: whitelist module crash during selftest
One of the selftests triggers an exception from serve().

Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-25 13:48:45 -04:00
Ronen Friedman
d232c4e8d8 qa/suites/rados: add simultaneous scrubs (multiple options) to the thrasher
Setting osd-max-scrubs to either 2 or 3.

Triggered by https://tracker.ceph.com/issues/50346

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2021-06-24 18:53:50 +03:00
Sage Weil
fe9963b03c qa/suites/rados/dashboard: fix e2e test
Move roles into task yaml.  Rename e2e.

Fixes: https://tracker.ceph.com/issues/51292
Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-23 09:54:40 -05:00
Sage Weil
9074e87611 Merge PR #41827 into master
* refs/pull/41827/head:
	qa: move dashboard e2e from cephadm -> rados suite

Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
2021-06-14 09:11:04 -04:00
Sage Weil
ac05b3568f qa: move dashboard e2e from cephadm -> rados suite
This test fails ~20% of the time.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-12 07:52:54 -05:00
Patrick Donnelly
d6c66f3fa6
qa,pybind/mgr: allow disabling .mgr pool
This is mostly for testing: a lot of tests assume that there are no
existing pools. These tests relied on a config to turn off creating the
"device_health_metrics" pool which generally exists for any new Ceph
cluster. It would be better to make these tests tolerant of the new .mgr
pool but clearly there's a lot of these. So just convert the config to
make it work.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-11 19:35:17 -07:00
Radoslaw Zarzynski
cec7c15f19 qa: use dump_metrics as alternative of get_heap_property
"get_heap_property *" asock commands are exposed to operators
to check the tcmalloc internals for understanding the performance
of the memory subsystem. but crimson uses the builtin seastar allocator
which is not backed by tcmalloc. but we can dump the metrics using
the "dump_metrics" asock command which is only available from
crimson-osd.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-06-03 14:24:23 +08:00
Neha Ojha
9241144022
Merge pull request #41487 from neha-ojha/wip-toc
qa/suites/rados/thrash-old-clients: remove luminous and mimic and use centos_latest

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2021-05-24 14:44:18 -07:00
Neha Ojha
ece5ed1ac9
Merge pull request #41486 from neha-ojha/wip-49139-new
qa: use ubuntu_latest for perf suites and remove cosbench workloads

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-05-24 12:53:46 -07:00
Neha Ojha
30eb7467aa qa/suites/rados/thrash-old-clients: use centos_latest.yaml
use centos_latest instead of bionic because this is only common
distro for which we build packages for nautilus and above.

Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-05-24 18:34:04 +00:00
Sage Weil
04ce0496e8 Merge PR #41451 into master
* refs/pull/41451/head:
	qa/suites/rados: include rook test in rados

Reviewed-by: Yuri Weinstein <yweins@redhat.com>
2021-05-24 13:30:27 -04:00
Neha Ojha
b7237c9e2d qa/suites/rados/thrash-old-clients: remove luminous and mimic
We support N-3 client versions.

Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-05-21 22:03:41 +00:00
Neha Ojha
fd350fd015 qa: remove cosbench workloads from perf suites
Due to https://tracker.ceph.com/issues/49139

Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-05-21 20:17:11 +00:00
Neha Ojha
5957d1797a qa: use ubuntu_latest for perf suites
Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-05-21 17:39:44 +00:00
Sage Weil
5db5c8c292 qa/suites/rados: include rook test in rados
This just to make sure we don't break mgr/orchestrator.

Note that we already symlink ../orch/cephadm, so this makes rados
include all of orch/.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-05-20 12:41:52 -05:00
Sage Weil
5b25f8a2e5 qa/suites: move rados/cephadm -> orch/cephadm; symlink
Move cephadm under orch/ top-level suite.  Symlink so that we
still include it in a rados run.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-05-18 11:14:14 -05:00
Neha Ojha
d3692a3e92
Merge pull request #40016 from neha-ojha/wip-default-mclock
use mclock_scheduler as the default scheduler

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Sunny Kumar <sunkumar@redhat.com>
2021-05-07 08:08:39 -07:00
Neha Ojha
c8e48c5c25 qa/suites/rados/standalone: remove mon_election symlink
The standalone tests need parameters to be passed as ceph_args to
override defaults.

This was just doubling the number of standalone tests being run in each rados
run with no effect!

Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-05-07 00:42:53 +00:00