Commit Graph

818 Commits

Author SHA1 Message Date
Nizamudeen A
59cbf97e6c mgr/dashboard: Cluster Creation Add Host Section and e2es
Add host section of the cluster creation workflow.

1. Fix bug in the modal where going forward one step on the wizard and coming back opens up the add host modal.
2. Rename Create Cluster to Expand Cluster as per the discussions
3. A skip confirmation modal to warn the user when he tries to skip the
   cluster creation
4. Adapted all the tests
5. Did some UI improvements like fixing and aligning the styles,
   colors..
- Used routed modal for host Additon form
- Renamed the Create to Add in Host Form

Fixes: https://tracker.ceph.com/issues/51517
Fixes: https://tracker.ceph.com/issues/51640
Fixes: https://tracker.ceph.com/issues/50336
Fixes: https://tracker.ceph.com/issues/50565
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
Signed-off-by: Nizamudeen A <nia@redhat.com>
2021-10-13 15:55:23 +05:30
Dai Zhiwei
eaa385f3da qa: support isal ec test for aarch64
modified:   qa/standalone/erasure-code/test-erasure-code-plugins.sh
	new file:   qa/suites/rados/thrash-erasure-code-isa/arch/aarch64.yaml

Signed-off-by: Dai Zhiwei <daizhiwei3@huawei.com>
2021-10-08 14:37:25 +08:00
Kefu Chai
958b22e3ab
Merge pull request #43335 from liewegas/debug-51815
mon,auth: fix proposal (and mon db rebuild) of rotating secrets

Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-10-07 06:45:45 +08:00
Neha Ojha
363b223844
Merge pull request #42964 from trociny/wip-52448
osd: re-cache peer_bytes on every peering state activate

Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-10-06 09:26:16 -07:00
Sage Weil
eddfbbc421 qa/suites/rados/singleton/rebuild-mon-db: debug auth 30
Hunting https://tracker.ceph.com/issues/51815

Signed-off-by: Sage Weil <sage@newdream.net>
2021-10-01 14:42:23 -04:00
Sage Weil
0b361fc8b9 qa/packages: install ceph-volume
Signed-off-by: Sage Weil <sage@newdream.net>
2021-09-19 21:51:19 -04:00
Mykola Golub
76743e0058 qa/suites/rados: add backfill_toofull test
Signed-off-by: Mykola Golub <mgolub@suse.com>
2021-09-15 17:21:11 +03:00
Sridhar Seshasayee
7dcede75df qa: Use osd_op_queue=wpq for tests using filestore backend.
Force a subset of tests that explicitly employ the filestore backend to
use WPQ scheduler. This is because mclock scheduler will not be
optimized for filestore.

Fixes: https://tracker.ceph.com/issues/52025
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2021-09-02 18:15:54 +05:30
Mykola Golub
7311f6656f qa/suites/rados: add crushdiff test
Signed-off-by: Mykola Golub <mykola.golub@clyso.com>
2021-08-27 17:45:40 +03:00
Sebastian Wagner
e436483c77
qa/distro: Add centos_8.2_container_tools_3.0.yaml
Let's avoid latest kubic stable

Fixes: https://tracker.ceph.com/issues/52279
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-08-20 10:53:11 +02:00
Neha Ojha
119544bb29 qa/suites/rados/perf/ceph.yaml: remove rgw
This is no longer required because we removed cosbench workloads in
fd350fd015. This is also required to prevent
failures like the following or any other changes that break the rgw task:

```
2021-08-06T20:13:25.812 INFO:teuthology.orchestra.run.smithi060.stderr:curl: (7) Failed to connect to smithi060.front.sepia.ceph.com port 80: Connection refused
2021-08-06T20:15:33.813 ERROR:teuthology.contextutil:Saw exception from nested tasks
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_04c2febe7099917d97a71271f17abb5710030132/teuthology/contextutil.py", line 31, in nested
    vars.append(enter())
  File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_3c0f8c8164075af7aac4d1f2805d3f4580709461/qa/tasks/rgw.py", line 191, in start_rgw
    wait_for_radosgw(url, remote)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_3c0f8c8164075af7aac4d1f2805d3f4580709461/qa/tasks/util/rgw.py", line 94, in wait_for_radosgw
    assert exit_status == 0
AssertionError
```

Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-08-09 15:08:11 +00:00
Neha Ojha
c9f8846b7f
Merge pull request #41907 from kamoltat/wip-ksirivad-progress-time-interval
pybind/mgr/progress: introduce 5 second sleep interval

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2021-07-21 16:53:38 -07:00
Volker Theile
f7f163e75c mgr/dashboard: Add configurable MOTD or wall notification
Fixes: https://tracker.ceph.com/issues/51408

Signed-off-by: Volker Theile <vtheile@suse.com>
2021-07-14 10:48:49 +02:00
Kamoltat
5f33f2f6e0 mgr/test_progress.py: Delay recover in test_progress
Changes some the tests in teuthology to make
the test more deterministic.
Using:

`ceph osd set norecover` and
`ceph osd set nobackfill` when marking osds in
or out. As this will delay the recovery and make
sure it the test cases get the chance to check
that there is actually events poping up in
the progress module.

took out test_osd_cannot_recover from
tasks/mgr/test_progress.py since it is no longer
a relevant test case since recovery will get
triggered regardless if pg is unmoved.

Ignoring `OSDMAP_FLAGS` in teuthology
because we are using norecover and nobackfill
to delay the recovery process, therefore, it
will create a health warning and fails the
teuthology test.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
2021-07-13 19:33:20 +00:00
Kefu Chai
15fa32dc86 qa: run e2e test on centos only
this change is a follow up of 02b8b0f490,
which failed to remove the random facet for distro.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-07-02 23:06:27 +08:00
Kefu Chai
e5c9315b11
Merge pull request #42084 from tchaikov/wip-49638
qa: run e2e test on centos only

Reviewed-by: Laura Paduano <lpaduano@suse.com>
2021-06-30 19:26:42 +08:00
Kefu Chai
812e58c597
Merge pull request #42013 from ronen-fr/wip-ronenf-scrubs-config
qa/suites/rados: add simultaneous scrubs to the thrasher

Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-06-29 16:21:52 +08:00
Kefu Chai
02b8b0f490 qa: run e2e test on centos only
it's a regression introduced by the restrcuture of the test suites,
let's pin the test to CentOS8.

See-also: https://tracker.ceph.com/issues/49638
Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-06-29 13:09:53 +08:00
Kefu Chai
29064f1bf8
Merge pull request #41937 from liewegas/mgr-crash
mgr: generate crash dumps for Python exceptions in mgr modules

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-06-26 22:18:14 +08:00
Sage Weil
3edc04a46b qa/suites/rados/mgr: whitelist module crash during selftest
One of the selftests triggers an exception from serve().

Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-25 13:48:45 -04:00
Ronen Friedman
d232c4e8d8 qa/suites/rados: add simultaneous scrubs (multiple options) to the thrasher
Setting osd-max-scrubs to either 2 or 3.

Triggered by https://tracker.ceph.com/issues/50346

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2021-06-24 18:53:50 +03:00
Sage Weil
fe9963b03c qa/suites/rados/dashboard: fix e2e test
Move roles into task yaml.  Rename e2e.

Fixes: https://tracker.ceph.com/issues/51292
Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-23 09:54:40 -05:00
Sage Weil
9074e87611 Merge PR #41827 into master
* refs/pull/41827/head:
	qa: move dashboard e2e from cephadm -> rados suite

Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
2021-06-14 09:11:04 -04:00
Sage Weil
ac05b3568f qa: move dashboard e2e from cephadm -> rados suite
This test fails ~20% of the time.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-12 07:52:54 -05:00
Patrick Donnelly
d6c66f3fa6
qa,pybind/mgr: allow disabling .mgr pool
This is mostly for testing: a lot of tests assume that there are no
existing pools. These tests relied on a config to turn off creating the
"device_health_metrics" pool which generally exists for any new Ceph
cluster. It would be better to make these tests tolerant of the new .mgr
pool but clearly there's a lot of these. So just convert the config to
make it work.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-11 19:35:17 -07:00
Radoslaw Zarzynski
cec7c15f19 qa: use dump_metrics as alternative of get_heap_property
"get_heap_property *" asock commands are exposed to operators
to check the tcmalloc internals for understanding the performance
of the memory subsystem. but crimson uses the builtin seastar allocator
which is not backed by tcmalloc. but we can dump the metrics using
the "dump_metrics" asock command which is only available from
crimson-osd.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-06-03 14:24:23 +08:00
Neha Ojha
9241144022
Merge pull request #41487 from neha-ojha/wip-toc
qa/suites/rados/thrash-old-clients: remove luminous and mimic and use centos_latest

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2021-05-24 14:44:18 -07:00
Neha Ojha
ece5ed1ac9
Merge pull request #41486 from neha-ojha/wip-49139-new
qa: use ubuntu_latest for perf suites and remove cosbench workloads

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-05-24 12:53:46 -07:00
Neha Ojha
30eb7467aa qa/suites/rados/thrash-old-clients: use centos_latest.yaml
use centos_latest instead of bionic because this is only common
distro for which we build packages for nautilus and above.

Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-05-24 18:34:04 +00:00
Sage Weil
04ce0496e8 Merge PR #41451 into master
* refs/pull/41451/head:
	qa/suites/rados: include rook test in rados

Reviewed-by: Yuri Weinstein <yweins@redhat.com>
2021-05-24 13:30:27 -04:00
Neha Ojha
b7237c9e2d qa/suites/rados/thrash-old-clients: remove luminous and mimic
We support N-3 client versions.

Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-05-21 22:03:41 +00:00
Neha Ojha
fd350fd015 qa: remove cosbench workloads from perf suites
Due to https://tracker.ceph.com/issues/49139

Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-05-21 20:17:11 +00:00
Neha Ojha
5957d1797a qa: use ubuntu_latest for perf suites
Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-05-21 17:39:44 +00:00
Sage Weil
5db5c8c292 qa/suites/rados: include rook test in rados
This just to make sure we don't break mgr/orchestrator.

Note that we already symlink ../orch/cephadm, so this makes rados
include all of orch/.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-05-20 12:41:52 -05:00
Sage Weil
5b25f8a2e5 qa/suites: move rados/cephadm -> orch/cephadm; symlink
Move cephadm under orch/ top-level suite.  Symlink so that we
still include it in a rados run.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-05-18 11:14:14 -05:00
Neha Ojha
d3692a3e92
Merge pull request #40016 from neha-ojha/wip-default-mclock
use mclock_scheduler as the default scheduler

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Sunny Kumar <sunkumar@redhat.com>
2021-05-07 08:08:39 -07:00
Neha Ojha
c8e48c5c25 qa/suites/rados/standalone: remove mon_election symlink
The standalone tests need parameters to be passed as ceph_args to
override defaults.

This was just doubling the number of standalone tests being run in each rados
run with no effect!

Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-05-07 00:42:53 +00:00
Sridhar Seshasayee
cc1fc98ea4 qa/suites/rados/mgr/tasks/progress: use high_recovery_ops for faster recovery
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2021-05-06 17:54:38 +00:00
Kefu Chai
fee3028abc
Merge pull request #41014 from smithfarm/wip-mempool-cacheline-49781
qa: verify the benefits of mempool cacheline optimization

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2021-04-30 19:36:17 +08:00
Loïc Dachary
7fe0ac7c11 qa: verify the benefits of mempool cacheline optimization
There already is a test to verify the mempool sharding works, in the sense that
it uses at least half of the variables available to count the number of
allocated objects and their total size. This new test verifies that, with
sharding, object counting is at least twice faster than without sharding. It
also collects cacheline contention data with the perf c2c tool. The manual
analysis of this data shows the optimization gain is indeed related to cacheline
contention.

Fixes: https://tracker.ceph.com/issues/49896

Signed-off-by: Loïc Dachary <loic@dachary.org>
2021-04-30 12:11:13 +08:00
Josh Durgin
0e273e6760
Merge pull request #40593 from ideepika/wip-new-testing-params
qa/config/rados: add dispatch delay testing params

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Sam Just <sjust@redhat.com>
2021-04-28 11:26:58 -07:00
Sage Weil
75480f52e4 Merge PR #40941 into master
* refs/pull/40941/head:
	qa/suites/rados/cephadm/smoke-roleless: test client-keyring
	qa/tasks/cephadm.py: adjust client.admin key mode; place on all hosts
	cephadm: distribute client.admin keyring+conf to label:_admin on bootstrap
	doc/cephadm: document the default 'admin' label
	mgr/cephadm: 'ceph orch client-keyring ...' commands to manage keyring files
	mgr/cephadm: reimplement ceph.conf pushing
	mgr/cephadm: use _write_remote_file for ceph.conf
	mgr/cephadm: _write_remote_file helper
	mgr/cephadm: add placementspec for which hosts get ceph.conf

Reviewed-by: Sebastian Wagner <swagner@suse.com>
Reviewed-by: Adam King <adking@redhat.com>
2021-04-28 14:26:35 -04:00
Sage Weil
e41931d042 qa/suites/rados/cephadm/smoke-roleless: test client-keyring
Signed-off-by: Sage Weil <sage@newdream.net>
2021-04-27 18:29:50 -04:00
Sage Weil
b0dcaf2cfa qa/tasks/cephadm.py: adjust client.admin key mode; place on all hosts
Except during upgrades, since it is not supported there.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-04-27 18:29:50 -04:00
Yuri Weinstein
b6c84d5621 qa/tests: changed simlink to upgrade/parallel only
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
2021-04-23 08:20:01 -07:00
Deepika Upadhyay
b2c2a4326c qa/config/rados: add dispatch delay testing params
these parameters have proven to catch some of the uncaught bugs such as:
https://tracker.ceph.com/issues/48417, adopting them will help in
preventing more such hard to debug bugs.

Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
2021-04-19 12:28:18 +05:30
Sage Weil
9c1f128885 qa/suites/rados/cephadm/smoke-*: use cephadm.wait_for_service
Signed-off-by: Sage Weil <sage@newdream.net>
2021-04-16 16:00:31 -04:00
Sage Weil
c20323d114 qa/suites/rados/cephadm/smoke-singlehost: test --single-host-defaults
Signed-off-by: Sage Weil <sage@newdream.net>
2021-04-16 16:00:31 -04:00
Sage Weil
16b30f2858 qa/suites/rados/cephadm/smoke-*: use cephadm.wait_for_service
Signed-off-by: Sage Weil <sage@newdream.net>
2021-04-16 09:49:45 -05:00
Sage Weil
3ff3f697b4 qa/suites/rados/cephadm/smoke-roleless: test rgw-ingress
Test this properly by downing each rgw and haproxy in turn and ensuring
that things remain up.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-04-16 09:49:45 -05:00