Commit Graph

4219 Commits

Author SHA1 Message Date
David Zafman
f43faf4ad7 test: cleanup: Remove redundant cat of log and handle errors in create_scenario()
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-11-08 14:48:19 -08:00
Sage Weil
c40685ebdd Merge PR #24787 into master
* refs/pull/24787/head:
	Merge PR #24796 into nautilus
	osd: fix heartbeat_reset unlock
	Merge PR #24780 into nautilus
	Merge PR #24761 into nautilus
	Merge PR #24651 into nautilus
	osd: fix race between op_wq and context_queue
	test: Make sure kill_daemons failure will be easy to find
	test: Add flush_pg_stats to make test more deterministic
2018-10-29 08:36:34 -05:00
Sage Weil
143f601155 Merge PR #24651 into nautilus
* refs/pull/24651/head:
	test: Make sure kill_daemons failure will be easy to find
	test: Add flush_pg_stats to make test more deterministic

Reviewed-by: Neha Ojha <nojha@redhat.com>
2018-10-26 21:07:09 -05:00
Xie Xingguo
e6f9241aeb
Merge pull request #24657 from xiexingguo/wip-rm-device-class-fix
mon/OSDMonitor: two "ceph osd crush class rm" fixes

Reviewed-by: Sage Weil <sage@redhat.com>
2018-10-27 09:49:57 +08:00
Sage Weil
80bb0664d6 Merge PR #24713 into master
* refs/pull/24713/head:
	mon: drop repeated 'goodchars' and add osd crush ls testcase

Reviewed-by: João Eduardo Luís <joao@suse.de>
2018-10-26 09:18:13 -05:00
John Spray
c5fd31dfcc
Merge pull request #24767 from votdev/issue_36581
mgr/dashboard/qa: Fix various vstart_runner.py issues

Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: John Spray <john.spray@redhat.com>
2018-10-26 15:09:59 +01:00
Lenz Grimmer
0c84be2306
Merge pull request #24727 from zmc/wip-dashboard-gzip
mgr/dashboard: Enable gzip compression

Reviewed-by: Ricardo Dias <rdias@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
2018-10-26 11:05:27 +02:00
Volker Theile
00e3022710 mgr/dashboard/qa: CephfsTest - admin_socket() got an unexpected keyword argument 'timeout'
Adapt method arguments of LocalRemote::run() according to teuthology.orchestra.run.run() (see https://github.com/ceph/teuthology/blob/master/teuthology/orchestra/run.py#L364) to be able to run QA tests locally in a vstart cluster.

Fixes: http://tracker.ceph.com/issues/36581
Signed-off-by: Volker Theile <vtheile@suse.com>
2018-10-26 09:59:11 +02:00
Zack Cerza
03de8f9557 mgr/dashboard: Enable gzip compression
This is related to http://tracker.ceph.com/issues/36453. It is far from
a complete solution, but seems like a positive move.

I tested this change by first disabling my browser cache, and then used
the /docs endpoint to query /api/dashboard/health. Before compression:
  Content-Length: 60748
  Time: 615ms
After:
  Content-Length: 7505
  Time: 92ms

Then, I logged into the dashboard as normal and reloaded the page once I
was in. Some values for the reload operation before compression:
  Total page load time: 58.48s
  vendor.js Content-Length: 6486025
  vendor.js time: 48.09s
After:
  Total page load time: 14.55s
  vendor.js Content-Length: 1143178
  vendor.js time: 4.50s

Signed-off-by: Zack Cerza <zack@redhat.com>
2018-10-24 16:04:37 -06:00
Zack Cerza
bd09bc0462 DashboardTestCase: add assertHeaders()
Signed-off-by: Zack Cerza <zack@redhat.com>
2018-10-24 16:04:37 -06:00
Zack Cerza
b134972035 LocalCephManager.admin_socket: add timeout kwarg
This fixes "TypeError: admin_socket() got an unexpected keyword argument
'timeout'". The value is never used.

Signed-off-by: Zack Cerza <zack@redhat.com>
2018-10-24 16:04:37 -06:00
Jason Dillaman
484dc12089 qa/tasks/qemu: use unique clone directory to avoid race with workunit
If there is a workunit task associated with the same client, the two
tasks will attempt to clone the suite repo to the same directory.
Worse, if it's parallel tasks, the two clones will clobber each
other.

Fixes: http://tracker.ceph.com/issues/36542
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 5d56014c61)
2018-10-24 10:30:43 -04:00
Patrick Donnelly
60c63f71f6
Merge PR #24533 into master
* refs/pull/24533/head:
	qa: add timeouts for remote ops for client mounts

Reviewed-by: Zheng Yan <zyan@redhat.com>
2018-10-23 14:46:34 -07:00
xie xingguo
5bcac35213 mon/OSDMonitor: do not remove device class still referenced by ec-profiles
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-10-23 21:17:56 +08:00
xie xingguo
4bc54587a1 mon/OSDMonitor: make "ceph osd crush class rm" idempotent
Removing a non-existent device class should be generally okay.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-10-23 21:17:56 +08:00
Yan Jun
1e98c72dfc mon: drop repeated 'goodchars' and add osd crush ls testcase
Signed-off-by: Yan Jun <yan.jun8@zte.com.cn>
2018-10-23 16:32:45 +08:00
Mykola Golub
5dd0599bdf
Merge pull request #24696 from dillaman/wip-36542
qa/tasks/qemu: use unique clone directory to avoid race with workunit

Reviewed-by: Mykola Golub <mgolub@suse.com>
2018-10-23 09:41:43 +03:00
Sage Weil
a350131d16 Merge PR #24698 into master
* refs/pull/24698/head:
	Merge PR #24697 into nautilus
	ceph_test_msgr: fix authorizer behavior
	Merge pull request #24667 from liewegas/wip-ec-thrash-full
	Merge PR #24689 into nautilus
	qa/suites/rados/thrash-erasure-code*/thrashers/*: less likely resv rejection injection

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-10-22 22:46:23 -05:00
Jason Dillaman
5d56014c61 qa/tasks/qemu: use unique clone directory to avoid race with workunit
If there is a workunit task associated with the same client, the two
tasks will attempt to clone the suite repo to the same directory.
Worse, if it's parallel tasks, the two clones will clobber each
other.

Fixes: http://tracker.ceph.com/issues/36542
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-10-22 10:44:40 -04:00
Josh Durgin
36ca230776
Merge pull request #24667 from liewegas/wip-ec-thrash-full
qa/suites/rados/thrash-erasure-code*/thrashers/*: less likely resv rejection injection

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-10-22 07:39:26 -07:00
Kefu Chai
4af71e7c00
Merge pull request #23103 from ifed01/wip-ifed-bluefs-migrate
os/bluestore: allow ceph-bluestore-tool to coalesce, add and migrate BlueFS backing volumes

Reviewed-by:  Sage Weil <sage@redhat.com>
2018-10-22 22:33:08 +08:00
Sage Weil
ae583f5dde Merge PR #24689 into master
* refs/pull/24689/head:
	qa/tasks/ceph_manager: fix get_stuck_pgs from pg dump change
	Merge PR #24625 into nautilus
	qa/suites/rados/mgr/tasks/module_selftest: whitelist 'foo bar security'

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-10-22 09:19:46 -05:00
Sage Weil
b678356594 qa/tasks/ceph_manager: fix get_stuck_pgs from pg dump change
Fixes 95b7d2340c

Fixes: http://tracker.ceph.com/issues/36485
Signed-off-by: Sage Weil <sage@redhat.com>
2018-10-21 10:52:38 -05:00
Sage Weil
98fc7ebc99 Merge PR #24184 into master
* refs/pull/24184/head:
	mgr/DaemonServer: remove any upmaps on merging PGs
	mgr/DaemonServer: prevent merge if either pg is remapped|upmap
	mgr/DaemonServer: move pending merge check for more consistent code
	qa/suites/rados/thrash*/thrashers/careful.yaml: thrash with mgr controller
	mgr/DaemonServer: add option to bypass careful throttling for thrasher
	PendingReleaseNotes: note about mgr/balancer/max_misplaced change
	mgr/DaemonServer: remove stale/misleading check
	mgr/DaemonServer: throttle pgp_num changes based on misplaced %
	mgr/DaemonServer: block pg_num decrease(merge) until pgp_num is reduced
	mgr/DaemonServer: adjust_pgs(): cosmetic change to debug output
	mon/PGMap: add get_recovery_stats()
	mgr/balancer: mgr/balancer/max_misplaced -> pg_max_misplaced
	pybind/mgr/mgr_module: add get_option()
	mgr/DaemonServer: allow pg_num increases that abort pending merges
	mon/OSDMonitor: resent pre-nautilus client ops on aborted merge
	mon/OSDMonitor: make pgp_num track pg_num more consistently

Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-10-20 16:40:22 -05:00
Sage Weil
86ae8fb6b8 qa/suites/rados/thrash*/thrashers/careful.yaml: thrash with mgr controller
Thrash such that we still exercise the careful throttling in the mgr.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-10-20 15:21:58 -05:00
Sage Weil
70ec5bda23 mgr/DaemonServer: add option to bypass careful throttling for thrasher
Signed-off-by: Sage Weil <sage@redhat.com>
2018-10-20 15:21:58 -05:00
Sage Weil
c22d5a0fce qa/suites/rados/thrash-erasure-code*/thrashers/*: less likely resv rejection injection
For EC pools we have a lot of shards, and 30% probability on each one
means we are very like to repeatedly fail backfill reservations.. long
enough that teuthology gives up waiting.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-10-18 17:17:41 -05:00
David Zafman
da3c556aa2 test: Make sure kill_daemons failure will be easy to find
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-10-17 16:54:45 -07:00
David Zafman
b33edbc4f6 test: Add flush_pg_stats to make test more deterministic
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-10-17 16:54:45 -07:00
Igor Fedotov
02b5768a4f tests: add qa test case for bluefs volume coalescence
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
2018-10-17 22:39:27 +03:00
Patrick Donnelly
092801ae34
qa: add timeouts for remote ops for client mounts
Fixes: https://tracker.ceph.com/issues/36390

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-10-17 10:39:13 -07:00
Patrick Donnelly
b1dca00f90
Merge PR #24503 into master
* refs/pull/24503/head:
	qa: increase timeout for cleanup

Reviewed-by: Jeff Layton <jlayton@redhat.com>
2018-10-17 10:18:50 -07:00
Patrick Donnelly
6345c3f80d
Merge PR #24455 into master
* refs/pull/24455/head:
	qa: use timeout for fs asok operations

Reviewed-by: Jeff Layton <jlayton@redhat.com>
2018-10-17 10:17:29 -07:00
Lenz Grimmer
a848953f28
Merge pull request #24617 from p-na/fix-python3-issue
mgr/dashboard: Fix Python3 issue

Reviewed-by: Ricardo Marques <rimarques@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
2018-10-17 18:06:58 +02:00
Sage Weil
f8b6cbda34 Merge PR #24359 into master
* refs/pull/24359/head:
	qa/tests: update ansible version to 2.6 for master branch testing.
	qa/tests: use lvm as default for ceph-ansible testing, this should also work with raw devices

Reviewed-by: Alfredo Deza <adeza@redhat.com>
2018-10-17 10:06:42 -05:00
Sage Weil
54d539d79a Merge PR #24603 into master
* refs/pull/24603/head:
	crush: get "ceph osd crush class create/rm" back

Reviewed-by: Sage Weil <sage@redhat.com>
2018-10-17 10:06:26 -05:00
Sage Weil
b833d35d9b qa/suites/rados/mgr/tasks/module_selftest: whitelist 'foo bar security'
Avoid failures like

"2018-10-16 20:36:00.437153 mgr.y (mgr.25609) 6 : cluster [SEC] foo bar security" in cluster log

Signed-off-by: Sage Weil <sage@redhat.com>
2018-10-17 07:09:15 -05:00
Patrick Nawracay
5c0c122597 mgr/dashboard: Fix Python3 issue
Which results in a 500 error when trying to access the `Performance
Counter` tab on the OSD list.

Signed-off-by: Patrick Nawracay <pnawracay@suse.com>
2018-10-17 14:02:12 +02:00
Sage Weil
05faeb4d12 Merge PR #24579 into master
* refs/pull/24579/head:
	qa/osd: fixup osd-rep-recov-eio.sh fails to parse pg dump

Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2018-10-16 07:17:59 -05:00
John Spray
e6a26aeff7
Merge pull request #24597 from batrick/i36450
qa: fix run call args

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Lenz Grimmer <lgrimmer@suse.com>
2018-10-16 13:09:08 +01:00
Patrick Donnelly
6a4cc58a9d
Merge PR #24292 into master
* refs/pull/24292/head:
	qa: add test for rctime on root inode
	mds: set rctime on new system inode
	mds: small refactor

Reviewed-by: Zheng Yan <zyan@redhat.com>
2018-10-15 21:31:04 -07:00
xie xingguo
d7ff33e9fd crush: get "ceph osd crush class create/rm" back
This reverts a27fd9d25c and
b863883ca7.

Quote form Sébastien Han:
> IIRC at some point, we were able to create a device class from the CLI.
Now it seems that the device class gets created when at least one OSD
of a particular class starts.
In ceph-ansible, we create pools after the initial monitors are up and
we want to assign a device crush class on some of them.
That's not possible at the moment since there no device class available yet.
Also, someone might want to create its own device class.
Something as crazy as running Filestore with a tmpfs osd store and
might want to isolate them.
I know it's a very limited use case, but still, it could be desired.

See also https://www.spinics.net/lists/ceph-devel/msg41152.html

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-10-16 08:45:49 +08:00
Patrick Donnelly
d491227956
qa: fix run call args
Fixes: http://tracker.ceph.com/issues/36450
Introduced-by: 95746ecce9
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-10-15 14:45:18 -07:00
huanwen ren
f1219d716d qa/osd: fixup osd-rep-recov-eio.sh fails to parse pg dump
Fixes: http://tracker.ceph.com/issues/36418
Signed-off-by: huanwen ren <ren.huanwen@zte.com.cn>
2018-10-16 02:18:22 +08:00
Sage Weil
7ac6ab4b2f Merge PR #24494 into master
* refs/pull/24494/head:
	ceph-kvstore-tool: rename repair -> destructive-repair

Reviewed-by: Neha Ojha <nojha@redhat.com>
2018-10-14 13:11:11 -05:00
Sage Weil
8cc6369511 ceph-kvstore-tool: rename repair -> destructive-repair
This is shown to corrupt otherwise healthy rocksdb databases.  Rename to
make it clear that it is generally not safe to run and shoud only be used
as a last resort.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-10-14 11:41:24 -05:00
Mykola Golub
1d92788f71
Merge pull request #24563 from dillaman/wip-36410
test: move OpenStack devstack test to rocky release

Reviewed-by: Mykola Golub <mgolub@suse.com>
2018-10-14 10:40:05 +03:00
Sage Weil
9db328f2ab Merge PR #24204 into master
* refs/pull/24204/head:
	qa/suites/rgw/tempest: valgrind on centos only

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Yuri Weinstein <yweins@redhat.com>
2018-10-12 16:29:44 -05:00
Jason Dillaman
1e3dc02604 qa/tasks/workunit: use suite branch/SHA1 when cloning workunits
Right now it's using the Ceph branch/SHA1 but it's using the suite
Git URL.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-10-12 12:41:58 -04:00
Jason Dillaman
2a1dca3fca qa/workunits/rbd: switch devstack to rocky branch and tempest to 19.0.0 tag
Fixes: http://tracker.ceph.com/issues/36410
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-10-12 08:40:44 -04:00