Commit Graph

5554 Commits

Author SHA1 Message Date
Ilya Dryomov
5011cc926c qa/suites/krbd: run unmap subsuite with msgr1 only
pre-single-major.yaml kernel doesn't have any of the monitor client
fixes that came in 4.6.  If the connection is closed, it closes the
session and retries only after 10 seconds.  On top of that, there is
nothing to prevent it from picking the same monitor when reconnecting.
This means that when given both v1 and v2 ports (which look like two
different monitors), it is susceptible to mount_timeout (60 seconds):

  $ sudo rbd map img
  rbd: sysfs write failed
  In some cases useful info is found in syslog - try "dmesg | tail".
  rbd: map failed: (5) Input/output error

  [  822.242313] libceph: mon0 172.21.15.132:3300 socket closed (con state CONNECTING)
  [  832.265494] libceph: mon0 172.21.15.132:3300 socket closed (con state CONNECTING)
  [  842.296175] libceph: mon0 172.21.15.132:3300 socket closed (con state CONNECTING)
  [  852.326924] libceph: mon0 172.21.15.132:3300 socket closed (con state CONNECTING)
  [  862.357611] libceph: mon0 172.21.15.132:3300 socket closed (con state CONNECTING)
  [  872.388373] libceph: mon0 172.21.15.132:3300 socket closed (con state CONNECTING)
  [  882.676136] libceph: mon0 172.21.15.132:3300 socket closed (con state CONNECTING)

Unlike newer kernels that return ETIMEDOUT, it returns EIO.

Newer kernels are much more aggressive about retries and will pick
a different monitor when reconnecting, hence they are always able to
establish the session in time.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-10-30 19:51:55 +01:00
Ilya Dryomov
9c17ca0aa7
Merge pull request #31023 from idryomov/wip-krbd-udev-enumerate-retry
krbd: retry on transient errors from udev_enumerate_scan_devices()

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
2019-10-29 11:40:45 +01:00
Patrick Donnelly
eb00dcd660
Merge PR #31063 into master
* refs/pull/31063/head:
	qa: disable too few PG warning during Mimic deploy

Reviewed-by: Nathan Cutler <ncutler@suse.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2019-10-28 20:37:31 -07:00
Sage Weil
d927374bb4 Merge PR #31168 into master
* refs/pull/31168/head:
	ceph-daemon: try py2 import before py3
	qa/suites/rados/singleton-nomsgr/ceph-daemon: make sure python3 is installed
	qa/standalone/test_ceph_damon.sh: test with python2 and python3
	mgr/ssh: python, not python3
	ceph-daemon: python, not python3
	ceph-daemon: os.makedirs
	ceph-daemon: configparser is ConfigParser on py2
	ceph-daemon: avoid py3-isms

Reviewed-by: Sebastian Wagner <swagner@suse.com>
Reviewed-by: Alfredo Deza <adeza@redhat.com>
2019-10-28 14:59:43 -05:00
Sage Weil
9fe9653c8c qa/suites/rados/singleton-nomsgr/ceph-daemon: make sure python3 is installed
Centos7 doesn't have it by default.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-28 12:15:47 -05:00
Sage Weil
debde146d2 qa/standalone/test_ceph_damon.sh: test with python2 and python3
Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-28 12:15:47 -05:00
Kefu Chai
5047b3222c
Merge pull request #31171 from liewegas/bug-42496
qa/tasks/cbt: run stop-all.sh while shutting down

Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-10-27 23:57:57 +08:00
Kefu Chai
674bd8a9e6
Merge pull request #30434 from smithfarm/wip-41820
qa: enable dashboard tests to be run with "--suite rados/dashboard"

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-10-27 09:18:16 +08:00
Kefu Chai
c121234b79
Merge pull request #31005 from tchaikov/wip/qa/tasks/ceph/cleanup
qa/tasks/ceph.py: remove unused variables

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
2019-10-27 09:16:37 +08:00
Sage Weil
56e99ba5f0 qa/tasks/cbt: run stop-all.sh when finishing up
stop-all.sh will work if the right deps are there (currently we lack 'nc')

also killall -9 java to be sure.

Fixes: https://tracker.ceph.com/issues/42496
Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-26 13:25:39 -05:00
Ilya Dryomov
b7a0e2adcb qa: add script to stress udev_enumerate_scan_devices()
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-10-25 22:05:38 +02:00
Sage Weil
b830870bbb Merge PR #31130 into master
* refs/pull/31130/head:
	ceph-daemon: only set up crash dir mount if it exists

Reviewed-by: Sebastian Wagner <swagner@suse.com>
2019-10-25 09:18:20 -05:00
Sage Weil
f56a8db34d ceph-daemon: only set up crash dir mount if it exists
Sometimes we run containers on a host that doesn't have a crash dir set
up (becuase no daemon has been deployed).  Examples include shell and
ceph-volume.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-24 20:06:23 -05:00
David Zafman
4ea43f7342
Merge pull request #31133 from dzafman/wip-42476
ceph-objectstore-tool: call collection_bits() crashes on the meta col…

Reviewed-by: Sage Weil <sage@redhat.com>
2019-10-24 17:23:48 -07:00
Patrick Donnelly
e47d0e4cb3
Merge PR #31095 into master
* refs/pull/31095/head:
	qa: do not check pg count for new data_isolated volume

Reviewed-by: Sage Weil <sage@redhat.com>
2019-10-24 15:17:48 -07:00
Patrick Donnelly
8fb4e4c1e7
qa: disable too few PG warning during Mimic deploy
Mimic will raise this warning when we use 8 PGs for CephFS metadata/data
pools.

Fixes: fc88e6c6c5
Fixes: https://tracker.ceph.com/issues/42434
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-10-24 15:12:43 -07:00
David Zafman
2d79e77b6a ceph-objectstore-tool: call collection_bits() crashes on the meta collection
Skip new check for meta collection
test:
    Turn off osd_pool_default_pg_autoscale_mode just like bash tests do
    Fix test by checking for new error message

Caused by: f88b353454

Fixes: https://tracker.ceph.com/issues/42476

Signed-off-by: David Zafman <dzafman@redhat.com>
2019-10-24 11:37:30 -07:00
Nathan Cutler
54020b75bc
Merge pull request #31112 from smithfarm/wip-distros-sle
qa/distros: add SLE-12-SP3 and SLE-15-SP1

Reviewed-by: Kyr Shatskyy <kyrylo.shatskyy@suse.de>
2019-10-24 11:28:41 +02:00
Nathan Cutler
33a9e4a671 qa/distros: add SLE-12-SP3 and SLE-15-SP1
Ceph luminous is known to run on SLE-12-SP3 and nautilus on SLE-15-SP1, so add
these two to qa/distros/all.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
2019-10-24 09:47:22 +02:00
Sage Weil
29c97547a9 Merge PR #30859 into master
* refs/pull/30859/head:
	auth: EACCES, not EPERM
	mon: shunt old tell commands from cli interface to asok
	mon: allow mgr to tell mon.foo smart
	mon: include quorum features in quorum_status
	qa/workunits/mon/caps.sh: fix test
	ceph_test_rados_api_cmd: fix MonDescribe test
	Merge branch 'vstart-fs-auth' of git://github.com/batrick/ceph into wip-cleanup-mon-asok
	test/pybind/test_ceph_argparse: fix tests
	vstart: add volume client keys to keyring
	vstart: use fs authorize to create master client key
	vstart: redirect some output to stderr
	vstart: output command strings to stderr
	qa/workunits/cephtool/test.sh: fix 'quorum enter' caller
	qa: change mon_status calls to quorum_status or tell commands
	mon: fix 'heap ...' command
	mon: consolidate 'sync force' commands
	mon: allow asok commands to return an error code
	mon: move 'quorum enter|exit' and 'mon_status' to asok
	mon: fix 'smart' asok command
	mon: remove old 'config set' and 'injectargs'

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2019-10-23 21:05:42 -05:00
Sage Weil
bf09a04d22 Merge PR #31094 into master
* refs/pull/31094/head:
	ceph-daemon: remove redundant --privileged
	test_ceph_daemon: test unit, enter, shell
	ceph-daemon: drop exec
	ceph-daemon: fix exit code for run, shell, enter, exec
	ceph-daemon: allow optional command for 'enter'
	ceph-daemon: fix LANG for 'enter' command
	ceph-daemon: allow shell to take optional command
	qa/suites/rados/singleton-nomsgr/ceph-daemon: run test_ceph_daemon.sh
	qa/standalone/test_ceph_daemon.sh: add new functional tests
	test_ceph_daemon.sh: use newer image
	ceph-daemon: unconditionally enable and start crash unit
	ceph-daemon: fix crash unit cleanup
	ceph-daemon: include 'crash' unit/item in 'ls' output
	ceph-daemon: fix 'ls'
	mgr/orchestrator: s/sdd/ssd/
	mgr/ssh: remove stdout/stderr kludges
	ceph-daemon: fix ceph-volume command to write stdout to stdout

Reviewed-by: Sebastian Wagner <swagner@suse.com>
2019-10-23 19:46:06 -05:00
Gregory Farnum
e190825f26
Merge pull request #30650 from athanatos/sjust/wip-dmclock-server-only
dmclock server side refactor

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
2019-10-23 13:35:13 -07:00
Samuel Just
2157ac7d14 osd/: remove legacy schedulers
The consensus seems to be that PrioritizedQueue is strictly worse than
WeightedPriorityQueue.

mClockClientQueue and mClockClassQueue are superceded by
mClockScheduler.

Signed-off-by: Samuel Just <sjust@redhat.com>
2019-10-23 13:33:59 -07:00
Sage Weil
adf22b9e59 Merge PR #31054 into master
* refs/pull/31054/head:
	qa/suites/upgrade/*-x-singleton: suppress TOO_FEW_PGS warning

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2019-10-23 15:20:43 -05:00
Sage Weil
d7bd029b51 test_ceph_daemon: test unit, enter, shell
Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-23 15:08:55 -05:00
Sage Weil
86b2c8dd60 ceph-daemon: drop exec
It's not identical to enter.  enter seems more intuitive to me, but that
may be because I'm not a longtime docker user.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-23 15:08:55 -05:00
Sage Weil
47777b9c0d qa/suites/rados/singleton-nomsgr/ceph-daemon: run test_ceph_daemon.sh
Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-23 15:08:55 -05:00
Sage Weil
202d615d38 qa/standalone/test_ceph_daemon.sh: add new functional tests
- sudo as needed
- clean up afterward

There is still a bit of missing coverage, but this captures most of it.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-23 15:08:55 -05:00
Patrick Donnelly
be1fa70b84
qa: do not check pg count for new data_isolated volume
We don't need to specify the number of PGs for a new data pool anymore
since b1b821f608 and other related
changes. The related health warnings are also deprecated/gone. So this
no longer needs to be done.

Fixes: b1b821f608
Fixes: https://tracker.ceph.com/issues/42436
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-10-23 11:14:28 -07:00
Casey Bodley
250a65e045
Merge pull request #30997 from cbodley/wip-qa-rgw-objectstores
qa/rgw: drop some objectstore types

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2019-10-23 11:37:32 -04:00
Casey Bodley
604db96bbb
Merge pull request #28421 from pritha-srivastava/wip-rgw-omap-offload
rgw: add cls_queue and cls_rgw_gc for omap offload

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2019-10-23 09:59:28 -04:00
Patrick Donnelly
f9d5c6de86
Merge PR #30710 into master
* refs/pull/30710/head:
	mds: add configurable snapshot limit

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2019-10-22 20:55:19 -07:00
Sage Weil
9f912c2158 qa/suites/upgrade/*-x-singleton: suppress TOO_FEW_PGS warning
Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-22 15:53:05 -05:00
Patrick Donnelly
3c4328c0a4
Merge PR #30813 into master
* refs/pull/30813/head:
	qa: get rid of iteritems for python3 compatibility

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-10-21 21:22:00 -07:00
Patrick Donnelly
581be7595b
Merge PR #30971 into master
* refs/pull/30971/head:
	qa: whitelist "Error recovering journal" for cephfs-data-scan

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2019-10-21 21:18:36 -07:00
Patrick Donnelly
16ad4e78e9
Merge PR #30986 into master
* refs/pull/30986/head:
	qa: allow client mount to reset fully

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2019-10-21 21:14:07 -07:00
Sage Weil
19984d729f qa/workunits/mon/caps.sh: fix test
I'm not really sure why this test expected EPERM before when it expects 0
a bit earlier, but it should certainly expect EPERM after the user is
deleted.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-21 21:43:43 -05:00
Xie Xingguo
ad9b7fea90
Merge pull request #30897 from tchaikov/wip-bluestore/avl-allocator
os/bluestore: AVL-tree & extent - based space allocator

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2019-10-22 08:33:54 +08:00
Sage Weil
da533bdd4d Merge branch 'vstart-fs-auth' of git://github.com/batrick/ceph into wip-cleanup-mon-asok 2019-10-21 11:43:01 -05:00
Venky Shankar
dcbdc726a6 qa: allow client mount to reset fully
... without this there is a sutle race where it takes a
bit of time after a hard reset of a client mount causing
further checks to fail as the (still up) client is still
connected to the MDS.

Fixes: http://tracker.ceph.com/issues/42213
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2019-10-21 08:38:37 -04:00
Nathan Cutler
493ee6d78f qa: enable dashboard tests to be run with "--suite rados/dashboard"
This moves dashboard.yaml from rados/mgr into a new, separate rados/dashboard
suite. The common elements it uses are moved from rados/mgr into qa/ and
replaced with symlinks.

Fixes: https://tracker.ceph.com/issues/41820
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2019-10-21 12:31:51 +02:00
Ilya Dryomov
a80185d02c
Merge pull request #30965 from idryomov/wip-krbd-udev-socket-overrun
krbd: avoid udev netlink socket overrun

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2019-10-21 11:00:46 +02:00
Kefu Chai
02a2a04ad2 qa/tasks/ceph.py: remove unused variables
Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-10-21 12:37:56 +08:00
Kefu Chai
09863ef3d9 qa/tasks/ceph: tolerate 'T' or ' ' as date and time separator
str.replace() does not change the string in-place, so we need to assign
its return value to `t`.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-10-21 12:18:58 +08:00
Kefu Chai
548098668e qa/tasks/ceph_manager: do not panic of "pg_num_target" is missing
we don't have "pg_num_target" in "osd dump" back in mimic, so we don't
need to check it if it is missing when performing upgrade test.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-10-21 12:18:58 +08:00
Kefu Chai
18ccae63fa
Merge pull request #30758 from kshtsk/wip-python3-print
tests: use python3 compatible print

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-10-21 00:34:18 +08:00
Casey Bodley
0e76d40aa1 test/rgw: run ceph_test_rgw_gc_log in rgw verify suite
since it requires a running ceph cluster, it can't run in 'make check'
as a unittest. add it to the rgw/verify suite instead

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2019-10-19 13:28:18 +05:30
Ilya Dryomov
898c113f93 qa: add script to test udev event reaping
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-10-18 21:56:30 +02:00
Casey Bodley
85a37896b8 qa/rgw: drop some objectstore types
use the subset of objectstore configurations from .qa/objectstore_cephfs
instead of .qa/objectstore

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2019-10-18 13:20:20 -04:00
Ilya Dryomov
340d6f61b3
Merge pull request #30978 from idryomov/wip-krbd-modprobe
krbd: modprobe before calling build_map_buf()

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2019-10-18 11:11:07 +02:00