Commit Graph

165 Commits

Author SHA1 Message Date
Patrick Donnelly
0c8899c985
qa: add upgrade test for volume upgrade from legacy
This tests that volumes created using the ceph_volume_client.py library
continue to be accessible/function via the Nautilus/Octopus ceph-mgr
volumes plugin.

Fixes: https://tracker.ceph.com/issues/42723
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2020-03-02 20:27:15 -08:00
Sridhar Seshasayee
e527067666 qa: Whitelist 'slow request' within a bunch of tests
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2020-02-24 19:59:56 +05:30
Patrick Donnelly
48ca559224
qa: update cluster warning message for removed MDS
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2020-02-13 07:51:10 -08:00
Patrick Donnelly
1fc33c54f8
qa: specify random distros in multimds
Note: the name is important so that kclient mount can override the
distro setting.

Fixes: https://tracker.ceph.com/issues/43968
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2020-02-05 12:36:50 -08:00
Venky Shankar
b5970ff80d test: add subvolume clone tests
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2020-01-31 05:09:14 -05:00
Sage Weil
7ce7ac8bfc qa/suites/fs/upgrade: finish at octopus
Signed-off-by: Sage Weil <sage@redhat.com>
2020-01-25 16:04:28 -06:00
Sage Weil
3d94bc42db qa/suites/fs/upgrade: set min-compat-client to octopus
Signed-off-by: Sage Weil <sage@redhat.com>
2020-01-25 13:31:08 -06:00
Sage Weil
41c03aa143 qa/suites/fs/upgrade: set pg_autoscale_mode=off after upgrade
Signed-off-by: Sage Weil <sage@redhat.com>
2020-01-24 21:01:07 -06:00
Patrick Donnelly
1a0258ed3c
Merge PR #32644 into master
* refs/pull/32644/head:
	qa: ignore trimmed cache items for dead cache drop
	qa: use unit test comparisons

Reviewed-by: Zheng Yan <zyan@redhat.com>
2020-01-22 08:21:04 -08:00
Patrick Donnelly
590368e956
qa: ignore trimmed cache items for dead cache drop
Fixes: https://tracker.ceph.com/issues/42986
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2020-01-20 17:52:12 -08:00
Vikhyat Umrao
e0abf8c13a test: test case for openfiletable MAX_ITEMS_PER_OBJ value verification
Signed-off-by: Vikhyat Umrao <vikhyat@redhat.com>
2020-01-17 19:35:46 -08:00
Patrick Donnelly
2cdb2972cd
qa: define centos version for fs:verify
Otherwise it uses the teuthology default of 7.6.

Fixes: https://tracker.ceph.com/issues/43516
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2020-01-07 13:20:00 -08:00
Patrick Donnelly
02b3883dd0
Merge PR #32363 into master
* refs/pull/32363/head:
	qa: add .qa link

Reviewed-by: Sage Weil <sage@redhat.com>
2020-01-06 12:18:12 -08:00
Patrick Donnelly
4562823a19
qa: add .qa link
Continuation of 716db6e2fd.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-12-19 14:31:09 -08:00
Kefu Chai
4148ff42b5 qa: no need to exclude ceph-mgr-diskprediction-cloud from package list to be installed
Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-12-17 21:52:18 +08:00
Sage Weil
2184641d7f qa: fix lingering ceph-mgr-ssh -> ceph-mgr-cephadm refs
Signed-off-by: Sage Weil <sage@redhat.com>
2019-12-13 12:48:06 -06:00
Sage Weil
c8750b7066 files,rpm,deb: rename ceph-daemon -> cephadm
This is just renaming the files and adjusting the packages.  Lots of
cleanup to do still.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-12-11 19:14:09 -06:00
Patrick Donnelly
e8368d61be
Merge PR #29421 into master
* refs/pull/29421/head:
	qa/cephfs: add tests for ACLs
	qa/cephfs: allow running tests from xfstests-dev
	qa/tasks: add methods to get monitor's sockets
	qa/cephfs: don't crash if mountpoint dir is already deleted
	vstart_runner.py: set omit_sudo's default value to False
	qa/vstart_runner.py: fix get_keyring_path()
	qa/cephfs: don't abort if mountpoint is already present
	qa/cephfs: allow specifying mountpoint for kernel mounts
	qa/cephfs: allow specifying mountpoints for FUSE mounts
	qa/vstart_runner.py: allow specifying mountpoint for local FUSE mounts
	qa/mount.py: allow setting mountpoint
	qa/vstart_runner.py: add a method to create a temporary directory

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2019-12-05 13:25:03 -08:00
Rishabh Dave
a9db23fd18 qa/cephfs: add tests for ACLs
Add code to run tests for ACLs from xfstests-dev against kernel and
FUSE CephFS mounts.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-12-03 18:17:18 +05:30
Sage Weil
cf352c3ac0 osd: add osd_fast_shutdown option (default true)
If we get a SIGINT or SIGTERM or are deleted from the OSDMap, do a fast
shutdown by exiting immediately.  This has a few important benefits:

 - We immediately stop responding (binding) to any sockets, which means
   other OSDs will immediately decide we are down (and dead!).  This
   minimizes IO interruption.
 - We avoid the complex "clean" shutdown process, which is historically a
   source of bugs.

In reality, the only purpose of the "clean" shutdown is to try to tear down
everything in memory so we can do memory leak checking with valgrind.  Set
this option to false for valgrind QA runs so we can still do that.

Not that with the new read leases in octopus, we rely on the default
behavior that a ECONNREFUSED is taken to mean that the OSD is fully dead,
so that we don't have to wait for any leases to time out.  This works in
sane environments with normal IP networks, but that behavior could
conceivably be a bad idea if there are some weird network shenanigans
going on.  If osd_fast_fail_on_connection_refused were disabled, then this
fast shutdown procedure might be *worse* than the clean shutdown because
we would have to wait for the heartbeat timeout.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-11-15 09:31:50 -06:00
Patrick Donnelly
9dc07d8096
qa: add tests for CephFS admin commands
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-10-30 11:44:26 -07:00
Patrick Donnelly
8fb4e4c1e7
qa: disable too few PG warning during Mimic deploy
Mimic will raise this warning when we use 8 PGs for CephFS metadata/data
pools.

Fixes: fc88e6c6c5
Fixes: https://tracker.ceph.com/issues/42434
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-10-24 15:12:43 -07:00
Yan, Zheng
c4c7df8bf0 qa: whitelist "Error recovering journal" for cephfs-data-scan
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Fixes: https://tracker.ceph.com/issues/41836
2019-10-17 21:19:54 +08:00
Sage Weil
71d74aa8c6 qa: more tries for mon tell when injecting msgr failures
With failure injection the default 2 tries isn't quite enough

Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-11 14:16:42 -05:00
Sage Weil
6fd67f19e3 Merge PR #30603 into master
* refs/pull/30603/head:
	ceph-daemon: -n type.id instead of -i id
	ceph-daemon: drop unused VERSION
	ceph-daemon: clean up dir helpers, tighten up permissions
	ceph-daemon: fchmod before writing to keyring file
	test_ceph_daemon.sh: skip ssh until container image has remoto
	ceph-daemon: decode utf-8 in run() helper
	mgr/ssh: clean up debug cruft
	mgr/ssh: clean up bare except: block
	ceph-daemon: clean up bare except: blocks
	ceph-daemon: all imports to top
	ceph-volume: no_tmpfs -> tmpfs
	doc/bootstrap: add new bootstrap documentation
	ceph-daemon: add --output-pub-ssh-key for bootstrap
	ceph-daemon: make 'shell' easier to use
	ceph-daemon: support docker; prefer podman
	qa: add ceph-daemon
	debian: ceph-daemon package, required by ceph-mgr-ssh
	ceph.spec.in: ceph-daemon package, required by ceph-mgr
	common/options: cleanup whitespace
	mgr/ssh: simplify getting the cluster fsid
	mgr/ssh: pipe ceph-daemon script to stdin of python3
	ceph-daemon: add support for args and/or stdin from top of script
	ceph-daemon: make ceph-volume use get_config_and_keyring
	ceph-daemon: ls: behave if /var/log/ceph doesn't exist
	ceph-daemon: implement 'adopt' for legacy style daemons
	ceph-daemon: fix fsid detection for legacy osds
	ceph-daemon: make rm-cluster clean up system-ceph*.slice too
	ceph-daemon: configure ssh orchestrator
	ceph-daemon: be more restrictive with file permissions
	mgr/ssh: create osd with ceph-daemon
	mgr/ssh: pass daemon id separately to _create_daemon
	ceph-daemon: add --config-and-keyring to ceph-volume command
	ceph-daemon: create log path for shell (if needed)
	mgr/ssh: use _run_ceph_daemon for _create_daemon
	mgr/ssh: factor _run_ceph_daemon out of _get_device_inventory
	mon/ConfigMonitor: allow entity type only for 'config get'
	ceph-daemon: add ceph-volume subcommand
	ceph-daemon: remove unused CephContainer dname property
	ceph-daemon: drop useless uid/gid checks
	mgr/ssh: deploy new mgrs with ceph-daemon
	mgr/ssh: factor _create_daemon out of create_mon
	mon/MonCap: allow mgr to create new auth keys
	mgr/ssh: run c-v with podman when getting inventory
	mgr/ssh: simplify ssh connection management
	mgr/ssh: use ceph-daemon for deploying mon
	ceph-daemon: allow --mon-network for deploying new mon (vs specifying IP)
	ceph-daemon: --config-and-keyring (not key)
	common/options: add 'image' config option
	test_ceph_daemon: specify image name
	vstart.sh: add --ssh to enable+configure ssh orchestrator
	mgr/ssh: use ssh identity from config-key, if present
	mgr/ssh: hardcode default ssh_config
	ceph-daemon: store ssh identity in mon config-key store
	ceph-daemon: --privileged arg for 'exec'
	ceph-daemon: make deploy work for osd (do a c-v prepare)
	ceph-daemon: make shell privileged
	ceph-daemon: move get_container_mounts to a helper
	ceph-daemon: pass full path for entrypoint
	ceph-daemon: make id portion of 'shell' optional
	ceph-volume: accept --no-tmpfs argument for bluestore
	ceph-daemon: 'unit' command
	ceph-daemon: fix run command to use call(), not check_output()
	src/ceph-daemon: whitespace
	ceph-daemon: add 'enter', 'exec' commands
	ceph-daemon: bind config to default location
	test_ceph_daemon.sh: test deploy mds too
	ceph-daemon: generate ssh keys
	ceph-daemon: --config, not --conf
	ceph-daemon: long lines
	ceph-daemon: add --config to bootstrap
	ceph-daemon: add 'shell' command
	ceph-daemon: do not import subprocess symbols directly
	ceph-daemon: add mons with 'deploy mon.x ...'
	ceph-daemon: add 'ls'
	ceph-daemon: simplify uid/gid a bit
	ceph-daemon: fix libudev
	ceph-daemon: autodetect uid/gid from container image
	ceph-daemon: default to empty log files, log to stderr (systemd journal)
	ceph-daemon: rm-{daemon,cluster}
	ceph-daemon: fix bootstrap config
	ceph-daemon: fix args.fsid usage
	ceph-daemon: be careful overwriting live files
	ceph-daemon: slurp some options over from the standard systemd unit
	ceph-daemon: add ceph.target and ceph-$fsid.target units
	test_ceph_daemon.sh: stupid test script
	ceph-daemon: bootstrap and deploy (mgr) work
	ceph-daemon: initial checkin
	ceph-mon: fix debug print of public_addr
2019-10-07 15:31:14 -05:00
Sage Weil
f2e2cb1541 qa: add ceph-daemon
Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-04 20:33:35 -05:00
David Zafman
ded58ef91d test: Ignore OSD_SLOW_PING_TIME* if injecting socket failures
Fixes: https://tracker.ceph.com/issues/41743

Signed-off-by: David Zafman <dzafman@redhat.com>
2019-10-03 09:09:10 -07:00
Jeff Layton
9c406d0ab3 mon: deprecate CephFS inline_data support
The plan is to start deprecating this feature now so that we can remove
it in a future release. Change it to require the
--yes-i-really-really-mean-it flag, and to emit a custom
warning when that isn't specified.

For now, we leave the testing in place since we do want to be notified
if something breaks before we're ready to rip it out completely.

Fixes: https://tracker.ceph.com/issues/41311
Signed-off-by: Jeff Layton <jlayton@redhat.com>
2019-09-19 09:15:13 -04:00
Patrick Donnelly
aba296aab8
qa: add debugging failed osd-release setting
See-also: https://tracker.ceph.com/issues/40773
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-08-22 07:19:51 -07:00
Patrick Donnelly
d1ce58257e
Merge PR #29431 into master
* refs/pull/29431/head:
	qa: fix malformed suite config

Reviewed-by: Zheng Yan <zyan@redhat.com>
2019-08-14 15:21:51 -07:00
Sage Weil
f011c13547 Merge PR #29292 into master
* refs/pull/29292/head:
	os/bluestore: warn on no per-pool omap
	os/bluestore: fsck: warning (not error) by default on no per-pool omap
	os/bluestore: fsck: int64_t for error count
	os/bluestore: default size of 1 TB for testing
	os/bluestore: behave if we *do* set PGMETA and PERPOOL flags
	os/bluestore: do not set both PGMETA_OMAP and PERPOOL_OMAP
	os/bluestore: fsck: only generate 1 error per omap_head
	os/bluestore: make fsck repair convert to per-pool omap
	os/bluestore: teach fsck to tolerate per-pool omap
	os/bluestore: ondisk format change to 3 for per-pool omap
	mon/PGMap: add data/omap breakouts for 'df detail' view
	osd/osd_types: separate get_{user,allocated}_bytes() into data and omap variants
	mon/PGMap: fix stored_raw calculation
	mon/PGMap: add in actual omap usage into per-pool stats
	osd: report per-pool omap support via store_statfs_t
	os/bluestore: set per_pool_omap key on mkfs
	osd/osd_types: count per-pool omap capable OSDs
	os/bluestore: report omap_allocated per-pool
	os/bluestore: add pool prefix to omap keys
	kv/KeyValueDB: take key_prefix for estimate_prefix_size()
	os/bluestore: fix manual omap key manipulation to use Onode::get_omap_key()
	os/bluestore: make omap key helpers Onode methods
	os/bluestore: add Onode::get_omap_prefix() helper
	os/bluestore: change _do_omap_clear() args

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2019-08-09 10:40:45 -05:00
Sage Weil
b8501164ef os/bluestore: warn on no per-pool omap
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-09 08:21:18 -05:00
Patrick Donnelly
31492bb095
qa: fix malformed suite config
Fixes: https://tracker.ceph.com/issues/41031
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-07-31 10:11:45 -07:00
Sage Weil
9cb6108eb7 Merge PR #29363 into master
* refs/pull/29363/head:
	qa/suites/multimds/basic/tasks/ceph_test_snapshots: disable RECENT_CRASH
	qa/suites/kcephfs/recovery/failover.yaml: disable RECENT_CRASH
	qa/suites/fs/multifs/tasks/failover.yaml: disable RECENT_CRASH

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2019-07-30 22:46:55 -05:00
Patrick Donnelly
5e08dac8c1
qa: ignore expected MDS_CLIENT_LATE_RELEASE warning
Fixes: http://tracker.ceph.com/issues/40968
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-07-26 13:37:23 -07:00
Sage Weil
2c87a46364 qa/suites/fs/multifs/tasks/failover.yaml: disable RECENT_CRASH
This test deliberately crashes daemons.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-07-26 15:00:11 -05:00
Patrick Donnelly
0e25e4bb4e
Merge PR #27073 into master
* refs/pull/27073/head:
	qa/tasks: Check MDS failover during mon_thrash
	qa/tasks: Compare two FSStatuses
	qa/suites/fs: renamed default.yaml to mds.yaml
	qa/suites/fs: mon_thrash test for fs
	qa/tasks: Fix typo in the comment

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2019-07-01 15:31:55 -07:00
Patrick Donnelly
ff1f04f4d5
qa: elide python version config
This test doesn't actually use the config and cephfs-shell is py3 only.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-06-19 18:22:51 -07:00
Venky Shankar
d92840b59e test: port fs/volume related tests to python
... and add subvolume related tests.

Signed-off-by: Venky Shankar <vshankar@redhat.com>
2019-06-12 12:43:17 -04:00
Patrick Donnelly
66f18ecd09
qa: use mimic-O upgrade process
Fixes: https://tracker.ceph.com/issues/39436

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-06-06 13:06:56 -07:00
Jos Collin
5e09b4681a
qa/suites/fs: renamed default.yaml to mds.yaml
Signed-off-by: Jos Collin <jcollin@redhat.com>
2019-04-12 10:22:31 +05:30
Jos Collin
9dace9258f
qa/suites/fs: mon_thrash test for fs
Created a mon.yaml in fs suite that calls mon_thrash test for fs and multimds suites.

Fixes: http://tracker.ceph.com/issues/17309
Signed-off-by: Jos Collin <jcollin@redhat.com>
2019-04-12 10:22:04 +05:30
Patrick Donnelly
40c6319a55
qa: test featureful client with mimic base
Fixes: http://tracker.ceph.com/issues/39020
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-04-03 14:58:57 -07:00
Patrick Donnelly
6168791373
qa: remove obsolete snap upgrade tests
Direct upgrades from Luminous to Octopus are not supported. These snap format
upgrade tests are now only going to be run in the mimic/nautilus branches.

Fixes: http://tracker.ceph.com/issues/39020
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-04-03 14:58:11 -07:00
Patrick Donnelly
f20de0897c
qa: remove requirement on simple msgr
Fixes: http://tracker.ceph.com/issues/39079
Introduced-by: 28b4392a71
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-04-01 17:06:24 -07:00
Patrick Donnelly
7de8cb405c
Merge PR #26935 into nautilus
* refs/pull/26935/head:
	qa: extend MDS heartbeat grace for valgrind

Reviewed-by: Sage Weil <sage@redhat.com>
2019-03-13 20:37:03 -07:00
Patrick Donnelly
7b520755ce
qa: extend MDS heartbeat grace for valgrind
Valgrind makes the MDS slowwwww. The newish mds_heartbeat_grace config allows
us to keep sending beacons to the mons even if the internal heartbeat is slow.
This avoids the laggy messages which are useful to grep for unrelated messaging
issues.

Fixes: http://tracker.ceph.com/issues/38723
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-03-13 09:18:32 -07:00
Patrick Donnelly
1ceadf0f07
qa: ignore MON_DOWN for volume-client testing
The test restarts the monitors.

Fixes: http://tracker.ceph.com/issues/38704
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-03-12 10:38:55 -07:00
Patrick Donnelly
897a1f7385
qa: stop testing simple messenger in CephFS suites
Simple messenger is on it's way out and it doesn't work with msgr2.

Fixes: http://tracker.ceph.com/issues/38676
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-03-11 09:06:32 -07:00
Sage Weil
e79dc454db qa/suites: disable valgrind leak checks on ceph-mgr
We've disabled the "clean" shutdown in ceph-mgr due to
https://tracker.ceph.com/issues/38621

Until then, no valgrind leak checks!

Signed-off-by: Sage Weil <sage@redhat.com>
2019-03-07 13:03:28 -06:00