Commit Graph

135 Commits

Author SHA1 Message Date
Patrick Donnelly
2e44b87141
Merge PR #19263 into master
* refs/pull/19263/head:
	qa: ignore bad backtrace cluster wrn
	qa/cephfs: Add tests to validate scrub functionality
	cephfs: Add option to load invalid metadata from disk
	cephfs: Reset scrub data when inodes move

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2018-02-13 14:43:32 -08:00
Douglas Fuller
07339e2d1d qa/cephfs: Add tests to validate scrub functionality
Add tests to ensure the scrub operation is not adversly affected
by certain metadata pathologies.

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2018-02-13 14:07:28 -05:00
Yan, Zheng
27b1ca076e qa: adjust cephfs full test for kclient
Fixes: http://tracker.ceph.com/issues/22886
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-05 18:00:57 +08:00
Patrick Donnelly
f6e1a797d4
Revert "Merge PR #19369 into master"
This reverts commit 3189ba19a7, reversing
changes made to b7620de020.

Despite the change in json format being positive, the unfortunate side-effect
is that it broke upgrade testing (because the QA framework must handle the
transition of mdsmap["info"] to a list from object) and the ceph-mgr.

Fixes: http://tracker.ceph.com/issues/22527
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-01-04 09:42:37 -08:00
Sage Weil
819a3578fa Merge tag 'v13.0.1' 2018-01-03 10:04:20 -06:00
Patrick Donnelly
3189ba19a7
Merge PR #19369 into master
* refs/pull/19369/head:
	qa: update handling of fs status format
	PendingReleaseNotes: add note for format change
	mds/MDSMap : use arrary_section for mds stat

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Zheng Yan <zyan@redhat.com>
Reviewed-by: Xiaoxi Chen <xiaoxchen@ebay.com>
2017-12-21 20:21:18 -08:00
Patrick Donnelly
1f1a2a27ef
qa: update handling of fs status format
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-12-21 10:35:09 -08:00
Sage Weil
f33ab7e03a Merge remote-tracking branch 'gh/mimic-dev1' 2017-12-20 15:08:30 -06:00
Patrick Donnelly
6e046dfc90
qa: check pool full flags
Cluster-wide flag removed in b4ca5ae462.

Fixes: http://tracker.ceph.com/issues/22475

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-12-18 21:59:57 -08:00
Patrick Donnelly
b2284f23b8
qa: don't configure ec data pool with memstore
Fixes: http://tracker.ceph.com/issues/22436

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-12-18 21:12:22 -08:00
Patrick Donnelly
67ca6cd229
mds: obsolete MDSMap option configs
These configs were used for initialization but it is more appropriate to
require setting these file system attributes via `ceph fs set`. This is similar
to what was already done with max_mds. There are new variables added for `fs
set` where missing.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-12-13 18:30:52 -08:00
Patrick Donnelly
df43e415c6
Merge PR #18274 into master
* refs/pull/18274/head:
	mds: fold mds_revoke_cap_timeout into mds_session_timeout
	client: add new delegation testcases
	client: add delegation support for cephfs
	common: remove data_dir_option from common_preinit and global_pre_init

Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-11-20 21:48:19 -08:00
Jeff Layton
3321cc7b37 mds: fold mds_revoke_cap_timeout into mds_session_timeout
Right now, we have two different timeout settings -- one for when the
client is just not responding at all (mds_session_timeout), and one for
when the client is otherwise responding but isn't returning caps in a
timely fashion (mds_cap_revoke_timeout).

The default settings on them are equivalent (60s), but only the
mds_session_timeout is communicated via the mdsmap. The
mds_cap_revoke_timeout is known only to the MDS. Neither timeout results
in anything other than warnings in the current codebase.

There is also a third setting (mds_session_autoclose) that is also
communicated via the MDSmap. Exceeding that value (default of 300s)
could eventually result in the client being blacklisted from the
cluster. The code to implement that doesn't exist yet, however.

The current codebase doesn't do any real sanity checking of these
timeouts, so the potential for admins to get them wrong is rather high.
It's hard to concoct a use-case where we'd want to warn about these
events at different intervals.

Simplify this by just removing the mds_cap_revoke_timeout setting, and
replace its use in the code with the mds_session_timeout. With that, the
client can at least determine when warnings might start showing up in
the MDS' logs.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
2017-11-14 07:27:01 -05:00
Patrick Donnelly
2bba5d8e0f
Merge PR #18192 into master
* refs/pull/18192/head:
	qa/cephfs: test ec data pool
	qa/suites/fs/basic_functional/clusters: more osds

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-10-25 17:05:38 -07:00
Patrick Donnelly
c58161f25b
Merge PR #17266 into master
* refs/pull/17266/head:
	qa: update test_ceph_argparse to test fs cmds
	qa: use fs rm_data_pool
	qa: fix mdsmap lookup
	qa: remove usage of mds dump
	PendingReleaseNotes: add obsoleted mds commands
	qa: remove use of obsolete mds commands
	ceph_volume_client: remove use of obsolete mds cmd
	doc: update on obsolete mds commands
	cephfs: obsolete deprecated mds commands

Reviewed-by: Douglas Fuller <dfuller@redhat.com>
2017-10-24 16:37:14 -07:00
Patrick Donnelly
3a5f090a1e
qa: remove usage of mds dump
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-10-24 11:32:43 -07:00
Sage Weil
d0732fc96f qa/cephfs: test ec data pool
Signed-off-by: Sage Weil <sage@redhat.com>
2017-10-23 21:11:24 -05:00
Zack Cerza
e606386626 qa/tasks/cephfs/filesystem: Check for mds failure
... inside Filesystem.are_daemons_healthy()

Signed-off-by: Zack Cerza <zack@redhat.com>
2017-10-18 12:59:09 -06:00
Patrick Donnelly
183646c919
qa: remove use of obsolete mds commands
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-10-01 17:22:36 -07:00
Patrick Donnelly
534c30aca4
Merge PR #18041 into master
* refs/remotes/upstream/pull/18041/head:
	qa: relax cap expected value check
2017-09-30 17:43:56 -07:00
Patrick Donnelly
b37c7f7db7
qa: relax cap expected value check
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-29 08:48:14 -07:00
Ramana Raja
baf3b88800 ceph_volume_client: fix setting caps for IDs
... that have empty OSD and MDS caps. Don't add a ',' at the
start of OSD and MDS caps.

Fixes: http://tracker.ceph.com/issues/21501
Signed-off-by: Ramana Raja <rraja@redhat.com>
2017-09-29 17:06:05 +05:30
Patrick Donnelly
1aef50a1ed
Merge PR #17697 into master
* refs/remotes/upstream/pull/17697/head:
	pybind/ceph_volume_client: add get, put, and delete object interfaces
	pybind/ceph_volume_client: remove 'compat_version'
	pybind/ceph_volume_client: set the version

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-28 18:26:06 -07:00
Patrick Donnelly
1da4a5090a
Merge PR #16036 into HEAD
* refs/remotes/upstream/pull/16036/head:
	mds: improve cap min/max ratio descriptions
	mds: fix whitespace
	mds: cap client recall to min caps per client
	mds: fix conf types
	mds: fix whitespace
	doc/cephfs: add client min cache and max cache ratio describe
	mds: adding tunable features for caps_per_client

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Zheng Yan <zyan@redhat.com>
2017-09-28 17:00:39 -07:00
Patrick Donnelly
538834171f
mds: cap client recall to min caps per client
Fixes: http://tracker.ceph.com/issues/21575

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-28 15:55:57 -07:00
Ramana Raja
d1bd171d6b pybind/ceph_volume_client: add get, put, and delete object interfaces
Wrap low-level rados APIs to allow ceph_volume_client to get, put, and
delete objects. The interfaces would allow OpenStack Manila's
cephfs driver to store config data in a shared storage to implement
highly available Manila deployments. Restrict  write(put) and
read(get) object sizes to 'osd_max_size' config setting.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2017-09-22 16:24:38 +05:30
Patrick Donnelly
8a535d9c72
qa: get config only on running MDS
Fixes: http://tracker.ceph.com/issues/21466

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-20 15:47:28 -07:00
Sage Weil
6767f841e5 Merge pull request #17427 from liewegas/wip-pg-num-limits
mon/OSDMonitor: implement cluster pg limit

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-09-19 12:57:10 -05:00
Patrick Donnelly
3c727d9a36
Merge PR #17701 into master
* refs/remotes/upstream/pull/17701/head:
	qa/cephfs: Fix error in test_filtered_df

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-15 14:12:35 -07:00
Patrick Donnelly
8a54e101e5
Merge PR #17694 into master
* refs/remotes/upstream/pull/17694/head:
	qa/cephfs: kill mount if it gets evicted by mds
	qa/cephfs: fix test_evict_client

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-15 14:12:33 -07:00
Sage Weil
986b86fbeb mon: rename mon_pg_warn_max_per_osd -> mon_max_pg_per_osd
Signed-off-by: Sage Weil <sage@redhat.com>
2017-09-14 16:00:31 -04:00
Patrick Donnelly
d929dae49b
Merge PR #17657 into master
* refs/remotes/upstream/pull/17657/head:
	mds: optimize MDCache::rejoin_scour_survivor_replicas()
	mds: fix MDSCacheObject::clear_replica_map
	mds: support limiting cache by memory
	common: refactor of lru
	mds: resolve unsigned coercion compiler warning
	common: use safer uint64_t for list size
	common: add bytes2str pretty print function
	mds: check if waiting is allocated before use
	mds: go back to compact_map for replicas
	mds: use mempool for cache objects
	mds: cleanup replica_map access
	common: add alloc_ptr smart pointer
	common: add warning on base class use of mempool
	common: use atomic uin64_t for counter

Reviewed-by: Zheng Yan <zyan@redhat.com>
2017-09-13 20:08:51 -07:00
Douglas Fuller
b059cb6290 qa/cephfs: Fix error in test_filtered_df
ceph df accounts for pool size, so there is no need to do it in the test.

Fixes: http://tracker.ceph.com/issues/21381
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-09-13 14:02:24 -04:00
Yan, Zheng
98d86a0752 qa/cephfs: kill mount if it gets evicted by mds
otherwise, teardown() hange at umount

Fixes: http://tracker.ceph.com/issues/21275
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-09-13 21:30:51 +08:00
Yan, Zheng
8433ced847 qa/cephfs: fix test_evict_client
executing mount_a.kill() twice, then executing mount_b.kill_cleanup()
twice do not make sense.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-09-13 16:17:42 +08:00
Patrick Donnelly
06c94de584
mds: support limiting cache by memory
This introduces two config parameters:

    mds_cache_memory_limit: Sets the soft maximum of the cache to the given
    byte count. (Like mds_cache_size, this doesn't actually limit the maximum
    size of the cache. It just dictates the steady-state size.)

    mds_cache_reservation: This replaces mds_health_cache_threshold everywhere
    except the Beacon heartbeat sent to the mons. The idea here is to specify a
    reservation of memory (5% by default) for operations and the MDS tries to
    always maintain that reservation. So, the MDS will recall caps from clients
    when it begins dipping into its reservation of memory.

mds_cache_size still limits the cache by Inode count but is now by-default 0
(i.e. unlimited). The new preferred way of specifying cache limits is by memory
size. The default is 1GB.

Fixes: http://tracker.ceph.com/issues/20594
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1464976

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-12 20:02:41 -07:00
Patrick Donnelly
b4f962a486
qa: log ceph-fuse kill/cleanup
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-06 13:40:11 -07:00
Douglas Fuller
6af2ae80d3 qa/cephfs: test CephFS recovery pools
Test recovering metadata in to a separate RADOS pool with
cephfs_data_scan and friends.

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-08-30 09:02:44 -04:00
Douglas Fuller
8f9a252020 qa/cephfs: support CephFS recovery pools
Add support for testing recovery of CephFS metadata into an alternate
RADOS pool, useful as a disaster recovery mechanism that avoids
modifying the metadata in-place.

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-08-30 09:02:44 -04:00
Douglas Fuller
c85562c94a qa/ceph_test_case: support CephFS recovery pools
Add support for testing recovery of CephFS metadata into an alternate
RADOS pool, useful as a disaster recovery mechanism that avoids
modifying the metadata in-place.

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-08-30 09:02:44 -04:00
Douglas Fuller
5fafc03cb9 qa/cephfs: Allow deferred fs creation
Permit Filesystem objects to be created and settings modified before
calling Filesystem.create().

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-08-30 09:02:44 -04:00
Douglas Fuller
47318f8ac4 qa/cephfs: Refactor alternate pool test
Remove the alternate pool recovery test from test_data_scan. Newer
commits will place the test in its own file.

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-08-30 09:02:44 -04:00
Greg Farnum
c85af7b146 qa: test that "fs new" correctly set the application_metadata
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2017-08-10 11:09:38 -07:00
Patrick Donnelly
eabe662614
Merge PR #16378 into master
* refs/remotes/upstream/pull/16378/head:
	doc: remove accidental additions to release notes
	qa/cephfs: Fix race in test_volume_client
	qa/cephfs: Test filtered df
	PendingReleaseNotes: add note about df filtering
	client: Support new, filtered MStatfs
	objecter: Support new, filtered MStatfs
	mon/PGMap stats: Support new, filtered MStatfs
	messages: Add optional data pool to MStatfs

Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2017-08-08 09:33:52 -07:00
Douglas Fuller
552225f329 qa/cephfs: Fix race in test_volume_client
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-08-04 14:38:50 -04:00
Patrick Donnelly
66756c4f65
Merge PR #16292 into master
* refs/remotes/upstream/pull/16292/head:
	qa: use new hex rep of inode
	qa: fix whitelist error message
	mds: refine "Scrub error" cluster log message
	mds: polish clog messages
	doc: developer logging guidance

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-03 13:55:21 -07:00
Douglas Fuller
b9d11af92b qa/cephfs: Test filtered df
Add a test for filtered df for file systems with single data pools.

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-08-03 14:11:47 -04:00
Patrick Donnelly
8d33cbbf5c
qa: use new hex rep of inode
Resolves a failure from QA:

    2017-08-02T19:23:27.489 INFO:tasks.cephfs_test_runner:======================================================================
    2017-08-02T19:23:27.489 INFO:tasks.cephfs_test_runner:FAIL: test_oversize (tasks.cephfs.test_fragment.TestFragmentation)
    2017-08-02T19:23:27.489 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
    2017-08-02T19:23:27.490 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
    2017-08-02T19:23:27.490 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-pdonnell-testing-20170802/qa/tasks/cephfs/test_fragment.py", line 71, in test_oversize
    2017-08-02T19:23:27.490 INFO:tasks.cephfs_test_runner:    self.assertEqual(frags[0]['dirfrag'], "10000000000.0*")
    2017-08-02T19:23:27.490 INFO:tasks.cephfs_test_runner:AssertionError: u'0x10000000000.0*' != '10000000000.0*'
    2017-08-02T19:23:27.490 INFO:tasks.cephfs_test_runner:

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-02 21:39:48 -07:00
Patrick Donnelly
8db2c43e79
qa: test export_pin is correct in dumped subtree
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-31 15:33:49 -07:00
Patrick Donnelly
019f20ff98
Merge PR #16640 into master
* refs/remotes/upstream/pull/16640/head:
	qa: fix wait for wrong health message

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-28 09:55:49 -07:00