Commit Graph

108 Commits

Author SHA1 Message Date
Sage Weil
6767f841e5 Merge pull request #17427 from liewegas/wip-pg-num-limits
mon/OSDMonitor: implement cluster pg limit

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-09-19 12:57:10 -05:00
Patrick Donnelly
3c727d9a36
Merge PR #17701 into master
* refs/remotes/upstream/pull/17701/head:
	qa/cephfs: Fix error in test_filtered_df

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-15 14:12:35 -07:00
Patrick Donnelly
8a54e101e5
Merge PR #17694 into master
* refs/remotes/upstream/pull/17694/head:
	qa/cephfs: kill mount if it gets evicted by mds
	qa/cephfs: fix test_evict_client

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-15 14:12:33 -07:00
Sage Weil
986b86fbeb mon: rename mon_pg_warn_max_per_osd -> mon_max_pg_per_osd
Signed-off-by: Sage Weil <sage@redhat.com>
2017-09-14 16:00:31 -04:00
Patrick Donnelly
d929dae49b
Merge PR #17657 into master
* refs/remotes/upstream/pull/17657/head:
	mds: optimize MDCache::rejoin_scour_survivor_replicas()
	mds: fix MDSCacheObject::clear_replica_map
	mds: support limiting cache by memory
	common: refactor of lru
	mds: resolve unsigned coercion compiler warning
	common: use safer uint64_t for list size
	common: add bytes2str pretty print function
	mds: check if waiting is allocated before use
	mds: go back to compact_map for replicas
	mds: use mempool for cache objects
	mds: cleanup replica_map access
	common: add alloc_ptr smart pointer
	common: add warning on base class use of mempool
	common: use atomic uin64_t for counter

Reviewed-by: Zheng Yan <zyan@redhat.com>
2017-09-13 20:08:51 -07:00
Douglas Fuller
b059cb6290 qa/cephfs: Fix error in test_filtered_df
ceph df accounts for pool size, so there is no need to do it in the test.

Fixes: http://tracker.ceph.com/issues/21381
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-09-13 14:02:24 -04:00
Yan, Zheng
98d86a0752 qa/cephfs: kill mount if it gets evicted by mds
otherwise, teardown() hange at umount

Fixes: http://tracker.ceph.com/issues/21275
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-09-13 21:30:51 +08:00
Yan, Zheng
8433ced847 qa/cephfs: fix test_evict_client
executing mount_a.kill() twice, then executing mount_b.kill_cleanup()
twice do not make sense.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-09-13 16:17:42 +08:00
Patrick Donnelly
06c94de584
mds: support limiting cache by memory
This introduces two config parameters:

    mds_cache_memory_limit: Sets the soft maximum of the cache to the given
    byte count. (Like mds_cache_size, this doesn't actually limit the maximum
    size of the cache. It just dictates the steady-state size.)

    mds_cache_reservation: This replaces mds_health_cache_threshold everywhere
    except the Beacon heartbeat sent to the mons. The idea here is to specify a
    reservation of memory (5% by default) for operations and the MDS tries to
    always maintain that reservation. So, the MDS will recall caps from clients
    when it begins dipping into its reservation of memory.

mds_cache_size still limits the cache by Inode count but is now by-default 0
(i.e. unlimited). The new preferred way of specifying cache limits is by memory
size. The default is 1GB.

Fixes: http://tracker.ceph.com/issues/20594
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1464976

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-12 20:02:41 -07:00
Patrick Donnelly
b4f962a486
qa: log ceph-fuse kill/cleanup
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-06 13:40:11 -07:00
Douglas Fuller
6af2ae80d3 qa/cephfs: test CephFS recovery pools
Test recovering metadata in to a separate RADOS pool with
cephfs_data_scan and friends.

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-08-30 09:02:44 -04:00
Douglas Fuller
8f9a252020 qa/cephfs: support CephFS recovery pools
Add support for testing recovery of CephFS metadata into an alternate
RADOS pool, useful as a disaster recovery mechanism that avoids
modifying the metadata in-place.

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-08-30 09:02:44 -04:00
Douglas Fuller
c85562c94a qa/ceph_test_case: support CephFS recovery pools
Add support for testing recovery of CephFS metadata into an alternate
RADOS pool, useful as a disaster recovery mechanism that avoids
modifying the metadata in-place.

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-08-30 09:02:44 -04:00
Douglas Fuller
5fafc03cb9 qa/cephfs: Allow deferred fs creation
Permit Filesystem objects to be created and settings modified before
calling Filesystem.create().

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-08-30 09:02:44 -04:00
Douglas Fuller
47318f8ac4 qa/cephfs: Refactor alternate pool test
Remove the alternate pool recovery test from test_data_scan. Newer
commits will place the test in its own file.

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-08-30 09:02:44 -04:00
Greg Farnum
c85af7b146 qa: test that "fs new" correctly set the application_metadata
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2017-08-10 11:09:38 -07:00
Patrick Donnelly
eabe662614
Merge PR #16378 into master
* refs/remotes/upstream/pull/16378/head:
	doc: remove accidental additions to release notes
	qa/cephfs: Fix race in test_volume_client
	qa/cephfs: Test filtered df
	PendingReleaseNotes: add note about df filtering
	client: Support new, filtered MStatfs
	objecter: Support new, filtered MStatfs
	mon/PGMap stats: Support new, filtered MStatfs
	messages: Add optional data pool to MStatfs

Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2017-08-08 09:33:52 -07:00
Douglas Fuller
552225f329 qa/cephfs: Fix race in test_volume_client
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-08-04 14:38:50 -04:00
Patrick Donnelly
66756c4f65
Merge PR #16292 into master
* refs/remotes/upstream/pull/16292/head:
	qa: use new hex rep of inode
	qa: fix whitelist error message
	mds: refine "Scrub error" cluster log message
	mds: polish clog messages
	doc: developer logging guidance

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-03 13:55:21 -07:00
Douglas Fuller
b9d11af92b qa/cephfs: Test filtered df
Add a test for filtered df for file systems with single data pools.

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-08-03 14:11:47 -04:00
Patrick Donnelly
8d33cbbf5c
qa: use new hex rep of inode
Resolves a failure from QA:

    2017-08-02T19:23:27.489 INFO:tasks.cephfs_test_runner:======================================================================
    2017-08-02T19:23:27.489 INFO:tasks.cephfs_test_runner:FAIL: test_oversize (tasks.cephfs.test_fragment.TestFragmentation)
    2017-08-02T19:23:27.489 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
    2017-08-02T19:23:27.490 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
    2017-08-02T19:23:27.490 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-pdonnell-testing-20170802/qa/tasks/cephfs/test_fragment.py", line 71, in test_oversize
    2017-08-02T19:23:27.490 INFO:tasks.cephfs_test_runner:    self.assertEqual(frags[0]['dirfrag'], "10000000000.0*")
    2017-08-02T19:23:27.490 INFO:tasks.cephfs_test_runner:AssertionError: u'0x10000000000.0*' != '10000000000.0*'
    2017-08-02T19:23:27.490 INFO:tasks.cephfs_test_runner:

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-08-02 21:39:48 -07:00
Patrick Donnelly
8db2c43e79
qa: test export_pin is correct in dumped subtree
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-31 15:33:49 -07:00
Patrick Donnelly
019f20ff98
Merge PR #16640 into master
* refs/remotes/upstream/pull/16640/head:
	qa: fix wait for wrong health message

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-28 09:55:49 -07:00
Patrick Donnelly
6fc2ee383f
Merge PR #16413 into master
* refs/remotes/upstream/pull/16413/head:
	qa/cephfs: lsof if umount fails

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-28 09:55:23 -07:00
Patrick Donnelly
ced01a2335
qa: fix wait for wrong health message
Fixes: http://tracker.ceph.com/issues/20805

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-27 14:40:05 -07:00
Patrick Donnelly
9506789ce1
Merge PR 16379 into master
* refs/remotes/upstream/pull/16379/head:
	qa: fix MDS_CLIENT_RECALL copy error

Reviewed-by: Zheng Yan <zyan@redhat.com>
2017-07-21 13:23:07 -07:00
Patrick Donnelly
23e3d40751
Merge PR 16226 into master
* refs/remotes/upstream/pull/16226/head:
	qa: wait for OSDMap to propagate for snap purge

Reviewed-by: Zheng Yan <zyan@redhat.com>
2017-07-21 13:22:47 -07:00
Sage Weil
572a942f8f mon: 'auth list' -> 'auth ls'
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-19 12:33:14 -04:00
Yan, Zheng
b49d6d8ead qa/cephfs: lsof if umount fails
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-07-19 15:32:37 +08:00
Patrick Donnelly
f8e0571982
qa: fix MDS_CLIENT_RECALL copy error
Fixes: http://tracker.ceph.com/issues/20682

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-18 16:06:20 -07:00
Yan, Zheng
e4844706b0 qa/cephfs: don't use int() to convert string of float point number
Fixes: http://tracker.ceph.com/issues/20582
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-07-13 15:55:22 +08:00
Sage Weil
25717f7e84 qa/tasks/ceph_test_case.py: update health check helpers
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-12 12:52:03 -04:00
Patrick Donnelly
62d008436b
qa: wait for OSDMap to propagate for snap purge
Note: unmounting the client is not necessary for purging snapshots.

Fixes: http://tracker.ceph.com/issues/20072

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-07 15:12:42 -07:00
Patrick Donnelly
5b87301192
Merge remote-tracking branch 'upstream/pull/15822/head' into master
* upstream/pull/15822/head:
  qa: add timeout/repeat to pool df

Reviewed-by: John Spray <jspray@redhat.com>
2017-07-06 22:14:32 -07:00
Patrick Donnelly
97cdb1e34a
Merge remote-tracking branch 'upstream/pull/15817/head' into master
* upstream/pull/15817/head:
  qa: wait for healthy cluster before testing pins

Reviewed-by: Zheng Yan <zyan@redhat.com>
2017-07-06 21:36:34 -07:00
Patrick Donnelly
2cb42a4dbf
Merge remote-tracking branch 'upstream/pull/13770/head' into master
* upstream/pull/13770/head:
  tasks/cephfs: add TestStrays.test_replicated_delete_speed

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-07-06 21:33:03 -07:00
John Spray
623f1240a2 tasks/cephfs: add TestStrays.test_replicated_delete_speed
Reproducer for http://tracker.ceph.com/issues/16914

Signed-off-by: John Spray <john.spray@redhat.com>
2017-06-29 17:21:57 +01:00
Patrick Donnelly
95c0ca6a2b
qa: add timeout/repeat to pool df
Fixes: http://tracker.ceph.com/issues/20212

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-06-27 06:50:15 -07:00
John Spray
b6cfa35458 qa: no longer need to explicitly enable multimds
Signed-off-by: John Spray <john.spray@redhat.com>
2017-06-23 17:07:34 +01:00
John Spray
38dccd2c72 Merge pull request #15548 from ukernel/wip-20196
mds: improvements for stray reintegration

Reviewed-by: John Spray <john.spray@redhat.com>
2017-06-22 06:46:27 -04:00
Patrick Donnelly
d4870a093c
qa: wait for healthy cluster before testing pins
Fixes: http://tracker.ceph.com/issues/20318

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-06-21 13:21:32 -07:00
Yan, Zheng
57e82edc9c qa/cephfs: use ceph.dir.pin to trigger migration
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-06-20 17:39:46 +08:00
John Spray
18fbf24c7a Merge pull request #15308 from jcsp/wip-19706
mon: don't kill MDSs unless some beacons are getting through

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-06-15 10:50:44 -04:00
Yan, Zheng
5e1d8879ee qa/cephfs: update stray reintegration test case
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-06-12 09:46:06 +08:00
John Spray
7e1be30b9a qa: clean up test_exports.py
Mainly just using the setfattr helper
instead of run_shell.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-06-01 07:18:03 -04:00
John Spray
6ef30d1ed3 qa: explicitly set up standby replay in test_journal_migration
Previously this relied on being run in a special cluster configuration
that set up standby replay daemons.  This change will allow it
to live alongside all the 'normal' functional tests.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-06-01 07:18:03 -04:00
John Spray
3326321858 qa: fix daemon restart between tests
Previously, calling mds_stop without mds_fail meant
that if the filesystem creation was not quick, then
we would see those daemons go laggy.  This starts
to trigger failures now that we have cluster log
messages that fire when a daemon gets failed out
due to being laggy.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-05-31 18:00:43 -04:00
Patrick Donnelly
76335b0e0f
qa: improve debug message for subtree wait
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-05-30 09:08:27 -07:00
John Spray
f80e0973f5 Merge pull request #15062 from ukernel/wip-19912
qa/tasks/cephfs: use getattr to guarantee inode is in client cache

Reviewed-by: John Spray <john.spray@redhat.com>
2017-05-25 18:44:54 +01:00
John Spray
ef9d555916 Merge pull request #15105 from ukernel/wip-19892
qa/cephfs: disable mds_bal_frag for TestStrays.test_purge_queue_op_rate

Reviewed-by: John Spray <john.spray@redhat.com>
2017-05-24 16:41:45 +01:00