Commit Graph

77842 Commits

Author SHA1 Message Date
Patrick Donnelly
06c94de584
mds: support limiting cache by memory
This introduces two config parameters:

    mds_cache_memory_limit: Sets the soft maximum of the cache to the given
    byte count. (Like mds_cache_size, this doesn't actually limit the maximum
    size of the cache. It just dictates the steady-state size.)

    mds_cache_reservation: This replaces mds_health_cache_threshold everywhere
    except the Beacon heartbeat sent to the mons. The idea here is to specify a
    reservation of memory (5% by default) for operations and the MDS tries to
    always maintain that reservation. So, the MDS will recall caps from clients
    when it begins dipping into its reservation of memory.

mds_cache_size still limits the cache by Inode count but is now by-default 0
(i.e. unlimited). The new preferred way of specifying cache limits is by memory
size. The default is 1GB.

Fixes: http://tracker.ceph.com/issues/20594
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1464976

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-12 20:02:41 -07:00
Patrick Donnelly
12d615b3c5
common: refactor of lru
Avoids an unnecessary "max" size of the LRU which was used to calculate the
midpoint. Instead, just dynamically move the LRUObjects between top and bottom
on-the-fly.

This change is necessary for a cache which which does not limit by the number
of objects but by some other metric. (In this case, memory.)

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-12 15:46:24 -07:00
Patrick Donnelly
0c2032c287
mds: resolve unsigned coercion compiler warning
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-12 15:46:24 -07:00
Patrick Donnelly
0ddd260a32
common: use safer uint64_t for list size
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-12 15:46:23 -07:00
Patrick Donnelly
7fff24e10e
common: add bytes2str pretty print function
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-12 15:46:23 -07:00
Patrick Donnelly
055020ce80
mds: check if waiting is allocated before use
This prevents accidental allocation of the map.

Also, privatize the variable to protect from this in child classes.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-12 15:46:23 -07:00
Patrick Donnelly
5d67b5cc57
mds: go back to compact_map for replicas
Zheng observed that an alloc_ptr doesn't really work in this case since any
call to get_replicas() will cause the map to be allocated, nullifying the
benefit. Use a compact_map until a better solution can be written. (This means
that the map will be allocated outside the mempool.)

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-12 15:46:23 -07:00
Patrick Donnelly
e035b64fcb
mds: use mempool for cache objects
The purpose of this is to allow us to track memory usage by cached objects so
we can limit cache size based on memory available/allocated to the MDS.

This commit is a first step: it adds CInode, CDir, and CDentry to the mempool
but not all of the containers in these classes (e.g. std::map). However,
MDSCacheObject has been changed to allocate its containers through the mempool
by converting compact_* containers to the std versions offered through mempool
via the new alloc_ptr.

(A compact_* class simply wraps a pointer to the std:: version to reduce memory
usage of an object when the container is only occasionally used. The alloc_ptr
allows us to achieve the same thing explicitly with only a little handholding:
when all entries in the wrapped container are deleted, the caller must call
alloc_ptr.release().)

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-12 15:46:23 -07:00
Patrick Donnelly
d1b6cadd6c
mds: cleanup replica_map access
The gymnastics protecting the map failed as the code evolved. Just expose it
normally with a getter.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-12 15:46:22 -07:00
Patrick Donnelly
5fa557d271
common: add alloc_ptr smart pointer
This ptr is like a unique_ptr except it allocates the underlying object on
access. The idea being that we can save memory if the object is only needed
sometimes.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-12 15:46:12 -07:00
Patrick Donnelly
c0d0fa804e
common: add warning on base class use of mempool
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-06 21:22:09 -07:00
Patrick Donnelly
59b5931a2f
common: use atomic uin64_t for counter
Making this interface thread-safe...

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-06 21:22:09 -07:00
Patrick Donnelly
c9994788ca
Merge PR #17340 into master
* refs/remotes/upstream/pull/17340/head:
	mds: void sending cap import message when inode is frozen
	client: fix message order check in handle_cap_export()

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-06 21:05:20 -07:00
Patrick Donnelly
d269c851c6
Merge PR #16778 into master
* refs/remotes/upstream/pull/16778/head:
	mds: fix return value of MDCache::dump_cache
	mds: new cap message flags indicate if there is pending capsnap
	mds: properly do null snapflush part2
	mds: track snap inodes through sorted map
	mds: properly drop wrlock when finishing snapflush

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-06 21:05:18 -07:00
Patrick Donnelly
28c7813f4e
Merge PR #17291 into master
* refs/remotes/upstream/pull/17291/head:
	mds: fix 'dirfrag end' check in Server::handle_client_readdir

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: huanwen ren <ren.huanwen@zte.com.cn>
2017-09-06 14:45:16 -07:00
Patrick Donnelly
00629ad52f
Merge PR #17289 into master
* refs/remotes/upstream/pull/17289/head:
	osd, mds, tools: drop the invalid comment and some unused variables

Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-06 14:45:14 -07:00
Patrick Donnelly
8f79a7eccc
Merge PR #17219 into master
* refs/remotes/upstream/pull/17219/head:
	mds: fix StrayManager::truncate()

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Amit Kumar <amitkuma@redhat.com>
2017-09-06 14:45:12 -07:00
Sage Weil
1e272575ad Merge pull request #17505 from liewegas/wip-20910
qa/objectstore/bluestore*: less debug output

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-09-06 16:02:01 -05:00
Sage Weil
267750e457 Merge pull request #17459 from xiexingguo/wip-bs-tracked-key
os/bluestore: add bluestore_prefer_deferred_size_hdd/ssd to tracked keys

Reviewed-by: Pan Liu <liupan1111@gmail.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-09-06 15:55:54 -05:00
Sage Weil
32d5722003 Merge pull request #17463 from tchaikov/wip-ceph-tell-mds-star
ceph: fixes for "tell <service>.*" command

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Chang Liu <liuchang0812@gmail.com>
2017-09-06 15:55:25 -05:00
Sage Weil
b647184233 Merge pull request #17503 from liewegas/wip-21250
os/bluestore/BlueFS: prevent _compact_log_async reentry

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Varada Kari <varada.kari@gmail.com>
2017-09-06 15:52:29 -05:00
Sage Weil
bd52ddd681 Merge pull request #17510 from liewegas/wip-crush-fix-rule-lookup
crush: fix fast rule lookup when uniform

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-09-06 13:41:55 -05:00
Yuri Weinstein
49d307211a Merge pull request #17356 from shashalu/bucket_link/unlink_olh
rgw: don't write bucket_header when it is not changed in bucket_link/unlink

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2017-09-06 08:50:23 -07:00
Yuri Weinstein
64a445add0 Merge pull request #17434 from iliul/remove-useless-output
rgw: Remove the useless output when list zones

Reviewed-by: Jos Collin <jcollin@redhat.com>
2017-09-06 08:49:27 -07:00
Jos Collin
2e6c65b90a Merge pull request #17518 from wjwithagen/wjw-githubmap
.githubmap: Add wjwithagen as a known Ceph reviewer

Reviewed-by: Jos Collin <jcollin@redhat.com>
2017-09-06 15:14:47 +00:00
Yan, Zheng
f519fca9dd mds: fix return value of MDCache::dump_cache
previous commit "mds: track snap inodes through sorted map" makes
MDCache::dump_cache return 1 on success.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-09-06 21:32:43 +08:00
Willem Jan Withagen
7b7fe48f65 .githubmap: Add myself
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
2017-09-06 14:18:57 +02:00
Orit Wasserman
f71af81251 Merge pull request #16145 from yehudasa/wip-20234
rgw: add tail tag to track tail instance
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2017-09-06 10:23:30 +03:00
Mykola Golub
86c6f429fc Merge pull request #17502 from dillaman/wip-21248
librbd: rename of non-existent image results in seg fault

Reviewed-by: Nathan Cutler <ncutler@suse.com>
2017-09-06 10:14:04 +03:00
Mykola Golub
b1e9cabe82 Merge pull request #17375 from liupan1111/wip-final-fix-nbd
rbd-nbd: fix generic option issue

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
2017-09-06 09:27:38 +03:00
Jos Collin
01f29ef615 Merge pull request #17507 from batrick/githubmap-update
githubmap: add some known Ceph reviewers

Reviewed-by: Jos Collin <jcollin@redhat.com>
2017-09-06 04:18:39 +00:00
Patrick Donnelly
73e927293c
Merge PR #17319 into master
* refs/remotes/upstream/pull/17319/head:
	qa: whitelist expected rstat warning
	qa: sync whitelist with fs/basic_functional
	qa: whitelist expected MDS_CACHE_OVERSIZED

Reviewed-by: Zheng Yan <zyan@redhat.com>
2017-09-05 20:49:54 -07:00
Patrick Donnelly
a962708d56
Merge PR #17301 into master
* refs/remotes/upstream/pull/17301/head:
	mds: fix "1 filesystem is have a..." message

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-05 20:49:19 -07:00
Patrick Donnelly
eaa8c5d6d8
Merge PR #17263 into master
* refs/remotes/upstream/pull/17263/head:
	mds: remove unused method
	mds: move EMetaBlob cons to header

Reviewed-by: Zheng Yan <zyan@redhat.com>
2017-09-05 20:45:35 -07:00
Patrick Donnelly
96db892db2
Merge PR #17178 into master
* refs/remotes/upstream/pull/17178/head:
	ceph-dencoder: simplify decoding/encoding cephfs inode

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2017-09-05 20:44:39 -07:00
Patrick Donnelly
2302b6c521
Merge PR #17095 into master
* refs/remotes/upstream/pull/17095/head:
	client: reset unmounting flag to false when starting a new mount
	client: add mountedness check inside client_lock
	client: rework Client::get_local_osd() return codes
	client: remove misleading comment in get_cap_ref

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Douglas Fuller <dfuller@redhat.com>
2017-09-05 20:44:04 -07:00
Patrick Donnelly
f37f2ea10c
Merge PR #16562 into master
* refs/remotes/upstream/pull/16562/head:
	cephfs/fuse: set big_writes default is false

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-05 20:43:28 -07:00
Patrick Donnelly
28ca766cd2
Merge PR #16305 into master
* refs/remotes/upstream/pull/16305/head:
	qa/cephfs: test CephFS recovery pools
	qa/cephfs: support CephFS recovery pools
	qa/ceph_test_case: support CephFS recovery pools
	qa/cephfs: Allow deferred fs creation
	qa/cephfs: Refactor alternate pool test

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-05 20:42:30 -07:00
Patrick Donnelly
4cb459a19d
githubmap: add some known GitHub reviewers
Selection from [1] where the GitHub username is available.

[1] http://pad.ceph.com/p/reviewers

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-05 20:29:11 -07:00
Kefu Chai
0717fc3488 Merge pull request #17447 from tchaikov/wip-freebsd-coredump
test/coredumpctl: support freebsd

Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
2017-09-06 11:16:08 +08:00
Sage Weil
f24095e0e9 crush: fix fast rule lookup when uniform
Older clients will search for the first rule with a matching ruleset,
type, and size.  The has_uniform_rules bool is only set if we have rule
ids and rulesets that line up, but we must also verify that the rest of the
mask matches or else we can get a different CRUSH mapping result because
the mask might not match and old clients will fail to find a rule and we
will find one.  We also can't just check the ruleset as the legacy clients
find the *first* (of potentially many) matching rules; hence we only do
the fast check if all rulesets == rule id.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-09-05 22:27:05 -04:00
Sage Weil
444f5aa085 qa/objectstore/bluestore*: less debug output
Let's see if this makes the spurious MON_DOWN failures go away?  (See
http://tracker.ceph.com/issues/20910)

Signed-off-by: Sage Weil <sage@redhat.com>
2017-09-05 17:43:28 -04:00
Patrick Donnelly
f0f93c6645
Merge PR #17373 into master
* refs/remotes/upstream/pull/17373/head:
	doc/cephfs: add info on using EC pools with CephFS

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-05 12:48:41 -07:00
Sage Weil
6bd9db304f os/bluestore/BlueFS: prevent _compact_log_async reentry
_should_compact_log uses new_log != nullptr to tell whether compaction is
already in progress, but we don't set it until we are midway through the
process.  Set it at the top of the method to prevent reentry.

See 455cc6cea2, which failed to implement
this properly.

Fixes: http://tracker.ceph.com/issues/21250
Signed-off-by: Sage Weil <sage@redhat.com>
2017-09-05 15:01:59 -04:00
Kefu Chai
dd702cc94e ceph: collect all mds in mdsids()
otherwise, only the active mds are returned.

Fixes: http://tracker.ceph.com/issues/21230
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-09-06 01:49:20 +08:00
Kefu Chai
b682e61ddc ceph: always populate targets with ids_by_service()
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-09-06 01:49:20 +08:00
Kefu Chai
25639f6691 ceph: extract ids_by_service() so it can be reused
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-09-06 01:49:20 +08:00
Jason Dillaman
4a75ee43d3 librbd: rename of non-existent image results in seg fault
Fixes: http://tracker.ceph.com/issues/21248
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-09-05 12:11:45 -04:00
Casey Bodley
683212ae1f Merge pull request #17141 from theanalyst/doc-civetweb-ports
doc: rgw: mention the civetweb support for binding to multiple ports

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2017-09-05 11:39:12 -04:00
Jason Dillaman
149778edde Merge pull request #17436 from ashishkumsingh/wip-doc-fix-snapshot-flatten-example
doc: Fixes rbd snapshot flatten example

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2017-09-05 10:13:50 -04:00