Commit Graph

20 Commits

Author SHA1 Message Date
Patrick Donnelly
b810bc9c54
doc/cephfs: document MDS_CLIENTS_BROKEN_ROOTSQUASH health error
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2024-05-07 08:19:28 -04:00
Dhairya Parmar
7c8e794414 doc/cephfs: document MDS_CLIENTS_LAGGY health warning
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
2023-05-17 14:39:42 +05:30
Venky Shankar
d01acd531a mds: record and dump last tid for trimming completed requests (or flushes)
CephFS clients include `oldest_tid` as part of the client request
to the MDS. This field is the tid of the oldest incomplete mds
request (excluding setfilelock request). The MDS uses this to
trim completed requests (and flushes). In one case, the ceph
cluster had an extremely high completed requests count, meaning,
for some reason the client was not advancing its `oldest_tid`
field, although, the MDS had successfully "safe replied" the
request back to the client.

This change adds a debug aid for recording and dumping this
field. It might be possible to fetch this from clients (if
not, we should add that!), but it makes sense to have this
information available from the MDS.

Partially-Fixes: http://tracker.ceph.com/issues/57985
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2023-01-17 17:28:22 +05:30
Patrick Donnelly
61de84a522
Merge PR #44315 into master
* refs/pull/44315/head:
	doc/cephfs: mds default cache memory limit is now 4GB

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
2021-12-20 14:41:43 -05:00
Dimitri Papadopoulos
7677651618
doc,man: typos found by codespell
Signed-off-by: Dimitri Papadopoulos <3234522+DimitriPapadopoulos@users.noreply.github.com>
2021-12-15 12:04:36 +01:00
wangxinyu
8dcbb2e500 doc/cephfs: mds default cache memory limit is now 4GB
MDS default cache memory limit is now 4GB.

Signed-off-by: wangxinyu <wangxinyu@inspur.com>
2021-12-15 14:18:41 +08:00
Venky Shankar
3d97d6d98f doc / cephfs: health message codes should be permalinks
... so that such links can be included in alert warnings.

Additionally, document some other health warnings. Credit to @pcuzner
to point out that not all health warnings have been documented.

Signed-off-by: Venky Shankar <vshankar@redhat.com>
2021-10-14 10:21:07 +05:30
haoyixing
fb0a650ba2 doc/cephfs/health-messages: add dot between mds identifier and it's name
After pr #37608, ceph health detail output message
have changed when mds has slow requests. So update
doc according to output.

Signed-off-by: haoyixing <haoyixing@kuaishou.com>
2020-11-18 09:42:48 +08:00
Kefu Chai
1a0c45148b doc/cephfs: reformat the health checks
otherwise the "Message" and "Code" of each check are cluttered in the
same paragraph.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-11-03 13:02:27 +08:00
Paul Emmerich
d905678a87 mds: make threshold for MDS_TRIM configurable
Fixes: https://tracker.ceph.com/issues/45906
Signed-off-by: Paul Emmerich <paul.emmerich@croit.io>
2020-06-30 18:26:37 +02:00
Ramana Raja
aeaef1b4c5 mds: obsoleting 'mds_cache_size'
Remove last bits of support for 'mds_cache_size'.
'mds_cache_memory_limit' is preferred.

Fixes: https://tracker.ceph.com/issues/41951
Signed-off-by: Ramana Raja <rraja@redhat.com>
2019-12-02 14:51:25 +05:30
Patrick Donnelly
e7a7cf429e
doc: filesystem to file system
"Filesystem" is not a word (although fairly common in use).

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-09-10 08:43:28 -07:00
Patrick Donnelly
35412684b6
doc: update doc on new recall config
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-02-05 10:08:15 -08:00
ashitakasam
a2b14a741c doc: 'daemon' is misspelled in doc/cephfs/health-messages.rst and src/tools/rbd_recover_tool/README
Signed-off-by: ashitakasam <694240887@qq.com>
2018-03-19 10:21:38 +08:00
Patrick Donnelly
67ca6cd229
mds: obsolete MDSMap option configs
These configs were used for initialization but it is more appropriate to
require setting these file system attributes via `ceph fs set`. This is similar
to what was already done with max_mds. There are new variables added for `fs
set` where missing.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-12-13 18:30:52 -08:00
Jeff Layton
3321cc7b37 mds: fold mds_revoke_cap_timeout into mds_session_timeout
Right now, we have two different timeout settings -- one for when the
client is just not responding at all (mds_session_timeout), and one for
when the client is otherwise responding but isn't returning caps in a
timely fashion (mds_cap_revoke_timeout).

The default settings on them are equivalent (60s), but only the
mds_session_timeout is communicated via the mdsmap. The
mds_cap_revoke_timeout is known only to the MDS. Neither timeout results
in anything other than warnings in the current codebase.

There is also a third setting (mds_session_autoclose) that is also
communicated via the MDSmap. Exceeding that value (default of 300s)
could eventually result in the client being blacklisted from the
cluster. The code to implement that doesn't exist yet, however.

The current codebase doesn't do any real sanity checking of these
timeouts, so the potential for admins to get them wrong is rather high.
It's hard to concoct a use-case where we'd want to warn about these
events at different intervals.

Simplify this by just removing the mds_cap_revoke_timeout setting, and
replace its use in the code with the mds_session_timeout. With that, the
client can at least determine when warnings might start showing up in
the MDS' logs.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
2017-11-14 07:27:01 -05:00
Patrick Donnelly
06c94de584
mds: support limiting cache by memory
This introduces two config parameters:

    mds_cache_memory_limit: Sets the soft maximum of the cache to the given
    byte count. (Like mds_cache_size, this doesn't actually limit the maximum
    size of the cache. It just dictates the steady-state size.)

    mds_cache_reservation: This replaces mds_health_cache_threshold everywhere
    except the Beacon heartbeat sent to the mons. The idea here is to specify a
    reservation of memory (5% by default) for operations and the MDS tries to
    always maintain that reservation. So, the MDS will recall caps from clients
    when it begins dipping into its reservation of memory.

mds_cache_size still limits the cache by Inode count but is now by-default 0
(i.e. unlimited). The new preferred way of specifying cache limits is by memory
size. The default is 1GB.

Fixes: http://tracker.ceph.com/issues/20594
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1464976

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-09-12 20:02:41 -07:00
Alfredo Deza
d8b287011c doc/cephfs add label to health messages for use in refs linking
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2017-08-16 08:20:00 -04:00
Patrick Donnelly
7278543d74
mds: warn if insufficient standbys exist
Fixes: http://tracker.ceph.com/issues/17604

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-02-28 14:57:59 -05:00
John Spray
ef1405ab19 doc/cephfs: explain the various health messages
Signed-off-by: John Spray <john.spray@redhat.com>
2016-07-21 12:32:05 +01:00