ceph/doc/cephfs/administration.rst
Ramana Raja 67bb13859a mon/FSCommands: add 'recover' flag in fs new command
Currently, to recover a file system after recovering monitor store, you
need to stop all the MDSs; create FSMap with defaults using `fs new`
command; execute `fs reset` command to get the file system's rank 0 into
existing but failed state; and then restart MDSs.

Add 'recover' flag to the `fs new` command that sets the file system's
rank 0 to existing but failed state, and sets the file system's
'joinable' setting to False. Using the `fs new` command with 'recover'
flag gets rid of the steps to stop all the MDSs and execute `fs reset`
command when recovering the file system after recoving monitor store.

Fixes: https://tracker.ceph.com/issues/51716
Signed-off-by: Ramana Raja <rraja@redhat.com>
2021-09-13 00:15:39 -04:00

392 lines
11 KiB
ReStructuredText

.. _cephfs-administration:
CephFS Administrative commands
==============================
File Systems
------------
.. note:: The names of the file systems, metadata pools, and data pools can
only have characters in the set [a-zA-Z0-9\_-.].
These commands operate on the CephFS file systems in your Ceph cluster.
Note that by default only one file system is permitted: to enable
creation of multiple file systems use ``ceph fs flag set enable_multiple true``.
::
fs new <file system name> <metadata pool name> <data pool name>
This command creates a new file system. The file system name and metadata pool
name are self-explanatory. The specified data pool is the default data pool and
cannot be changed once set. Each file system has its own set of MDS daemons
assigned to ranks so ensure that you have sufficient standby daemons available
to accommodate the new file system.
::
fs ls
List all file systems by name.
::
fs lsflags <file system name>
List all the flags set on a file system.
::
fs dump [epoch]
This dumps the FSMap at the given epoch (default: current) which includes all
file system settings, MDS daemons and the ranks they hold, and the list of
standby MDS daemons.
::
fs rm <file system name> [--yes-i-really-mean-it]
Destroy a CephFS file system. This wipes information about the state of the
file system from the FSMap. The metadata pool and data pools are untouched and
must be destroyed separately.
::
fs get <file system name>
Get information about the named file system, including settings and ranks. This
is a subset of the same information from the ``fs dump`` command.
::
fs set <file system name> <var> <val>
Change a setting on a file system. These settings are specific to the named
file system and do not affect other file systems.
::
fs add_data_pool <file system name> <pool name/id>
Add a data pool to the file system. This pool can be used for file layouts
as an alternate location to store file data.
::
fs rm_data_pool <file system name> <pool name/id>
This command removes the specified pool from the list of data pools for the
file system. If any files have layouts for the removed data pool, the file
data will become unavailable. The default data pool (when creating the file
system) cannot be removed.
::
fs rename <file system name> <new file system name> [--yes-i-really-mean-it]
Rename a Ceph file system. This also changes the application tags on the data
pools and metadata pool of the file system to the new file system name.
The CephX IDs authorized to the old file system name need to be reauthorized
to the new name. Any on-going operations of the clients using these IDs may be
disrupted. Mirroring is expected to be disabled on the file system.
Settings
--------
::
fs set <fs name> max_file_size <size in bytes>
CephFS has a configurable maximum file size, and it's 1TB by default.
You may wish to set this limit higher if you expect to store large files
in CephFS. It is a 64-bit field.
Setting ``max_file_size`` to 0 does not disable the limit. It would
simply limit clients to only creating empty files.
Maximum file sizes and performance
----------------------------------
CephFS enforces the maximum file size limit at the point of appending to
files or setting their size. It does not affect how anything is stored.
When users create a file of an enormous size (without necessarily
writing any data to it), some operations (such as deletes) cause the MDS
to have to do a large number of operations to check if any of the RADOS
objects within the range that could exist (according to the file size)
really existed.
The ``max_file_size`` setting prevents users from creating files that
appear to be eg. exabytes in size, causing load on the MDS as it tries
to enumerate the objects during operations like stats or deletes.
Taking the cluster down
-----------------------
Taking a CephFS cluster down is done by setting the down flag:
::
fs set <fs_name> down true
To bring the cluster back online:
::
fs set <fs_name> down false
This will also restore the previous value of max_mds. MDS daemons are brought
down in a way such that journals are flushed to the metadata pool and all
client I/O is stopped.
Taking the cluster down rapidly for deletion or disaster recovery
-----------------------------------------------------------------
To allow rapidly deleting a file system (for testing) or to quickly bring the
file system and MDS daemons down, use the ``fs fail`` command:
::
fs fail <fs_name>
This command sets a file system flag to prevent standbys from
activating on the file system (the ``joinable`` flag).
This process can also be done manually by doing the following:
::
fs set <fs_name> joinable false
Then the operator can fail all of the ranks which causes the MDS daemons to
respawn as standbys. The file system will be left in a degraded state.
::
# For all ranks, 0-N:
mds fail <fs_name>:<n>
Once all ranks are inactive, the file system may also be deleted or left in
this state for other purposes (perhaps disaster recovery).
To bring the cluster back up, simply set the joinable flag:
::
fs set <fs_name> joinable true
Daemons
-------
Most commands manipulating MDSs take a ``<role>`` argument which can take one
of three forms:
::
<fs_name>:<rank>
<fs_id>:<rank>
<rank>
Commands to manipulate MDS daemons:
::
mds fail <gid/name/role>
Mark an MDS daemon as failed. This is equivalent to what the cluster
would do if an MDS daemon had failed to send a message to the mon
for ``mds_beacon_grace`` second. If the daemon was active and a suitable
standby is available, using ``mds fail`` will force a failover to the standby.
If the MDS daemon was in reality still running, then using ``mds fail``
will cause the daemon to restart. If it was active and a standby was
available, then the "failed" daemon will return as a standby.
::
tell mds.<daemon name> command ...
Send a command to the MDS daemon(s). Use ``mds.*`` to send a command to all
daemons. Use ``ceph tell mds.* help`` to learn available commands.
::
mds metadata <gid/name/role>
Get metadata about the given MDS known to the Monitors.
::
mds repaired <role>
Mark the file system rank as repaired. Unlike the name suggests, this command
does not change a MDS; it manipulates the file system rank which has been
marked damaged.
Required Client Features
------------------------
It is sometimes desirable to set features that clients must support to talk to
CephFS. Clients without those features may disrupt other clients or behave in
surprising ways. Or, you may want to require newer features to prevent older
and possibly buggy clients from connecting.
Commands to manipulate required client features of a file system:
::
fs required_client_features <fs name> add reply_encoding
fs required_client_features <fs name> rm reply_encoding
To list all CephFS features
::
fs feature ls
Clients that are missing newly added features will be evicted automatically.
Here are the current CephFS features and first release they came out:
+------------------+--------------+-----------------+
| Feature | Ceph release | Upstream Kernel |
+==================+==============+=================+
| jewel | jewel | 4.5 |
+------------------+--------------+-----------------+
| kraken | kraken | 4.13 |
+------------------+--------------+-----------------+
| luminous | luminous | 4.13 |
+------------------+--------------+-----------------+
| mimic | mimic | 4.19 |
+------------------+--------------+-----------------+
| reply_encoding | nautilus | 5.1 |
+------------------+--------------+-----------------+
| reclaim_client | nautilus | N/A |
+------------------+--------------+-----------------+
| lazy_caps_wanted | nautilus | 5.1 |
+------------------+--------------+-----------------+
| multi_reconnect | nautilus | 5.1 |
+------------------+--------------+-----------------+
| deleg_ino | octopus | 5.6 |
+------------------+--------------+-----------------+
| metric_collect | pacific | N/A |
+------------------+--------------+-----------------+
| alternate_name | pacific | PLANNED |
+------------------+--------------+-----------------+
CephFS Feature Descriptions
::
reply_encoding
MDS encodes request reply in extensible format if client supports this feature.
::
reclaim_client
MDS allows new client to reclaim another (dead) client's states. This feature
is used by NFS-Ganesha.
::
lazy_caps_wanted
When a stale client resumes, if the client supports this feature, mds only needs
to re-issue caps that are explicitly wanted.
::
multi_reconnect
When mds failover, client sends reconnect messages to mds, to reestablish cache
states. If MDS supports this feature, client can split large reconnect message
into multiple ones.
::
deleg_ino
MDS delegate inode numbers to client if client supports this feature. Having
delegated inode numbers is a prerequisite for client to do async file creation.
::
metric_collect
Clients can send performance metric to MDS if MDS support this feature.
::
alternate_name
Clients can set and understand "alternate names" for directory entries. This is
to be used for encrypted file name support.
Global settings
---------------
::
fs flag set <flag name> <flag val> [<confirmation string>]
Sets a global CephFS flag (i.e. not specific to a particular file system).
Currently, the only flag setting is 'enable_multiple' which allows having
multiple CephFS file systems.
Some flags require you to confirm your intentions with "--yes-i-really-mean-it"
or a similar string they will prompt you with. Consider these actions carefully
before proceeding; they are placed on especially dangerous activities.
.. _advanced-cephfs-admin-settings:
Advanced
--------
These commands are not required in normal operation, and exist
for use in exceptional circumstances. Incorrect use of these
commands may cause serious problems, such as an inaccessible
file system.
::
mds rmfailed
This removes a rank from the failed set.
::
fs reset <file system name>
This command resets the file system state to defaults, except for the name and
pools. Non-zero ranks are saved in the stopped set.
::
fs new <file system name> <metadata pool name> <data pool name> --fscid <fscid> --force
This command creates a file system with a specific **fscid** (file system cluster ID).
You may want to do this when an application expects the file system's ID to be
stable after it has been recovered, e.g., after monitor databases are lost and
rebuilt. Consequently, file system IDs don't always keep increasing with newer
file systems.