ceph/doc/cephfs/recover-fs-after-mon-store-loss.rst

Recovering the file system after catastrophic Monitor store loss
================================================================

During rare occasions, all the monitor stores of a cluster may get corrupted
or lost. To recover the cluster in such a scenario, you need to rebuild the
monitor stores using the OSDs (see :ref:`mon-store-recovery-using-osds`),
and get back the pools intact (active+clean state). However, the rebuilt monitor
stores don't restore the file system maps ("FSMap"). Additional steps are required
to bring back the file system. The steps to recover a multiple active MDS file
system or multiple file systems are yet to be identified. Currently, only the steps
to recover a **single active MDS** file system with no additional file systems
in the cluster have been identified and tested. Briefly the steps are:
recreate the FSMap with basic defaults; and allow MDSs to recover from
the journal/metadata stored in the filesystem's pools. The steps are described
in more detail below.

First up, recreate the file system using the recovered file system pools. The
new FSMap will have the filesystem's default settings. However, the user defined
file system settings such as ``standby_count_wanted``, ``required_client_features``,
extra data pools, etc., are lost and need to be reapplied later.

::

    ceph fs new <fs_name> <metadata_pool> <data_pool> --force --recover

The ``recover`` flag sets the state of file system's rank 0 to existing but
failed. So when a MDS daemon eventually picks up rank 0, the daemon reads the
existing in-RADOS metadata and doesn't overwrite it. The flag also prevents the
standby MDS daemons to activate the file system.

The file system cluster ID, fscid, of the file system will not be preserved.
This behaviour may not be desirable for certain applications (e.g., Ceph CSI)
that expect the file system to be unchanged across recovery. To fix this, you
can optionally set the ``fscid`` option in the above command (see
:ref:`advanced-cephfs-admin-settings`).

Allow standby MDS daemons to join the file system.

::

    ceph fs set <fs_name> joinable true


Check that the file system is no longer in degraded state and has an active
MDS.

::

    ceph fs status

Reapply any other custom file system settings.
doc/cephfs: recover file system after recovering ... monitor stores using OSDs. The steps are valid only to recover single active MDS file systems. Partially-fixes: https://tracker.ceph.com/issues/51341 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-07-13 05:10:50 +00:00			`Recovering the file system after catastrophic Monitor store loss`
			`================================================================`

			`During rare occasions, all the monitor stores of a cluster may get corrupted`
			`or lost. To recover the cluster in such a scenario, you need to rebuild the`
			monitor stores using the OSDs (see :ref:`mon-store-recovery-using-osds`),
			`and get back the pools intact (active+clean state). However, the rebuilt monitor`
			`stores don't restore the file system maps ("FSMap"). Additional steps are required`
			`to bring back the file system. The steps to recover a multiple active MDS file`
			`system or multiple file systems are yet to be identified. Currently, only the steps`
			`to recover a single active MDS file system with no additional file systems`
mon/FSCommands: add 'recover' flag in `fs new` command Currently, to recover a file system after recovering monitor store, you need to stop all the MDSs; create FSMap with defaults using `fs new` command; execute `fs reset` command to get the file system's rank 0 into existing but failed state; and then restart MDSs. Add 'recover' flag to the `fs new` command that sets the file system's rank 0 to existing but failed state, and sets the file system's 'joinable' setting to False. Using the `fs new` command with 'recover' flag gets rid of the steps to stop all the MDSs and execute `fs reset` command when recovering the file system after recoving monitor store. Fixes: https://tracker.ceph.com/issues/51716 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-08-11 20:34:47 +00:00			`in the cluster have been identified and tested. Briefly the steps are:`
			`recreate the FSMap with basic defaults; and allow MDSs to recover from`
doc/cephfs: recover file system after recovering ... monitor stores using OSDs. The steps are valid only to recover single active MDS file systems. Partially-fixes: https://tracker.ceph.com/issues/51341 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-07-13 05:10:50 +00:00			`the journal/metadata stored in the filesystem's pools. The steps are described`
			`in more detail below.`

mon/FSCommands: add 'recover' flag in `fs new` command Currently, to recover a file system after recovering monitor store, you need to stop all the MDSs; create FSMap with defaults using `fs new` command; execute `fs reset` command to get the file system's rank 0 into existing but failed state; and then restart MDSs. Add 'recover' flag to the `fs new` command that sets the file system's rank 0 to existing but failed state, and sets the file system's 'joinable' setting to False. Using the `fs new` command with 'recover' flag gets rid of the steps to stop all the MDSs and execute `fs reset` command when recovering the file system after recoving monitor store. Fixes: https://tracker.ceph.com/issues/51716 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-08-11 20:34:47 +00:00			`First up, recreate the file system using the recovered file system pools. The`
			`new FSMap will have the filesystem's default settings. However, the user defined`
			file system settings such as ``standby_count_wanted``, ``required_client_features``,
doc/cephfs: recover file system after recovering ... monitor stores using OSDs. The steps are valid only to recover single active MDS file systems. Partially-fixes: https://tracker.ceph.com/issues/51341 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-07-13 05:10:50 +00:00			`extra data pools, etc., are lost and need to be reapplied later.`

			`::`

mon/FSCommands: add 'recover' flag in `fs new` command Currently, to recover a file system after recovering monitor store, you need to stop all the MDSs; create FSMap with defaults using `fs new` command; execute `fs reset` command to get the file system's rank 0 into existing but failed state; and then restart MDSs. Add 'recover' flag to the `fs new` command that sets the file system's rank 0 to existing but failed state, and sets the file system's 'joinable' setting to False. Using the `fs new` command with 'recover' flag gets rid of the steps to stop all the MDSs and execute `fs reset` command when recovering the file system after recoving monitor store. Fixes: https://tracker.ceph.com/issues/51716 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-08-11 20:34:47 +00:00			`ceph fs new <fs_name> <metadata_pool> <data_pool> --force --recover`

			The ``recover`` flag sets the state of file system's rank 0 to existing but
			`failed. So when a MDS daemon eventually picks up rank 0, the daemon reads the`
			`existing in-RADOS metadata and doesn't overwrite it. The flag also prevents the`
			`standby MDS daemons to activate the file system.`
doc/cephfs: recover file system after recovering ... monitor stores using OSDs. The steps are valid only to recover single active MDS file systems. Partially-fixes: https://tracker.ceph.com/issues/51341 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-07-13 05:10:50 +00:00
			`The file system cluster ID, fscid, of the file system will not be preserved.`
			`This behaviour may not be desirable for certain applications (e.g., Ceph CSI)`
mon/FSCommands: add 'recover' flag in `fs new` command Currently, to recover a file system after recovering monitor store, you need to stop all the MDSs; create FSMap with defaults using `fs new` command; execute `fs reset` command to get the file system's rank 0 into existing but failed state; and then restart MDSs. Add 'recover' flag to the `fs new` command that sets the file system's rank 0 to existing but failed state, and sets the file system's 'joinable' setting to False. Using the `fs new` command with 'recover' flag gets rid of the steps to stop all the MDSs and execute `fs reset` command when recovering the file system after recoving monitor store. Fixes: https://tracker.ceph.com/issues/51716 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-08-11 20:34:47 +00:00			`that expect the file system to be unchanged across recovery. To fix this, you`
			can optionally set the ``fscid`` option in the above command (see
			:ref:`advanced-cephfs-admin-settings`).
doc/cephfs: recover file system after recovering ... monitor stores using OSDs. The steps are valid only to recover single active MDS file systems. Partially-fixes: https://tracker.ceph.com/issues/51341 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-07-13 05:10:50 +00:00
mon/FSCommands: add 'recover' flag in `fs new` command Currently, to recover a file system after recovering monitor store, you need to stop all the MDSs; create FSMap with defaults using `fs new` command; execute `fs reset` command to get the file system's rank 0 into existing but failed state; and then restart MDSs. Add 'recover' flag to the `fs new` command that sets the file system's rank 0 to existing but failed state, and sets the file system's 'joinable' setting to False. Using the `fs new` command with 'recover' flag gets rid of the steps to stop all the MDSs and execute `fs reset` command when recovering the file system after recoving monitor store. Fixes: https://tracker.ceph.com/issues/51716 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-08-11 20:34:47 +00:00			`Allow standby MDS daemons to join the file system.`
doc/cephfs: recover file system after recovering ... monitor stores using OSDs. The steps are valid only to recover single active MDS file systems. Partially-fixes: https://tracker.ceph.com/issues/51341 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-07-13 05:10:50 +00:00
			`::`

mon/FSCommands: add 'recover' flag in `fs new` command Currently, to recover a file system after recovering monitor store, you need to stop all the MDSs; create FSMap with defaults using `fs new` command; execute `fs reset` command to get the file system's rank 0 into existing but failed state; and then restart MDSs. Add 'recover' flag to the `fs new` command that sets the file system's rank 0 to existing but failed state, and sets the file system's 'joinable' setting to False. Using the `fs new` command with 'recover' flag gets rid of the steps to stop all the MDSs and execute `fs reset` command when recovering the file system after recoving monitor store. Fixes: https://tracker.ceph.com/issues/51716 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-08-11 20:34:47 +00:00			`ceph fs set <fs_name> joinable true`

doc/cephfs: recover file system after recovering ... monitor stores using OSDs. The steps are valid only to recover single active MDS file systems. Partially-fixes: https://tracker.ceph.com/issues/51341 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-07-13 05:10:50 +00:00
mon/FSCommands: add 'recover' flag in `fs new` command Currently, to recover a file system after recovering monitor store, you need to stop all the MDSs; create FSMap with defaults using `fs new` command; execute `fs reset` command to get the file system's rank 0 into existing but failed state; and then restart MDSs. Add 'recover' flag to the `fs new` command that sets the file system's rank 0 to existing but failed state, and sets the file system's 'joinable' setting to False. Using the `fs new` command with 'recover' flag gets rid of the steps to stop all the MDSs and execute `fs reset` command when recovering the file system after recoving monitor store. Fixes: https://tracker.ceph.com/issues/51716 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-08-11 20:34:47 +00:00			`Check that the file system is no longer in degraded state and has an active`
			`MDS.`
doc/cephfs: recover file system after recovering ... monitor stores using OSDs. The steps are valid only to recover single active MDS file systems. Partially-fixes: https://tracker.ceph.com/issues/51341 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-07-13 05:10:50 +00:00
			`::`

mon/FSCommands: add 'recover' flag in `fs new` command Currently, to recover a file system after recovering monitor store, you need to stop all the MDSs; create FSMap with defaults using `fs new` command; execute `fs reset` command to get the file system's rank 0 into existing but failed state; and then restart MDSs. Add 'recover' flag to the `fs new` command that sets the file system's rank 0 to existing but failed state, and sets the file system's 'joinable' setting to False. Using the `fs new` command with 'recover' flag gets rid of the steps to stop all the MDSs and execute `fs reset` command when recovering the file system after recoving monitor store. Fixes: https://tracker.ceph.com/issues/51716 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-08-11 20:34:47 +00:00			`ceph fs status`
doc/cephfs: recover file system after recovering ... monitor stores using OSDs. The steps are valid only to recover single active MDS file systems. Partially-fixes: https://tracker.ceph.com/issues/51341 Signed-off-by: Ramana Raja <rraja@redhat.com> 2021-07-13 05:10:50 +00:00
			`Reapply any other custom file system settings.`