btrfs-progs: docs: enhance the scrub chapter

- Explain that scrub is device based

- Add extra warning on NOCOW files
  Which implies NODATASUM, and can cause unexpected stale data to be
  returned.

- Explain the limitation of scrub
  As it can only do very basic checksum verification and very basic
  mirror based repair.

Signed-off-by: Colin Snover <csnover@users.noreply.github.com>
[ Add an SoB line and commit message, remove the mention of btrfs-check errors,
  as there is no evidence/example where btrfs-check failed to choose a good mirror. ]
Signed-off-by: Qu Wenruo <wqu@suse.com>
This commit is contained in:
Colin Snover 2024-08-05 18:54:27 -05:00 committed by David Sterba
parent 960b2991f4
commit dbebb78051

View File

@ -1,14 +1,50 @@
Scrub is a pass over all filesystem data and metadata and verifying the
checksums. If a valid copy is available (replicated block group profiles) then
the damaged one is repaired. All copies of the replicated profiles are validated.
Scrub is a validation pass over all filesystem data and metadata that detects
data checksum errors, basic super block errors, basic metadata block header errors,
and disk read errors.
Scrub is done on a per-device base, if a device is specified to :command:`btrfs scrub start`,
then only that device will be scrubbed. Although btrfs will also try to read
other device to find a good copy, if the mirror on that specified device failed
to be read or pass verification.
If a path of btrfs is specified to :command:`btrfs scrub start`, btrfs will scrub
all devices in parallel.
On filesystems that use replicated block group profiles (e.g. RAID1), read-write
scrub will also automatically repair any damage by copying verified good data
from one of the other replicas.
Such automatic repair is also carried out when reading metadata or data from a
read-write mounted filesystem.
.. warning::
As currently implemented, setting the ``NOCOW`` file attribute (by
:command:`chattr +C`) on a file implicitly enables
``NODATASUM``. This means that while metadata for these files continues to
be validated and corrected by scrub, the actual file data is not.
Furthermore, btrfs does not currently mark missing or failed disks as
unreliable, so will continue to load-balance reads to potentially damaged
replicas. This is not a problem normally because damage is detected by
checksum validation, but because ``NOCOW`` files are
not protected by checksums, btrfs has no idea which mirror is good thus it can
return the bad contents to the user space tool.
Detecting and recovering from such failure requires manual intervention.
Notably, `systemd sets +C on journals by default <https://github.com/systemd/systemd/commit/11689d2a021d95a8447d938180e0962cd9439763>`__,
and `libvirt ≥ 6.6 sets +C on storage pool directories by default <https://www.libvirt.org/news.html#v6-6-0-2020-08-02>`__.
Other applications or distributions may also set ``+C`` to try to improve
performance.
.. note::
Scrub is not a filesystem checker (fsck) and does not verify nor repair
structural damage in the filesystem. It really only checks checksums of data
and tree blocks, it doesn't ensure the content of tree blocks is valid and
consistent. There's some validation performed when metadata blocks are read
from disk (:doc:`Tree-checker`) but it's not extensive and cannot substitute
full :doc:`btrfs-check` run.
Scrub is not a filesystem checker (fsck, :doc:`btrfs-check`). It can only detect
filesystem damage using the checksum validation, and it can only repair
filesystem damage by copying from other known good replicas.
:doc:`btrfs-check` performs more exhaustive checking and can sometimes be
used, with expert guidance, to rebuild certain corrupted filesystem structures
in the absence of any good replica.
The user is supposed to run it manually or via a periodic system service. The
recommended period is a month but it could be less. The estimated device bandwidth