From c6be84840fa740433bebb5ddebb044c4d9a07c8c Mon Sep 17 00:00:00 2001
From: David Sterba <dsterba@suse.com>
Date: Thu, 9 Dec 2021 20:46:42 +0100
Subject: [PATCH] btrfs-progs: docs: add more chapters (part 2)

The feature pages share the contents with the manual page section 5 so
put the contents to separate files. Progress: 2/3.

Signed-off-by: David Sterba <dsterba@suse.com>
---
 Documentation/Auto-repair.rst       |   6 +-
 Documentation/Convert.rst           |   2 +-
 Documentation/Deduplication.rst     |  42 +++++-
 Documentation/Defragmentation.rst   |  20 ++-
 Documentation/Flexibility.rst       |  16 ++-
 Documentation/Qgroups.rst           |   2 +-
 Documentation/Reflink.rst           |  27 +++-
 Documentation/Resize.rst            |  10 +-
 Documentation/Scrub.rst             |   2 +-
 Documentation/Send-receive.rst      |  25 +++-
 Documentation/btrfs-convert.rst     |  97 +-------------
 Documentation/btrfs-quota.rst       | 197 +--------------------------
 Documentation/btrfs-scrub.rst       |  28 +---
 Documentation/ch-checksumming.rst   |  76 +++++++++++
 Documentation/ch-compression.rst    | 153 +++++++++++++++++++++
 Documentation/ch-convert-intro.rst  |  97 ++++++++++++++
 Documentation/ch-quota-intro.rst    | 198 ++++++++++++++++++++++++++++
 Documentation/ch-scrub-intro.rst    |  28 ++++
 Documentation/ch-seeding-device.rst |  78 +++++++++++
 19 files changed, 772 insertions(+), 332 deletions(-)
 create mode 100644 Documentation/ch-checksumming.rst
 create mode 100644 Documentation/ch-compression.rst
 create mode 100644 Documentation/ch-convert-intro.rst
 create mode 100644 Documentation/ch-quota-intro.rst
 create mode 100644 Documentation/ch-scrub-intro.rst
 create mode 100644 Documentation/ch-seeding-device.rst

diff --git a/Documentation/Auto-repair.rst b/Documentation/Auto-repair.rst
index 1d6c60b7..31760d09 100644
--- a/Documentation/Auto-repair.rst
+++ b/Documentation/Auto-repair.rst
@@ -1,4 +1,8 @@
 Auto-repair on read
 ===================
 
-...
+Data or metadata that are found to be damaged (eg. because the checksum does
+not match) at the time they're read from the device can be salvaged in case the
+filesystem has another valid copy when using block group profile with redundancy
+(DUP, RAID1, RAID5/6). The correct data are returned to the user application
+and the damaged copy is replaced by it.
diff --git a/Documentation/Convert.rst b/Documentation/Convert.rst
index c1f85959..0c13cc8a 100644
--- a/Documentation/Convert.rst
+++ b/Documentation/Convert.rst
@@ -1,4 +1,4 @@
 Convert
 =======
 
-...
+.. include:: ch-convert-intro.rst
diff --git a/Documentation/Deduplication.rst b/Documentation/Deduplication.rst
index 9f491a91..0a3abeed 100644
--- a/Documentation/Deduplication.rst
+++ b/Documentation/Deduplication.rst
@@ -1,4 +1,44 @@
 Deduplication
 =============
 
-...
+Going by the definition in the context of filesystems, it's a process of
+looking up identical data blocks tracked separately and creating a shared
+logical link while removing one of the copies of the data blocks. This leads to
+data space savings while it increases metadata consumption.
+
+There are two main deduplication types:
+
+* **in-band** *(sometimes also called on-line)* -- all newly written data are
+  considered for deduplication before writing
+* **out-of-band** *(sometimes alco called offline)* -- data for deduplication
+  have to be actively looked for and deduplicated by the user application
+
+Both have their pros and cons. BTRFS implements **only out-of-band** type.
+
+BTRFS provides the basic building blocks for deduplication allowing other tools
+to choose the strategy and scope of the deduplication.  There are multiple
+tools that take different approaches to deduplication, offer additional
+features or make trade-offs. The following table lists tools that are known to
+be up-to-date, maintained and widely used.
+
+.. list-table::
+   :header-rows: 1
+
+   * - Name
+     - File based
+     - Block based
+     - Incremental
+   * - `BEES <https://github.com/Zygo/bees>`_
+     - No
+     - Yes
+     - Yes
+   * - `duperemove <https://github.com/markfasheh/duperemove>`_
+     - Yes
+     - No
+     - Yes
+
+Legend:
+
+- *File based*: the tool takes a list of files and deduplicates blocks only from that set
+- *Block based*: the tool enumerates blocks and looks for duplicates
+- *Incremental*: repeated runs of the tool utilizes information gathered from previous runs
diff --git a/Documentation/Defragmentation.rst b/Documentation/Defragmentation.rst
index 89f4fc1f..87bed47d 100644
--- a/Documentation/Defragmentation.rst
+++ b/Documentation/Defragmentation.rst
@@ -1,4 +1,22 @@
 Defragmentation
 ===============
 
-...
+Defragmentation of files is supposed to make the layout of the file extents to
+be more linear or at least coalesce the file extents into larger ones that can
+be stored on the device more efficiently. The reason there's a need for
+defragmentation stems from the COW design that BTRFS is built on and is
+inherent. The fragmentation is caused by rewrites of the same file data
+in-place, that has to be handled by creating a new copy that may lie on a
+distant location on the physical device. Fragmentation is the worst problem on
+rotational hard disks due to the delay caused by moving the drive heads to the
+distant location. With the modern seek-less devices it's not a problem though
+it may still make sense because of reduced size of the metadata that's needed
+to track the scattered extents.
+
+File data that are in use can be safely defragmented because the whole process
+happens inside the page cache, that is the central point caching the file data
+and takes care of synchronization. Once a filesystem sync or flush is started
+(either manually or automatically) all the dirty data get written to the
+devices. This however reduces the chances to find optimal layout as the writes
+happen together with other data and the result depens on the remaining free
+space layout and fragmentation.
diff --git a/Documentation/Flexibility.rst b/Documentation/Flexibility.rst
index e0c00e63..09cef47e 100644
--- a/Documentation/Flexibility.rst
+++ b/Documentation/Flexibility.rst
@@ -1,6 +1,18 @@
 Flexibility
 ===========
 
-* dynamic inode creation (no preallocated space)
+The underlying design of BTRFS data structures allows a lot of flexibility and
+making changes after filesystem creation, like resizing, adding/removing space
+or enabling some features on-the-fly.
 
-* block group profile change on-the-fly
+* **dynamic inode creation** -- there's no fixed space or tables for tracking
+  inodes so the number of inodes that can be created is bounded by the metadata
+  space and it's utilization
+
+* **block group profile change on-the-fly** -- the block group profiles can be
+  changed on a mounted filesystem by running the balance operation and
+  specifying the conversion filters
+
+* **resize** -- the space occupied by the filesystem on each device can be
+  resized up (grow) or down (shrink) as long as the amount of data can be still
+  contained on the device
diff --git a/Documentation/Qgroups.rst b/Documentation/Qgroups.rst
index 3f9cb701..dde68744 100644
--- a/Documentation/Qgroups.rst
+++ b/Documentation/Qgroups.rst
@@ -1,4 +1,4 @@
 Quota groups
 ============
 
-...
+.. include:: ch-quota-intro.rst
diff --git a/Documentation/Reflink.rst b/Documentation/Reflink.rst
index 00efe09b..98c1e232 100644
--- a/Documentation/Reflink.rst
+++ b/Documentation/Reflink.rst
@@ -1,4 +1,29 @@
 Reflink
 =======
 
-...
+Reflink is a type of shallow copy of file data that shares the blocks but
+otherwise the files are independent and any change to the file will not affect
+the other. This builds on the underlying COW mechanism. A reflink will
+effectively create only a separate metadata pointing to the shared blocks which
+is typically much faster than a deep copy of all blocks.
+
+The reflink is typically meant for whole files but a partial file range can be
+also copied, though there are no ready-made tools for that.
+
+.. code-block:: shell
+
+   cp --reflink=always source target
+
+There are some constaints:
+
+- cross-filesystem reflink is not possible, there's nothing in common between
+  so the block sharing can't work
+- reflink crossing two mount points of the same filesystem does not work due
+  to an artificial limitation in VFS (this may change in the future)
+- reflink requires source and target file that have the same status regarding
+  NOCOW and checksums, for example if the source file is NOCOW (once created
+  with the chattr +C attribute) then the above command won't work unless the
+  target file is pre-created with the +C attribute as well, or the NOCOW
+  attribute is inherited from the parent directory (chattr +C on the directory)
+  or if the whole filesystem is mounted with *-o nodatacow* that would create
+  the NOCOW files by default
diff --git a/Documentation/Resize.rst b/Documentation/Resize.rst
index 0ffdf672..5efca120 100644
--- a/Documentation/Resize.rst
+++ b/Documentation/Resize.rst
@@ -1,4 +1,12 @@
 Resize
 ======
 
-...
+A BTRFS mounted filesystem can be resized after creation, grown or shrunk. On a
+multi device filesystem the space occupied on each device can be resized
+independently. Data tha reside in the are that would be out of the new size are
+relocated to the remaining space below the limit, so this constrains the
+minimum size to which a filesystem can be shrunk.
+
+Growing a filesystem is quick as it only needs to take note of the available
+space, while shrinking a filesystem needs to relocate potentially lots of data
+and this is IO intense. It is possible to shrink a filesystem in smaller steps.
diff --git a/Documentation/Scrub.rst b/Documentation/Scrub.rst
index 35199289..8b076e76 100644
--- a/Documentation/Scrub.rst
+++ b/Documentation/Scrub.rst
@@ -1,4 +1,4 @@
 Scrub
 =====
 
-...
+.. include:: ch-scrub-intro.rst
diff --git a/Documentation/Send-receive.rst b/Documentation/Send-receive.rst
index 29e0b4df..a965ff6a 100644
--- a/Documentation/Send-receive.rst
+++ b/Documentation/Send-receive.rst
@@ -1,4 +1,23 @@
-Balance
-=======
+Send/receive
+============
 
-...
+Send and receive are complementary features that allow to transfer data from
+one filesystem to another in a streamable format. The send part traverses a
+given read-only subvolume and either creates a full stream representation of
+its data and metadata (*full mode*), or given a set of subvolumes for reference
+it generates a difference relative to that set (*incremental mode*).
+
+Receive on the other hand takes the stream and reconstructs a subvolume with
+files and directories equivalent to the filesystem that was used to produce the
+stream. The result is not exactly 1:1, eg. inode numbers can be different and
+other unique identifiers can be different (like the subvolume UUIDs). The full
+mode starts with an empty subvolume, creates all the files and then turns the
+subvolume to read-only. At this point it could be used as a starting point for a
+future incremental send stream, provided it would be generated from the same
+source subvolume on the other filesystem.
+
+The stream is a sequence of encoded commands that change eg. file metadata
+(owner, permissions, extended attributes), data extents (create, clone,
+truncate), whole file operations (rename, delete). The stream can be sent over
+network, piped directly to the receive command or saved to a file. Each command
+in the stream is protected by a CRC32C checksum.
diff --git a/Documentation/btrfs-convert.rst b/Documentation/btrfs-convert.rst
index da61c290..4c7da323 100644
--- a/Documentation/btrfs-convert.rst
+++ b/Documentation/btrfs-convert.rst
@@ -9,102 +9,7 @@ SYNOPSIS
 DESCRIPTION
 -----------
 
-**btrfs-convert** is used to convert existing source filesystem image to a btrfs
-filesystem in-place.  The original filesystem image is accessible in subvolume
-named like *ext2_saved* as file *image*.
-
-Supported filesystems:
-
-* ext2, ext3, ext4 -- original feature, always built in
-
-* reiserfs -- since version 4.13, optionally built, requires libreiserfscore 3.6.27
-
-* ntfs -- external tool https://github.com/maharmstone/ntfs2btrfs
-
-The list of supported source filesystem by a given binary is listed at the end
-of help (option *--help*).
-
-.. warning::
-   If you are going to perform rollback to the original filesystem, you
-   should not execute **btrfs balance** command on the converted filesystem. This
-   will change the extent layout and make **btrfs-convert** unable to rollback.
-
-The conversion utilizes free space of the original filesystem. The exact
-estimate of the required space cannot be foretold. The final btrfs metadata
-might occupy several gigabytes on a hundreds-gigabyte filesystem.
-
-If the ability to rollback is no longer important, the it is recommended to
-perform a few more steps to transition the btrfs filesystem to a more compact
-layout. This is because the conversion inherits the original data blocks'
-fragmentation, and also because the metadata blocks are bound to the original
-free space layout.
-
-Due to different constraints, it is only possible to convert filesystems that
-have a supported data block size (ie. the same that would be valid for
-**mkfs.btrfs**). This is typically the system page size (4KiB on x86_64
-machines).
-
-**BEFORE YOU START**
-
-The source filesystem must be clean, eg. no journal to replay or no repairs
-needed. The respective **fsck** utility must be run on the source filesytem prior
-to conversion. Please refer to the manual pages in case you encounter problems.
-
-For ext2/3/4:
-
-.. code-block:: bash
-
-    # e2fsck -fvy /dev/sdx
-
-For reiserfs:
-
-.. code-block:: bash
-
-    # reiserfsck -fy /dev/sdx
-
-Skipping that step could lead to incorrect results on the target filesystem,
-but it may work.
-
-**REMOVE THE ORIGINAL FILESYSTEM METADATA**
-
-By removing the subvolume named like *ext2_saved* or *reiserfs_saved*, all
-metadata of the original filesystem will be removed:
-
-.. code-block:: bash
-
-   # btrfs subvolume delete /mnt/ext2_saved
-
-At this point it is not possible to do a rollback. The filesystem is usable but
-may be impacted by the fragmentation inherited from the original filesystem.
-
-**MAKE FILE DATA MORE CONTIGUOUS**
-
-An optional but recommended step is to run defragmentation on the entire
-filesystem. This will attempt to make file extents more contiguous.
-
-.. code-block:: bash
-
-   # btrfs filesystem defrag -v -r -f -t 32M /mnt/btrfs
-
-Verbose recursive defragmentation (*-v*, *-r*), flush data per-file (*-f*) with
-target extent size 32MiB (*-t*).
-
-**ATTEMPT TO MAKE BTRFS METADATA MORE COMPACT**
-
-Optional but recommended step.
-
-The metadata block groups after conversion may be smaller than the default size
-(256MiB or 1GiB). Running a balance will attempt to merge the block groups.
-This depends on the free space layout (and fragmentation) and may fail due to
-lack of enough work space. This is a soft error leaving the filesystem usable
-but the block group layout may remain unchanged.
-
-Note that balance operation takes a lot of time, please see also
-``btrfs-balance(8)``.
-
-.. code-block:: bash
-
-   # btrfs balance start -m /mnt/btrfs
+.. include:: ch-convert-intro.rst
 
 OPTIONS
 -------
diff --git a/Documentation/btrfs-quota.rst b/Documentation/btrfs-quota.rst
index a81ad9b9..da26e754 100644
--- a/Documentation/btrfs-quota.rst
+++ b/Documentation/btrfs-quota.rst
@@ -36,202 +36,7 @@ gradually improving and issues found and fixed.
 HIERARCHICAL QUOTA GROUP CONCEPTS
 ---------------------------------
 
-The concept of quota has a long-standing tradition in the Unix world.  Ever
-since computers allow multiple users to work simultaneously in one filesystem,
-there is the need to prevent one user from using up the entire space.  Every
-user should get his fair share of the available resources.
-
-In case of files, the solution is quite straightforward.  Each file has an
-'owner' recorded along with it, and it has a size.  Traditional quota just
-restricts the total size of all files that are owned by a user.  The concept is
-quite flexible: if a user hits his quota limit, the administrator can raise it
-on the fly.
-
-On the other hand, the traditional approach has only a poor solution to
-restrict directories.
-At installation time, the harddisk can be partitioned so that every directory
-(eg. /usr, /var/, ...) that needs a limit gets its own partition.  The obvious
-problem is that those limits cannot be changed without a reinstallation.  The
-btrfs subvolume feature builds a bridge.  Subvolumes correspond in many ways to
-partitions, as every subvolume looks like its own filesystem.  With subvolume
-quota, it is now possible to restrict each subvolume like a partition, but keep
-the flexibility of quota.  The space for each subvolume can be expanded or
-restricted on the fly.
-
-As subvolumes are the basis for snapshots, interesting questions arise as to
-how to account used space in the presence of snapshots.  If you have a file
-shared between a subvolume and a snapshot, whom to account the file to? The
-creator? Both? What if the file gets modified in the snapshot, should only
-these changes be accounted to it? But wait, both the snapshot and the subvolume
-belong to the same user home.  I just want to limit the total space used by
-both! But somebody else might not want to charge the snapshots to the users.
-
-Btrfs subvolume quota solves these problems by introducing groups of subvolumes
-and let the user put limits on them.  It is even possible to have groups of
-groups.  In the following, we refer to them as 'qgroups'.
-
-Each qgroup primarily tracks two numbers, the amount of total referenced
-space and the amount of exclusively referenced space.
-
-referenced
-        space is the amount of data that can be reached from any of the
-        subvolumes contained in the qgroup, while
-exclusive
-        is the amount of data where all references to this data can be reached
-        from within this qgroup.
-
-SUBVOLUME QUOTA GROUPS
-^^^^^^^^^^^^^^^^^^^^^^
-
-The basic notion of the Subvolume Quota feature is the quota group, short
-qgroup.  Qgroups are notated as 'level/id', eg.  the qgroup 3/2 is a qgroup of
-level 3. For level 0, the leading '0/' can be omitted.
-Qgroups of level 0 get created automatically when a subvolume/snapshot gets
-created.  The ID of the qgroup corresponds to the ID of the subvolume, so 0/5
-is the qgroup for the root subvolume.
-For the *btrfs qgroup* command, the path to the subvolume can also be used
-instead of '0/ID'.  For all higher levels, the ID can be chosen freely.
-
-Each qgroup can contain a set of lower level qgroups, thus creating a hierarchy
-of qgroups. Figure 1 shows an example qgroup tree.
-
-.. code-block:: none
-
-                                  +---+
-                                  |2/1|
-                                  +---+
-                                 /     \
-                           +---+/       \+---+
-                           |1/1|         |1/2|
-                           +---+         +---+
-                          /     \       /     \
-                    +---+/       \+---+/       \+---+
-        qgroups     |0/1|         |0/2|         |0/3|
-                    +-+-+         +---+         +---+
-                      |          /     \       /     \
-                      |         /       \     /       \
-                      |        /         \   /         \
-        extents       1       2            3            4
-
-        Figure1: Sample qgroup hierarchy
-
-At the bottom, some extents are depicted showing which qgroups reference which
-extents.  It is important to understand the notion of *referenced* vs
-*exclusive*.  In the example, qgroup 0/2 references extents 2 and 3, while 1/2
-references extents 2-4, 2/1 references all extents.
-
-On the other hand, extent 1 is exclusive to 0/1, extent 2 is exclusive to 0/2,
-while extent 3 is neither exclusive to 0/2 nor to 0/3.  But because both
-references can be reached from 1/2, extent 3 is exclusive to 1/2.  All extents
-are exclusive to 2/1.
-
-So exclusive does not mean there is no other way to reach the extent, but it
-does mean that if you delete all subvolumes contained in a qgroup, the extent
-will get deleted.
-
-Exclusive of a qgroup conveys the useful information how much space will be
-freed in case all subvolumes of the qgroup get deleted.
-
-All data extents are accounted this way.  Metadata that belongs to a specific
-subvolume (i.e.  its filesystem tree) is also accounted.  Checksums and extent
-allocation information are not accounted.
-
-In turn, the referenced count of a qgroup can be limited.  All writes beyond
-this limit will lead to a 'Quota Exceeded' error.
-
-INHERITANCE
-^^^^^^^^^^^
-
-Things get a bit more complicated when new subvolumes or snapshots are created.
-The case of (empty) subvolumes is still quite easy.  If a subvolume should be
-part of a qgroup, it has to be added to the qgroup at creation time.  To add it
-at a later time, it would be necessary to at least rescan the full subvolume
-for a proper accounting.
-
-Creation of a snapshot is the hard case.  Obviously, the snapshot will
-reference the exact amount of space as its source, and both source and
-destination now have an exclusive count of 0 (the filesystem nodesize to be
-precise, as the roots of the trees are not shared).  But what about qgroups of
-higher levels? If the qgroup contains both the source and the destination,
-nothing changes.  If the qgroup contains only the source, it might lose some
-exclusive.
-
-But how much? The tempting answer is, subtract all exclusive of the source from
-the qgroup, but that is wrong, or at least not enough.  There could have been
-an extent that is referenced from the source and another subvolume from that
-qgroup.  This extent would have been exclusive to the qgroup, but not to the
-source subvolume.  With the creation of the snapshot, the qgroup would also
-lose this extent from its exclusive set.
-
-So how can this problem be solved? In the instant the snapshot gets created, we
-already have to know the correct exclusive count.  We need to have a second
-qgroup that contains all the subvolumes as the first qgroup, except the
-subvolume we want to snapshot.  The moment we create the snapshot, the
-exclusive count from the second qgroup needs to be copied to the first qgroup,
-as it represents the correct value.  The second qgroup is called a tracking
-qgroup.  It is only there in case a snapshot is needed.
-
-USE CASES
-^^^^^^^^^
-
-Below are some usecases that do not mean to be extensive. You can find your
-own way how to integrate qgroups.
-
-SINGLE-USER MACHINE
-"""""""""""""""""""
-
-``Replacement for partitions``
-
-The simplest use case is to use qgroups as simple replacement for partitions.
-Btrfs takes the disk as a whole, and /, /usr, /var, etc. are created as
-subvolumes.  As each subvolume gets it own qgroup automatically, they can
-simply be restricted.  No hierarchy is needed for that.
-
-``Track usage of snapshots``
-
-When a snapshot is taken, a qgroup for it will automatically be created with
-the correct values.  'Referenced' will show how much is in it, possibly shared
-with other subvolumes.  'Exclusive' will be the amount of space that gets freed
-when the subvolume is deleted.
-
-MULTI-USER MACHINE
-""""""""""""""""""
-
-``Restricting homes``
-
-When you have several users on a machine, with home directories probably under
-/home, you might want to restrict /home as a whole, while restricting every
-user to an individual limit as well.  This is easily accomplished by creating a
-qgroup for /home , eg. 1/1, and assigning all user subvolumes to it.
-Restricting this qgroup will limit /home, while every user subvolume can get
-its own (lower) limit.
-
-``Accounting snapshots to the user``
-
-Let's say the user is allowed to create snapshots via some mechanism.  It would
-only be fair to account space used by the snapshots to the user.  This does not
-mean the user doubles his usage as soon as he takes a snapshot.  Of course,
-files that are present in his home and the snapshot should only be accounted
-once.  This can be accomplished by creating a qgroup for each user, say
-'1/UID'.  The user home and all snapshots are assigned to this qgroup.
-Limiting it will extend the limit to all snapshots, counting files only once.
-To limit /home as a whole, a higher level group 2/1 replacing 1/1 from the
-previous example is needed, with all user qgroups assigned to it.
-
-``Do not account snapshots``
-
-On the other hand, when the snapshots get created automatically, the user has
-no chance to control them, so the space used by them should not be accounted to
-him.  This is already the case when creating snapshots in the example from
-the previous section.
-
-``Snapshots for backup purposes``
-
-This scenario is a mixture of the previous two.  The user can create snapshots,
-but some snapshots for backup purposes are being created by the system.  The
-user's snapshots should be accounted to the user, not the system.  The solution
-is similar to the one from section 'Accounting snapshots to the user', but do
-not assign system snapshots to user's qgroup.
+.. include:: ch-quota-intro.rst
 
 SUBCOMMAND
 ----------
diff --git a/Documentation/btrfs-scrub.rst b/Documentation/btrfs-scrub.rst
index 5f19365e..75079eec 100644
--- a/Documentation/btrfs-scrub.rst
+++ b/Documentation/btrfs-scrub.rst
@@ -9,33 +9,7 @@ SYNOPSIS
 DESCRIPTION
 -----------
 
-**btrfs scrub** is used to scrub a mounted btrfs filesystem, which will read all
-data and metadata blocks from all devices and verify checksums. Automatically
-repair corrupted blocks if there's a correct copy available.
-
-.. note::
-   Scrub is not a filesystem checker (fsck) and does not verify nor repair
-   structural damage in the filesystem. It really only checks checksums of data
-   and tree blocks, it doesn't ensure the content of tree blocks is valid and
-   consistent. There's some validation performed when metadata blocks are read
-   from disk but it's not extensive and cannot substitute full *btrfs check*
-   run.
-
-The user is supposed to run it manually or via a periodic system service. The
-recommended period is a month but could be less. The estimated device bandwidth
-utilization is about 80% on an idle filesystem. The IO priority class is by
-default *idle* so background scrub should not significantly interfere with
-normal filesystem operation. The IO scheduler set for the device(s) might not
-support the priority classes though.
-
-The scrubbing status is recorded in */var/lib/btrfs/* in textual files named
-*scrub.status.UUID* for a filesystem identified by the given UUID. (Progress
-state is communicated through a named pipe in file *scrub.progress.UUID* in the
-same directory.) The status file is updated every 5 seconds. A resumed scrub
-will continue from the last saved position.
-
-Scrub can be started only on a mounted filesystem, though it's possible to
-scrub only a selected device. See **scrub start** for more.
+.. include:: ch-scrub-intro.rst
 
 SUBCOMMAND
 ----------
diff --git a/Documentation/ch-checksumming.rst b/Documentation/ch-checksumming.rst
new file mode 100644
index 00000000..96cd27a4
--- /dev/null
+++ b/Documentation/ch-checksumming.rst
@@ -0,0 +1,76 @@
+Data and metadata are checksummed by default, the checksum is calculated before
+write and verifed after reading the blocks.  There are several checksum
+algorithms supported. The default and backward compatible is *crc32c*. Since
+kernel 5.5 there are three more with different characteristics and trade-offs
+regarding speed and strength. The following list may help you to decide which
+one to select.
+
+CRC32C (32bit digest)
+        default, best backward compatibility, very fast, modern CPUs have
+        instruction-level support, not collision-resistant but still good error
+        detection capabilities
+
+XXHASH* (64bit digest)
+        can be used as CRC32C successor, very fast, optimized for modern CPUs utilizing
+        instruction pipelining, good collision resistance and error detection
+
+SHA256 (256bit digest)::
+        a cryptographic-strength hash, relatively slow but with possible CPU
+        instruction acceleration or specialized hardware cards, FIPS certified and
+        in wide use
+
+BLAKE2b (256bit digest)
+        a cryptographic-strength hash, relatively fast with possible CPU acceleration
+        using SIMD extensions, not standardized but based on BLAKE which was a SHA3
+        finalist, in wide use, the algorithm used is BLAKE2b-256 that's optimized for
+        64bit platforms
+
+The *digest size* affects overall size of data block checksums stored in the
+filesystem.  The metadata blocks have a fixed area up to 256 bits (32 bytes), so
+there's no increase. Each data block has a separate checksum stored, with
+additional overhead of the b-tree leaves.
+
+Approximate relative performance of the algorithms, measured against CRC32C
+using reference software implementations on a 3.5GHz intel CPU:
+
+
+========  ============   =======  ================
+Digest    Cycles/4KiB    Ratio    Implementation
+========  ============   =======  ================
+CRC32C            1700      1.00  CPU instruction
+XXHASH            2500      1.44  reference impl.
+SHA256          105000        61  reference impl.
+SHA256           36000        21  libgcrypt/AVX2
+SHA256           63000        37  libsodium/AVX2
+BLAKE2b          22000        13  reference impl.
+BLAKE2b          19000        11  libgcrypt/AVX2
+BLAKE2b          19000        11  libsodium/AVX2
+========  ============   =======  ================
+
+Many kernels are configured with SHA256 as built-in and not as a module.
+The accelerated versions are however provided by the modules and must be loaded
+explicitly (**modprobe sha256**) before mounting the filesystem to make use of
+them. You can check in */sys/fs/btrfs/FSID/checksum* which one is used. If you
+see *sha256-generic*, then you may want to unmount and mount the filesystem
+again, changing that on a mounted filesystem is not possible.
+Check the file */proc/crypto*, when the implementation is built-in, you'd find
+
+.. code-block:: none
+
+        name         : sha256
+        driver       : sha256-generic
+        module       : kernel
+        priority     : 100
+        ...
+
+while accelerated implementation is e.g.
+
+.. code-block:: none
+
+        name         : sha256
+        driver       : sha256-avx2
+        module       : sha256_ssse3
+        priority     : 170
+        ...
+
+
diff --git a/Documentation/ch-compression.rst b/Documentation/ch-compression.rst
new file mode 100644
index 00000000..10c343e4
--- /dev/null
+++ b/Documentation/ch-compression.rst
@@ -0,0 +1,153 @@
+Btrfs supports transparent file compression. There are three algorithms
+available: ZLIB, LZO and ZSTD (since v4.14), with various levels.
+The compression happens on the level of file extents and the algorithm is
+selected by file property, mount option or by a defrag command.
+You can have a single btrfs mount point that has some files that are
+uncompressed, some that are compressed with LZO, some with ZLIB, for instance
+(though you may not want it that way, it is supported).
+
+Once the compression is set, all newly written data will be compressed, ie.
+existing data are untouched. Data are split into smaller chunks (128KiB) before
+compression to make random rewrites possible without a high performance hit. Due
+to the increased number of extents the metadata consumption is higher. The
+chunks are compressed in parallel.
+
+The algorithms can be characterized as follows regarding the speed/ratio
+trade-offs:
+
+ZLIB
+        * slower, higher compression ratio
+        * levels: 1 to 9, mapped directly, default level is 3
+        * good backward compatibility
+LZO
+        * faster compression and decompression than zlib, worse compression ratio, designed to be fast
+        * no levels
+        * good backward compatibility
+ZSTD
+        * compression comparable to zlib with higher compression/decompression speeds and different ratio
+        * levels: 1 to 15
+        * since 4.14, levels since 5.1
+
+The differences depend on the actual data set and cannot be expressed by a
+single number or recommendation. Higher levels consume more CPU time and may
+not bring a significant improvement, lower levels are close to real time.
+
+How to enable compression
+-------------------------
+
+Typically the compression can be enabled on the whole filesystem, specified for
+the mount point. Note that the compression mount options are shared among all
+mounts of the same filesystem, either bind mounts or subvolume mounts.
+Please refer to section *MOUNT OPTIONS*.
+
+.. code-block:: shell
+
+   $ mount -o compress=zstd /dev/sdx /mnt
+
+This will enable the ``zstd`` algorithm on the default level (which is 3).
+The level can be specified manually too like ``zstd:3``. Higher levels compress
+better at the cost of time. This in turn may cause increased write latency, low
+levels are suitable for real-time compression and on reasonably fast CPU don't
+cause performance drops.
+
+.. code-block:: shell
+
+   $ btrfs filesystem defrag -czstd file
+
+The command above will start defragmentation of the whole *file* and apply
+the compression, regardless of the mount option. (Note: specifying level is not
+yet implemented). The compression algorithm is not persisent and applies only
+to the defragmentation command, for any other writes other compression settings
+apply.
+
+Persistent settings on a per-file basis can be set in two ways:
+
+.. code-block:: shell
+
+   $ chattr +c file
+   $ btrfs property set file compression zstd
+
+The first command is using legacy interface of file attributes inherited from
+ext2 filesystem and is not flexible, so by default the *zlib* compression is
+set. The other command sets a property on the file with the given algorithm.
+(Note: setting level that way is not yet implemented.)
+
+Compression levels
+------------------
+
+The level support of ZLIB has been added in v4.14, LZO does not support levels
+(the kernel implementation provides only one), ZSTD level support has been added
+in v5.1.
+
+There are 9 levels of ZLIB supported (1 to 9), mapping 1:1 from the mount option
+to the algorithm defined level. The default is level 3, which provides the
+reasonably good compression ratio and is still reasonably fast. The difference
+in compression gain of levels 7, 8 and 9 is comparable but the higher levels
+take longer.
+
+The ZSTD support includes levels 1 to 15, a subset of full range of what ZSTD
+provides. Levels 1-3 are real-time, 4-8 slower with improved compression and
+9-15 try even harder though the resulting size may not be significantly improved.
+
+Level 0 always maps to the default. The compression level does not affect
+compatibility.
+
+Incompressible data
+-------------------
+
+Files with already compressed data or with data that won't compress well with
+the CPU and memory constraints of the kernel implementations are using a simple
+decision logic. If the first portion of data being compressed is not smaller
+than the original, the compression of the file is disabled -- unless the
+filesystem is mounted with *compress-force*. In that case compression will
+always be attempted on the file only to be later discarded. This is not optimal
+and subject to optimizations and further development.
+
+If a file is identified as incompressible, a flag is set (*NOCOMPRESS*) and it's
+sticky. On that file compression won't be performed unless forced. The flag
+can be also set by **chattr +m** (since e2fsprogs 1.46.2) or by properties with
+value *no* or *none*. Empty value will reset it to the default that's currently
+applicable on the mounted filesystem.
+
+There are two ways to detect incompressible data:
+
+* actual compression attempt - data are compressed, if the result is not smaller,
+  it's discarded, so this depends on the algorithm and level
+* pre-compression heuristics - a quick statistical evaluation on the data is
+  peformed and based on the result either compression is performed or skipped,
+  the NOCOMPRESS bit is not set just by the heuristic, only if the compression
+  algorithm does not make an improvent
+
+.. code-block:: shell
+
+   $ lsattr file
+   ---------------------m file
+
+Using the forcing compression is not recommended, the heuristics are
+supposed to decide that and compression algorithms internally detect
+incompressible data too.
+
+Pre-compression heuristics
+--------------------------
+
+The heuristics aim to do a few quick statistical tests on the compressed data
+in order to avoid probably costly compression that would turn out to be
+inefficient. Compression algorithms could have internal detection of
+incompressible data too but this leads to more overhead as the compression is
+done in another thread and has to write the data anyway. The heuristic is
+read-only and can utilize cached memory.
+
+The tests performed based on the following: data sampling, long repated
+pattern detection, byte frequency, Shannon entropy.
+
+Compatibility
+-------------
+
+Compression is done using the COW mechanism so it's incompatible with
+*nodatacow*. Direct IO works on compressed files but will fall back to buffered
+writes and leads to recompression. Currently 'nodatasum' and compression don't
+work together.
+
+The compression algorithms have been added over time so the version
+compatibility should be also considered, together with other tools that may
+access the compressed data like bootloaders.
diff --git a/Documentation/ch-convert-intro.rst b/Documentation/ch-convert-intro.rst
new file mode 100644
index 00000000..b3fdd162
--- /dev/null
+++ b/Documentation/ch-convert-intro.rst
@@ -0,0 +1,97 @@
+The **btrfs-convert** tool can be used to convert existing source filesystem
+image to a btrfs filesystem in-place.  The original filesystem image is
+accessible in subvolume named like *ext2_saved* as file *image*.
+
+Supported filesystems:
+
+* ext2, ext3, ext4 -- original feature, always built in
+
+* reiserfs -- since version 4.13, optionally built, requires libreiserfscore 3.6.27
+
+* ntfs -- external tool https://github.com/maharmstone/ntfs2btrfs
+
+The list of supported source filesystem by a given binary is listed at the end
+of help (option *--help*).
+
+.. warning::
+   If you are going to perform rollback to the original filesystem, you
+   should not execute **btrfs balance** command on the converted filesystem. This
+   will change the extent layout and make **btrfs-convert** unable to rollback.
+
+The conversion utilizes free space of the original filesystem. The exact
+estimate of the required space cannot be foretold. The final btrfs metadata
+might occupy several gigabytes on a hundreds-gigabyte filesystem.
+
+If the ability to rollback is no longer important, the it is recommended to
+perform a few more steps to transition the btrfs filesystem to a more compact
+layout. This is because the conversion inherits the original data blocks'
+fragmentation, and also because the metadata blocks are bound to the original
+free space layout.
+
+Due to different constraints, it is only possible to convert filesystems that
+have a supported data block size (ie. the same that would be valid for
+**mkfs.btrfs**). This is typically the system page size (4KiB on x86_64
+machines).
+
+**BEFORE YOU START**
+
+The source filesystem must be clean, eg. no journal to replay or no repairs
+needed. The respective **fsck** utility must be run on the source filesytem prior
+to conversion. Please refer to the manual pages in case you encounter problems.
+
+For ext2/3/4:
+
+.. code-block:: bash
+
+    # e2fsck -fvy /dev/sdx
+
+For reiserfs:
+
+.. code-block:: bash
+
+    # reiserfsck -fy /dev/sdx
+
+Skipping that step could lead to incorrect results on the target filesystem,
+but it may work.
+
+**REMOVE THE ORIGINAL FILESYSTEM METADATA**
+
+By removing the subvolume named like *ext2_saved* or *reiserfs_saved*, all
+metadata of the original filesystem will be removed:
+
+.. code-block:: bash
+
+   # btrfs subvolume delete /mnt/ext2_saved
+
+At this point it is not possible to do a rollback. The filesystem is usable but
+may be impacted by the fragmentation inherited from the original filesystem.
+
+**MAKE FILE DATA MORE CONTIGUOUS**
+
+An optional but recommended step is to run defragmentation on the entire
+filesystem. This will attempt to make file extents more contiguous.
+
+.. code-block:: bash
+
+   # btrfs filesystem defrag -v -r -f -t 32M /mnt/btrfs
+
+Verbose recursive defragmentation (*-v*, *-r*), flush data per-file (*-f*) with
+target extent size 32MiB (*-t*).
+
+**ATTEMPT TO MAKE BTRFS METADATA MORE COMPACT**
+
+Optional but recommended step.
+
+The metadata block groups after conversion may be smaller than the default size
+(256MiB or 1GiB). Running a balance will attempt to merge the block groups.
+This depends on the free space layout (and fragmentation) and may fail due to
+lack of enough work space. This is a soft error leaving the filesystem usable
+but the block group layout may remain unchanged.
+
+Note that balance operation takes a lot of time, please see also
+``btrfs-balance(8)``.
+
+.. code-block:: bash
+
+   # btrfs balance start -m /mnt/btrfs
+
diff --git a/Documentation/ch-quota-intro.rst b/Documentation/ch-quota-intro.rst
new file mode 100644
index 00000000..abd71606
--- /dev/null
+++ b/Documentation/ch-quota-intro.rst
@@ -0,0 +1,198 @@
+The concept of quota has a long-standing tradition in the Unix world.  Ever
+since computers allow multiple users to work simultaneously in one filesystem,
+there is the need to prevent one user from using up the entire space.  Every
+user should get his fair share of the available resources.
+
+In case of files, the solution is quite straightforward.  Each file has an
+*owner* recorded along with it, and it has a size.  Traditional quota just
+restricts the total size of all files that are owned by a user.  The concept is
+quite flexible: if a user hits his quota limit, the administrator can raise it
+on the fly.
+
+On the other hand, the traditional approach has only a poor solution to
+restrict directories.
+At installation time, the harddisk can be partitioned so that every directory
+(eg. /usr, /var/, ...) that needs a limit gets its own partition.  The obvious
+problem is that those limits cannot be changed without a reinstallation.  The
+btrfs subvolume feature builds a bridge.  Subvolumes correspond in many ways to
+partitions, as every subvolume looks like its own filesystem.  With subvolume
+quota, it is now possible to restrict each subvolume like a partition, but keep
+the flexibility of quota.  The space for each subvolume can be expanded or
+restricted on the fly.
+
+As subvolumes are the basis for snapshots, interesting questions arise as to
+how to account used space in the presence of snapshots.  If you have a file
+shared between a subvolume and a snapshot, whom to account the file to? The
+creator? Both? What if the file gets modified in the snapshot, should only
+these changes be accounted to it? But wait, both the snapshot and the subvolume
+belong to the same user home.  I just want to limit the total space used by
+both! But somebody else might not want to charge the snapshots to the users.
+
+Btrfs subvolume quota solves these problems by introducing groups of subvolumes
+and let the user put limits on them.  It is even possible to have groups of
+groups.  In the following, we refer to them as *qgroups*.
+
+Each qgroup primarily tracks two numbers, the amount of total referenced
+space and the amount of exclusively referenced space.
+
+referenced
+        space is the amount of data that can be reached from any of the
+        subvolumes contained in the qgroup, while
+exclusive
+        is the amount of data where all references to this data can be reached
+        from within this qgroup.
+
+SUBVOLUME QUOTA GROUPS
+^^^^^^^^^^^^^^^^^^^^^^
+
+The basic notion of the Subvolume Quota feature is the quota group, short
+qgroup.  Qgroups are notated as *level/id*, eg.  the qgroup 3/2 is a qgroup of
+level 3. For level 0, the leading '0/' can be omitted.
+Qgroups of level 0 get created automatically when a subvolume/snapshot gets
+created.  The ID of the qgroup corresponds to the ID of the subvolume, so 0/5
+is the qgroup for the root subvolume.
+For the ``btrfs qgroup`` command, the path to the subvolume can also be used
+instead of *0/ID*.  For all higher levels, the ID can be chosen freely.
+
+Each qgroup can contain a set of lower level qgroups, thus creating a hierarchy
+of qgroups. Figure 1 shows an example qgroup tree.
+
+.. code-block:: none
+
+                                  +---+
+                                  |2/1|
+                                  +---+
+                                 /     \
+                           +---+/       \+---+
+                           |1/1|         |1/2|
+                           +---+         +---+
+                          /     \       /     \
+                    +---+/       \+---+/       \+---+
+        qgroups     |0/1|         |0/2|         |0/3|
+                    +-+-+         +---+         +---+
+                      |          /     \       /     \
+                      |         /       \     /       \
+                      |        /         \   /         \
+        extents       1       2            3            4
+
+        Figure1: Sample qgroup hierarchy
+
+At the bottom, some extents are depicted showing which qgroups reference which
+extents.  It is important to understand the notion of *referenced* vs
+*exclusive*.  In the example, qgroup 0/2 references extents 2 and 3, while 1/2
+references extents 2-4, 2/1 references all extents.
+
+On the other hand, extent 1 is exclusive to 0/1, extent 2 is exclusive to 0/2,
+while extent 3 is neither exclusive to 0/2 nor to 0/3.  But because both
+references can be reached from 1/2, extent 3 is exclusive to 1/2.  All extents
+are exclusive to 2/1.
+
+So exclusive does not mean there is no other way to reach the extent, but it
+does mean that if you delete all subvolumes contained in a qgroup, the extent
+will get deleted.
+
+Exclusive of a qgroup conveys the useful information how much space will be
+freed in case all subvolumes of the qgroup get deleted.
+
+All data extents are accounted this way.  Metadata that belongs to a specific
+subvolume (i.e.  its filesystem tree) is also accounted.  Checksums and extent
+allocation information are not accounted.
+
+In turn, the referenced count of a qgroup can be limited.  All writes beyond
+this limit will lead to a 'Quota Exceeded' error.
+
+INHERITANCE
+^^^^^^^^^^^
+
+Things get a bit more complicated when new subvolumes or snapshots are created.
+The case of (empty) subvolumes is still quite easy.  If a subvolume should be
+part of a qgroup, it has to be added to the qgroup at creation time.  To add it
+at a later time, it would be necessary to at least rescan the full subvolume
+for a proper accounting.
+
+Creation of a snapshot is the hard case.  Obviously, the snapshot will
+reference the exact amount of space as its source, and both source and
+destination now have an exclusive count of 0 (the filesystem nodesize to be
+precise, as the roots of the trees are not shared).  But what about qgroups of
+higher levels? If the qgroup contains both the source and the destination,
+nothing changes.  If the qgroup contains only the source, it might lose some
+exclusive.
+
+But how much? The tempting answer is, subtract all exclusive of the source from
+the qgroup, but that is wrong, or at least not enough.  There could have been
+an extent that is referenced from the source and another subvolume from that
+qgroup.  This extent would have been exclusive to the qgroup, but not to the
+source subvolume.  With the creation of the snapshot, the qgroup would also
+lose this extent from its exclusive set.
+
+So how can this problem be solved? In the instant the snapshot gets created, we
+already have to know the correct exclusive count.  We need to have a second
+qgroup that contains all the subvolumes as the first qgroup, except the
+subvolume we want to snapshot.  The moment we create the snapshot, the
+exclusive count from the second qgroup needs to be copied to the first qgroup,
+as it represents the correct value.  The second qgroup is called a tracking
+qgroup.  It is only there in case a snapshot is needed.
+
+USE CASES
+^^^^^^^^^
+
+Below are some usecases that do not mean to be extensive. You can find your
+own way how to integrate qgroups.
+
+SINGLE-USER MACHINE
+"""""""""""""""""""
+
+``Replacement for partitions``
+
+The simplest use case is to use qgroups as simple replacement for partitions.
+Btrfs takes the disk as a whole, and /, /usr, /var, etc. are created as
+subvolumes.  As each subvolume gets it own qgroup automatically, they can
+simply be restricted.  No hierarchy is needed for that.
+
+``Track usage of snapshots``
+
+When a snapshot is taken, a qgroup for it will automatically be created with
+the correct values.  'Referenced' will show how much is in it, possibly shared
+with other subvolumes.  'Exclusive' will be the amount of space that gets freed
+when the subvolume is deleted.
+
+MULTI-USER MACHINE
+""""""""""""""""""
+
+``Restricting homes``
+
+When you have several users on a machine, with home directories probably under
+/home, you might want to restrict /home as a whole, while restricting every
+user to an individual limit as well.  This is easily accomplished by creating a
+qgroup for /home , eg. 1/1, and assigning all user subvolumes to it.
+Restricting this qgroup will limit /home, while every user subvolume can get
+its own (lower) limit.
+
+``Accounting snapshots to the user``
+
+Let's say the user is allowed to create snapshots via some mechanism.  It would
+only be fair to account space used by the snapshots to the user.  This does not
+mean the user doubles his usage as soon as he takes a snapshot.  Of course,
+files that are present in his home and the snapshot should only be accounted
+once.  This can be accomplished by creating a qgroup for each user, say
+'1/UID'.  The user home and all snapshots are assigned to this qgroup.
+Limiting it will extend the limit to all snapshots, counting files only once.
+To limit /home as a whole, a higher level group 2/1 replacing 1/1 from the
+previous example is needed, with all user qgroups assigned to it.
+
+``Do not account snapshots``
+
+On the other hand, when the snapshots get created automatically, the user has
+no chance to control them, so the space used by them should not be accounted to
+him.  This is already the case when creating snapshots in the example from
+the previous section.
+
+``Snapshots for backup purposes``
+
+This scenario is a mixture of the previous two.  The user can create snapshots,
+but some snapshots for backup purposes are being created by the system.  The
+user's snapshots should be accounted to the user, not the system.  The solution
+is similar to the one from section 'Accounting snapshots to the user', but do
+not assign system snapshots to user's qgroup.
+
+
diff --git a/Documentation/ch-scrub-intro.rst b/Documentation/ch-scrub-intro.rst
new file mode 100644
index 00000000..796d0a24
--- /dev/null
+++ b/Documentation/ch-scrub-intro.rst
@@ -0,0 +1,28 @@
+Scrub is a pass over all filesystem data and metadata and verifying the
+checksums. If a valid copy is available (replicated block group profiles) then
+the damaged one is repaired. All copies of the replicated profiles are validated.
+
+.. note::
+   Scrub is not a filesystem checker (fsck) and does not verify nor repair
+   structural damage in the filesystem. It really only checks checksums of data
+   and tree blocks, it doesn't ensure the content of tree blocks is valid and
+   consistent. There's some validation performed when metadata blocks are read
+   from disk but it's not extensive and cannot substitute full *btrfs check*
+   run.
+
+The user is supposed to run it manually or via a periodic system service. The
+recommended period is a month but could be less. The estimated device bandwidth
+utilization is about 80% on an idle filesystem. The IO priority class is by
+default *idle* so background scrub should not significantly interfere with
+normal filesystem operation. The IO scheduler set for the device(s) might not
+support the priority classes though.
+
+The scrubbing status is recorded in */var/lib/btrfs/* in textual files named
+*scrub.status.UUID* for a filesystem identified by the given UUID. (Progress
+state is communicated through a named pipe in file *scrub.progress.UUID* in the
+same directory.) The status file is updated every 5 seconds. A resumed scrub
+will continue from the last saved position.
+
+Scrub can be started only on a mounted filesystem, though it's possible to
+scrub only a selected device. See **btrfs scrub start** for more.
+
diff --git a/Documentation/ch-seeding-device.rst b/Documentation/ch-seeding-device.rst
new file mode 100644
index 00000000..93136c2f
--- /dev/null
+++ b/Documentation/ch-seeding-device.rst
@@ -0,0 +1,78 @@
+The COW mechanism and multiple devices under one hood enable an interesting
+concept, called a seeding device: extending a read-only filesystem on a single
+device filesystem with another device that captures all writes. For example
+imagine an immutable golden image of an operating system enhanced with another
+device that allows to use the data from the golden image and normal operation.
+This idea originated on CD-ROMs with base OS and allowing to use them for live
+systems, but this became obsolete. There are technologies providing similar
+functionality, like *unionmount*, *overlayfs* or *qcow2* image snapshot.
+
+The seeding device starts as a normal filesystem, once the contents is ready,
+**btrfstune -S 1** is used to flag it as a seeding device. Mounting such device
+will not allow any writes, except adding a new device by **btrfs device add**.
+Then the filesystem can be remounted as read-write.
+
+Given that the filesystem on the seeding device is always recognized as
+read-only, it can be used to seed multiple filesystems, at the same time. The
+UUID that is normally attached to a device is automatically changed to a random
+UUID on each mount.
+
+Once the seeding device is mounted, it needs the writable device. After adding
+it, something like **remount -o remount,rw /path** makes the filesystem at
+*/path* ready for use. The simplest usecase is to throw away all changes by
+unmounting the filesystem when convenient.
+
+Alternatively, deleting the seeding device from the filesystem can turn it into
+a normal filesystem, provided that the writable device can also contain all the
+data from the seeding device.
+
+The seeding device flag can be cleared again by **btrfstune -f -s 0**, eg.
+allowing to update with newer data but please note that this will invalidate
+all existing filesystems that use this particular seeding device. This works
+for some usecases, not for others, and a forcing flag to the command is
+mandatory to avoid accidental mistakes.
+
+Example how to create and use one seeding device:
+
+.. code-block:: bash
+
+        # mkfs.btrfs /dev/sda
+        # mount /dev/sda /mnt/mnt1
+        # ... fill mnt1 with data
+        # umount /mnt/mnt1
+        # btrfstune -S 1 /dev/sda
+        # mount /dev/sda /mnt/mnt1
+        # btrfs device add /dev/sdb /mnt
+        # mount -o remount,rw /mnt/mnt1
+        # ... /mnt/mnt1 is now writable
+
+Now */mnt/mnt1* can be used normally. The device */dev/sda* can be mounted
+again with a another writable device:
+
+.. code-block:: bash
+
+        # mount /dev/sda /mnt/mnt2
+        # btrfs device add /dev/sdc /mnt/mnt2
+        # mount -o remount,rw /mnt/mnt2
+        ... /mnt/mnt2 is now writable
+
+The writable device (*/dev/sdb*) can be decoupled from the seeding device and
+used independently:
+
+.. code-block:: bash
+
+        # btrfs device delete /dev/sda /mnt/mnt1
+
+As the contents originated in the seeding device, it's possible to turn
+*/dev/sdb* to a seeding device again and repeat the whole process.
+
+A few things to note:
+
+* it's recommended to use only single device for the seeding device, it works
+  for multiple devices but the *single* profile must be used in order to make
+  the seeding device deletion work
+* block group profiles *single* and *dup* support the usecases above
+* the label is copied from the seeding device and can be changed by **btrfs filesystem label**
+* each new mount of the seeding device gets a new random UUID
+
+