btrfs-progs: docs: update some chapters

Signed-off-by: David Sterba <dsterba@suse.com>
This commit is contained in:
David Sterba 2022-01-05 00:43:47 +01:00
parent b28f7bd9bb
commit df91bfd5d5
5 changed files with 69 additions and 13 deletions

View File

@ -10,7 +10,7 @@ There are two main deduplication types:
* **in-band** *(sometimes also called on-line)* -- all newly written data are * **in-band** *(sometimes also called on-line)* -- all newly written data are
considered for deduplication before writing considered for deduplication before writing
* **out-of-band** *(sometimes alco called offline)* -- data for deduplication * **out-of-band** *(sometimes also called offline)* -- data for deduplication
have to be actively looked for and deduplicated by the user application have to be actively looked for and deduplicated by the user application
Both have their pros and cons. BTRFS implements **only out-of-band** type. Both have their pros and cons. BTRFS implements **only out-of-band** type.
@ -37,8 +37,41 @@ be up-to-date, maintained and widely used.
- No - No
- Yes - Yes
Legend: File based deduplication
------------------------
- *File based*: the tool takes a list of files and deduplicates blocks only from that set The tool takes a list of files and tries to find duplicates among data only
- *Block based*: the tool enumerates blocks and looks for duplicates from that files. This is suitable eg. for files that originated from the same
- *Incremental*: repeated runs of the tool utilizes information gathered from previous runs base image, source of a reflinked file. Optionally the tools could track a
database of hashes and allow to deduplicate blocks from more files, or use that
for repeated runs and update the database incrementally.
Block based deduplication
-------------------------
The tool typically scans the filesystem and builds a database of file block
hashes, then finds candidate files and deduplicates the ranges. The hash
database is kept as an ordinary file and can be scaled according to the needs.
As the file changes, the hash database may get out of sync and the scan has to
be done repeatedly.
Safety of block comparison
--------------------------
The deduplication inside the filesystem is implemented as an ``ioctl`` that takes
a source file, destination file and the range. The blocks from both files are
compared for exact match before merging to the same range (ie. there's no
hash based comparison). Pages representing the extents in memory are locked
prior to deduplication and prevent concurrent modification by buffered writes
or mmaped writes.
Limitations, compatibility
--------------------------
Files that are subject do deduplication must have the same status regarding
COW, ie. both regular COW files with checksums, or both NOCOW, or files that
are COW but don't have checksums (NODATASUM attribute is set).
If the deduplication is in progress on any file in the filesystem, the *send*
operation cannot be started as it relies on the extent layout being unchanged.

View File

@ -20,3 +20,11 @@ and takes care of synchronization. Once a filesystem sync or flush is started
devices. This however reduces the chances to find optimal layout as the writes devices. This however reduces the chances to find optimal layout as the writes
happen together with other data and the result depends on the remaining free happen together with other data and the result depends on the remaining free
space layout and fragmentation. space layout and fragmentation.
.. warning::
Defragmentation does not preserve extent sharing, eg. files created by **cp
--reflink** or existing on multiple snapshots. Due to that the data space
consumption may increase.
Defragmentation can be started together with compression on the given range,
and takes precedence over per-file compression property or mount options.

View File

@ -10,3 +10,18 @@ If the filesystem has been created with different data and metadata profiles,
namely with different level of integrity, this also affects the inlined files. namely with different level of integrity, this also affects the inlined files.
It can be completely disabled by mounting with ``max_inline=0``. The upper It can be completely disabled by mounting with ``max_inline=0``. The upper
limit is either the size of b-tree node or the page size of the host. limit is either the size of b-tree node or the page size of the host.
An inline file can be identified by enumerating the extents, eg. by the tool
``filefrag``:
.. code-block:: bash
$ filefrag -v inlinefile
Filesystem type is: 9123683e
File size of inlinefile is 463 (1 block of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 4095: 0.. 4095: 4096: last,not_aligned,inline,eof
In the above example, the file is not compressed, otherwise it would have the
*encoded* flag. The inline files have no limitations and behave like regular
files with respect to copying, renaming, reflink, truncate etc.

View File

@ -1,9 +1,9 @@
Data and metadata are checksummed by default, the checksum is calculated before Data and metadata are checksummed by default, the checksum is calculated before
write and verifed after reading the blocks. There are several checksum write and verifed after reading the blocks from devices. There are several
algorithms supported. The default and backward compatible is *crc32c*. Since checksum algorithms supported. The default and backward compatible is *crc32c*.
kernel 5.5 there are three more with different characteristics and trade-offs Since kernel 5.5 there are three more with different characteristics and
regarding speed and strength. The following list may help you to decide which trade-offs regarding speed and strength. The following list may help you to
one to select. decide which one to select.
CRC32C (32bit digest) CRC32C (32bit digest)
default, best backward compatibility, very fast, modern CPUs have default, best backward compatibility, very fast, modern CPUs have

View File

@ -23,9 +23,9 @@ Requirements, limitations
but this is namely for testing but this is namely for testing
* the super block is handled in a special way and is at different locations * the super block is handled in a special way and is at different locations
than on a non-zoned filesystem: than on a non-zoned filesystem:
* primary: 0B (and the next two zones) * primary: 0B (and the next two zones)
* secondary: 512GiB (and the next two zones) * secondary: 512GiB (and the next two zones)
* tertiary: 4TiB (4096GiB, and the next two zones) * tertiary: 4TiB (4096GiB, and the next two zones)
Incompatible features Incompatible features
^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^