btrfs-progs: docs: update mount options

Enhance the text, update for 4.14, sync with existing wiki page.

Signed-off-by: David Sterba <dsterba@suse.com>
This commit is contained in:
David Sterba 2017-11-07 19:03:44 +01:00
parent 3daa286951
commit 94856dd547
1 changed files with 84 additions and 33 deletions

View File

@ -83,21 +83,22 @@ supposed to make it to the permanent storage.
(since: 3.0, default: off) (since: 3.0, default: off)
+ +
These debugging options control the behavior of the integrity checking These debugging options control the behavior of the integrity checking
module (the BTRFS_FS_CHECK_INTEGRITY config option required). + module (the BTRFS_FS_CHECK_INTEGRITY config option required). The main goal is
to verify that all blocks from a given transaction period are properly linked.
+ +
`check_int` enables the integrity checker module, which examines all 'check_int' enables the integrity checker module, which examines all
block write requests to ensure on-disk consistency, at a large block write requests to ensure on-disk consistency, at a large
memory and CPU cost. + memory and CPU cost.
+ +
`check_int_data` includes extent data in the integrity checks, and 'check_int_data' includes extent data in the integrity checks, and
implies the check_int option. + implies the 'check_int' option.
+ +
`check_int_print_mask` takes a bitmask of BTRFSIC_PRINT_MASK_* values 'check_int_print_mask' takes a bitmask of BTRFSIC_PRINT_MASK_* values
as defined in 'fs/btrfs/check-integrity.c', to control the integrity as defined in 'fs/btrfs/check-integrity.c', to control the integrity
checker module behavior. + checker module behavior.
+ +
See comments at the top of 'fs/btrfs/check-integrity.c' See comments at the top of 'fs/btrfs/check-integrity.c'
for more info. for more information.
*clear_cache*:: *clear_cache*::
Force clearing and rebuilding of the disk space cache if something Force clearing and rebuilding of the disk space cache if something
@ -106,10 +107,11 @@ has gone wrong. See also: 'space_cache'.
*commit='seconds'*:: *commit='seconds'*::
(since: 3.12, default: 30) (since: 3.12, default: 30)
+ +
Set the interval of periodic commit. Higher Set the interval of periodic transaction commit when data are synchronized
values defer data being synced to permanent storage with obvious to permanent storage. Higher interval values lead to larger amount of unwritten
consequences when the system crashes. The upper bound is not forced, data, which has obvious consequences when the system crashes.
but a warning is printed if it's more than 300 seconds (5 minutes). The upper bound is not forced, but a warning is printed if it's more than 300
seconds (5 minutes). Use with care.
*compress*:: *compress*::
*compress='type'*:: *compress='type'*::
@ -141,6 +143,10 @@ Enable data copy-on-write for newly created files.
under 'nodatacow' are also set the NOCOW file attribute (see `chattr`(1)). under 'nodatacow' are also set the NOCOW file attribute (see `chattr`(1)).
+ +
NOTE: If 'nodatacow' or 'nodatasum' are enabled, compression is disabled. NOTE: If 'nodatacow' or 'nodatasum' are enabled, compression is disabled.
+
Updates in-place improve performance for workloads that do frequent overwrites,
at the cost of potential partial writes, in case the write is interruted
(system crash, device failure).
*datasum*:: *datasum*::
*nodatasum*:: *nodatasum*::
@ -152,13 +158,31 @@ under 'nodatasum' inherit the "no checksums" property, however there's no
corresponding file attribute (see `chattr`(1)). corresponding file attribute (see `chattr`(1)).
+ +
NOTE: If 'nodatacow' or 'nodatasum' are enabled, compression is disabled. NOTE: If 'nodatacow' or 'nodatasum' are enabled, compression is disabled.
+
There is a slight performance gain when checksums are turned off, the
correspoinding metadata blocks holding the checksums do not need to updated.
The cost of checksumming of the blocks in memory is much lower than the IO,
modern CPUs feature hardware support of the checksumming algorithm.
*degraded*:: *degraded*::
(default: off) (default: off)
+ +
Allow mounts with less devices than the raid profile constraints Allow mounts with less devices than the RAID profile constraints
require. A read-write mount (or remount) may fail with too many devices require. A read-write mount (or remount) may fail when there are too many devices
missing, for example if a stripe member is completely missing from RAID0. missing, for example if a stripe member is completely missing from RAID0.
+
Since 4.14, the constraint checks have been improved and are verified on the
chunk level, not an the device level. This allows degraded mounts of
filesystems with mixed RAID profiles for data and metadata, even if the
device number constraints would not be satisfied for some of the prifles.
+
Example: metadata -- raid1, data -- single, devices -- /dev/sda, /dev/sdb
+
Suppose the data are completely stored on 'sda', then missing 'sdb' will not
prevent the mount, even if 1 missing device would normally prevent (any)
'single' profile to mount. In case some of the data chunks are stored on 'sdb',
then the constraint of single/data is not satisfied and the filesystem
cannot be mounted.
*device='devicepath'*:: *device='devicepath'*::
Specify a path to a device that will be scanned for BTRFS filesystem during Specify a path to a device that will be scanned for BTRFS filesystem during
@ -174,14 +198,14 @@ system at that point.
*nodiscard*:: *nodiscard*::
(default: off) (default: off)
+ +
Enable discarding of freed file blocks using TRIM operation. This is useful Enable discarding of freed file blocks using the TRIM operation. This is useful
for SSD devices, thinly provisioned LUNs or virtual machine images where the for SSD devices, thinly provisioned LUNs or virtual machine images where the
backing device understands the operation. Depending on support of the backing device understands the operation. Depending on support of the
underlying device, the operation may severely hurt performance in case the TRIM underlying device, the operation may severely hurt performance in case the TRIM
operation is synchronous (eg. with SATA devices up to revision 3.0). operation is synchronous (eg. with SATA devices up to revision 3.0).
+ +
If discarding is not necessary to be done at the block freeing time, there's If discarding is not necessary to be done at the block freeing time, there's
`fstrim` tool that lets the filesystem discard all free blocks in a batch, `fstrim`(8) tool that lets the filesystem discard all free blocks in a batch,
possibly not much interfering with other operations. Also, the the device may possibly not much interfering with other operations. Also, the the device may
ignore the TRIM command if the range is too small, so running the batch discard ignore the TRIM command if the range is too small, so running the batch discard
can actually discard the blocks. can actually discard the blocks.
@ -215,7 +239,7 @@ This option forces any data dirtied by a write in a prior transaction to commit
as part of the current commit, effectively a full filesystem sync. as part of the current commit, effectively a full filesystem sync.
+ +
This makes the committed state a fully consistent view of the file system from This makes the committed state a fully consistent view of the file system from
the application's perspective (i.e., it includes all completed file system the application's perspective (i.e. it includes all completed file system
operations). This was previously the behavior only when a snapshot was operations). This was previously the behavior only when a snapshot was
created. created.
+ +
@ -245,6 +269,14 @@ the option.
+ +
NOTE: Defaults to off due to a potential overflow problem when the free space NOTE: Defaults to off due to a potential overflow problem when the free space
checksums don't fit inside a single page. checksums don't fit inside a single page.
+
Don't use this option unless you really need it. The inode number limit
on 64bit system is 2^64^, which is practically enough for the whole filesystem
lifetime. Due to implemention of linux VFS layer, the inode numbers on 32bit
systems are only 32 bits wide. This lowers the limit significantly and makes
it possible to reach it. In such case, this mount option will help.
Alternatively, files with high inode numbers can be copied to a new subvolume
which will effectively start the inode numbers from the beginning again.
*logreplay*:: *logreplay*::
*nologreplay*:: *nologreplay*::
@ -258,7 +290,7 @@ disable that behaviour, mount also with 'nologreplay'.
*max_inline='bytes'*:: *max_inline='bytes'*::
(default: min(2048, page size) ) (default: min(2048, page size) )
+ +
Specify the maximum amount of space, in bytes, that can be inlined in Specify the maximum amount of space, that can be inlined in
a metadata B-tree leaf. The value is specified in bytes, optionally a metadata B-tree leaf. The value is specified in bytes, optionally
with a K suffix (case insensitive). In practice, this value with a K suffix (case insensitive). In practice, this value
is limited by the filesystem block size (named 'sectorsize' at mkfs time), is limited by the filesystem block size (named 'sectorsize' at mkfs time),
@ -319,8 +351,8 @@ the space cache consumes some resources, including a small amount of disk
space. space.
+ +
There are two implementations of the free space cache. The original There are two implementations of the free space cache. The original
implementation, 'v1', is the safe default. The 'v1' space cache can be disabled one, referred to as 'v1', is the safe default. The 'v1' space cache can be
at mount time with 'nospace_cache' without clearing. disabled at mount time with 'nospace_cache' without clearing.
+ +
On very large filesystems (many terabytes) and certain workloads, the On very large filesystems (many terabytes) and certain workloads, the
performance of the 'v1' space cache may degrade drastically. The 'v2' performance of the 'v1' space cache may degrade drastically. The 'v2'
@ -329,12 +361,12 @@ this issue. Once enabled, the 'v2' space cache will always be used and cannot
be disabled unless it is cleared. Use 'clear_cache,space_cache=v1' or be disabled unless it is cleared. Use 'clear_cache,space_cache=v1' or
'clear_cache,nospace_cache' to do so. If 'v2' is enabled, kernels without 'v2' 'clear_cache,nospace_cache' to do so. If 'v2' is enabled, kernels without 'v2'
support will only be able to mount the filesystem in read-only mode. The support will only be able to mount the filesystem in read-only mode. The
`btrfs(8)` command currently only has read-only support for 'v2'. A read-write `btrfs`(8) command currently only has read-only support for 'v2'. A read-write
command may be run on a 'v2' filesystem by clearing the cache, running the command may be run on a 'v2' filesystem by clearing the cache, running the
command, and then remounting with 'space_cache=v2'. command, and then remounting with 'space_cache=v2'.
+ +
If a version is not explicitly specified, the default implementation will be If a version is not explicitly specified, the default implementation will be
chosen, which is 'v1' as of 4.9. chosen, which is 'v1'.
*ssd*:: *ssd*::
*ssd_spread*:: *ssd_spread*::
@ -342,10 +374,22 @@ chosen, which is 'v1' as of 4.9.
(default: SSD autodetected) (default: SSD autodetected)
+ +
Options to control SSD allocation schemes. By default, BTRFS will Options to control SSD allocation schemes. By default, BTRFS will
enable or disable SSD allocation heuristics depending on whether a enable or disable SSD optimizations depending on status of a device with
rotational or non-rotational device is in use (contents of respect to rotational or non-rotational type. This is determined by the
'/sys/block/DEV/queue/rotational'). If it is, the 'ssd' option is turned on. contents of '/sys/block/DEV/queue/rotational'). If it is 1, the 'ssd' option is
The option 'nossd' will disable the autodetection. turned on. The option 'nossd' will disable the autodetection.
+
The optimizations make use of the absence of the seek penalty that's inherent
for the rotational devices. The blocks can be typically written faster and
are not offloaded to separate threads.
+
NOTE: Since 4.14, the block layout optimizations have been dropped. This used
to help with first generations of SSD devices. Their FTL (flash translation
layer) was not effective and the optimization was supposed to improve the wear
by better aligning blocks. This is no longer true with modern SSD devices and
the optimization had no real benefit. Furthermore it caused increased
fragmentation. The layout tuning has been kept intact for the option
'ssd_spread'.
+ +
The 'ssd_spread' mount option attempts to allocate into bigger and aligned The 'ssd_spread' mount option attempts to allocate into bigger and aligned
chunks of unused space, and may perform better on low-end SSDs. 'ssd_spread' chunks of unused space, and may perform better on low-end SSDs. 'ssd_spread'
@ -354,25 +398,26 @@ will disable all SSD options.
*subvol='path'*:: *subvol='path'*::
Mount subvolume from 'path' rather than the toplevel subvolume. The Mount subvolume from 'path' rather than the toplevel subvolume. The
'path' is absolute (ie. starts at the toplevel subvolume). 'path' is always treated as relative to the the toplevel subvolume.
This mount option overrides the default subvolume set for the given filesystem. This mount option overrides the default subvolume set for the given filesystem.
*subvolid='subvolid'*:: *subvolid='subvolid'*::
Mount subvolume specified by a 'subvolid' number rather than the toplevel Mount subvolume specified by a 'subvolid' number rather than the toplevel
subvolume. You can use *btrfs subvolume list* to see subvolume ID numbers. subvolume. You can use *btrfs subvolume list* of *btrfs subvolume show* to see
subvolume ID numbers.
This mount option overrides the default subvolume set for the given filesystem. This mount option overrides the default subvolume set for the given filesystem.
+ +
NOTE: if both 'subvolid' and 'subvol' are specified, they must point at the NOTE: if both 'subvolid' and 'subvol' are specified, they must point at the
same subvolume, otherwise mount will fail. same subvolume, otherwise the mount will fail.
*thread_pool='number'*:: *thread_pool='number'*::
(default: min(NRCPUS + 2, 8) ) (default: min(NRCPUS + 2, 8) )
+ +
The number of worker threads to allocate. NRCPUS is number of on-line CPUs The number of worker threads to start. NRCPUS is number of on-line CPUs
detected at the time of mount. Small number leads to less parallelism in detected at the time of mount. Small number leads to less parallelism in
processing data and metadata, higher numbers could lead to a performance hit processing data and metadata, higher numbers could lead to a performance hit
due to increased locking contention, cache-line bouncing or costly data due to increased locking contention, process scheduling, cache-line bouncing or
transfers between local CPU memories. costly data transfers between local CPU memories.
*treelog*:: *treelog*::
*notreelog*:: *notreelog*::
@ -384,13 +429,14 @@ are flushed at sync and transaction commit. If the system crashes between two
such syncs, the pending tree log operations are replayed during mount. such syncs, the pending tree log operations are replayed during mount.
+ +
WARNING: currently, the tree log is replayed even with a read-only mount! To WARNING: currently, the tree log is replayed even with a read-only mount! To
disable that behaviour, mount also with 'nologreplay'. disable that behaviour, also mount with 'nologreplay'.
+ +
The tree log could contain new files/directories, these would not exist on The tree log could contain new files/directories, these would not exist on
a mounted filesystem if the log is not replayed. a mounted filesystem if the log is not replayed.
*usebackuproot*:: *usebackuproot*::
*nousebackuproot*:: *nousebackuproot*::
(since: 4.6, default: off)
+ +
Enable autorecovery attempts if a bad tree root is found at mount time. Enable autorecovery attempts if a bad tree root is found at mount time.
Currently this scans a backup list of several previous tree roots and tries to Currently this scans a backup list of several previous tree roots and tries to
@ -403,6 +449,11 @@ NOTE: This option has replaced 'recovery'.
+ +
Allow subvolumes to be deleted by their respective owner. Otherwise, only the Allow subvolumes to be deleted by their respective owner. Otherwise, only the
root user can do that. root user can do that.
+
NOTE: historically, any user could create a snapshot even if he was not owner
of the source subvolume, the subvolume deletion has been restricted for that
reason. The subvolume creation has been restricted but this mount option is
still required. This is a usability issue and will be addressed in the future.
DEPRECATED MOUNT OPTIONS DEPRECATED MOUNT OPTIONS
~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~