From cf0fce7ed408b8955d43e1a7d3d15f9488160e35 Mon Sep 17 00:00:00 2001 From: David Sterba Date: Fri, 8 Jan 2016 19:33:02 +0100 Subject: [PATCH] btrfs-progs: docs, enhance the mount option manual page Signed-off-by: David Sterba --- Documentation/btrfs-mount.asciidoc | 292 +++++++++++++++++++---------- 1 file changed, 197 insertions(+), 95 deletions(-) diff --git a/Documentation/btrfs-mount.asciidoc b/Documentation/btrfs-mount.asciidoc index d52d4c42..8be70e33 100644 --- a/Documentation/btrfs-mount.asciidoc +++ b/Documentation/btrfs-mount.asciidoc @@ -3,30 +3,42 @@ btrfs-mount(5) NAME ---- -btrfs-mount - mount options and supported file attributes for the btrfs filesystem +btrfs-mount - topics about the BTRFS filesystem (mount options, supported file attributes and other) DESCRIPTION ----------- -This document describes mount options specific to the btrfs filesystem. -Other generic mount options are available,and are described in the -`mount`(8) manpage. +This document describes topics related to BTRFS that are not specific to the +tools. MOUNT OPTIONS ------------- + +This section describes mount options specific to BTRFS. For the generic mount +options please refer to `mount`(8) manpage. + *alloc_start='bytes'*:: +(default: 1M, minimum: 1M) ++ Debugging option to force all block allocations above a certain byte threshold on each block device. The value is specified in -bytes, optionally with a K, M, or G suffix, case insensitive. -Default is 1MB. +bytes, optionally with a K, M, or G suffix (case insensitive). ++ +This option was used for testing and has not practial use, it's slated to be +removed in the future. *autodefrag*:: *noautodefrag*:: -(since: 3.0, default: off) + -Disable/enable auto defragmentation. -Auto defragmentation detects small random writes into files and queue -them up for the defrag process. Works best for small files; +(since: 3.0, default: off) ++ +Enable automatic file defragmentation. +When enabled, small random writes into files (in a range of tens of kilobytes, +currently it's 64K) are detected and queued up for the defragmentation process. Not well suited for large database workloads. + +The read latency may increase due to reading the adjacent blocks that make up the +range for defragmentation, successive write will merge the blocks in the new +location. ++ WARNING: Defragmenting with Linux kernel versions < 3.9 or ≥ 3.14-rc2 as well as with Linux stable kernel versions ≥ 3.10.31, ≥ 3.12.12 or ≥ 3.13.4 will break up the ref-links of CoW data (for example files @@ -37,7 +49,8 @@ broken up ref-links. *check_int*:: *check_int_data*:: *check_int_print_mask='value'*:: -(since: 3.0, default: off) + +(since: 3.0, default: off) ++ These debugging options control the behavior of the integrity checking module (the BTRFS_FS_CHECK_INTEGRITY config option required). + + @@ -56,7 +69,8 @@ See comments at the top of 'fs/btrfs/check-integrity.c' for more info. *commit='seconds'*:: -(since: 3.12, default: 30) + +(since: 3.12, default: 30) ++ Set the interval of periodic commit. Higher values defer data being synced to permanent storage with obvious consequences when the system crashes. The upper bound is not forced, @@ -66,7 +80,8 @@ but a warning is printed if it's more than 300 seconds (5 minutes). *compress='type'*:: *compress-force*:: *compress-force='type'*:: -(default: off) + +(default: off) ++ Control BTRFS file data compression. Type may be specified as 'zlib', 'lzo' or 'no' (for no compression, used for remounting). If no type is specified, 'zlib' is used. If compress-force is specified, @@ -75,37 +90,51 @@ all files will be compressed, whether or not they compress well. NOTE: If compression is enabled, 'nodatacow' and 'nodatasum' are disabled. *degraded*:: -(default: off) + -Allow mounts to continue with missing devices. A read-write mount may -fail with too many devices missing, for example if a stripe member -is completely missing. +(default: off) ++ +Allow mounts with less devices than the raid profile constraints +require. A read-write mount (or remount) may fail with too many devices +missing, for example if a stripe member is completely missing from RAID0. *device='devicepath'*:: -Specify a device during mount so that ioctls on the control device -can be avoided. Especially useful when trying to mount a multi-device -setup as root. May be specified multiple times for multiple devices. +Specify a path to a device that will be scanned for BTRFS filesystem during +mount. This is usually done automatically by a device manager (like udev) or +using the *btrfs device scan* command (eg. run from the initial ramdisk). In +cases where this is not possible the 'device' mount option can help. ++ +NOTE: booting eg. a RAID1 system may fail even if all filesystem's 'device' +paths are provided as the actual device nodes may not be discovered by the +system at that point. *discard*:: *nodiscard*:: -(default: off) + -Disable/enable discard mount option. -Discard issues frequent commands to let the block device reclaim space -freed by the filesystem. -This is useful for SSD devices, thinly provisioned -LUNs and virtual machine images, but may have a significant -performance impact. (The fstrim command is also available to -initiate batch trims from userspace). +(default: off) ++ +Enable discarding of freed file blocks using TRIM operation. This is useful +for SSD devices, thinly provisioned LUNs or virtual machine images where the +backing device understands the operation. Depending on support of the +underlying device, the operation may severly hurt performance in case the TRIM +operation is synchronous (eg. with SATA devices up to revision 3.0). ++ +If discarding is not necessary to be done at the block freeing time, there's +*fstrim* tool that lets the filesystem discard all free blocks in a batch, +possibly not much interfering with other operations. *enospc_debug*:: -(default: off) + -Disable/enable debugging option to be more verbose in some ENOSPC conditions. +(default: off) ++ +Enable verbose output for some ENOSPC conditions. It's safe to use but can +be noisy if the system hits reaches near-full state. *fatal_errors='action'*:: -(since: 3.4, default: bug) + -Action to take when encountering a fatal error. + +(since: 3.4, default: bug) ++ +Action to take when encountering a fatal error. ++ *bug*:::: 'BUG()' on a fatal error, the system will stay in the crashed state and may be -still partially usable, but reboot is required for full operation + +still partially usable, but reboot is required for full operation ++ *panic*:::: 'panic()' on a fatal error, depending on other system configuration, this may be followed by a reboot. Please refer to the documentation of kernel boot @@ -113,82 +142,144 @@ parameters, eg. 'panic', 'oops' or 'crashkernel'. *flushoncommit*:: *noflushoncommit*:: -(default: on) + -The `flushoncommit` mount option forces any data dirtied by a write in a -prior transaction to commit as part of the current commit. This makes -the committed state a fully consistent view of the file system from the -application's perspective (i.e., it includes all completed file system -operations). This was previously the behavior only when a snapshot is -created. +(default: on) ++ +This option forces any data dirtied by a write in a prior transaction to commit +as part of the current commit. This makes the committed state a fully +consistent view of the file system from the application's perspective (i.e., it +includes all completed file system operations). This was previously the +behavior only when a snapshot was created. ++ +Disabling flushing may improve performance but is not crash-safe. *inode_cache*:: *noinode_cache*:: -(since: 3.0, default: off) + -Enable free inode number caching. Defaults to off due to an overflow -problem when the free space crcs don't fit inside a single page. +(since: 3.0, default: off) ++ +Enable free inode number caching. Not recommended to use unless files on your +filesystem get assigned inode numbers that are approaching 2^64^. Normally, new +files in each subvolume get assigned incrementally (plus one from the last +time) and are not reused. The mount option turns on caching of the existing +inode numbers and reuse of inode numbers of deleted files. ++ +This option may slow down your system at first run, or after mounting without +the option. ++ +NOTE: Defaults to off due to a potential overflow problem when the free space +checksums don't fit inside a single page. *max_inline='bytes'*:: (default: min(8192, page size) ) ++ Specify the maximum amount of space, in bytes, that can be inlined in a metadata B-tree leaf. The value is specified in bytes, optionally -with a K, M, or G suffix, case insensitive. In practice, this value -is limited by the root sector size, with some space unavailable due -to leaf headers. For a 4k sectorsize, max inline data is ~3900 bytes. +with a K suffix (case insensitive). In practice, this value +is limited by the filesystem block size (named 'sectorsize' at mkfs time), +and memory page size of the system. In case of sectorsize limit, there's +some space unavailable due to leaf headers. For example, a 4k sectorsize, max +inline data is ~3900 bytes. ++ +Inlining can be completely turned off specifying 0. This will increase data +block slack if file sizes are much smaller than block size but will reduce +metadata consumption in return. *metadata_ratio='value'*:: -Specify that 1 metadata chunk should be allocated after every -'value' data chunks. Off by default. +(default: 0, internal logic) ++ +Specifies that 1 metadata chunk should be allocated after every 'value' data +chunks. Default behaviour depends on internal logic, some percent of unused +metadata space is attempted to be maintained but is not always possible if +there's not space left for chunk allocation. The option could be useful to +override the internal logic in favor of the metadata allocation if the expected +workload is supposed to be metadata intense (snapshots, reflinks, xattrs, +inlined files). *acl*:: *noacl*:: -(default: on) + +(default: on) ++ Enable/disable support for Posix Access Control Lists (ACLs). See the `acl`(5) manual page for more information about ACLs. *barrier*:: *nobarrier*:: -(default: on) + -ensure that certain IOs make it through the device cache and are on -persistent storage. If disabled on a device with a volatile -(non-battery-backed) write-back cache, nobarrier option will lead to -filesystem corruption on a system crash or power loss. +(default: on) ++ +Ensure that all IO write operations make it through the device cache and are stored +permanently when the filesystem is at it's consistency checkpoint. This +typically means that a flush command is sent to the device that will +synchronize all pending data and ordinary metadata blocks, then writes the +superblock and issues another flush. ++ +The write flushes incur a slight hit and also prevent the IO block +scheduler to reorder requests in more effective way. Disabling barriers gets +rid of that penalty but will most certainly lead to a corrupted filesystem in +case of a crash or power loss. The ordinary metadata blocks could be yet +unwrittent at the time the new superblock is stored permanently, expecting that +the block pointers to metadata were stored permanently before. ++ +On a device with a volatile battery-backed write-back cache, the 'nobarrier' +option will not lead to filesystem corruption as the pending blocks are +supposed to make it to the permanent storage. *datacow*:: *nodatacow*:: -(default: on) + -Enable/disable data copy-on-write for newly created files. -Nodatacow implies nodatasum, and disables all compression. +(default: on) ++ +Enable data copy-on-write for newly created files. +'Nodatacow' implies 'nodatasum', and disables 'compression'. All files created +under 'nodatacow' are also set the NOCOW file attribute (see `chattr`(1)). *datasum*:: *nodatasum*:: -(default: on) + -Enable/disable data checksumming for newly created files. -Datasum implies datacow. +(default: on) ++ +Enable data checksumming for newly created files. +'Datasum' implies 'datacow', ie. the normal mode of operation. All files created +under 'nodatasum' inherit the "no checksums" property, however there's no +corresponding file attribute (see `chattr`(1)). *treelog*:: *notreelog*:: -(default: on) + -Enable/disable the tree logging used for fsync and O_SYNC writes. +(default: on) ++ +Enable the tree logging used for 'fsync' and 'O_SYNC' writes. The tree log +stores changes without the need of a full filesystem sync. The log operations +are flushed at sync and transaction commit. If the system crashes between two +such syncs, the pending tree log operations are replayed during mount. ++ +WARNING: currently, the tree log is replayed even with a read-only mount! ++ +The tree log could contain new files/directories, these would not exist on +a mounted filesystm if the log is not replayed. *recovery*:: -(since: 3.2, default: off) + +(since: 3.2, default: off) ++ Enable autorecovery attempts if a bad tree root is found at mount time. -Currently this scans a list of several previous tree roots and tries to -use the first readable. +Currently this scans a backup list of several previous tree roots and tries to +use the first readable. This can be used with read-only mounts as well. *rescan_uuid_tree*:: -(since: 3.12, default: off) + +(since: 3.12, default: off) ++ Force check and rebuild procedure of the UUID tree. This should not normally be needed. *skip_balance*:: -(since: 3.3, default: off) + +(since: 3.3, default: off) ++ Skip automatic resume of interrupted balance operation after mount. -May be resumed with "btrfs balance resume." +May be resumed with *btrfs balance resume* or the paused state can be removed +by *btrfs balance cancel*. *nospace_cache*:: -(since: 3.2) + -Disable freespace cache loading without clearing the cache. +(since: 3.2) ++ +Disable freespace cache loading without clearing the cache and the free space +cache will not be used during the mount. This affects performance as searching +for new free blocks could take longer. On the other hand, managing the space +cache consumes some resources. *clear_cache*:: Force clearing and rebuilding of the disk space cache if something @@ -197,38 +288,47 @@ has gone wrong. *ssd*:: *nossd*:: *ssd_spread*:: -Options to control ssd allocation schemes. By default, BTRFS will -enable or disable ssd allocation heuristics depending on whether a -rotational or nonrotational disk is in use. The ssd and nossd options -can override this autodetection. + -The ssd_spread mount option attempts to allocate into big chunks -of unused space, and may perform better on low-end ssds. ssd_spread -implies ssd, enabling all other ssd heuristics as well. +(default: SSD autodetected) ++ +Options to control SSD allocation schemes. By default, BTRFS will +enable or disable SSD allocation heuristics depending on whether a +rotational or nonrotational disk is in use. The 'ssd' and 'nossd' options +can override this autodetection. ++ +The 'ssd_spread' mount option attempts to allocate into bigger and aligned +chunks of unused space, and may perform better on low-end SSDs. 'ssd_spread' +implies 'ssd', enabling all other SSD heuristics as well. *subvol='path'*:: -Mount subvolume at 'path' rather than the root subvolume. The -'path' is relative to the top level subvolume. +Mount subvolume from 'path' rather than the toplevel subvolume. The +'path' is absolute (ie. starts at the toplevel subvolume). +This mount option overrides the default subvolume set for the given filesystem. -*subvolid='ID'*:: -Mount subvolume specified by an ID number rather than the root subvolume. -This allows mounting of subvolumes which are not in the root of the mounted -filesystem. -You can use "btrfs subvolume list" to see subvolume ID numbers. +*subvolid='subvolid'*:: +Mount subvolume specified by a 'subvolid' number rather than the toplevel +subvolume. You can use *btrfs subvolume list* to see subvolume ID numbers. +This mount option overrides the default subvolume set for the given filesystem. *subvolrootid='objectid'*:: -(deprecated) + -Mount subvolume specified by 'objectid' rather than the root subvolume. -This allows mounting of subvolumes which are not in the root of the mounted -filesystem. -You can use "btrfs subvolume show" to see the object ID for a subvolume. +(irrelevant since: 3.2, formally deprecated since: 3.10) ++ +A workaround option from times (pre 3.2) when it was not possible to mount a +subvolume that did not reside directly under the toplevel subvolume. *thread_pool='number'*:: -The number of worker threads to allocate. The default number is equal -to the number of CPUs + 2, or 8, whichever is smaller. +(default: min(NRCPUS + 2, 8) ) ++ +The number of worker threads to allocate. NRCPUS is number of on-line CPUs +detected at the time of mount. Small number leads to less parallelism in +processing data and metadata, higher numbers could lead to a performance due to +increased locking contention, cache-line bouncing or costly data transfers +between local CPU memories. *user_subvol_rm_allowed*:: -(default: off) + -Allow subvolumes to be deleted by a non-root user. Use with caution. +(default: off) ++ +Allow subvolumes to be deleted by their respective owner. Otherwise, only the +root user can do that. FILE ATTRIBUTES --------------- @@ -258,7 +358,9 @@ For descriptions of these attribute flags, please refer to the SEE ALSO -------- +`acl`(5), +`btrfs`(8), `chattr`(1), +`fstrim`(8), `mkfs.btrfs`(8), -`mount`(8), -`btrfs`(8) +`mount`(8)