mirror of
https://github.com/kdave/btrfs-progs
synced 2025-01-12 08:49:33 +00:00
9e0dd182b4
Signed-off-by: David Sterba <dsterba@suse.com>
360 lines
13 KiB
Plaintext
360 lines
13 KiB
Plaintext
mkfs.btrfs(8)
|
|
=============
|
|
|
|
NAME
|
|
----
|
|
mkfs.btrfs - create a btrfs filesystem
|
|
|
|
SYNOPSIS
|
|
--------
|
|
*mkfs.btrfs* [options] <device> [<device>...]
|
|
|
|
DESCRIPTION
|
|
-----------
|
|
*mkfs.btrfs* is used to create the btrfs filesystem on a single or multiple
|
|
devices. <device> is typically a block device but can be a file-backed image
|
|
as well. Multiple devices are grouped by UUID of the filesystem.
|
|
|
|
Before mounting such filesystem, the kernel module must know all the devices
|
|
either via preceding execution of *btrfs device scan* or using the *device*
|
|
mount option. See section *MULTIPLE DEVICES* for more details.
|
|
|
|
OPTIONS
|
|
-------
|
|
*-A|--alloc-start <offset>*::
|
|
(An option to help debugging chunk allocator.)
|
|
Specify the (physical) offset from the start of the device at which allocations
|
|
start. The default value is zero.
|
|
|
|
*-b|--byte-count <size>*::
|
|
Specify the size of the filesystem. If this option is not used,
|
|
mkfs.btrfs uses the entire device space for the filesystem.
|
|
|
|
*-d|--data <profile>*::
|
|
Specify the profile for the data block groups. Valid values are 'raid0',
|
|
'raid1', 'raid5', 'raid6', 'raid10' or 'single' or dup (case does not matter).
|
|
+
|
|
See 'DUP PROFILES ON A SINGLE DEVICE' for more.
|
|
|
|
*-m|--metadata <profile>*::
|
|
Specify the profile for the metadata block groups.
|
|
Valid values are 'raid0', 'raid1', 'raid5', 'raid6', 'raid10', 'single' or
|
|
'dup', (case does not matter).
|
|
+
|
|
A single device filesystem will default to 'DUP', unless a SSD is detected. Then
|
|
it will default to 'single'. The detection is based on the value of
|
|
`/sys/block/DEV/queue/rotational`, where 'DEV' is the short name of the device.
|
|
+
|
|
Note that the rotational status can be arbitrarily set by the underlying block
|
|
device driver and may not reflect the true status (network block device, memory-backed
|
|
SCSI devices etc). Use the options '--data/--metadata' to avoid confusion.
|
|
+
|
|
See 'DUP PROFILES ON A SINGLE DEVICE' for more details.
|
|
|
|
*-M|--mixed*::
|
|
Normally the data and metadata block groups are isolated. The 'mixed' mode
|
|
will remove the isolation and store both types in the same block group type.
|
|
This helps to utilize the free space regardless of the purpose and is suitable
|
|
for small devices. The separate allocation of block groups leads to a situation
|
|
where the space is reserved for the other block group type, is not available for
|
|
allocation and can lead to ENOSPC state.
|
|
+
|
|
The recommended size for the mixed mode is for filesystems less than 1GiB. The
|
|
soft recommendation is to use it for filesystems smaller than 5GiB. The mixed
|
|
mode may lead to degraded performance on larger filesystems, but is otherwise
|
|
usable, even on multiple devices.
|
|
+
|
|
The 'nodesize' and 'sectorsize' must be equal, and the block group types must
|
|
match.
|
|
+
|
|
NOTE: versions up to 4.2.x forced the mixed mode for devices smaller than 1GiB.
|
|
This has been removed in 4.3+ as it caused some usability issues.
|
|
|
|
*-l|--leafsize <size>*::
|
|
Alias for --nodesize. Deprecated.
|
|
|
|
*-n|--nodesize <size>*::
|
|
Specify the nodesize, the tree block size in which btrfs stores metadata. The
|
|
default value is 16KiB (16384) or the page size, whichever is bigger. Must be a
|
|
multiple of the sectorsize, but not larger than 64KiB (65536). Leafsize always
|
|
equals nodesize and the options are aliases.
|
|
+
|
|
Smaller node size increases fragmentation but lead to higher b-trees which in
|
|
turn leads to lower locking contention. Higher node sizes give better packing
|
|
and less fragmentation at the cost of more expensive memory operations while
|
|
updating the metadata blocks.
|
|
+
|
|
NOTE: versions up to 3.11 set the nodesize to 4k.
|
|
|
|
*-s|--sectorsize <size>*::
|
|
Specify the sectorsize, the minimum data block allocation unit.
|
|
+
|
|
The default value is the page size and is autodetected. If the sectorsize
|
|
differs from the page size, the created filesystem may not be mountable by the
|
|
kernel. Therefore it is not recommended to use this option unless you are going
|
|
to mount it on a system with the appropriate page size.
|
|
|
|
*-L|--label <string>*::
|
|
Specify a label for the filesystem. The 'string' should be less than 256
|
|
bytes and must not contain newline characters.
|
|
|
|
*-K|--nodiscard*::
|
|
Do not perform whole device TRIM operation on devices that are capable of that.
|
|
|
|
*-r|--rootdir <rootdir>*::
|
|
Populate the toplevel subvolume with files from 'rootdir'. This does not
|
|
require root permissions and does not mount the filesystem.
|
|
|
|
*-O|--features <feature1>[,<feature2>...]*::
|
|
A list of filesystem features turned on at mkfs time. Not all features are
|
|
supported by old kernels. To disable a feature, prefix it with '^'.
|
|
+
|
|
See section *FILESYSTEM FEATURES* for more details. To see all available
|
|
features that mkfs.btrfs supports run:
|
|
+
|
|
+mkfs.btrfs -O list-all+
|
|
|
|
*-f|--force*::
|
|
Forcibly overwrite the block devices when an existing filesystem is detected.
|
|
By default, mkfs.btrfs will utilize 'libblkid' to check for any known
|
|
filesystem on the devices. Alternatively you can use the `wipefs` utility
|
|
to clear the devices.
|
|
|
|
*-q|--quiet*::
|
|
Print only error or warning messages. Options --features or --help are unaffected.
|
|
|
|
*-U|--uuid <UUID>*::
|
|
Create the filesystem with the given 'UUID'. The UUID must not exist on any
|
|
filesystem currently present.
|
|
|
|
*-V|--version*::
|
|
Print the *mkfs.btrfs* version and exit.
|
|
|
|
*--help*::
|
|
Print help.
|
|
|
|
SIZE UNITS
|
|
----------
|
|
The default unit is 'byte'. All size parameters accept suffixes in the 1024
|
|
base. The recognized suffixes are: 'k', 'm', 'g', 't', 'p', 'e', both uppercase
|
|
and lowercase.
|
|
|
|
MULTIPLE DEVICES
|
|
----------------
|
|
|
|
Before mounting a multiple device filesystem, the kernel module must know the
|
|
association of the block devices that are attached to the filesystem UUID.
|
|
|
|
There is typically no action needed from the user. On a system that utilizes a
|
|
udev-like daemon, any new block device is automatically registered. The rules
|
|
call *btrfs device scan*.
|
|
|
|
The same command can be used to trigger the device scanning if the btrfs kernel
|
|
module is reloaded (naturally all previous information about the device
|
|
registration is lost).
|
|
|
|
Another possibility is to use the mount options *device* to specify the list of
|
|
devices to scan at the time of mount.
|
|
|
|
# mount -o device=/dev/sdb,device=/dev/sdc /dev/sda /mnt
|
|
|
|
NOTE: that this means only scanning, if the devices do not exist in the system,
|
|
mount will fail anyway. This can happen on systems without initramfs/initrd and
|
|
root partition created with RAID1/10/5/6 profiles. The mount action can happen
|
|
before all block devices are discovered. The waiting is usually done on the
|
|
initramfs/initrd systems.
|
|
|
|
FILESYSTEM FEATURES
|
|
-------------------
|
|
|
|
Features that can be enabled during creation time. See also `btrfs`(5) section
|
|
'FILESYSTEM FEATURES'.
|
|
|
|
*mixed-bg*::
|
|
(kernel support since 2.6.37)
|
|
+
|
|
mixed data and metadata block groups, also set by option '--mixed'
|
|
|
|
*extref*::
|
|
(default since btrfs-progs 3.12, kernel support since 3.7)
|
|
+
|
|
increased hardlink limit per file in a directory to 65536, older kernels
|
|
supported a varying number of hardlinks depending on the sum of all file name
|
|
sizes that can be stored into one metadata block
|
|
|
|
*raid56*::
|
|
(kernel support since 3.9)
|
|
+
|
|
extended format for RAID5/6, also enabled if raid5 or raid6 block groups
|
|
are selected
|
|
|
|
*skinny-metadata*::
|
|
(default since btrfs-progs 3.18, kernel support since 3.10)
|
|
+
|
|
reduced-size metadata for extent references, saves a few percent of metadata
|
|
|
|
*no-holes*::
|
|
(kernel support since 3.14)
|
|
+
|
|
improved representation of file extents where holes are not explicitly
|
|
stored as an extent, saves a few percent of metadata if sparse files are used
|
|
|
|
BLOCK GROUPS, CHUNKS, RAID
|
|
--------------------------
|
|
|
|
The highlevel organizational units of a filesystem are block groups of three types:
|
|
data, metadata and system.
|
|
|
|
*DATA*::
|
|
store data blocks and nothing else
|
|
|
|
*METADATA*::
|
|
store internal metadata in b-trees, can store file data if they fit into the
|
|
inline limit
|
|
|
|
*SYSTEM*::
|
|
store structures that describe the mapping between the physical devices and the
|
|
linear logical space representing the filesystem
|
|
|
|
Other terms commonly used:
|
|
|
|
*block group*::
|
|
*chunk*::
|
|
a logical range of space of a given profile, stores data, metadata or both;
|
|
sometimes the terms are used interchangeably
|
|
+
|
|
A typical size of metadata block group is 256MiB (filesystem smaller than
|
|
50GiB) and 1GiB (larger than 50GiB), for data it's 1GiB. The system block group
|
|
size is a few megabytes.
|
|
|
|
*RAID*::
|
|
a block group profile type that utilizes RAID-like features on multiple
|
|
devices: striping, mirroring, parity
|
|
|
|
*profile*::
|
|
when used in connection with block groups refers to the allocation strategy
|
|
and constraints, see the section 'PROFILES' for more details
|
|
|
|
PROFILES
|
|
--------
|
|
|
|
There are the following block group types available:
|
|
|
|
[ cols="^,^,^,^,^",width="60%" ]
|
|
|=============================================================
|
|
.2+^.<h| Profile 3+^.^h| Redundancy .2+^.<h| Min/max devices
|
|
^.^h| Copies ^.^h| Parity ^.<h| Striping
|
|
| single | 1 | | | 1/any
|
|
| DUP | 2 / 1 device | | | 1/any ^(see note 1)^
|
|
| RAID0 | | | 1 to N | 2/any
|
|
| RAID1 | 2 | | | 2/any
|
|
| RAID10 | 2 | | 1 to N | 4/any
|
|
| RAID5 | 1 | 1 | 2 to N - 1 | 2/any ^(see note 2)^
|
|
| RAID6 | 1 | 2 | 3 to N - 2 | 3/any ^(see note 3)^
|
|
|=============================================================
|
|
|
|
WARNING: It's not recommended to build btrfs with RAID0/1/10/5/6 prfiles on
|
|
partitions from the same device. Neither redundancy nor performance will be
|
|
improved.
|
|
|
|
'Note 1:' DUP may exist on more than 1 device if it starts on a single device and
|
|
another one is added. Since version 4.5.1, *mkfs.btrfs* will let you create DUP
|
|
on multiple devices.
|
|
|
|
'Note 2:' It's not recommended to use 2 devices with RAID5. In that case,
|
|
parity stripe will contain the same data as the data stripe, making RAID5
|
|
degraded to RAID1 with more overhead.
|
|
|
|
'Note 3:' It's also not recommended to use 3 devices with RAID6, unless you
|
|
want to get effectively 3 copies in a RAID1-like manner (but not exactly that).
|
|
N-copies RAID1 is not implemented.
|
|
|
|
DUP PROFILES ON A SINGLE DEVICE
|
|
-------------------------------
|
|
|
|
The mkfs utility will let the user create a filesystem with profiles that write
|
|
the logical blocks to 2 physical locations. Whether there are really 2
|
|
physical copies highly depends on the underlying device type.
|
|
|
|
For example, a SSD drive can remap the blocks internally to a single copy thus
|
|
deduplicating them. This negates the purpose of increased redundancy and just
|
|
wastes filesystem space without the expected level of redundancy.
|
|
|
|
The duplicated data/metadata may still be useful to statistically improve the
|
|
chances on a device that might perform some internal optimizations. The actual
|
|
details are not usually disclosed by vendors. For example we could expect that
|
|
not all blocks get deduplicated. This will provide a non-zero probability of
|
|
recovery compared to a zero chance if the single profile is used. The user
|
|
should make the tradeoff decision. The deduplication in SSDs is thought to be
|
|
widely available so the reason behind the mkfs default is to not give a false
|
|
sense of redundancy.
|
|
|
|
As another example, the widely used USB flash or SD cards use a translation
|
|
layer between the logical and physical view of the device. The data lifetime
|
|
may be affected by frequent plugging. The memory cells could get damaged,
|
|
hopefully not destroying both copies of particular data in case of DUP.
|
|
|
|
The wear levelling techniques can also lead to reduced redundancy, even if the
|
|
device does not do any deduplication. The controllers may put data written in
|
|
a short timespan into the same physical storage unit (cell, block etc). In case
|
|
this unit dies, both copies are lost. BTRFS does not add any artificial delay
|
|
between metadata writes.
|
|
|
|
The traditional rotational hard drives usually fail at the sector level.
|
|
|
|
In any case, a device that starts to misbehave and repairs from the DUP copy
|
|
should be replaced! *DUP is not backup*.
|
|
|
|
KNOWN ISSUES
|
|
------------
|
|
|
|
**SMALL FILESYSTEMS AND LARGE NODESIZE**
|
|
|
|
The combination of small filesystem size and large nodesize is not recommended
|
|
in general and can lead to various ENOSPC-related issues during mount time or runtime.
|
|
|
|
Since mixed block group creation is optional, we allow small
|
|
filesystem instances with differing values for 'sectorsize' and 'nodesize'
|
|
to be created and could end up in the following situation:
|
|
|
|
# mkfs.btrfs -f -n 65536 /dev/loop0
|
|
btrfs-progs v3.19-rc2-405-g976307c
|
|
See http://btrfs.wiki.kernel.org for more information.
|
|
|
|
Performing full device TRIM (512.00MiB) ...
|
|
Label: (null)
|
|
UUID: 49fab72e-0c8b-466b-a3ca-d1bfe56475f0
|
|
Node size: 65536
|
|
Sector size: 4096
|
|
Filesystem size: 512.00MiB
|
|
Block group profiles:
|
|
Data: single 8.00MiB
|
|
Metadata: DUP 40.00MiB
|
|
System: DUP 12.00MiB
|
|
SSD detected: no
|
|
Incompat features: extref, skinny-metadata
|
|
Number of devices: 1
|
|
Devices:
|
|
ID SIZE PATH
|
|
1 512.00MiB /dev/loop0
|
|
|
|
# mount /dev/loop0 /mnt/
|
|
mount: mount /dev/loop0 on /mnt failed: No space left on device
|
|
|
|
The ENOSPC occurs during the creation of the UUID tree. This is caused
|
|
by large metadata blocks and space reservation strategy that allocates more
|
|
than can fit into the filesystem.
|
|
|
|
|
|
AVAILABILITY
|
|
------------
|
|
*mkfs.btrfs* is part of btrfs-progs.
|
|
Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for
|
|
further details.
|
|
|
|
SEE ALSO
|
|
--------
|
|
`btrfs`(5),
|
|
`btrfs`(8),
|
|
`wipefs`(8)
|