From 922797e15590b836e377d6dc47b828356cafc2a9 Mon Sep 17 00:00:00 2001 From: David Sterba Date: Thu, 4 Mar 2021 13:47:26 +0100 Subject: [PATCH] btrfs-progs: docs: add section about raid56 Used sources: - wiki - IRC discussions - https://lore.kernel.org/linux-btrfs/20200627032414.GX10769@hungrycats.org Signed-off-by: David Sterba --- Documentation/btrfs-man5.asciidoc | 46 +++++++++++++++++++++++++++++++ Documentation/mkfs.btrfs.asciidoc | 3 +- 2 files changed, 47 insertions(+), 2 deletions(-) diff --git a/Documentation/btrfs-man5.asciidoc b/Documentation/btrfs-man5.asciidoc index af4df8a0..3942fbea 100644 --- a/Documentation/btrfs-man5.asciidoc +++ b/Documentation/btrfs-man5.asciidoc @@ -20,6 +20,7 @@ tools. Currently covers: . control device . filesystems with multiple block group profiles . seeding device +. raid56 status and recommended practices MOUNT OPTIONS @@ -1089,6 +1090,51 @@ A few things to note: * each new mount of the seeding device gets a new random UUID +RAID56 STATUS AND RECOMMENDED PRACTICES +--------------------------------------- + +The RAID56 feature provides striping and parity over several devices, same as +the traditional RAID5/6. There are some implementation and design deficiencies +that make it unreliable for some corner cases and the feature **should not be +used in production, only for evaluation or testing**. The power failure safety +for metadata with RAID56 is not 100%. + +Metadata +~~~~~~~~ + +Do not use 'raid5' nor 'raid6' for metadata. Use 'raid1' or 'raid1c3' +respectively. + +The substitute profiles provide the same guarantees against loss of 1 or 2 +devices, and in some respect can be an improvement. Recovering from one +missing device will only need to access the remaining 1st or 2nd copy, that in +general may be stored on some other devices due to the way RAID1 works on +btrfs, unlike on a striped profile (similar to 'raid0') that would need all +devices all the time. + +The space allocation pattern and consumption is different (eg. on N devices): +for 'raid5' as an example, a 1GiB chunk is reserved on each device, while with +'raid1' there's each 1GiB chunk stored on 2 devices. The consumption of each +1GiB of used metadata is then 'N * 1GiB' for vs '2 * 1GiB'. Using 'raid1' +is also more convenient for balancing/converting to other profile due to lower +requirement on the available chunk space. + +Missing/incomplete support +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When RAID56 is on the same filesystem with different raid profiles, the space +reporting is inaccurate, eg. 'df', 'btrfs filesystem df' or 'btrfs filesystem +usge'. When there's only a one profile per block group type (eg. raid5 for data) +the reporting is accurate. + +When scrub is started on a RAID56 filesystem, it's started on all devices that +degrade the performance. The workaround is to start it on each device +separately. Due to that the device stats may not match the actual state and +some errors might get reported multiple times. + +The 'write hole' problem. + + SEE ALSO -------- `acl`(5), diff --git a/Documentation/mkfs.btrfs.asciidoc b/Documentation/mkfs.btrfs.asciidoc index 8c7ba9e9..90a248db 100644 --- a/Documentation/mkfs.btrfs.asciidoc +++ b/Documentation/mkfs.btrfs.asciidoc @@ -205,8 +205,7 @@ root partition created with RAID1/10/5/6 profiles. The mount action can happen before all block devices are discovered. The waiting is usually done on the initramfs/initrd systems. -As of kernel 4.14, RAID5/6 is still considered experimental and shouldn't be -employed for production use. +RAID5/6 has known problems and should not be used in production. FILESYSTEM FEATURES -------------------