Commit Graph

8 Commits

Author SHA1 Message Date
Qu Wenruo
5a24fb191d btrfs-progs: docs: add a warning when converting to a profile with lower redundancy
[BUG]
There is a bug report that when deleting a device using sysfs
/sys/block/<dev>/device/delete, the kernel module will still try to read
and write the device.

Normally it's fine as long as all chunks can tolerate that removed
device (e.g. all RAID1).

But the problem is when one is trying to lower the redundancy by
converting to another profile:

  # mkfs.btrfs -f -m raid1 -d raid1 /dev/sdd /dev/sde
  # mount /dev/sdd /mnt
  # echo 1 > /sys/block/sde/device/delete
  # btrfs balance start --force -mdup -dsingle /mnt

This will lead to the filesystem mounted RO, with the following error messages:

  sd 6:0:0:0: [sde] Synchronizing SCSI cache
  ata7.00: Entering standby power mode
  btrfs: attempt to access beyond end of device
  sde: rw=6145, sector=21696, nr_sectors = 32 limit=0
  btrfs: attempt to access beyond end of device
  sde: rw=6145, sector=21728, nr_sectors = 32 limit=0
  btrfs: attempt to access beyond end of device
  sde: rw=6145, sector=21760, nr_sectors = 32 limit=0
  BTRFS error (device sdd): bdev /dev/sde errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
  BTRFS error (device sdd): bdev /dev/sde errs: wr 2, rd 0, flush 0, corrupt 0, gen 0
  BTRFS error (device sdd): bdev /dev/sde errs: wr 3, rd 0, flush 0, corrupt 0, gen 0
  BTRFS error (device sdd): bdev /dev/sde errs: wr 3, rd 0, flush 1, corrupt 0, gen 0
  btrfs: attempt to access beyond end of device
  sde: rw=145409, sector=128, nr_sectors = 8 limit=0
  BTRFS warning (device sdd): lost super block write due to IO error on /dev/sde (-5)
  BTRFS error (device sdd): bdev /dev/sde errs: wr 4, rd 0, flush 1, corrupt 0, gen 0
  btrfs: attempt to access beyond end of device
  sde: rw=14337, sector=131072, nr_sectors = 8 limit=0
  BTRFS warning (device sdd): lost super block write due to IO error on /dev/sde (-5)
  BTRFS error (device sdd): bdev /dev/sde errs: wr 5, rd 0, flush 1, corrupt 0, gen 0
  BTRFS error (device sdd): error writing primary super block to device 2
  BTRFS info (device sdd): balance: start -dconvert=single -mconvert=dup -sconvert=dup
  BTRFS info (device sdd): relocating block group 1372585984 flags data|raid1
  BTRFS error (device sdd): bdev /dev/sde errs: wr 5, rd 0, flush 2, corrupt 0, gen 0
  BTRFS warning (device sdd): chunk 2446327808 missing 1 devices, max tolerance is 0 for writable mount
  BTRFS: error (device sdd) in write_all_supers:4044: errno=-5 IO failure (errors while submitting device barriers.)
  BTRFS info (device sdd state E): forced readonly
  BTRFS warning (device sdd state E): Skipping commit of aborted transaction.
  BTRFS error (device sdd state EA): Transaction aborted (error -5)
  BTRFS: error (device sdd state EA) in cleanup_transaction:2017: errno=-5 IO failure
  BTRFS info (device sdd state EA): balance: ended with status: -5

[CAUSE]
Btrfs doesn't have any runtime device error handling, it fully rely on
the extra copy provided.

For the sysfs block device removal, normally there is a device shutdown
callback to the running fs, but unfortunately btrfs doesn't support this
callback yet.

Thus even with that device removed, btrfs will still access that
removed device (both read and write, even if they will fail).

Normally for a full RAID1 btrfs, it will still be fine reading/write the
fs as usual.  The proper action is to replace the
removed/missing/failing device with a newer one using `btrfs device
replace`.

But when doing the convert, btrfs will allocate new metadata chunks on
to the removed device (which will lose all writes).

And since the new metadata profile is DUP, which can not handle any
missing device of that metadata chunk, finally it triggers the final
protection at transaction commit time, and flips the filesystem to RO,
before causing any real data loss.

[DOC ENHANCEMENT]
Add a warning to the `convert` filter about the dangerous doing convert
to a lower redundancy profile when there is a known failing/removed
device.

And mention the proper way to handle such failing/missing device.

The root fix is to introduce a failing/removed device detection for
btrfs, but that will be a pretty big feature and will take quite some
time before landing it upstream.

Link: https://lore.kernel.org/linux-btrfs/2cb1d81e-12a8-4fb1-b3fc-e7e83d31e059@siddall.name/
Reported-by: Jeff Siddall <news@siddall.name>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-02-12 22:09:40 +01:00
David Sterba
a0137082de btrfs-progs: docs: formatting updates
- use :file: and :command:
- simplify manual page references
- add more web links
- typo fixes
- more cross-references

Signed-off-by: David Sterba <dsterba@suse.com>
2023-07-26 14:59:10 +02:00
Eideen
1e18750288 btrfs-progs: docs: add balance filter examples
Add more examples and explanations how the filters can be used.

Pull-request: #486
Author: Eideen
Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-09 12:44:03 +02:00
David Sterba
d8172c2fbc btrfs-progs: docs: fixups, references
Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-01 20:50:04 +02:00
David Sterba
1afe51d22d btrfs-progs: docs: use command role for programs or command lines
Replace **bold** or ``quoted`` with :command:`line ...` that is supposed
to be used verbatim.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-27 01:48:47 +02:00
David Sterba
c0360b4735 btrfs-progs: docs: fix typos
Namely change eg. to e.g. and ie. to i.e.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-07 21:00:25 +01:00
Sidong Yang
2d039cc815 btrfs-progs: docs: add cross reference for manualpages
RST format provides cross reference function that users can navigate
manual pages click. This patch is written by macro that replaces old
references to doc role in RST format.

Issue: #495
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:08 +02:00
David Sterba
208aed2ed4 btrfs-progs: docs: add more chapters (part 3)
All main pages have some content and many typos have been fixed.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-12-17 15:35:10 +01:00