Userspace utilities to manage btrfs filesystems
Go to file
Filipe Manana 6f4a51886b btrfs-progs: receive: fix silent data loss after fall back from encoded write
When attempting an encoded write, if it fails for some specific reason
like -EINVAL (when an offset is not sector size aligned) or -ENOSPC, we
then fallback into decompressing the data and writing it using regular
buffered IO. This logic however is not correct, one of the reasons is
that it assumes the encoded offset is smaller than the unencoded file
length and that they can be compared, but one is an offset and the other
is a length, not an end offset, so they can't be compared to get correct
results. This bad logic will often result in not copying all data, or even
no data at all, resulting in a silent data loss. This is easily seen in
with the following reproducer:

   $ cat test.sh
   #!/bin/bash

   DEV=/dev/sdj
   MNT=/mnt/sdj

   umount $DEV &> /dev/null
   mkfs.btrfs -f $DEV > /dev/null
   mount -o compress $DEV $MNT

   # File foo has a size of 33K, not aligned to the sector size.
   xfs_io -f -c "pwrite -S 0xab 0 33K" $MNT/foo

   xfs_io -f -c "pwrite -S 0xcd 0 64K" $MNT/bar

   # Now clone the first 32K of file bar into foo at offset 0.
   xfs_io -c "reflink $MNT/bar 0 0 32K" $MNT/foo

   # Snapshot the default subvolume and create a full send stream (v2).
   btrfs subvolume snapshot -r $MNT $MNT/snap

   btrfs send --compressed-data -f /tmp/test.send $MNT/snap

   echo -e "\nFile bar in the original filesystem:"
   od -A d -t x1 $MNT/snap/bar

   umount $MNT
   mkfs.btrfs -f $DEV > /dev/null
   mount $DEV $MNT

   echo -e "\nReceiving stream in a new filesystem..."
   btrfs receive -f /tmp/test.send $MNT

   echo -e "\nFile bar in the new filesystem:"
   od -A d -t x1 $MNT/snap/bar

   umount $MNT

Running the test without this patch:

   $ ./test.sh
   (...)
   File bar in the original filesystem:
   0000000 cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd
   *
   0065536

   Receiving stream in a new filesystem...
   At subvol snap

   File bar in the new filesystem:
   0000000 cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd
   *
   0033792

We end up with file bar having less data, and a smaller size, than in the
original filesystem.

This happens because when processing file bar, send issues the following
commands:

   clone bar - source=foo source offset=0 offset=0 length=32768
   write bar - offset=32768 length=1024
   encoded_write bar - offset=33792, len=4096, unencoded_offset=33792, unencoded_file_len=31744, unencoded_len=65536, compression=1, encryption=0

The first 32K are cloned from file foo, as that file ranged is shared
between the files.

Then there's a regular write operation for the file range [32K, 33K),
since file foo has different data from bar for that file range.

Finally for the remainder of file bar, the send side issues an encoded
write since the extent is compressed in the source filesystem, for the
file offset 33792 (33K), remaining 31K of data. The receiver will try the
encoded write, but that fails with -EINVAL since the offset 33K is not
sector size aligned, so it will fallback to decompressing the data and
writing it using regular buffered writes. However that results in doing
no writes at decompress_and_write() because 'pos' is initialized to the
value of 33K (unencoded_offset) and unencoded_file_len is 31K, so the
while loop has no iterations.

Another case where we can fallback to decompression plus regular buffered
writes is when the destination filesystem has a sector size larger then
the sector size of the source filesystem (for example when the source
filesystem is on x86_64 with a 4K sector size and the destination
filesystem is PowerPC with a 64K sector size). In that scenario encoded
write attempts will fail with -EINVAL due to offsets not being aligned
with the sector size of the destination filesystem, and the receive will
attempt the fallback of decompressing the buffer and writing the
decompressed using regular buffered IO.

Fix this by tracking the number of written bytes instead, and increment
it, and the unencoded offset, after each write.

Fixes: d20e759fc9 ("btrfs-progs: receive: encoded_write fallback to explicit decode and write")
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-24 17:29:12 +01:00
Documentation btrfs-progs: filesystem: new subcommand mkswapfile 2022-11-08 11:30:21 +01:00
check btrfs-progs: unify naming of qgroup subvolid helpers 2022-10-26 09:36:44 +02:00
ci btrfs-progs: ci: fix image updater script 2022-10-11 09:06:13 +02:00
cmds btrfs-progs: receive: fix silent data loss after fall back from encoded write 2022-11-24 17:29:12 +01:00
common btrfs-progs: receive: fix parsing of attributes field from the fileattr command 2022-11-24 17:29:11 +01:00
convert btrfs-progs: mkfs: fix a stack over-flow when features string are too long 2022-10-11 09:08:12 +02:00
crypto btrfs-progs: use error helper for messages in non-kernel code 2022-10-11 09:08:07 +02:00
image btrfs-progs: warn when an experimental functionality is used 2022-10-20 16:39:11 +02:00
kernel-lib btrfs-progs: properly handle degraded raid56 reads 2022-11-24 17:29:12 +01:00
kernel-shared btrfs-progs: properly handle degraded raid56 reads 2022-11-24 17:29:12 +01:00
libbtrfs btrfs-progs: unify naming of qgroup subvolid helpers 2022-10-26 09:36:44 +02:00
libbtrfsutil libbtrfsutil: update include lists 2022-10-11 09:08:07 +02:00
m4 btrfs-progs: build: add m4 macros for builtin detection 2022-08-16 15:18:12 +02:00
mkfs btrfs-progs: warn when an experimental functionality is used 2022-10-20 16:39:11 +02:00
tests btrfs-progs: tests: add test case for degraded raid5 2022-11-24 17:29:12 +01:00
.editorconfig btrfs-progs: add basic .editorconfig 2020-08-31 17:01:02 +02:00
.gitignore btrfs-progs: remove asciidoc generated files from .gitignore 2022-03-09 15:37:25 +01:00
64-btrfs-dm.rules btrfs-progs: udev: add rules for dm devices 2016-06-01 14:56:56 +02:00
64-btrfs-zoned.rules btrfs-progs: add udev rule to use mq-deadline on zoned btrfs 2022-02-01 18:41:43 +01:00
CHANGES btrfs-progs: update CHANGES for 6.0.1 2022-11-04 20:23:22 +01:00
COPYING
INSTALL btrfs-progs: docs: update documentation site references in manual pages 2022-10-11 09:08:12 +02:00
Makefile btrfs-progs: build: redirect dependency files files to .deps 2022-10-11 09:08:11 +02:00
Makefile.extrawarn btrfs-progs: build: disable -Waddress-of-packed-member by default 2019-06-14 15:09:53 +02:00
Makefile.inc.in btrfs-progs: build: rename compression support variables 2022-08-16 15:18:11 +02:00
README.md btrfs-progs: docs: update documentation site references in manual pages 2022-10-11 09:08:12 +02:00
VERSION Btrfs progs v6.0.1 2022-11-04 20:31:03 +01:00
autogen.sh btrfs-progs: build: simplify version tracking 2018-01-31 15:14:01 +01:00
btrfs-completion btrfs-progs: completion: add recently added commands 2022-11-09 18:57:35 +01:00
btrfs-corrupt-block.c btrfs-progs: use template for transaction start error messages 2022-10-11 09:08:10 +02:00
btrfs-crc.c btrfs-progs: move crc32c implementation to crypto/ 2019-11-18 19:20:02 +01:00
btrfs-debugfs btrfs-progs: port btrfs-debugfs to python3 2020-07-02 22:24:33 +02:00
btrfs-find-root.c btrfs-progs: map-logical: use message helpers for error messages 2022-10-11 09:08:07 +02:00
btrfs-fragments.c btrfs-progs: reorder includes in standalone tools 2022-10-11 09:06:12 +02:00
btrfs-map-logical.c btrfs-progs: kernel-lib: remove radix-tree 2022-10-11 09:08:07 +02:00
btrfs-sb-mod.c btrfs-progs: reorder includes in standalone tools 2022-10-11 09:06:12 +02:00
btrfs-select-super.c btrfs-progs: remove unnecessary casts for u64 2022-10-11 09:08:09 +02:00
btrfs.c btrfs-progs: common: update include lists, part 1 2022-10-11 09:08:07 +02:00
btrfstune.c btrfs-progs: warn when an experimental functionality is used 2022-10-20 16:39:11 +02:00
configure.ac btrfs-progs: receive: add support for fs-verity 2022-10-11 09:08:08 +02:00
fsck.btrfs
ioctl.h btrfs-progs: send: stream v2 ioctl flags 2022-06-07 13:59:33 +02:00
kerncompat.h btrfs-progs: qgroup: add path to show output 2022-10-25 21:12:24 +02:00
libbtrfs.sym btrfs-progs: delete commented exports from libbtrfs.sym 2022-05-12 20:04:39 +02:00
quick-test.c btrfs-progs: kernel-lib: remove radix-tree 2022-10-11 09:08:07 +02:00
show-blocks btrfs-progs: Remove btrfs-debug-tree command 2018-04-24 13:00:10 +02:00
version.h.in

README.md

Btrfs-progs coverity status

Userspace utilities to manage btrfs filesystems. License: GPLv2.

Btrfs is a copy on write (COW) filesystem for Linux aimed at implementing advanced features while focusing on fault tolerance, repair and easy administration.

This repository hosts following utilities and also documentation:

See INSTALL for build instructions and tests/README.md for testing information.

Release cycle

The major version releases are time-based and follow the cycle of the linux kernel releases. The cycle usually takes 2 months. A minor version releases may happen in the meantime if there are bug fixes or minor useful improvements queued.

The release tags are signed with a GPG key ID F2B4 1200 C54E FB30 380C 1756 C565 D5F9 D76D 583B, release tarballs are hosted at kernel.org. See file CHANGES or changelogs on wiki.

Reporting bugs

There are several ways, each has its own specifics and audience that can give feedback or work on a fix. The following list is sorted in the order of preference:

  • github issue tracker
  • to the mailing list linux-btrfs@vger.kernel.org -- (not required to subscribe), beware that the mail might get overlooked in other traffic
  • IRC (irc.libera.chat #btrfs) -- good for discussions eg. if a bug is already known, but reports could miss developers' attention
  • bugzilla.kernel.org -- (requires registration), set the product to Filesystems and component Btrfs, please put 'btrfs-progs' into the subject so it's clear that it's not a kernel bug report

Development

The patch submissions, development or general discussions take place at linux-btrfs@vger.kernel.org mailinglist, subsciption is not required to post.

The GitHub pull requests will not be accepted directly, the preferred way is to send patches to the mailinglist instead. You can link to a branch in any git repository if the mails do not make it to the mailinglist or just for convenience (makes it easier to test).

The development model of btrfs-progs shares a lot with the kernel model. The github way is different in some ways. We, the upstream community, expect that the patches meet some criteria (often lacking in github contributions):

  • one logical change per patch: eg. not mixing bugfixes, cleanups, features etc., sometimes it's not clear and will be usually pointed out during reviews
  • proper subject line: eg. prefix with btrfs-progs: subpart, ... , descriptive yet not too long, see git log --oneline for some inspiration
  • proper changelog: the changelogs are often missing or lacking explanation why the change was made, or how is something broken, what are user-visible effects of the bug or the fix, how does an improvement help or the intended usecase
  • the Signed-off-by line: this documents who authored the change, you can read more about the The Developer's Certificate of Origin (chapter 11)
    • if you are not used to the signed-off style, your contributions won't be rejected just because of it's missing, the Author: tag will be added as a substitute in order to allow contributions without much bothering with formalities

Source code coding style and preferences follow the kernel coding style. You can find the editor settings in .editorconfig and use the EditorConfig plugin to let your editor use that, or update your editor settings manually.

Testing

The testing documentation can be found in tests/ and continuous integration/container images in ci/.

Documentation updates

Documentation fixes or updates do not need much explanation so sticking to the code rules in the previous section is not necessary. GitHub pull requests are OK, patches could be sent to me directly and not required to be also in the mailinglist. Pointing out typos via IRC also works, although might get accidentally lost in the noise.

Documents are written in RST and built by sphinx.

Third-party sources

Build dependencies are listed in INSTALL. Implementation of checksum/hash functions is provided by copies of the respective sources to avoid adding dependencies that would make deployments in rescure or limited environments harder. The implementations are portable and not optimized for speed nor accelerated. Optionally it's possible to use libgcrypt, libsodium or libkcapi implementations.

Some other code is borrowed from kernel, eg. the raid5 tables or data structure implementation.

References