Commit Graph

5491 Commits

Author SHA1 Message Date
Josef Bacik
6ee697fb25 btrfs-progs: mkfs: use blocks_nr to determine the super used bytes
We were setting the super block's used bytes to a static number.
However the number of blocks we have to write has the correct used size,
so just add up the total number of blocks we're allocating as we
determine their offsets.  This value will be used later which is why I'm
calculating it this way instead of doing the math to set the bytes_super
specifically.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-03 15:32:16 +02:00
Josef Bacik
5a164401e9 btrfs-progs: mkfs: get rid of MKFS_SUPER_BLOCK
We use these block's in order to keep track of which blocks need to be
added to the extent tree and where their roots need to be written.
However we skip MKFS_SUPER_BLOCK for all of these helpers, and we don't
actually need to keep track of the specific block we allocated because
it is always BTRFS_SUPER_INFO_OFFSET.  Remove this enum as we don't need
it.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-03 15:30:54 +02:00
Josef Bacik
5e63118215 btrfs-progs: mkfs: use an associative array for init blocks
Allow creating trees more flexible, eg. in no fixed order.  To handle
this we want to rework the initial mkfs step to take an array of the
blocks we want to create and use the array to keep track of which blocks
we need to create. Use that for current format, make it ready for extent
tree v2.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-03 15:26:35 +02:00
David Sterba
0534441e2e btrfs-progs: tests: add mkfs test for raid0/1 and raid10/2
Extend basic tests with the degenerate raid0 and raid10, coming in 5.15.
Mount of a freshly created filesystem works even on older kernels too.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-27 15:40:53 +02:00
David Sterba
a177ef7dd4 btrfs-progs: mkfs: allow degenerate raid0/raid10
Kernel patch b2f78e88052bc0bee ("btrfs: allow degenerate raid0/raid10")
in
5.15 will allow mounting and converting to single device raid0 or two
device raid10.  Let mkfs create such filesystem.

"The motivation is to allow to preserve the profile type as long as it
 possible for some intermediate state (device removal, conversion), or
 when there are disks of different size, with raid0 the otherwise
 unusable space of the last device will be used too.  Similarly for
 raid10, though the two largest devices would need to be the same."

Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-27 15:40:53 +02:00
Li Zhang
b199123b33 btrfs-progs: build: fix detection of ext4 i_{a,c,a}time_extra
Running convert-tests.sh Reported that the 019-ext4-copy-timestamps test
failed:

  ...
  mount -o loop -t ext4 btrfs-progs/tests/test.img btrfs-progs/tests/mnt
  ====== RUN CHECK touch btrfs-progs/tests/mnt/file
  ====== RUN CHECK stat btrfs-progs/tests/mnt/file
  File: 'btrfs-progs/tests/mnt/file'
  Size: 0           Blocks: 0          IO Block: 4096   regular empty file
  Device: 700h/1792d  Inode: 13          Links: 1
  Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
  Context: unconfined_u:object_r:unlabeled_t:s0
  Access: 2021-08-24 22:10:21.999209679 +0800
  Modify: 2021-08-24 22:10:21.999209679 +0800
  Change: 2021-08-24 22:10:21.999209679 +0800
  ...
  btrfs-progs/btrfs-convert btrfs-progs/tests/test.img
  ...
  ====== RUN CHECK mount -t btrfs -o loop btrfs-progs/tests/test.img btrfs-progs/tests/mnt
  ====== RUN CHECK stat btrfs-progs/tests/mnt/file
  File: 'btrfs-progs/tests/mnt/file'
  Size: 0           Blocks: 0          IO Block: 4096   regular empty file
  Device: 2ch/44d Inode: 267         Links: 1
  Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
  Context: unconfined_u:object_r:unlabeled_t:s0
  Access: 2021-08-24 22:10:21.000000000 +0800
  Modify: 2021-08-24 22:10:21.000000000 +0800
  Change: 2021-08-24 22:10:21.000000000 +0800
  ...
  atime on converted inode does not match
  test failed for case 019-ext4-copy-timestamps

Obviously, the log says that btrfs-convert does not support nanoseconds.
I looked at the source code and found that only if ext2_fs.h defines
EXT4_EPOCH_MASK btrfs-convert to support nanoseconds. But in e2fsprogs,
EXT4_EPOCH_MASK was introduced in v1.43, but in some older versions,
such as v1.40, e2fsprogs actually supports nanoseconds. It seems that if
struct ext2_inode_large contains the i_atime_extra member, ext4 is
supports nanoseconds, so I updated the logic to determine whether the
current ext4 file system supports nanosecond precision.  In addition, I
imported some definitions to encode and decode tv_nsec (copied from
e2fsprogs source code).

Author: Li Zhang <zhanglikernel@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-26 20:34:36 +02:00
Qu Wenruo
325dba6432 btrfs-progs: check: output proper csum values for --check-data-csum
[BUG]
When running "btrfs check --check-data-csum" on fs with corrupted data,
the error message almost makes no sense:

  $ btrfs check --check-data-csum /dev/test/test
  Opening filesystem to check...
  Checking filesystem on /dev/test/test
  UUID: c31afe0a-55bc-4e7d-aba0-9dfa9ddf8090
  [1/7] checking root items
  [2/7] checking extents
  [3/7] checking free space cache
  [4/7] checking fs roots
  [5/7] checking csums against data
  mirror 1 bytenr 13631488 csum 19 expected csum 152 <<<
  ERROR: errors found in csum tree
  [6/7] checking root refs
  [7/7] checking quota groups skipped (not enabled on this FS)
  found 147456 bytes used, error(s) found
  total csum bytes: 16
  total tree bytes: 131072
  total fs tree bytes: 32768
  total extent tree bytes: 16384
  btree space waste bytes: 124799
  file data blocks allocated: 16384
   referenced 16384

[CAUSE]
We're just outputting the first byte and in decimal, which is completely
different from what we did in kernel space, nor what we did for metadata
csum mismatch.

[FIX]
Use btrfs_format_csum() for btrfs-check to output csum.

Now the result looks much better:

  [5/7] checking csums against data
  mirror 1 bytenr 13631488 csum 0x13fec125 expected csum 0x98757625

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-26 14:27:40 +02:00
Qu Wenruo
773afad3e6 btrfs-progs: slightly enhance btrfs_format_csum()
- Change it void
  The old one always return csum_size.

- Use snprintf()

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-26 14:27:01 +02:00
Qu Wenruo
991a598f53 btrfs-progs: move btrfs_format_csum() to common/utils.[ch]
Function btrfs_format_csum() is a special helper only used in
btrfs-progs.

Move it to common/utils.[ch] other than leaving it in
kernel-shared/disk-io.c.

Since we're moving the code, also introduce a macro,
BTRFS_CSUM_STRING_LEN, to replace open-coded string length calculation.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-26 14:26:13 +02:00
Josef Bacik
4d57632e2f btrfs-progs: tests: add image with an invalid super bytes_used
This is used to validate the detection and correction code in both fsck
modes for an invalid bytes_used value in the super block.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:43:13 +02:00
Josef Bacik
e64af00bd1 btrfs-progs: check: detect and fix problems with super_bytes_used
We do not detect problems with our bytes_used counter in the super
block.  Thankfully the same method to fix block groups is used to re-set
the value in the super block, so simply add some extra code to validate
the bytes_used field and then piggy back on the repair code for block
groups.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:39:48 +02:00
Josef Bacik
04520da77b btrfs-progs: check: do not infinite loop on corrupt keys with lowmem mode
By enabling the lowmem checks properly I uncovered the case where test
fsck/007 will infinite loop at the detection stage.  This is because
when checking the inode item we will just btrfs_next_item(), and because
we ignore check tree block failures at read time we don't get an -EIO
from btrfs_next_leaf.  Generally what check usually does is validate the
leaves/nodes as we hit them, but in this case we're not doing that.  Fix
this by checking the leaf if we move to the next one and if it fails
bail.  This allows us to pass the fsck/007 test with lowmem.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:39:48 +02:00
Josef Bacik
4a1863d638 btrfs-progs: check: do not double add unaligned extent records
The repair cycle in the main check will drop all of our cache and loop
through again to make sure everything is still good to go.
Unfortunately we record our unaligned extent records on a per-root list
so they can be retrieved when we're checking the fs roots.  This isn't
straightforward to clean up, so instead simply check our current list of
unaligned extent records when we are adding a new one to make sure we're
not duplicating our efforts.  This makes us able to pass fsck/001 with
my super bytes_used fix applied.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:54 +02:00
Josef Bacik
8c3c13bb45 btrfs-progs: check blocks in btrfs_next_sibling_block
By enabling the lowmem checks properly I uncovered the case where test
fsck/007 will infinite loop at the detection stage.  This is because
when checking the inode item we will just btrfs_next_item(), and because
we ignore check tree block failures at read time we don't get an -EIO
from btrfs_next_leaf.

This occurs because we allow fsck to raw-read blocks even if they fail
basic sanity checks, because we want the opportunity to repair the
blocks.  However this means corrupt blocks are sitting in cache marked
as uptodate.  btrfs_search_slot() handles this by doing a check_block()
on every block we add to the path, so that anything that is doing a
search gets a proper -EIO.

btrfs_next_sibling_block() needs a similar check.  With this fix we now
return -EIO on btrfs_next_leaf() properly and we no longer infinite loop
on fsck/007 with lowmem.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:54 +02:00
Josef Bacik
71404c50e8 btrfs-progs: tests: fix running check mode lowmem tests
When I added the invalid super image I saw that the lowmem tests were
passing, despite not having the detection code yet.  Turns out this is
because we weren't using a run command helper which does the proper
expansion and adds the --mode=lowmem option.  Fix this to use the proper
handler, and now the lowmem test fails properly without my patch to add
this support to the lowmem mode.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:54 +02:00
Josef Bacik
a4190da45e btrfs-progs: check btrfs_super_used in lowmem check
We can already fix this problem with the block accounting code, we just
need to keep track of how much we should have used on the file system,
and then check it against the bytes_super.  The repair just piggy backs
on the block group used repair.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:54 +02:00
Josef Bacik
c3521f8a57 btrfs-progs: propagate extent item errors in lowmem mode
Test 044 was failing with lowmem because it was not bubbling up the
error to the user.  This is because we try to allow repair the
opportunity to clear the error, however if repair isn't set we simply do
not add the temporary error to the main error return variable.  Fix this
by adding the tmp_err to err before moving on to the next item.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:54 +02:00
Josef Bacik
477775946d btrfs-progs: propagate fs root errors in lowmem mode
We have a check that will return an error only if ret < 0, but we return
the lowmem specific errors which are all > 0.  Fix this by simply
checking if (ret).  This allows test 010 to pass with lowmem properly.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:54 +02:00
David Sterba
28df4bbd74 btrfs-progs: corupt-block: leave only long option for --block-group
Long options are always preferred and in case there's a long list of
single letter options it's the best practice to keep the options sane.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:54 +02:00
Josef Bacik
72a37dca27 btrfs-progs: tests: add image with a corrupt block group item
This image has a broken used field of a block group item to validate
fsck does the correct thing.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:54 +02:00
Josef Bacik
a832d36b59 btrfs-progs: check: detect and fix invalid used for block groups
The lowmem mode validates the used field of the block group item, but
the normal mode does not.  Fix this by keeping a running tally of what
we think the used value for the block group should be, and then if it
mismatches report an error and fix the problem if we have repair set.
We have to keep track of pending extents because we process leaves as we
see them, so it could be much later in the process that we find the
block group item to associate the extents with.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:54 +02:00
Josef Bacik
572a0e888a btrfs-progs: corrupt-block: add ability to corrupt block group items
While doing the extent tree v2 stuff I noticed that fsck doesn't detect
an invalid ->used value on the block group item in the normal mode.  To
build a test case for this I need the ability to corrupt block group
items.  This allows us to corrupt the various fields of a block group.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:54 +02:00
Qu Wenruo
3875c14a5a btrfs-progs: image: fix restored image size misalignment
[BUG]
There is a small device size misalignment between the super block device
size and the device extent size:

  total_bytes             10737418240 	<<<
  bytes_used              15097856
  dev_item.total_bytes    10737418240
  dev_item.bytes_used     1094713344

        item 0 key (DEV_ITEMS DEV_ITEM 1) itemoff 16185 itemsize 98
                devid 1 total_bytes 1095761920 bytes_used 1094713344
				    ^^^^^^^^^^

[CAUSE]
In fixup_device_size(), we only reset superblock device item size, which
will be overwritten in write_dev_supers() using btrfs_device::total_bytes.

And it doesn't touch btrfs_superblock::total_bytes either.

[FIX]
So fix the small mismatch by also resetting btrfs_device::total_bytes,
btrfs_device::bytes_used and btrfs_superblock::total_bytes.

Thankfully since commit 73dd4e3c87 ("btrfs-progs: image: Don't modify
the chunk and device tree if the source dump is single device") single
device dump won't have such problem, but it's still worth for
multi-device dump.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:53 +02:00
Qu Wenruo
c4ff83cb5d btrfs-progs: image: reduce memory requirements for decompression
With recent change to enlarge max_pending_size to 256M for data dump,
the decompress code requires quite a lot of memory, up to 256M * 4.

The reason is we're using wrapped uncompress() function call, which
needs the buffer to be large enough to contain the decompressed data.

This patch will re-work the decompress work to use inflate() which can
resume it decompression so that we can use a much smaller buffer size.

Use 512K as buffer size.

Now the memory consumption for restore is reduced to

 cluster data size + 512K * nr_running_threads

Instead of the original one:

 cluster data size + 1G * nr_running_threads

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:53 +02:00
Qu Wenruo
0cc8a7bd4a btrfs-progs: image: introduce -d option to dump data
This new experimental data dump feature will dump the whole image, not
only the existing tree blocks but also all its data extents(*).

This feature will rely on the new dump format (_DUmP_v1), as it needs
extra large extent size limit, and older btrfs-image dump can't handle
such large item/cluster size.

Since we're dumping all extents including data extents, for the restored
image there is no need to use any extra super block flags to inform
kernel.
Kernel should just treat the restored image as any ordinary btrfs.

This new feature will be hidden behind the experimental features, that's
to say, if --enable-experimental is not enabled, although we still have
the option, it will not do anything but output an error message.

*: The data extents will be dumped as is, that's to say, even for
preallocated extent, its (meaningless) data will be read out and
dumpped.
This behavior will cause extra space usage for the image, but we can
skip all the complex partially shared preallocated extent check.

Issue: #394
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:53 +02:00
Qu Wenruo
6b03e0dc40 btrfs-progs: image: introduce framework for more dump versions
The original dump format only contains a magic member to verify the
format, this means if we want to introduce new on-disk format or change
certain size limit, we can only introduce new magic as version.

Introduce the framework to allow multiple magic numbers to co-exist for
further extensions.

Introduce the following members for each dump version.

- max_pending_size
  The threshold size of a cluster. It's not a hard limit but a soft
  one. One cluster can go larger than max_pending_size for one item, but
  next item would go to the next cluster.

- magic_cpu
  The magic number in CPU byte order.

- extra_sb_flags
  If the super block of this restore needs extra super block flags like
  BTRFS_SUPER_FLAG_METADUMP_V2.
  For incoming data dump feature, we don't need any extra super block
  flags.

This change also implies that all image dumps will use the same magic
for all clusters. No mixing is allowed, as we will use the first cluster
to determine the dump version.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:53 +02:00
David Sterba
6ea4830f8f btrfs-progs: build: add configure time option to enable experimental features
Add --enable-experimental configure option that allows to merge unstable
features or partially implemented features. This is supposed to help
features that need time to settle, tweak output or formatting and would
require constant rebases and would have limited exposure to users that
could provide feedback.

If this is enabled, the following may change without notice:

- the whole feature may disappear in the future
- new command names could change or relocate to other subcommands
- parameter names
- output formatting
- json output

Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:53 +02:00
Qu Wenruo
9df3619a9d btrfs-progs: require full nodesize alignement for subpage support
For the incoming extra page size support for subpage (sectorsize <
PAGE_SIZE) cases, the support for metadata will be a critical point.

Currently for subpage support, we require 64K page size, so that no
matter whatever the nodesize is, it will be contained inside one page.
And we will reject any tree block which crosses page boundary.

But for other page size, especially 16K page size, we must support
nodesize differently.

For nodesize < PAGE_SIZE, we will have the same requirement (tree blocks
can't cross page boundary).
While for nodesize >= PAGE_SIZE, we will require the tree blocks to be
page aligned.

To support such feature, we will make btrfs-check to reports more
subpage related warnings for metadata.

This patch will report any tree block which is not nodesize aligned as a
warning.

Existing mkfs/convert has already make sure all new tree blocks are
nodesize aligned, this is just for older converted filesystems.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-20 14:45:58 +02:00
Qu Wenruo
701c7d58db btrfs-progs: tests: don't check subpage related warnings for simple fsck tests
For fsck tests, we check the subpage warnings for each type 1 test, but
such type 1 tests are mostly read-only tests, and one of the test will
trigger new subpage related warnings (fsck/018).

For subpage related warnings, what we really care are write operations,
including mkfs, btrfs-convert and repair, not those read-only tests.

So skip the subpage warning check for fsck type 1 tests to prevent false
alert of later more strict subpage warnings.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-20 14:44:09 +02:00
Qu Wenruo
82e0720312 btrfs-progs: tests: also check subpage warning for check_image cases
There are two types of test cases:

- Type 1 (without test.sh)
- Type 2 (test.sh, mostly will override check_image())

For Type 2 tests, we check subpage related warnings of btrfs-check, but
didn't check it for Type 1 test cases.

In fact, Type 1 test cases are more important, as they involve repair,
which can generate new tree blocks, and we want to make sure such new
tree blocks won't cause subpage related warnings.

This patch will add the extra check for Type 1 test cases.

And it will make sure the subpage related warnings are really from this
test case, to prevent false alerts.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-20 14:42:47 +02:00
David Sterba
dc29a5c51d btrfs-progs: convert: update default output
The messages printed by convert are incomplete regarding the source and
target filesystems and can be improved and unified with the style we
already have eg. in mkfs:

  $ btrfs-convert image
  btrfs-convert from btrfs-progs v5.13.1

  Source filesystem:
    Type:           ext2
    Label:
    Blocksize:      4096
    UUID:           b9bb96e0-7b2f-44a1-9670-1b6da27b26b0
  Target filesystem:
    Label:          NEWLABEL
    Blocksize:      4096
    Nodesize:       16384
    UUID:           c7dd7532-7e17-41b0-bf7c-12d695bbf6d0
    Checksum:       crc32c
    Features:       extref, skinny-metadata (default)
      Data csum:    yes
      Inline data:  yes
      Copy xattr:   yes
  Reported stats:
    Total space:      1073741824
    Free space:        805240832 (74.99%)
    Inode count:           65536
    Free inodes:           65525
    Block count:          262144
  Create initial btrfs filesystem
  Create ext2 image file
  Create btrfs metadata
  Copy inodes [o] [         0/        11]
  Set label to 'NEWLABEL'
  Conversion complete

  $ btrfs-convert -r image
  btrfs-convert from btrfs-progs v5.13.1

  Open filesystem for rollback:
    Label:
    UUID:            c7dd7532-7e17-41b0-bf7c-12d695bbf6d0
    Restoring from:  ext2_saved/image

Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-20 14:24:55 +02:00
David Sterba
33543eb2b1 btrfs-progs: remove stale command declarations
The declarations do not correspond to any command descriptors as they
have been moved to other command groups.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-20 14:24:55 +02:00
David Sterba
9f2bfd966e btrfs-progs: convert: rename context volume_name to label
The name was derived from ext2 but we use label elsewhere.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-20 14:24:55 +02:00
David Sterba
03a9dbf784 btrfs-progs: tests: add test for convert --uuid option
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-20 14:24:55 +02:00
David Sterba
bcae45d9e6 btrfs-progs: convert: new option to copy or specify uuid
Add new option --uuid to convert with the following modes:

- 'copy' -- copy the UUID from the source filesystem
- 'new' -- (default) generate new UUID
- UUID -- a valid UUID that will be set on btrfs

Based on patch from Florian

https://lore.kernel.org/linux-btrfs/1357486331-4615-2-git-send-email-falbrechtskirchinger@gmail.com/

and ported to contemporary codebase.

Issue: #391
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-20 14:24:55 +02:00
Qu Wenruo
9d5d3de01c btrfs-progs: subvol delete: try to delete subvolume by id when its path can't be resolved
There is a recent report of ghost subvolumes where such subvolumes has
no ROOT_REF/BACKREF, and 0 root ref.  But without an orphan item, thus
kernel won't queue them for cleanup.

Such ghost subvolumes are just here to take up space, and no way to
delete them except by btrfs check, which will try to fix the problem by
adding orphan item.

There is a kernel patch submitted to allow btrfs to detect such ghost
subvolumes and queue them for cleanup.

But btrfs-progs will not continue to call the ioctl if it can't find the
full subvolume path.

Thus this patch will loose the restriction by allowing btrfs-progs to
continue to call the ioctl even if it can't grab the subvolume path.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-20 14:24:55 +02:00
Qu Wenruo
a138daac17 btrfs-progs: mkfs: set super_cache_generation to 0 if we're using free space tree
[HICCUP]
There is a bug report that mkfs.btrfs -R free-space-tree still makes
kernel to try to cleanup the v1 space cache:

  # mkfs.btrfs -R free-space-tree -f /dev/test/scratch1
  # mount /dev/test/scratch1 /mnt/btrfs
  # dmesg | grep cleaning
  BTRFS info (device dm-6): cleaning free space cache v1

[CAUSE]
By default, mkfs.btrfs will set super cache generation to (u64)-1, which
will inform kernel that the v1 space cache is invalid, needs to
regenerate it.

But for free space cache tree, kernel will set super cache generation to
0, to indicate v1 space cache is not in use.

This means, even we enabled free space tree with all the RO compatible
bits and new tree, as long as super cache generation is not 0, kernel
still consider the fs has some invalid v1 space cache, and will try to
remove them.

[FIX]
This is not a big deal, but to make the "-R free-space-tree" to really
work as kernel, we also need to set super cache generation to 0.

Reported-by: Chris Murphy <lists@colorremedies.com>
Link: https://lore.kernel.org/linux-btrfs/CAJCQCtSvgzyOnxtrqQZZirSycEHp+g0eDH5c+Kw9mW=PgxuXmw@mail.gmail.com/
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-20 14:24:55 +02:00
David Sterba
e4ac7d4f67
Btrfs progs v5.13.1
Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-30 16:14:58 +02:00
David Sterba
612512a99c btrfs-progs: update CHANGES for 5.13.1
Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-30 16:14:29 +02:00
David Sterba
2f867c21e2 btrfs-progs: mkfs: update message when creating zoned fs with non-single profiles
The defaults for rotational devices are to enable DUP for metadata, this
does not yet work on zoned devices and fails with messages like:

  Zoned: /dev/sda: host-managed device detected, setting zoned feature
  ERROR: cannot use RAID/DUP profile in zoned mode

The RAID/DUP support will be implemented in the future and we don't want
to change the defaults to revert them back again. This makes it a bit
awkward for the user until this happens, so at least print a hint what
to do that single/single must be set manually.

Link: https://lore.kernel.org/linux-btrfs/20210706091922.38650-1-johannes.thumshirn@wdc.com/
Reported-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-30 16:13:58 +02:00
Qu Wenruo
af62875784 btrfs-progs: tests: check nlinks for directories
Make sure btrfs check can detect such problem.  Right now we have no way
to fix it yet.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-30 15:52:57 +02:00
Qu Wenruo
478e98cd01 btrfs-progs: check/original: detect directory inode with nlinks >= 2
Linux VFS doesn't allow directory to have hard links, thus for btrfs
on-disk directory inode items, their nlinks should never go beyond 1.

Lowmem mode already has the check and will report it without problem.
Only original mode needs this update.

Reported-by: Pepperpoint <pepperpoint@mb.ardentcoding.com>
Link: https://lore.kernel.org/linux-btrfs/162648632340.7.1932907459648384384.10178178@mb.ardentcoding.com/
Reviewed-by: Su Yue <l@damenly.su>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-30 15:50:48 +02:00
Omar Sandoval
e9c2942f38 libbtrfsutil: fix race between subvolume iterator and deletion
Subvolume iteration has a window between when we get a root ref (with
BTRFS_IOC_TREE_SEARCH or BTRFS_IOC_GET_SUBVOL_ROOTREF) and when we look
up the path of the parent directory (with BTRFS_IOC_INO_LOOKUP{,_USER}).
If the subvolume is moved or deleted and its old parent directory is
deleted during that window, then BTRFS_IOC_INO_LOOKUP{,_USER} will fail
with ENOENT. The iteration will then fail with ENOENT as well.

We originally encountered this bug with an application that called
`btrfs subvolume show` (which iterates subvolumes to find snapshots) in
parallel with other threads creating and deleting subvolumes. It can be
reproduced almost instantly with the included test cases.

Subvolume iteration should be robust against concurrent modifications to
subvolumes. So, if a subvolume's parent directory no longer exists, just
skip the subvolume, as it must have been deleted or moved elsewhere.

Reviewed-by: Neal Gompa <ngompa13@gmail.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-29 13:01:55 +02:00
David Sterba
1907cd64db btrfs-progs: ci: add script to do build test on musl
Run ci/ci-build-musl to verify build of current branch works in
environment with musl libc. Also works for a given branch name.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-26 13:45:35 +02:00
er888kh
f9979f9dd6 libbtrfsutil: fix typo in README example
Fix misplaced quote.

Pull-request: #387
Author: er888kh <45465346+er888kh@users.noreply.github.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-26 13:30:40 +02:00
Sidong Yang
d302bf5b34 btrfs-progs: cmds: fix build on musl when using NAME_MAX
There is some code that using NAME_MAX but it doesn't include header
that is defined. This patch adds a line that includes linux/limits.h
which defines NAME_MAX.

Issue: #386
Issue: #385
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-26 13:28:22 +02:00
David Sterba
6527771668 btrfs-progs: add nparity for raid1c34 definitions
The values of .ncopies was not explicitly set.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-23 00:59:27 +02:00
Qu Wenruo
07ecf878c1 btrfs-progs: check: batch v1 space cache inodes when clearing
Currently v1 space cache clearing will delete one cache inode just in
one transaction, and then start a new transaction to delete the next
inode.

This is far from efficient and can make the already slow v1 space cache
deleting even slower, as large fs has tons of cache inodes to delete.

This patch will speed up the process by batching up to 16 inode deletion
into one transaction.

A quick benchmark of deleting 702 v1 space cache inodes would look like
this:

Unpatched:		4.898s
Patched:		0.087s

Which is obviously a big win.

Reported-by: Joshua <joshua@mailmag.net>
Link: https://lore.kernel.org/linux-btrfs/0b4cf70fc883e28c97d893a3b2f81b11@mailmag.net/
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-22 16:26:05 +02:00
Qu Wenruo
42566a50ec btrfs-progs: docs: fix the out-of-date comment about free space tree support
Since v4.19, btrfs-progs has full write support to free space tree, the
out-of-date warning in btrfs(5) has already confused some end user.

Update the content to avoid further confusion.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-22 16:00:10 +02:00
David Sterba
de4914dbfd
Btrfs progs v5.13
Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-13 14:31:38 +02:00