Commit Graph

53 Commits

Author SHA1 Message Date
Qu Wenruo
2f2f6bfe17 btrfs-progs: btrfstune: add the ability to convert to block group tree feature
The new '-b' option will be responsible for converting to block group
tree compat ro feature.

The workflow looks like this for new convert:

- Setting CHANGING_BG_TREE flag
  And initialize fs_info->last_converted_bg_bytenr value to (u64)-1.

  Any bg with bytenr >= last_converted_bg_bytenr will have its bg item
  update go to the new root (bg tree).

- Iterate each block group by their bytenr in descending order
  This involves:
  * Delete the old bg item from the old tree (extent tree)
  * Update last_converted_bg_bytenr to the bytenr of the bg
  * Add the new bg item into the new tree (bg tree)
  * If we have converted a bunch of bgs, commit current transaction

- Clear CHANGING_BG_TREE flag
  And set the new BLOCK_GROUP_TREE compat ro flag and commit.

And since we're doing the convert in multiple transactions, we also need
to resume from last interrupted convert.

In that case, we just grab the last unconverted bg, and start from it.

And to co-operate with the new kernel requirement for both no-holes and
free-space-tree features, the convert tool will check for
free-space-tree feature. If not enabled, will error out with an error
message to how to continue (by mounting with "-o space_cache=v2").

For missing no-holes feature, we just need to set the flag during
convert.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-09-12 18:25:32 +02:00
Qu Wenruo
1430b41427 btrfs-progs: separate block group tree from extent tree v2
Block group tree feature is completely a standalone feature, and it has
been over 5 years before the initial introduction to solve the long
mount time.

I don't really want to waste another 5 years waiting for a feature which
may or may not work, but definitely not properly reviewed for its
preparation patches.

So this patch will separate the block group tree feature into a
standalone compat RO feature.

There is a catch, in mkfs create_block_group_tree(), current
tree-checker only accepts block group item with valid chunk_objectid,
but the existing code from extent-tree-v2 didn't properly initialize it.

This patch will also fix above mentioned problem so kernel can mount it
correctly.

Now mkfs/fsck should be able to handle the fs with block group tree.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-09-12 18:25:32 +02:00
Qu Wenruo
c5a21a7814 btrfs-progs: don't save block group root into super block
The extent tree v2 (thankfully not yet fully materialized) needs a
new root for storing all block group items.

My initial proposal years ago just added a new tree rootid, and load it
from tree root, just like what we did for quota/free space tree/uuid/extent
roots.

But the extent tree v2 patches introduced a completely new (and to me,
wasteful) way to store block group tree root into super block.

Currently there are only 3 trees stored in super blocks, and they all
have their valid reasons:

- Chunk root
  Needed for bootstrap.

- Tree root
  Really the entrance of all trees.

- Log root
  This is special as log root has to be updated out of existing
  transaction mechanism.

There is not even any reason to put block group root into super blocks,
the block group tree is updated at the same timing as old extent tree,
no need for extra bootstrap/out-of-transaction update.

So just move block group root from super block into tree root.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-09-12 15:31:27 +02:00
Boris Burkov
ba7b281049 btrfs-progs: add VERITY ro compat flag
This compat flag is missing, but is being checked by mount, and could
well be present legitimately.

Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-08-16 15:18:11 +02:00
Qu Wenruo
963188943f btrfs-progs: make btrfs_super_block::log_root_transid deprecated
This is the same on-disk format update synchronized from the kernel
code.

Unlike kernel, there are two callers reading this member:

- btrfs inspect dump-super
  It's just printing the value, add a notice about deprecation.

- btrfs-find-root
  In that case, since we always got 0, the root search for log root
  should never find a perfect match.

  Use btrfs_super_geneartion() + 1 to provide a better result.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-08-16 15:18:11 +02:00
David Sterba
0f65bf66be btrfs-progs: libbtrfs: drop ifdef BTRFS_FLAT_INCLUDES where not necessary
Headers that are only exported and not used for build do not need the
BTRFS_FLAT_INCLUDES switch (between local and installed headers). Now
that there are local copies of the shared headers drop the respective
part from local headers.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-06-06 15:48:52 +02:00
Sweet Tea Dorminy
c494724858 btrfs-progs: dump-tree: add print support for verity items
'btrfs inspect-internals dump-tree' doesn't currently know about the two
types of verity items and prints them as 'UNKNOWN.36' or 'UNKNOWN.37'.
So add them to the known item types.

Suggested-by: Boris Burkov <boris@bur.io>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-24 00:49:19 +01:00
Josef Bacik
e33738306c btrfs-progs: handle the per-block group global root id
We will now be using block_group->chunk_objectid to point at the global
root id for this particular block group.  For now we'll assign this
based on mod'ing the offset of the block group against the number of
global root id's and handle the block_group_item updating appropriately.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 18:07:17 +01:00
Josef Bacik
27eaa3b514 btrfs-progs: set the number of global roots in the super block
In order to make sure the file system is consistent we need to record
the number of global roots we should have in the super block.  We could
infer this from the number of global roots we find, however this could
lead to interesting fuzzing problems, so add a source of truth to the
super block in order to make it easier to verify the file system is
consistent.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 18:07:14 +01:00
Josef Bacik
9ee6cc78a8 btrfs-progs: add support for loading the block group root
This adds the ability to load the block group root, as well as make sure
the various backup super block and super block updates are made
appropriately.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 18:06:51 +01:00
Josef Bacik
4f184cc911 btrfs-progs: make all of the item/key_ptr offset helpers take an eb
When we change the size of the btrfs_header we're going to need to
change how these helpers calculate where to find the start of items or
block ptrs.  To prepare for that make these helpers take the
extent_buffer as an argument so we can do the appropriate math based on
the version type.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 15:13:14 +01:00
Josef Bacik
aba83381a5 btrfs-progs: rework the btrfs_node accessors to match the item accessors
We are duplicating the offsetof(btrfs_node, key_ptr) logic everywhere,
instead use the helper to do this work for us, and make all the node
accessors use the helper.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 15:13:14 +01:00
Josef Bacik
5dc3964aaa btrfs-progs: remove the _nr from the item helpers
Now that all callers are using the _nr variations we can simply rename
these helpers to btrfs_item_##member/btrfs_set_item_##member and change
the actual item SETGET funcs to raw_item_##member/set_raw_item_##member
and then change all callers to drop the _nr part.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 15:13:13 +01:00
Josef Bacik
f3be0ff01a btrfs-progs: rename btrfs_item_end_nr to btrfs_item_end
All callers use the btrfs_item_end_nr() variation, simply drop
btrfs_item_end() and make btrfs_item_end_nr() use the _nr() variations
of the item get helpers.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 15:13:13 +01:00
Josef Bacik
49539423fa btrfs-progs: change btrfs_file_extent_inline_item_len to take a slot
This matches how the kernel does it, simply pass in the slot and fix up
btrfs_file_extent_inline_item_len to use the btrfs_item_nr() helper and
the correct define.  Fixup all the callers to use the slot now instead
of passing in the btrfs_item.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 15:13:13 +01:00
Josef Bacik
04ffea07e4 btrfs-progs: add btrfs_set_item_*_nr() helpers
We have a lot of the following patterns

	item = btrfs_item_nr(nr);
	btrfs_set_item_*(eb, item, val);

	btrfs_set_item_*(eb, btrfs_item_nr(nr), val);

in a lot of places in our code.  Instead add _nr variations of these
helpers and convert all of the users to this new helper.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 15:13:13 +01:00
Josef Bacik
ed098523dc btrfs-progs: reduce usage of __BTRFS_LEAF_DATA_SIZE
This helper only takes the nodesize, but in the future it'll take a bool
to indicate if we're extent tree v2.  The remaining users are all where
we only have extent_buffer, but we should always have a valid
eb->fs_info in these cases, so add BUG_ON()'s for the !eb->fs_info case
and then convert these callers to use BTRFS_LEAF_DATA_SIZE which takes
the fs_info.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 15:13:13 +01:00
Josef Bacik
6cf2da1f6f btrfs-progs: store BTRFS_LEAF_DATA_SIZE in the fs_info
This is going to be a different value based on the incompat settings of
the file system, just store this in the fs_info instead of calculating
it every time.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 15:13:12 +01:00
David Sterba
1bb6fb896d btrfs-progs: btrfstune: experimental, new option to switch csums
This is still work in progress but can survive some stress testing.
There are still some sanity checks missing, do not user this on valuable
data. To enables this, configure must be run with the experimental
features enabled.

  $ mkfs.btrfs --csum crc32c /dev/sdx
  $ <mount, fill with data, unmount>

  $ btrfstune --csum sha256

Will change the checksum to sha256.

Implementation:

- set bit on superblock when the checksums are being changed (similar to
  the uuid rewrite)
- metadata checksums are overwritten in place
- data checksums:
  - the checksum tree is completely deleted and no checksums are
    verified
  - data blocks are enumerated and all checksums generated (same as
    check --init-csum-tree)

To make it usable, it should be restartable and track the current
progress somehow. Also the previous data checksums should be verified
any time they're available.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-08 18:10:03 +01:00
Josef Bacik
532bf58b5b btrfs-progs: sanity check global roots key.offset
For !extent tree v2 we should validate the key.offset == 0, and for
extent tree v2 we should validate that key.offset < nr_global_roots.  If
this fails we need to fail to load the global root so that the
appropriate action is taken.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-02-16 22:48:01 +01:00
Josef Bacik
5e8a779f5c btrfs-progs: add on disk pointers to global tree ids
We are going to start creating multiple sets of global trees, which at
the moment are the free space tree, csum tree, and extent tree.
Generally we will assign these at block group creation time, but Dave
would like to be able to have them per-subvolume at some point, so
reserve a slot for that as well.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-30 19:09:47 +01:00
Josef Bacik
ec0eaae673 btrfs-progs: add definitions for the block group tree
Add the on disk definitions for the block group tree.  This will be part
of the super block so we need to add the appropriate helpers to the
super block, as well as adding it to the backup roots.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-30 19:08:39 +01:00
Josef Bacik
3337b7993b btrfs-progs: common: allow users to select extent-tree-v2 option
We want to enable developers to test the extent tree v2 features as they
are added, add the ability to mkfs an extent tree v2 fs if we have
experimental enabled.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-30 19:07:34 +01:00
Josef Bacik
b057607325 btrfs-progs: track csum, extent, and free space trees in a rb tree
We are going to have multiples of these trees with extent tree v2, so
add a rb tree to track them based on their root key value.  This works
for both v1 and v2, so we can remove the direct pointers to these roots
in our fs_info.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-30 18:57:25 +01:00
Josef Bacik
0b23744de5 btrfs-progs: stop accessing ->free_space_root directly
We're going to have multiple free space roots in the future, so access
it via a helper in most cases.  We will address the remaining direct
accesses in future patches.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-30 18:57:19 +01:00
Josef Bacik
db2ab47823 btrfs-progs: stop accessing ->extent_root directly
When we switch to multiple global trees we'll need to access the
appropriate extent root depending on the block group or possibly root.
To handle this, use a helper in most places and then the actual root in
places where it is required.  We will whittle down the direct accessors
with future patches, but this does the bulk of the preparatory work.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-30 18:56:54 +01:00
Josef Bacik
639b1fc2e7 btrfs-progs: stop accessing ->csum_root directly
With extent tree v2 we will have per-block group checksums, so add a
helper to access the csum root and rename the fs_info csum_root to
_csum_root to catch all the places that are accessing it directly.
Convert everybody to use the helper except for internal things.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-22 21:45:37 +01:00
Josef Bacik
08b63c0fc5 btrfs-progs: stop passing root to csum related functions
We are going to need to start looking up the csum root based on the
bytenr with extent tree v2.  To that end stop passing the root to the
csum related functions so that can be done in the helper functions
themselves.

There's an unrelated deletion of a function prototype that no longer
exists.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-22 21:45:37 +01:00
Qu Wenruo
c4ff87c3d1 btrfs-progs: cache csum_size and csum_type in btrfs_fs_info
Just like kernel commit 22b6331d9617 ("btrfs: store precalculated
csum_size in fs_info"), we can cache csum_size and csum_type in
btrfs_fs_info.

Furthermore, there is already a 32 bits hole in btrfs_fs_info, and we
can fit csum_type and csum_size into the hole without increase the size
of btrfs_fs_info.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-05 12:50:03 +01:00
Qu Wenruo
76f1a2ed57 btrfs-progs: unify size of btrfs_super_block and BTRFS_SUPER_INFO_SIZE
Just like kernel change, pad struct btrfs_super_block to 4096 bytes. As
ctree.h is part of public headers, use raw number for the superblock
offset.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-05 12:50:03 +01:00
Qu Wenruo
5bee5c99bf btrfs-progs: fix printf formats on 32bit x86
When compiling btrfs-progs on 32bit x86 using GCC 11.1.0, there are
several warnings:

  In file included from ./common/utils.h:30,
                   from check/main.c:36:
  check/main.c: In function 'run_next_block':
  ./common/messages.h:42:31: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'u32' {aka 'unsigned int'} [-Wformat=]
     42 |                 __btrfs_error((fmt), ##__VA_ARGS__);                    \
        |                               ^~~~~
  check/main.c:6496:33: note: in expansion of macro 'error'
   6496 |                                 error(
        |                                 ^~~~~

  In file included from ./common/utils.h:30,
                   from kernel-shared/volumes.c:32:
  kernel-shared/volumes.c: In function 'btrfs_check_chunk_valid':
  ./common/messages.h:42:31: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'u32' {aka 'unsigned int'} [-Wformat=]
     42 |                 __btrfs_error((fmt), ##__VA_ARGS__);                    \
        |                               ^~~~~
  kernel-shared/volumes.c:2052:17: note: in expansion of macro 'error'
   2052 |                 error("invalid chunk item size, have %u expect [%zu, %lu)",
        |                 ^~~~~

  image/main.c: In function 'search_for_chunk_blocks':
  ./common/messages.h:42:31: warning: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'size_t' {aka 'unsigned int'} [-Wformat=]
     42 |                 __btrfs_error((fmt), ##__VA_ARGS__);                    \
        |                               ^~~~~
  image/main.c:2122:33: note: in expansion of macro 'error'
   2122 |                                 error(
        |                                 ^~~~~

There are two types of problems:

- __BTRFS_LEAF_DATA_SIZE()
  This macro has no type definition, making it behaves differently on
  different arches.

  Fix this by following kernel to use inline function to make its return
  value fixed to u32.

- size_t related output
  For x86_64 %lu is OK but not for x86.

  Fix this by using %zu.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-05 12:50:03 +01:00
Wang Yugui
b1d8f945c9 btrfs-progs: mask out all unwanted profiles in btrfs_group_profile_str
Commit ("btrfs-progs: switch btrfs_group_profile_str to use raid table")
introduced a regression that raid profile of GlobalReserve will be
printed as 'unknown'.

  $ btrfs filesystem df /mnt/test
  Data, single: total=5.02TiB, used=4.98TiB
  System, single: total=4.00MiB, used=624.00KiB
  Metadata, single: total=11.01GiB, used=6.94GiB
  GlobalReserve, unknown: total=512.00MiB, used=0.00B

Fix it by:

- take BTRFS_BLOCK_GROUP_RESERVED into account when masking the block
  group flags
- update the define of BTRFS_BLOCK_GROUP_RESERVED too so it's same as in
  kernel

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Wang Yugui <wangyugui@e16-tech.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-05 12:50:03 +01:00
Qu Wenruo
8f81113021 btrfs-progs: check: fix a lowmem mode crash where fatal error is not properly handled
[BUG]
When a special image (diverted from fsck/012) has its unused slots (slot
number >= nritems) with garbage, lowmem mode btrfs check can crash:

  (gdb) run check --mode=lowmem ~/downloads/good.img.restored
  Starting program: /home/adam/btrfs/btrfs-progs/btrfs check --mode=lowmem ~/downloads/good.img.restored
  ...
  ERROR: root 5 INODE[5044031582654955520] nlink(257228800) not equal to inode_refs(0)
  ERROR: root 5 INODE[5044031582654955520] nbytes 474624 not equal to extent_size 0

  Program received signal SIGSEGV, Segmentation fault.
  0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703
  1703	BTRFS_SETGET_FUNCS(inode_size, struct btrfs_inode_item, size, 64);
  (gdb) bt
  #0  0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703
  #1  0x0000555555641544 in check_inode_item (root=0x5555556c2290, path=0x7fffffffd960) at check/mode-lowmem.c:2628

[CAUSE]
At check_inode_item() we have path->slot[0] at 29, while the tree block
only has 26 items.

This happens because two reasons:

- btrfs_next_item() never reverts its slots
  Even if we failed to read next leaf.

- check_inode_item() doesn't inform the caller that a fatal error
  happened
  In check_inode_item(), if btrfs_next_item() failed, it goes to out
  label, which doesn't really set @err properly.

This means, when check_inode_item() fails at btrfs_next_item(), it will
increase path->slots[0], while it's already beyond current tree block
nritems.

When the slot increases furthermore, and if the unused item slots have
some garbage, we will get invalid btrfs_item_ptr() result, and causing
above segfault.

[FIX]
Fix the problems by two ways:

- Make btrfs_next_item() to revert its path->slots[0] on failure

- Properly detect fatal error from check_inode_item()

By this, we will no longer crash on the crafted image.

Reported-by: Wang Yugui <wangyugui@e16-tech.com>
Issue: #412
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-04 20:56:42 +01:00
David Sterba
785218efb1 btrfs-progs: remove direct calls to crc32c from ctree.h
Make the helpers using crc32c not inline so the crc32c.h can be removed
from the public headers exported by libbtrfs.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:46:35 +02:00
David Sterba
732d73dc1f btrfs-progs: remove btrfs_crc32c alias
There's an ancient macro btrfs_crc32c which is just wrapping crc32c and
not doing anything else, so we can use the crc helper directly.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:46:35 +02:00
David Sterba
979bda6fb5 btrfs-progs: libbtrfs: replace SZ_ constants and drop sizes.h
To drop sizes.h from exported headers, replace the few SZ_ constants
from the existing exported headers (ctree.h, send.h). It would be nice
to use them in the long run but right now it would prevent unexporting
the sizes.h file.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:46:35 +02:00
David Sterba
38356d456b btrfs-progs: libbtrfs: drop radix-tree.h from exported headers
The header is only included from ctree.h but not actually used, we can
drop it from the exported files.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:46:35 +02:00
Nikolay Borisov
39c6e0b79c btrfs-progs: add btrfs_uuid_tree_remove
It will be used to clear received data on RW snapshots that were
received. The function is copied from kernel sources.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:46:34 +02:00
Nikolay Borisov
97640a5b81 btrfs-progs: remove root argument from btrfs_truncate_item
This function lies in the kernel-shared directory and is supposed to be
close to 1:1 copy with its kernel counterpart, yet it takes one extra
argument - root. But this is now unused to simply remove it.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:46:34 +02:00
Nikolay Borisov
7c58b09548 btrfs-progs: remove root argument from btrfs_fixup_low_keys
It's not used, so just remove it.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:46:34 +02:00
Johannes Thumshirn
c22e9487a7 btrfs-progs: remove max_zone_append_size logic
max_zone_append_size is unused and can as well be removed just like we
did on the kernel side.

Keep one sanity check though, so we're not adding devices to a zoned FS
that aren't supporting zone append.

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-06 16:49:07 +02:00
Qu Wenruo
60651ad9da btrfs-progs: introduce OPEN_CTREE_ALLOW_TRANSID_MISMATCH flag
[BUG]
There is a report that, btrfstune can even work while the fs has transid
mismatch problems.

  $ btrfstune -f -u /dev/sdb1
  Current fsid: b2b5ae8d-4c49-45f0-b42e-46fe7dcfcb07
  New fsid: b2b5ae8d-4c49-45f0-b42e-46fe7dcfcb07
  Set superblock flag CHANGING_FSID
  Change fsid in extents
  parent transid verify failed on 792854528 wanted 20103 found 20091
  parent transid verify failed on 792854528 wanted 20103 found 20091
  parent transid verify failed on 792854528 wanted 20103 found 20091
  Ignoring transid failure
  parent transid verify failed on 792870912 wanted 20103 found 20091
  parent transid verify failed on 792870912 wanted 20103 found 20091
  parent transid verify failed on 792870912 wanted 20103 found 20091
  Ignoring transid failure
  parent transid verify failed on 792887296 wanted 20103 found 20091
  parent transid verify failed on 792887296 wanted 20103 found 20091
  parent transid verify failed on 792887296 wanted 20103 found 20091
  Ignoring transid failure
  ERROR: child eb corrupted: parent bytenr=38010880 item=69 parent level=1 child level=1
  ERROR: failed to change UUID of metadata: -5
  ERROR: btrfstune failed

This leaves a corrupted fs even more corrupted, and due to the extra
CHANGING_FSID flag, btrfs check will not even try to run on it:

  Opening filesystem to check...
  ERROR: Filesystem UUID change in progress
  ERROR: cannot open file system

[CAUSE]
Unlike kernel, btrfs-progs has a less strict check on transid mismatch.

In read_tree_block() we will fall back to use the tree block even its
transid mismatch if we can't find any better copy.

However not all commands in btrfs-progs needs this feature, only
btrfs-check (which may fix the problem) and btrfs-restore (it just tries
to ignore any problems) really utilize this feature.

[FIX]
Introduce a new open ctree flag, OPEN_CTREE_ALLOW_TRANSID_MISMATCH, to
be explicit about whether we really want to ignore transid error.

Currently only btrfs-check and btrfs-restore will utilize this new flag.

Also add btrfs-image to allow opening such fs with transid error.

Link: https://www.reddit.com/r/btrfs/comments/pivpqk/failure_during_btrfstune_u/
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-20 12:17:29 +02:00
Qu Wenruo
9a11b1b792 btrfs-progs: backport btrfs_check_node() from kernel
The btrfs_check_node() has far less meaningful error message compared to
kernel counterpart, and it even lacks certain checks like level check.

Backport btrfs_check_node() to btrfs-progs to not only unify the code
but greatly improve the readability of the error messages.

Extra modification includes:

- No fs_info needed
  As we don't need to output fsid.

- Remove unlikely() macro

- Extra BTRFS_TREE_BLOCK_* error type

- Btrfs-progs specific error handling
  To record the corrupted tree blocks.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 14:20:41 +02:00
Qu Wenruo
1f8dfe681f btrfs-progs: use btrfs_key for btrfs_check_node() and btrfs_check_leaf()
In kernel space we hardly use btrfs_disk_key, unless for very lowlevel
code.

There is no need to intentionally use btrfs_disk_key in btrfs-progs
either.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 13:58:44 +02:00
David Sterba
7572839a74 btrfs-progs: add and use bit masks for RAID1 and RAID56 profiles
Many test conditions can be simplified in case they check all the
related profiles.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-06 16:36:18 +02:00
Josef Bacik
79e534def9 btrfs-progs: add the incompat flag for extent tree v2
I will have a lot of preparatory patches to reduce the review pain of
this large feature.  In order to enable that work define the incompat
flag.  Once all of the work lands to support the feature there will be a
patch to actually enable us to select it and manipulate file systems
with that incompat flag set.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-06 16:36:17 +02:00
Naohiro Aota
bfd34b7876 btrfs-progs: zoned: redirty clean extent buffers
Tree manipulating operations like merging nodes often release
once-allocated tree nodes. Btrfs cleans such nodes so that pages in the
node are not uselessly written out. On ZONED drives, however, such
optimization blocks the following IOs as the cancellation of the write
out of the freed blocks breaks the sequential write sequence expected by
the device.

Check if next dirty extent buffer is continuous to a previously written
one. If not, it redirty extent buffers between the previous one and the
next one, so that all dirty buffers are written sequentially.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:45 +02:00
Naohiro Aota
f08410f078 btrfs-progs: zoned: load zone's allocation offset
A zoned filesystem must allocate blocks at the zones' write pointer. The
device's write pointer position can be mapped to a logical address
within a block group. To facilitate this, add an "alloc_offset" to the
block group to track the logical addresses of the write pointer.

This logical address is populated in btrfs_load_block_group_zone_info()
from the write pointers of corresponding zones.

For now, zoned filesystems the single profile. Supporting non-single
profile with zone append writing is not trivial. For example, in the DUP
profile, we send a zone append writing IO to two zones on a device. The
device reply with written LBAs for the IOs. If the offsets of the
returned addresses from the beginning of the zone are different, then it
results in different logical addresses.

We need fine-grained logical to physical mapping to support such
separated physical address issue. Since it should require additional
metadata type, disable non-single profiles for now.

This commit supports the case all the zones in a block group are
sequential. The next patch will handle the case having a conventional
zone.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:45 +02:00
Naohiro Aota
3c0f83e541 btrfs-progs: zoned: introduce max_zone_append_size
The zone append write command has a maximum IO size restriction it
accepts. This is because a zone append write command cannot be split, as
we ask the device to place the data into a specific target zone and the
device responds with the actual written location of the data.

Introduce max_zone_append_size to zone_info and fs_info to track the
value, so we can limit all I/O to a zoned block device that we want to
write using the zone append command to the device's limits.

Zone append command is mandatory for zoned btrfs. So, reject a device
with max_zone_append_size == 0.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:45 +02:00
Naohiro Aota
7e520022ff btrfs-progs: zoned: check and enable ZONED mode
Introduce function btrfs_check_zoned_mode() to check if ZONED flag is
enabled on the file system and if the file system consists of zoned
devices with equal zone size.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:45 +02:00