Commit Graph

55 Commits

Author SHA1 Message Date
Anand Jain 60771e72a4 btrfs-progs: drop argument devid from device_list_add
Drop the devid argument, it can be fetched from the disk_super argument.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-07-26 15:00:48 +02:00
Josef Bacik e150c843ea btrfs-progs: sync tree-checker.[ch] from kernel
This syncs tree-checker.c from the kernel.  The main modification was to
add a open ctree flag to skip the deeper leaf checks, and plumbing this
through tree-checker.c.  We need this for things like fsck or
btrfs-image that need to work with slightly corrupted file systems, and
these checks simply make us unable to look at the corrupted blocks.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:30 +02:00
Josef Bacik 180bd5edd1 btrfs-progs: change btrfs_check_chunk_valid to match the kernel version
In btrfs-progs we check the actual leaf pointers as well as the chunk
itself in btrfs_check_chunk_valid.  However in the kernel the leaf stuff
is handled separately as part of the read, and then we have the chunk
checker itself.  Change the btrfs-progs version to match the in-kernel
version temporarily so it makes syncing the in-kernel code easier.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:30 +02:00
Qu Wenruo 66dad3a8f5 btrfs-progs: fix a false alert on an uninitialized variable when BUG_ON() is involved
[FALSE ALERT]
Clang 15.0.7 gives the following false alert in get_dev_extent_len():

  kernel-shared/extent-tree.c:3328:2: warning: variable 'div' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized]
          default:
          ^~~~~~~
  kernel-shared/extent-tree.c:3332:24: note: uninitialized use occurs here
          return map->ce.size / div;
                                ^~~
  kernel-shared/extent-tree.c:3311:9: note: initialize the variable 'div' to silence this warning
          int div;
                 ^
                  = 0

And one in btrfs_stripe_length() too:

  kernel-shared/volumes.c:2781:2: warning: variable 'stripe_len' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized]
          default:
          ^~~~~~~
  kernel-shared/volumes.c:2785:9: note: uninitialized use occurs here
          return stripe_len;
                 ^~~~~~~~~~
  kernel-shared/volumes.c:2754:16: note: initialize the variable 'stripe_len' to silence this warning
          u64 stripe_len;
                        ^
                         = 0

[CAUSE]
Clang doesn't really understand what BUG_ON() means, thus in that
default case, we won't get uninitialized value but crash directly.

[FIX]
Silent the errors by assigning the default value properly using the
value of SINGLE profile.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-18 17:44:02 +01:00
David Sterba ccb2d4aa45 btrfs-progs: device-utils: rename btrfs_device_size
There's a group of helpers to read device size, the btrfs_device_size
should be one of them. Rename it and so minor cleanup.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:10 +02:00
David Sterba a827bb2db8 btrfs-progs: use template for transaction commit error messages
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:10 +02:00
David Sterba 8fcafae04a btrfs-progs: use template for transaction start error messages
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:10 +02:00
Qu Wenruo e47c34821f btrfs-progs: rescue: allow fix-device-size to shrink device item
If we found that the underlying block device size is smaller than
total_bytes in dev item, kernel will reject the mount, and there is no
progs tool to fix it.

Under most case it's just a small mismatch, and there is no dev extent
in the shrunk range.

In that case, we can let "btrfs rescue fix-device-size" to reset the
total_bytes in dev items to fix.

We add some extra checks, like to make sure there is no dev extent in
the shrunk device range, to make sure we won't lose data during the
device item shrink.

And also update the test case to verify the repaired fs can pass the
check.

Issue: #504
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-09-12 15:31:21 +02:00
Qu Wenruo 9bded24a46 btrfs-progs: do not use btrfs_commit_transaction() just to update super blocks
There are several call sites utilizing btrfs_commit_transaction() just
to update members in super blocks, without any metadata update.

This can be problematic for some simple call sites, like zero_log_tree()
or check_and_repair_super_num_devs().

If we have big problems preventing the fs to be mounted in the first
place, and need to clear the log or super block size, but by some other
problems in extent tree, we're unable to allocate new blocks.

Then we fall into a deadlock that, we need to mount (even
ro,rescue=all) to collect extra info, but btrfs-progs can not do any
super block updates.

Fix the problem by allowing the following super blocks only operations
to be done without using btrfs_commit_transaction():

- btrfs_fix_super_size()
- check_and_repair_super_num_devs()
- zero_log_tree().

There are some exceptions in btrfstune.c, related to the csum type
conversion and seed flags.

In those btrfstune cases, we in fact wants to proper error report in
btrfs_commit_transaction(), as those operations are not mount critical,
and any early error can be helpful to expose any problems in the fs.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-20 15:54:16 +02:00
Qu Wenruo 4e9e978783 btrfs-progs: allow read_data_from_disk() to rebuild RAID56 using P/Q
This new ability is added by:

- Allow btrfs_map_block() to return the chunk type
  This makes later work much easier

- Only reset stripe offset inside btrfs_map_block() when needed
  Currently if @raid_map is not NULL, btrfs_map_block() will consider
  this call is for WRITE and will reset stripe offset.

  This is no longer the case, as for RAID56 read with mirror_num 1/0,
  we will still call btrfs_map_block() with non-NULL raid_map.

  Add a small check to make sure we won't reset stripe offset for
  mirror 1/0 read.

- Add new helper read_raid56() to handle rebuild
  We will read the full stripe (including all data and P/Q stripes)
  do the rebuild, then only copy the refered part to the caller.

  There is a catch for RAID6, we have no way to exhaust all combination,
  so the current repair will assume the mirror = 0 data is corrupted,
  then try to find a missing device.

  But if no missing device can be found, it will assume P is corrupted.
  This is just a guess, and can to totally wrong, but we have no better
  idea.

Now btrfs-progs have full read ability for RAID56.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 19:08:30 +02:00
Qu Wenruo a99bece1cd btrfs-progs: remove extent_buffer::fd and extent_buffer::dev_bytes
Those two members are a shortcut for non-RAID56 profiles.

But we should not use such shortcut, and move all our logical address
read/write to the unified read_data_from_disk()/write_data_to_disk().

With previous refactors, now we're safe to remove them.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 19:08:30 +02:00
Qu Wenruo 2a93728391 btrfs-progs: use write_data_to_disk() to replace write_extent_to_disk()
Function write_extent_to_disk() is just writing the content of a tree
block to disk.

It can not handle RAID56, and its work is the same as
write_data_to_disk().

Thus we can replace write_extent_to_disk() with write_data_to_disk()
easily.

There is only one special call site in write_raid56_with_parity(), which
can easily be replace with btrfs_pwrite() directly.

This reduce the write entrance, and make later eb::fd removal easier.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 19:08:29 +02:00
Qu Wenruo f9659c7235 btrfs-progs: fix an error path which can lead to empty device list
[BUG]
With the incoming delayed chunk item insertion feature, there is a super
weird failure at mkfs/022:

  ====== RUN CHECK ./mkfs.btrfs -f --rootdir tmp.KnKpP5 -d dup -b 350M tests/test.img
  ...
  Checksum:           crc32c
  Number of devices:  0
  Devices:
     ID        SIZE  PATH

Note the "Number of devices: 0" line, this means our
fs_info->fs_devices->devices list is empty.

And since our rw device list is empty, we won't finish the mkfs with
proper superblock magic, and cause later btrfs check to fail.

[CAUSE]
Although the failure is only triggered by the incoming delayed chunk
item insertion feature, the bug itself is here for a while.

In btrfs_alloc_chunk(), we move rw devices to our @private_devs list
first, then in create_chunk(), we move it back to our rw devices list.

This dance is pretty dangerous, especially if btrfs_alloc_dev_extent()
failed inside create_chunk(), and current profile have multiple stripes
(including DUP), we will exit create_chunk() directly, without moving the
remaining devices in @private_devs list back to @dev_list.

Furthermore, btrfs_alloc_chunk() is expected to return -ENOSPC, as we
call btrfs_alloc_chunk() to pre-allocate chunks, and ignore the -ENOSPC
error if it's just a pre-allocation failure.

This existing error path can lead to the empty rw list seen above.

[FIX]
After create_chunk(), unconditionally move all devices in @private_devs
back to rw device list.

And add extra check to make sure our rw device list is never empty after
a chunk allocation attempt.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 18:33:29 +02:00
Naohiro Aota fd4bab06a4 btrfs-progs: zoned: fix and simplify dev_extent_hole_check_zoned()
The previous patch revealed a bug in dev_extent_hole_check_zoned(). If the
given hole is OK to use as is, it should have just returned the hole. But
on the contrary, it shifts the hole start position by one zone. That
results in refusing any hole.

We don't use btrfs_ensure_empty_zones() in the btrfs-progs version of
dev_extent_hole_check_zoned() unlike the kernel side, because
btrfs_find_allocatable_zones() itself is doing the necessary checks. So, we
can just "return changed" if the "pos" is unchanged. That also makes the
loop and "changed" variable unnecessary.

So, fix and simplify the code in one shot.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-08 23:17:35 +02:00
Naohiro Aota 38670212dd btrfs-progs: fix ordering of hole_size setting and dev_extent_hole_check()
The hole_size is used by dev_extent_hole_check() to check the hole is OK as
a device extent. However, commit b031fe84fd ("btrfs-progs: zoned:
implement zoned chunk allocator") mis-ported the kernel code and placed
dev_extent_hole_check() before setting hole_check. That made the
dev_extent_hole_check() call here essentially pass through as we have
hole_size == 0 on mkfs time.

As a result, mkfs.btrfs creates data BG at 64 MB where the regular
superblock exists, when zone size is 16 MB.

Fix the ordering of hole_size setting and calling dev_extent_hole_check().

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-08 23:17:35 +02:00
Josef Bacik 5dc3964aaa btrfs-progs: remove the _nr from the item helpers
Now that all callers are using the _nr variations we can simply rename
these helpers to btrfs_item_##member/btrfs_set_item_##member and change
the actual item SETGET funcs to raw_item_##member/set_raw_item_##member
and then change all callers to drop the _nr part.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 15:13:13 +01:00
Josef Bacik db2ab47823 btrfs-progs: stop accessing ->extent_root directly
When we switch to multiple global trees we'll need to access the
appropriate extent root depending on the block group or possibly root.
To handle this, use a helper in most places and then the actual root in
places where it is required.  We will whittle down the direct accessors
with future patches, but this does the bulk of the preparatory work.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-30 18:56:54 +01:00
Qu Wenruo 636b2e6027 btrfs-progs: remove temporary buffer for super block
There are a lot of call sites where we use the following code snippet:

	u8 super_block_data[BTRFS_SUPER_INFO_SIZE];
	struct btrfs_super_block *sb;
	u64 ret;

	sb = (struct btrfs_super_block *)super_block_data;

The reason for this is, structure btrfs_super_block was smaller than
BTRFS_SUPER_INFO_SIZE.

Thus for anything with csum involved, we have to use a proper 4K buffer.

Since the recent unification of sizeof(struct btrfs_super_block), we no
longer need such workaround, and can use struct btrfs_super_block
directly to do any operation.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-05 12:50:03 +01:00
David Sterba 4882a4f5fa btrfs-porgs: add exception for upper case single profile name
For consistency with older versions switch the case of 'single' to be
lower case again even if it's inconsistent. This could be revisited in
the future.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-05 12:50:03 +01:00
Qu Wenruo b1c944657c btrfs-progs: make "btrfs filesystem df" command show upper case profile
[BUG]
Since commit dad03fac3b ("btrfs-progs: switch btrfs_group_profile_str
to use raid table"), fstests/btrfs/023 and btrfs/151 will always fail.

The failure of btrfs/151 explains the reason pretty well:

btrfs/151 1s ... - output mismatch
    --- tests/btrfs/151.out	2019-10-22 15:18:14.068965341 +0800
    +++ ~/xfstests-dev/results//btrfs/151.out.bad	2021-11-02 17:13:43.879999994 +0800
    @@ -1,2 +1,2 @@
     QA output created by 151
    -Data, RAID1
    +Data, raid1
    ...
    (Run 'diff -u ~/xfstests-dev/tests/btrfs/151.out ~/xfstests-dev/results//btrfs/151.out.bad'  to see the entire diff)

[CAUSE]
Commit dad03fac3b ("btrfs-progs: switch btrfs_group_profile_str to use
raid table") will use btrfs_raid_array[index].raid_name, which is all
lower case.

[FIX]
There is no need to bring such output format change.

So here we split the btrfs_raid_attr::raid_name[] into upper_name[] and
lower_name[], and make upper and lower case helpers for callers to use.

Now there are several types of callers referring to lower_name and
upper_name:

- parse_bg_profile()
  It uses strcasecmp(), either case would be fine.

- btrfs_group_profile_str()
  Originally it uses upper case for all profiles except "single".
  Now unified to upper case.

- sprint_profiles()
  It uses lower case.

- bg_flags_to_str()
  It uses upper case.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-05 12:50:03 +01:00
Qu Wenruo 5bee5c99bf btrfs-progs: fix printf formats on 32bit x86
When compiling btrfs-progs on 32bit x86 using GCC 11.1.0, there are
several warnings:

  In file included from ./common/utils.h:30,
                   from check/main.c:36:
  check/main.c: In function 'run_next_block':
  ./common/messages.h:42:31: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'u32' {aka 'unsigned int'} [-Wformat=]
     42 |                 __btrfs_error((fmt), ##__VA_ARGS__);                    \
        |                               ^~~~~
  check/main.c:6496:33: note: in expansion of macro 'error'
   6496 |                                 error(
        |                                 ^~~~~

  In file included from ./common/utils.h:30,
                   from kernel-shared/volumes.c:32:
  kernel-shared/volumes.c: In function 'btrfs_check_chunk_valid':
  ./common/messages.h:42:31: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'u32' {aka 'unsigned int'} [-Wformat=]
     42 |                 __btrfs_error((fmt), ##__VA_ARGS__);                    \
        |                               ^~~~~
  kernel-shared/volumes.c:2052:17: note: in expansion of macro 'error'
   2052 |                 error("invalid chunk item size, have %u expect [%zu, %lu)",
        |                 ^~~~~

  image/main.c: In function 'search_for_chunk_blocks':
  ./common/messages.h:42:31: warning: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'size_t' {aka 'unsigned int'} [-Wformat=]
     42 |                 __btrfs_error((fmt), ##__VA_ARGS__);                    \
        |                               ^~~~~
  image/main.c:2122:33: note: in expansion of macro 'error'
   2122 |                                 error(
        |                                 ^~~~~

There are two types of problems:

- __BTRFS_LEAF_DATA_SIZE()
  This macro has no type definition, making it behaves differently on
  different arches.

  Fix this by following kernel to use inline function to make its return
  value fixed to u32.

- size_t related output
  For x86_64 %lu is OK but not for x86.

  Fix this by using %zu.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-05 12:50:03 +01:00
David Sterba bc28dc6bea btrfs-progs: introduce helper for striped profiles
There are several profiles like raid0, raid10, raid5 and raid6 that can
span as many devices as possible and need special handling for the
stripe calculations. Provide a helper to identify the profiles in a
simple way.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-20 18:59:24 +02:00
David Sterba a25a5cc2c0 btrfs-progs: use btrfs_bg_type_to_nparity in btrfs_stripe_length
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-20 18:59:24 +02:00
David Sterba 11fcdbc35e btrfs-progs: introduce helper to get allowed profiles for a given device number
Use the raid table helper to avoid hard coding profiles for the given
number of devices in test_num_disk_vs_raid.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-20 18:59:24 +02:00
David Sterba cb0b63cd90 btrfs-progs: use raid table value for sub_stripes in btrfs_check_chunk_valid
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-20 18:59:24 +02:00
David Sterba 4f662d74fd btrfs-progs: export raid table helper for sub_stripes
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-20 18:59:24 +02:00
David Sterba e3355b43b4 btrfs-progs: use raid table for min devs in btrfs_check_chunk_valid
Replace the hard coded values with the raid table reference.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-20 18:59:24 +02:00
David Sterba 3d05b20435 btrfs-progs: use btrfs_bg_type_to_nparity in chunk_bytes_by_type
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-20 18:59:23 +02:00
David Sterba f759e9bbad btrfs-progs: introduce a public helper for raid parity
There's a private helper for parity and there are many open coded
calculations of parity for the RAID56 profiles. The helper will be used
to remove that and use the raid table values.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-20 18:59:23 +02:00
David Sterba 80714610f3 btrfs-progs: use raid table for ncopies
There's opencoded value of raid table ncopies in
print_filesystem_usage_overall, add a helper and use it.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-20 18:59:23 +02:00
David Sterba b29f1603b0 btrfs-progs: use raid table for devs_min and replace local helper
Another duplication of the raid table, in this case missing the changes
to raid10 and raid0 minimum devices changed in a177ef7dd4
("btrfs-progs: mkfs: allow degenerate raid0/raid10").

Define and use a helper using the table value.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-20 18:59:23 +02:00
Naohiro Aota e9696b06f0 btrfs-progs: use direct-io for zoned device
We need to use direct-IO for zoned devices to preserve the write
ordering.  Instead of detecting if the device is zoned or not, we simply
use direct-IO for any kind of device (even if emulated zoned mode on a
regular device).

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-20 18:59:23 +02:00
Naohiro Aota 40ab7530df btrfs-progs: set eb::fs_info properly everywhere
Several extent_buffer initializations miss fs_info initialization. This
is OK before the following patch ("btrfs-progs: use direct-io for zoned
device") as eb->fs_info is not always necessary. But, after that patch,
we will use fs_info to determine it is zoned or not and that causes
segfault in such cases.

Properly set fs_info when initializing extent_buffers to fix the issue.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:47:04 +02:00
David Sterba 76ab1fa364 btrfs-progs: rename and move group_profile_max_safe_loss
The helper belongs to the others that translate bg flags to the raid
attr table member.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 16:38:56 +02:00
David Sterba c3ee6a8a09 btrfs-progs: unify GPL header comments
Add the GPL v2 header to files where it was missing and is not from an
external source, update to the most recent version with the address.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 13:58:44 +02:00
David Sterba 7572839a74 btrfs-progs: add and use bit masks for RAID1 and RAID56 profiles
Many test conditions can be simplified in case they check all the
related profiles.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-06 16:36:18 +02:00
David Sterba 7fe4396467 btrfs-progs: copy some raid_attr helpers from kernel
There are convenience helpers for the raid attr table, copy them from
kernel for further cleanups.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-06 16:36:17 +02:00
David Sterba a177ef7dd4 btrfs-progs: mkfs: allow degenerate raid0/raid10
Kernel patch b2f78e88052bc0bee ("btrfs: allow degenerate raid0/raid10")
in
5.15 will allow mounting and converting to single device raid0 or two
device raid10.  Let mkfs create such filesystem.

"The motivation is to allow to preserve the profile type as long as it
 possible for some intermediate state (device removal, conversion), or
 when there are disks of different size, with raid0 the otherwise
 unusable space of the last device will be used too.  Similarly for
 raid10, though the two largest devices would need to be the same."

Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-27 15:40:53 +02:00
David Sterba 6527771668 btrfs-progs: add nparity for raid1c34 definitions
The values of .ncopies was not explicitly set.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-23 00:59:27 +02:00
Naohiro Aota b031fe84fd btrfs-progs: zoned: implement zoned chunk allocator
Implement a zoned chunk and device extent allocator. One device zone
becomes a device extent so that a zone reset affects only this device
extent and does not change the state of blocks in the neighbor device
extents.

To implement the allocator, we need to extend the following functions for
a zoned filesystem:

  - init_alloc_chunk_ctl
  - dev_extent_search_start
  - dev_extent_hole_check
  - decide_stripe_size

Here, dev_extent_hole_check() is newly introduced to check the validity of
a hole found.

init_alloc_chunk_ctl_zoned() is mostly the same as regular one. It always
set the stripe_size to the zone size and aligns the parameters to the zone
size.

dev_extent_search_start() only aligns the start offset to zone boundaries.
We don't care about the first 1MB like in regular filesystem because we
anyway reserve the first two zones for superblock logging.

dev_extent_hole_check_zoned() checks if zones in given hole are either
conventional or empty sequential zones. Also, it skips zones reserved for
superblock logging.

With the change to the hole, the new hole may now contain pending extents.
So, in this case, loop again to check that.

Finally, decide_stripe_size_zoned() should shrink the number of devices
instead of stripe size because we need to honor stripe_size == zone_size.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:45 +02:00
Naohiro Aota 384840b9c0 btrfs-progs: zoned: get zone information of zoned block devices
Get the zone information (number of zones and zone size) from all the
devices, if the volume contains a zoned block device. To avoid costly
run-time zone report commands to test the device zones type during block
allocation, it also records all the zone status (zone type, write
pointer position, etc.).

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:45 +02:00
Naohiro Aota acdd22ab68 btrfs-progs: provide fs_info from btrfs_device
Likewise in the kernel code, provide fs_info access from struct
btrfs_device. This will help to unify the code between the kernel and
the userland.

Since fs_info can be NULL at the time of btrfs_add_to_fsid(), let's use
btrfs_open_devices() to set fs_info to the devices.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:45 +02:00
Naohiro Aota cf67267d33 btrfs-progs: rename calc_size to stripe_size
alloc_chunk_ctl::calc_size is actually the stripe_size in the kernel
side code.  Let's rename it to clarify what the "calc" is.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:45 +02:00
Naohiro Aota a968606632 btrfs-progs: simplify arguments of chunk_bytes_by_type()
Chunk_bytes_by_type() takes type, calc_size, and ctl as arguments. But
the first two can be obtained from the ctl. Let's drop these arguments
for simplicity.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:45 +02:00
Naohiro Aota 3428487d90 btrfs-progs: drop alloc_chunk_ctl::stripe_len
Since commit b9444efb66 ("btrfs-progs: don't pretend RAID56 has a
different stripe length"), alloc_chunk_ctl::stripe_len is always fixed
to BTRFS_STRIPE_LEN. Let's replace alloc_chunk_ctl::stripe_len with
BTRFS_STRIPE_LEN, like in the kernel code.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:44 +02:00
Naohiro Aota 2d3b31d604 btrfs-progs: use round_down for allocation calcs
Several calculations in the chunk allocation process use this pattern.

    x /= y;
    x *= y;

Replace this pattern with round_down().

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:44 +02:00
Naohiro Aota 907c5fd7a4 btrfs-progs: fix to use half the available space for DUP profile
In the DUP profile, we can use only half of the space available in a
device extent. Fix the calculation of calc_size for it.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:44 +02:00
Naohiro Aota 13a6cff8b6 btrfs-progs: rewrite btrfs_alloc_data_chunk() using create_chunk()
btrfs_alloc_data_chunk() and create_chunk() have the most part in common.
Let's rewrite btrfs_alloc_data_chunk() using create_chunk().

There are two differences between btrfs_alloc_data_chunk() and
create_chunk(). create_chunk() uses find_next_chunk() to decide the
logical address of the chunk, and it uses btrfs_alloc_dev_extent() to
decide the physical address of a device extent. On the other hand,
btrfs_alloc_data_chunk() uses *start for both logical and physical
addresses.

To support the btrfs_alloc_data_chunk()'s use case, we use ctl->start
and ctl->dev_offset. If these values are set (non-zero), use the
specified values as the address. It is safe to use 0 to indicate the
value is not set here. Because both lower addresses of logical

  (0..BTRFS_FIRST_CHUNK_TREE_OBJECT_ID) and physical
  (0..BTRFS_BLOCK_RESERVED_1M_FOR_SUPER) are reserved.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:44 +02:00
Naohiro Aota 1e344dc8cf btrfs-progs: factor out create_chunk()
Factor out create_chunk() from btrfs_alloc_chunk(). This new function
creates a chunk.

There is no functional changes.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:44 +02:00
Naohiro Aota 0e3865206e btrfs-progs: factor out decide_stripe_size()
Factor out decide_stripe_size() from btrfs_alloc_chunk(). This new
function calculates the actual stripe size to allocate and decides the
size of a stripe (ctl->calc_size).

This commit has no functional changes.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:44 +02:00