Commit Graph

5923 Commits

Author SHA1 Message Date
David Sterba 1c70e888da btrfs-progs: docs: add subpage feature page
Introductory paragraph, status and progress needs to be added.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-17 21:12:19 +02:00
David Sterba 65f67f5829 btrfs-progs: docs: copy more contents from wiki
- Tree-checker - about reporting problems
- Seeding-device - chained seeding devices
- RAID56 - write hole, fixed stripe width

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-17 21:12:19 +02:00
David Sterba f7456830d1 btrfs-progs: docs: merge storage model to hardware chapter
The storage model is the intro chapter for the hardware problems.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-17 21:12:19 +02:00
David Sterba 4c5554d46d btrfs-progs: docs: separate chapter for hardware considerations
Make it more visible than just in section 5.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-17 21:12:19 +02:00
David Sterba f1b4ef2f35 btrfs-progs: docs: move flexibility to Administration
Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-17 21:12:19 +02:00
David Sterba 0539bbb66a btrfs-progs: docs: separate filesystem limits chapter
For section 5 and Administration.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-17 21:12:19 +02:00
David Sterba 1a431b0837 btrfs-progs: docs: document paused balance
Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-17 21:12:19 +02:00
David Sterba 908a46085c btrfs-progs: docs: separate bootloaders chapter
Used in manual page section 5 and Administration overview.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-17 21:12:19 +02:00
David Sterba 751be36f86 btrfs-progs: delete commented exports from libbtrfs.sym
Keep only existing exports, the commented out functions have been hidden
in 56e9963474 ("btrfs-progs: libbtrfs: hide unused symbols, same
version") in 5.14 and no problems have been reported.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-12 20:04:39 +02:00
David Sterba b6e15650b2 btrfs-progs: docs: add note about ifdef EXPERIMENTAL
Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-12 14:02:51 +02:00
David Sterba a5d88cbdee btrfs-progs: INSTALL: drop reference to libattr
The header sys/attr.h should be used in all cases and libattr is not
used in the build anywhere.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-12 13:59:24 +02:00
David Sterba d619bb3192 btrfs-progs: docs: link INSTALL to docs
The INSTALL format renders fine as RST, add it to the main page.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-12 13:56:03 +02:00
David Sterba 836db3d0b8 btrfs-progs: INSTALL: update dependencies for docs build
No asciidoc since 5.17, we're using sphinx.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-12 13:47:50 +02:00
David Sterba 8e79d40e8f btrfs-progs: kernel-lib: sync include/rtree.h
Minor fixups to comments.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-12 13:40:09 +02:00
David Sterba d3585221fc btrfs-progs: kernel-lib: sync include/list.h
Sync list.h, add hlist definitions, update poison pointer, add stub
defitions for smp_* annotations.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-12 13:30:54 +02:00
David Sterba 5ad2aacd24 btrfs-progs: kernel-lib: sync include/overflow.h
Sync current version with improved checks.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-12 13:13:31 +02:00
David Sterba e49441a953 btrfs-progs: kernel-lib: sync lib/rbtree.c
Changes: comments, WRITE_ONCE, typos
Not included: RCU helpers

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-12 13:07:15 +02:00
David Sterba 68d04374f7 btrfs-progs: kernel-lib: add rb_root_cached helpers
Copy inline helpers for the cached variant of the rbtree, not used yet.
Rename 'new' for C++ compatibility.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-12 12:53:38 +02:00
David Sterba 2b603d9819 btrfs-progs: kernel-lib: add simplified READ_ONCE and WRITE_ONCE
For easier source synchronization with kernel, add the _ONCE wrappers,
but only the simplified version as we don't do any lock-less
algorithms or use the semantics in userspace.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-12 12:43:27 +02:00
David Sterba c1b24d742f btrfs-progs: kernel-lib: add rbtree_types.h from linux
In order to use rb_root_cached we need to sync with kernel sources. Copy
the file from linux.git/include/linux/rbtree_types.h and update so it's
C++ protected for inclusion to libbtrfs and remove duplicate
definitions.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-12 12:32:18 +02:00
David Sterba df77a231bb btrfs-progs: docs: move glossary to overview sections
The glossary is reasonably complete so make it more visible in the main
section.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-10 15:50:10 +02:00
David Sterba f1178950d3 btrfs-progs: btrfstune: fix build-time detection of experimental features
Qu noticed that the full checksums are still printed even if the
experimental build is not enabled. This is caused by wrong use of #ifdef
(as the macro is always defined), this must be "#if".

Fixes: 1bb6fb896d ("btrfs-progs: btrfstune: experimental, new option to switch csums")
Reported-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-10 15:42:13 +02:00
Qu Wenruo 50a5dfde6d btrfs-progs: print-tree: print the checksum of header without tailing zeros
For the default CRC32C checksum, print-tree now prints tons of
unnecessary padding zeros:

  btrfs-progs v5.17
  chunk tree
  leaf 22036480 items 7 free space 15430 generation 6 owner CHUNK_TREE
  leaf 22036480 flags 0x1(WRITTEN) backref revision 1
  checksum stored 0ac1b9fa00000000000000000000000000000000000000000000000000000000
  checksum calced 0ac1b9fa00000000000000000000000000000000000000000000000000000000
  fs uuid 3d95b7e3-3ab6-4927-af56-c58aa634342e

This is caused by commit 1bb6fb896d ("btrfs-progs: btrfstune:
experimental, new option to switch csums"), and it looks like most
distros just enable EXPERIMENTAL features by default.
(Which is a good thing to provide much better coverage).

So here we just limit the csum print to the utilized csum size.

Now the output looks like:

  btrfs-progs v5.17
  chunk tree
  leaf 22036480 items 4 free space 15781 generation 6 owner CHUNK_TREE
  leaf 22036480 flags 0x1(WRITTEN) backref revision 1
  checksum stored 676b812f
  checksum calced 676b812f
  fs uuid d11f8799-b6dc-415d-b1ed-cebe6da5f0b7

Fixes: 1bb6fb896d ("btrfs-progs: btrfstune: experimental, new option to switch csums")
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-10 13:44:37 +02:00
David Sterba e278e0755f btrfs-progs: make device add and paused balance work together
Kernel commit efc0e69c2fea ("btrfs: introduce exclusive operation
BALANCE_PAUSED state") allows to start a device add when there's a
paused balance, eg. to let the balance finish when there's not enough
chunk space. Add the support for that, though this needs an updated
kernel to export the 'balance paused' in sysfs.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-05-03 22:48:14 +02:00
Qu Wenruo 851ef59b2c btrfs-progs: remove the unused btrfs_fs_info::seeding member
This member is not used by anyone, just remove it.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-29 22:13:22 +02:00
David Sterba 50c71aedfa btrfs-progs: unify CHANGES indentation
Indent the main list, fix spacing for nested lists.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-27 19:51:05 +02:00
David Sterba 2f83d94013 btrfs-progs: reformat CHANGES for RST
Add headings to versions, reorder so the minor releases are under the
major so it's properly nested, keep the last version expanded.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-27 19:50:59 +02:00
David Sterba 477b73ed32
Btrfs progs v5.17
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-26 19:05:20 +02:00
David Sterba 50545bbcfa btrfs-progs: update CHANGES for 5.17
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-26 19:04:43 +02:00
Qu Wenruo 007c799ca8 btrfs-progs: mkfs: use sectorsize as nodesize fallback for mixed profiles
[BUG]
When running btrfs/011 with subpage case, even with RAID56 support, it
still fails with the following error:

 QA output created by 011
 *** test btrfs replace
 mkfs failed
 (see /home/adam/xfstests-dev/results//btrfs/011.full for details)

The full log shows:

  ---------workout "-m single -d single -M" 1 no 64-----------
  ERROR: illegal nodesize 65536 (not equal to 4096 for mixed block group)
  mkfs failed

This is a critical error, making test case to be aborted, without
checking the rest profiles.

[CAUSE]
Mkfs.btrfs always uses the maximum value between sectorsize and page
size for its mixed profile nodesize.

For subpage case, it means we always go PAGE_SIZE, no matter whatever
the sectorsize is passed in.

[FIX]
Just get rid of the direct PAGE_SIZE usage when determining nodesize for
mixed profiles.
And use sectorsize directly (either passed in by the user, or
determined from page size).

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-26 01:14:48 +02:00
Qu Wenruo 2d6acbaee4 btrfs-progs: tests/fsck: add test case for data csum check on raid5
Previously 'btrfs check --check-data-csum' will report tons of false
alerts for RAID56.

Add a test case to make sure with the new RAID56 rebuild ability, there
should be no false alerts.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-26 01:14:48 +02:00
Qu Wenruo 4e9e978783 btrfs-progs: allow read_data_from_disk() to rebuild RAID56 using P/Q
This new ability is added by:

- Allow btrfs_map_block() to return the chunk type
  This makes later work much easier

- Only reset stripe offset inside btrfs_map_block() when needed
  Currently if @raid_map is not NULL, btrfs_map_block() will consider
  this call is for WRITE and will reset stripe offset.

  This is no longer the case, as for RAID56 read with mirror_num 1/0,
  we will still call btrfs_map_block() with non-NULL raid_map.

  Add a small check to make sure we won't reset stripe offset for
  mirror 1/0 read.

- Add new helper read_raid56() to handle rebuild
  We will read the full stripe (including all data and P/Q stripes)
  do the rebuild, then only copy the refered part to the caller.

  There is a catch for RAID6, we have no way to exhaust all combination,
  so the current repair will assume the mirror = 0 data is corrupted,
  then try to find a missing device.

  But if no missing device can be found, it will assume P is corrupted.
  This is just a guess, and can to totally wrong, but we have no better
  idea.

Now btrfs-progs have full read ability for RAID56.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 19:08:30 +02:00
Qu Wenruo a99bece1cd btrfs-progs: remove extent_buffer::fd and extent_buffer::dev_bytes
Those two members are a shortcut for non-RAID56 profiles.

But we should not use such shortcut, and move all our logical address
read/write to the unified read_data_from_disk()/write_data_to_disk().

With previous refactors, now we're safe to remove them.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 19:08:30 +02:00
Qu Wenruo 3ff9d35257 btrfs-progs: use read_data_from_disk() to replace read_extent_from_disk() and replace read_extent_data()
The function read_extent_from_disk() is only a wrapper to read tree
block.

And read_extent_data() is just a while loop to eliminate short read
caused by stripe boundary.

In fact, a lot of call sites of read_extent_data() are either reading
metadata (thus no possible short read) or doing extra loop by
themselves.

This patch will replace those two functions with read_data_from_disk(),
making it the only entrance for data/metadata read.
And update read_data_from_disk() to return the read bytes, so caller can
do a simple while loop.

For the few callers of read_extent_data(), open-code a small while loop
for them.

This will allow later RAID56 read repair using P/Q much easier.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 19:08:30 +02:00
Qu Wenruo 2a93728391 btrfs-progs: use write_data_to_disk() to replace write_extent_to_disk()
Function write_extent_to_disk() is just writing the content of a tree
block to disk.

It can not handle RAID56, and its work is the same as
write_data_to_disk().

Thus we can replace write_extent_to_disk() with write_data_to_disk()
easily.

There is only one special call site in write_raid56_with_parity(), which
can easily be replace with btrfs_pwrite() directly.

This reduce the write entrance, and make later eb::fd removal easier.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 19:08:29 +02:00
Qu Wenruo 3816a861d0 btrfs-progs: don't use write_extent_to_disk() directly
There are two call sites using write_extent_to_disk() directly:

- debug_corrupt_block() in btrfs-corrupt-block.c
- corrupt_keys() in btrfs-corrupt-block.c

The problem of write_extent_to_disk() is, it can only handle plain
profiles (All profiles except P/Q stripes of RAID56).

Calling it directly can corrupted RAID56 P/Q, and in the future we're
going to remove eb::fd/eb::dev_bytes, so remove such call sites with
write_and_map_eb().

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 19:07:09 +02:00
Qu Wenruo 01c25d73f1 btrfs-progs: extract metadata restore read code into its own helper
For metadata restore, our logical address is mapped to a single device
with logical address 1:1 mapped to device physical address.

Move this part of code into a helper, this will make later extent buffer
read path refactoer much easier.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 19:07:09 +02:00
Qu Wenruo 7a0c4b5dc1 btrfs-progs: remove the unnecessary BTRFS_SUPER_INFO_OFFSET path for tree block read
We used to use read_whole_eb() to read super block, but it's no longer
the case (so long that I can not even find out which commit did the
conversion).

Thus there is no need for read_whole_eb() to handle super block read
anymore.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 19:07:08 +02:00
Qu Wenruo 013d80648c btrfs-progs: tests: check warning for seed and sprouted filesystems
Previously we had a bug that btrfs check would report false warning for
a sprouted filesystem.

So this patch will add a new test case to make sure neither seed nor
and sprouted filesystem will cause such false warning.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 19:06:06 +02:00
Qu Wenruo 0dc8b8b6a4 btrfs-progs: check: fix wrong total bytes check for seed device
[BUG]
The following script can lead to false positive from btrfs check:

  mkfs.btrfs -f $dev1
  mount $dev1 $mnt
  btrfstune -S1 $dev1
  mount $dev1 $mnt
  btrfs dev add -f $dev2 $mnt
  umount $mnt

  # Now dev1 is seed, and dev2 is the rw fs.
  btrfs check $dev2
  ...
  [2/7] checking extents
  WARNING: minor unaligned/mismatch device size detected
  WARNING: recommended to use 'btrfs rescue fix-device-size' to fix it
  ...

This false positive only happens on $dev2, $dev1 is completely fine.

[CAUSE]
The warning is from is_super_size_valid(), in that function we verify
the super block total bytes (@super_bytes) is correct against the total
device bytes (@total_bytes).

However the when calculating @total_bytes, we only use devices in
current fs_devices, which only contains RW devices.

Thus all bytes from seed device are not taken into consideration, and
trigger the false positive.

[FIX]
Fix it by also iterating seed devices.

Since we're here, also output @total_bytes and @super_bytes when
outputting the warning message, to allow end users have a better idea on
what's going wrong.

Reviewed-by: Su Yue <l@damenly.su>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 18:59:44 +02:00
Mark Harmstone ef194732d5 btrfs-progs: check: add check for overlong xattr names
While working on my Windows driver, I found that it was inadvertently
allowing users to create xattrs with names longer than 255 bytes, which
wasn't being picked up by btrfs-check.

If the Linux driver encounters a file with an invalid xattr like this,
it makes the whole directory it's in inaccessible. If it's the root
directory, it'll refuse to mount the filesystem entirely.

Pull-request: #456
Signed-off-by: Mark Harmstone <mark@harmstone.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 18:52:49 +02:00
Qu Wenruo f9659c7235 btrfs-progs: fix an error path which can lead to empty device list
[BUG]
With the incoming delayed chunk item insertion feature, there is a super
weird failure at mkfs/022:

  ====== RUN CHECK ./mkfs.btrfs -f --rootdir tmp.KnKpP5 -d dup -b 350M tests/test.img
  ...
  Checksum:           crc32c
  Number of devices:  0
  Devices:
     ID        SIZE  PATH

Note the "Number of devices: 0" line, this means our
fs_info->fs_devices->devices list is empty.

And since our rw device list is empty, we won't finish the mkfs with
proper superblock magic, and cause later btrfs check to fail.

[CAUSE]
Although the failure is only triggered by the incoming delayed chunk
item insertion feature, the bug itself is here for a while.

In btrfs_alloc_chunk(), we move rw devices to our @private_devs list
first, then in create_chunk(), we move it back to our rw devices list.

This dance is pretty dangerous, especially if btrfs_alloc_dev_extent()
failed inside create_chunk(), and current profile have multiple stripes
(including DUP), we will exit create_chunk() directly, without moving the
remaining devices in @private_devs list back to @dev_list.

Furthermore, btrfs_alloc_chunk() is expected to return -ENOSPC, as we
call btrfs_alloc_chunk() to pre-allocate chunks, and ignore the -ENOSPC
error if it's just a pre-allocation failure.

This existing error path can lead to the empty rw list seen above.

[FIX]
After create_chunk(), unconditionally move all devices in @private_devs
back to rw device list.

And add extra check to make sure our rw device list is never empty after
a chunk allocation attempt.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 18:33:29 +02:00
Qu Wenruo 4a940ab2c0 btrfs-progs: fix a memory leak when starting a transaction on fs with error
Function btrfs_start_transaction() will allocate the memory
unconditionally, but if the fs has an aborted transaction we don't free
the allocated memory but return error directly.

Fix it by only allocate the new memory after all the checks.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 18:32:17 +02:00
Qu Wenruo bfe6402026 btrfs-progs: make sure "btrfstune -S1" will reject fs with dirty log
The new test case will have a image file which has dirty log
(btrfs-image supports dumping log tree).

So we can easily check if "btrfstune -S" will reject fs with dirty log.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 18:30:28 +02:00
Qu Wenruo cb3ad87baf btrfs-progs: do not allow setting seed flag on fs with dirty log
[BUG]
The following sequence operation can lead to a seed fs rejected by
kernel:

 # Generate a fs with dirty log
 mkfs.btrfs -f $file
 mount $dev $mnt
 xfs_io -f -c "pwrite 0 16k" -c fsync $mnt/file
 cp $file $file.backup
 umount $mnt
 mv $file.backup $file

 # now $file has dirty log, set seed flag on it
 btrfstune -S1 $file

 # mount will fail
 mount $file $mnt

The mount failure with the following dmesg:

[  980.363667] loop0: detected capacity change from 0 to 262144
[  980.371177] BTRFS info (device loop0): flagging fs with big metadata feature
[  980.372229] BTRFS info (device loop0): using free space tree
[  980.372639] BTRFS info (device loop0): has skinny extents
[  980.375075] BTRFS info (device loop0): start tree-log replay
[  980.375513] BTRFS warning (device loop0): log replay required on RO media
[  980.381652] BTRFS error (device loop0): open_ctree failed

[CAUSE]
Although btrfs will replay its dirty log even with RO mount, but kernel
will treat seed device as RO device, and dirty log can not be replayed
on RO device.

This rejection is already the better end, just imagine if we don't treat
seed device as RO, and replayed the dirty log.
The filesystem relying on the seed device will be completely screwed up.

[FIX]
Just add extra check on log tree in btrfstune to reject setting seed
flag on filesystems with dirty log.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 18:30:28 +02:00
Li Zhang 7781d1a2da btrfs-progs: props: don't translate value of compression=none
Currently, if user specifies value 'no' or 'none' on the command line,
it gets translated to an empty value that is passed to kernel. There was
a change in kernel 5.14 done by commit 5548c8c6f55b ("btrfs: props:
change how empty value is interpreted") that changes the behaviour
in that case.

The empty value is supposed to mean 'the default value' for any
property. For compression there is a need to distinguish resetting the
value and also setting the NOCOMPRESS property. The translation to empty
value makes that impossible.

The explanation and behaviour copied from the kernel patch:

    Old behaviour:

      $ lsattr file
      ---------------------- file
      # the NOCOMPRESS bit is set
      $ btrfs prop set file compression ''
      $ lsattr file
      ---------------------m file

    This is equivalent to 'btrfs prop set file compression no' in current
    btrfs-progs as the 'no' or 'none' values are translated to an empty
    string.

    This is where the new behaviour is different: empty string drops the
    compression flag (-c) and nocompress (-m):

      $ lsattr file
      ---------------------- file
      # No change
      $ btrfs prop set file compression ''
      $ lsattr file
      ---------------------- file
      $ btrfs prop set file compression lzo
      $ lsattr file
      --------c------------- file
      $ btrfs prop get file compression
      compression=lzo
      $ btrfs prop set file compression ''
      # Reset to the initial state
      $ lsattr file
      ---------------------- file
      # Set NOCOMPRESS bit
      $ btrfs prop set file compression no
      $ lsattr file
      ---------------------m file

    This obviously brings problems with backward compatibility, so this
    patch should not be backported without making sure the updated
    btrfs-progs are also used and that scripts have been updated to use the
    new semantics.

    Summary:

    - old kernel:
      no, none, "" - set NOCOMPRESS bit
    - new kernel:
      no, none - set NOCOMPRESS bit
      "" - drop all compression flags, ie. COMPRESS and NOCOMPRESS

Signed-off-by: Li Zhang <zhanglikernel@gmail.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 18:30:28 +02:00
Naohiro Aota fd4bab06a4 btrfs-progs: zoned: fix and simplify dev_extent_hole_check_zoned()
The previous patch revealed a bug in dev_extent_hole_check_zoned(). If the
given hole is OK to use as is, it should have just returned the hole. But
on the contrary, it shifts the hole start position by one zone. That
results in refusing any hole.

We don't use btrfs_ensure_empty_zones() in the btrfs-progs version of
dev_extent_hole_check_zoned() unlike the kernel side, because
btrfs_find_allocatable_zones() itself is doing the necessary checks. So, we
can just "return changed" if the "pos" is unchanged. That also makes the
loop and "changed" variable unnecessary.

So, fix and simplify the code in one shot.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-08 23:17:35 +02:00
Naohiro Aota 38670212dd btrfs-progs: fix ordering of hole_size setting and dev_extent_hole_check()
The hole_size is used by dev_extent_hole_check() to check the hole is OK as
a device extent. However, commit b031fe84fd ("btrfs-progs: zoned:
implement zoned chunk allocator") mis-ported the kernel code and placed
dev_extent_hole_check() before setting hole_check. That made the
dev_extent_hole_check() call here essentially pass through as we have
hole_size == 0 on mkfs time.

As a result, mkfs.btrfs creates data BG at 64 MB where the regular
superblock exists, when zone size is 16 MB.

Fix the ordering of hole_size setting and calling dev_extent_hole_check().

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-08 23:17:35 +02:00
Naohiro Aota e273f9ebbd btrfs-progs: zoned: fix initial system BG location
Currently, we create the initial system block group in the zone 2. That
will create the BG at 64MB when the zone size is 32 MB, which collides with
the regular superblock location. It results in mount failure with:

  BTRFS info (device nullb0): zoned mode enabled with zone size 33554432
  BTRFS error (device nullb0): zoned: block group 67108864 must not contain super block
  BTRFS error (device nullb0): failed to read block groups: -117
  BTRFS error (device nullb0): open_ctree failed

Fix that by calculating the proper location of the initial system BG. It
avoids using zones reserved for zoned superblock logging and the zones
where a regular superblock resides.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-08 23:17:35 +02:00
Naohiro Aota 32c43d0c68 btrfs-progs: zoned: export sb_zone_number() and related constants
Move sb_zone_number() and related constants from zoned.c to the
corresponding header for later use.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-08 23:17:35 +02:00