Commit Graph

17 Commits

Author SHA1 Message Date
David Sterba d739e3b73a btrfs-progs: kernel-shared: use kmalloc and kfree
All the code in kernel-shared should use the proper memory allocation
helpers.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-11-03 18:04:37 +01:00
Qu Wenruo e79f18a4a7 btrfs-progs: introduce a basic metadata free space reservation check
Unlike kernel, in btrfs-progs btrfs_start_transaction() never checks if
there is enough metadata space.

This can lead to very dangerous situation where there is no metadata
space left at all, deadlocking future tree operations.

This patch introduces a very basic version of metadata/system free space
check by:

- Check if there is enough metadata/system space left
  If there is enough, go as usual.

- If there is not enough space left, try allocating a new chunk

- Recheck if the new space can meet our demand
  If not, return ERR_PTR(-ENOSPC).
  Otherwise, allocate a new trans handle to the caller.

This is possible thanks to the simplified transaction model in
btrfs-progs:

- We don't allow joining a transaction
  This means we don't need to handle complex cases like data ordered
  extents, which need to reserve space first, then join the current
  transaction and use the reserved blocks.

- We don't allow multiple transaction handles for one transaction
  Since btrfs-progs is single threaded, we always start a transaction
  and then commit it.

However there is a feature that must be an exception for the new
metadata/system free space check:

- btrfs check --init-extent-tree
  As all the meta/system free space check is based on the space info,
  which is loaded from block group items.
  Thus when rebuilding extent tree, we can no longer have an accurate
  view, thus we have to disable the feature for the whole execution if
  we're rebuilding the extent tree.

For now, there is no regression exposed during the self tests, but I
really hope this can be an extra safety net to prevent causing ENOSPC
deadlock in btrfs-progs.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-17 19:33:59 +02:00
Josef Bacik 2a34fd3892 btrfs-progs: cleanup dirty buffers on transaction abort
When adding the extent buffer leak detection I started getting failures
on some of the fuzz tests.  This is because we don't clean up dirty
buffers for aborted transactions, we just leave them dirty and thus we
leak them.  Fix this up by making btrfs_commit_transaction() on an
aborted transaction properly cleanup the dirty buffers that exist in the
system.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:57 +02:00
David Sterba 21aa6777b2 btrfs-progs: clean up includes, using include-what-you-use
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:57 +02:00
Josef Bacik 0fe11183af btrfs-progs: update btrfs_cow_block to match the in-kernel definition
btrfs_cow_block takes the lockdep nesting enum in the kernel.  Update
the definition to match the kernel version to make syncing ctree.c into
btrfs-progs more straightforward.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:56 +02:00
Josef Bacik f94ad0c516 btrfs-progs: pass btrfs_trans_handle through btrfs_clear_buffer_dirty
This is the calling convention in the kernel because we track dirty
blocks per transaction instead of globally in the fs_info.  Simply
mirror what we do in the kernel to make it easier to sync ctree.c
locally.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:55 +02:00
Josef Bacik db7157d912 btrfs-progs: clear root dirty when we update the root
We don't currently use the bit to track whether or not the root is
dirty, but when we sync ctree.c it uses this bit to determine if we
should add the root to the dirty list.  Clear this bit when we update
the root so that the dirty tracking works properly when we sync ctree.c.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-08-28 17:24:25 +02:00
David Sterba 2a6810615b btrfs-progs: partial sync of transaction.h from kernel
Add structures and prototypes. Leave out undefined types (time64_t,
block_group_rsv) for now. Kernel 6.4-rc1.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:32 +02:00
Josef Bacik 656938b665 btrfs-progs: rename clear_extent_buffer_dirty to btrfs_clear_buffer_dirty
This is a mirror of the change I've done in the kernel, but in progs
it's even more simply because clean_tree_block was just a wrapper around
clear_extent_buffer_dirty.  Change this to btrfs_clear_buffer_dirty, and
then update all the callers to use this helper instead of
clean_tree_block and clear_extent_buffer_dirty.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:29 +02:00
Josef Bacik 4a9a8f2a8a btrfs-progs: sync extent-io-tree.[ch] and misc.h from the kernel
This is a bit larger than the previous syncs, because we use
extent_io_tree's everywhere.  There's a lot of stuff added to
kerncompat.h, and then I went through and cleaned up all the API
changes, which were

- extent_io_tree_init takes an fs_info and an owner now.
- extent_io_tree_cleanup is now extent_io_tree_release.
- set_extent_dirty takes a gfpmask.
- clear_extent_dirty takes a cached_state.
- find_first_extent_bit takes a cached_state.

The diffstat looks insane for this, but keep in mind extent-io-tree.c
and extent-io-tree.h are ~2000 loc just by themselves.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:29 +02:00
Josef Bacik ccee633f3a btrfs-progs: move dirty eb tracking to it's own io_tree
btrfs-progs has a cache tree embedded in the extent_io_tree in order to
track extent buffers.  We use the extent_io_tree part to track dirty,
and the cache tree to keep the extent buffers in.  When we sync
extent-io-tree.[ch] we'll lose this ability, so separate out the dirty
tracking into its own extent_io_tree.  Subsequent patches will adjust
the extent buffer lookup so it doesn't use the custom extent_io_tree
thing.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-28 18:57:43 +01:00
Josef Bacik af30cf2e3e btrfs-progs: make the find extent buffer helpers take fs_info
This is a cleanup patch to make syncing the btrfs kernel code into
btrfs-progs easier.  In btrfs-progs we have an extra cache in the
extent_io_tree that's exclusively used for the extent buffer tracking.
In order to untangle this dependency start passing around the fs_info to
search for extent_buffers, and then have the helpers use the appropriate
structure to find the extent buffer.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-28 18:57:43 +01:00
Qu Wenruo 08bb354a1c btrfs-progs: properly handle write error when writing back tree blocks
[BUG]
If we emulate a write error during commit transaction, by setting the
block device read-only, then we can easily have the following crash
using "btrfs check --clear-space-cache v2":

  Opening filesystem to check...
  Checking filesystem on /dev/test/scratch1
  UUID: 5945915b-37f1-4bfa-9f64-684b318b8f73
  Clear free space cache v2
  Error writing to device 1
  kernel-shared/transaction.c:156: __commit_transaction: BUG_ON `ret` triggered, value 1
  ./btrfs(+0x570c9)[0x562ec894f0c9]
  ./btrfs(+0x57167)[0x562ec894f167]
  ./btrfs(__commit_transaction+0x13b)[0x562ec894f7f2]
  ./btrfs(btrfs_commit_transaction+0x214)[0x562ec894fa64]
  ./btrfs(btrfs_clear_free_space_tree+0x177)[0x562ec8941ae6]
  ./btrfs(+0xc8958)[0x562ec89c0958]
  ./btrfs(+0xc9d53)[0x562ec89c1d53]
  ./btrfs(+0x17ec7)[0x562ec890fec7]
  ./btrfs(main+0x12f)[0x562ec8910908]
  /usr/lib/libc.so.6(+0x232d0)[0x7ff917ee82d0]
  /usr/lib/libc.so.6(__libc_start_main+0x8a)[0x7ff917ee838a]
  ./btrfs(_start+0x25)[0x562ec890fdc5]
  Aborted (core dumped)

[CAUSE]
The call trace has shown it's a BUG_ON(), and it's from
__commit_transaction(), which is writing tree blocks back.

[FIX]
The fix is pretty simple, just return error.

In fact we even have an error value check in btrfs_commit_transaction()
just after __commit_transaction() call (although not catching the return
value from it).

And since we're here, also call btrfs_abort_transaction() to prevent
newer transactions from being started.

Now we won't have a full crash:

  Opening filesystem to check...
  Checking filesystem on /dev/test/scratch1
  UUID: 5945915b-37f1-4bfa-9f64-684b318b8f73
  Clear free space cache v2
  Error writing to device 1
  ERROR: failed to write bytenr 30425088 length 16384: Operation not permitted
  ERROR: failed to write tree block 30425088: Operation not permitted
  ERROR: failed to clear free space cache v2: -1
  extent buffer leak: start 30720000 len 16384

Reported-by: Christoph Anton Mitterer <calestyo@scientia.org>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:08 +02:00
Qu Wenruo 4a940ab2c0 btrfs-progs: fix a memory leak when starting a transaction on fs with error
Function btrfs_start_transaction() will allocate the memory
unconditionally, but if the fs has an aborted transaction we don't free
the allocated memory but return error directly.

Fix it by only allocate the new memory after all the checks.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 18:32:17 +02:00
Josef Bacik 9ee6cc78a8 btrfs-progs: add support for loading the block group root
This adds the ability to load the block group root, as well as make sure
the various backup super block and super block updates are made
appropriately.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 18:06:51 +01:00
Naohiro Aota bfd34b7876 btrfs-progs: zoned: redirty clean extent buffers
Tree manipulating operations like merging nodes often release
once-allocated tree nodes. Btrfs cleans such nodes so that pages in the
node are not uselessly written out. On ZONED drives, however, such
optimization blocks the following IOs as the cancellation of the write
out of the freed blocks breaks the sequential write sequence expected by
the device.

Check if next dirty extent buffer is continuous to a previously written
one. If not, it redirty extent buffers between the previous one and the
next one, so that all dirty buffers are written sequentially.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:45 +02:00
David Sterba 6069bc52a9 btrfs-progs: move transaction.c to kernel-shared/
Signed-off-by: David Sterba <dsterba@suse.com>
2020-08-31 17:01:06 +02:00