Commit Graph

257 Commits

Author SHA1 Message Date
Qu Wenruo
516f1e963a btrfs-progs: block-group: rename write_one_cache_group()
The name of this function contains the word "cache", which is left from
the era where btrfs_block_group is called btrfs_block_group_cache.

Now this "cache" doesn't match anything, and we have better namings for
functions like read/insert/remove_block_group_item().

Rename it to update_block_group_item().

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-11 21:31:48 +02:00
Qu Wenruo
24557bb4f9 btrfs-progs: block-group: refactor how we insert a block group item
Currently the block group item insert is pretty straight forward, fill
the block group item structure and insert it into extent tree.

However the incoming skinny block group feature is going to change this,
so this patch will refactor such insert into a new function,
insert_block_group_item(), to make the incoming feature easier to add.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-11 21:27:45 +02:00
Qu Wenruo
772ba86e5e btrfs-progs: rename btrfs_remove_block_group() and free_block_group_item()
To sync with the refactored kernel code.  Also since we're here, sync
the function parameters with kernel too.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-11 21:27:30 +02:00
Qu Wenruo
b2c8f806c4 btrfs-progs: block-group: refactor how we read one block group item
Structure btrfs_block_group has the following members which are
currently read from on-disk block group item and key:

- length - from item key
- used
- flags - from block group item

However for incoming skinny block group tree, we are going to read those
members from different sources.

This patch will refactor such read by:

- Refactor length/used/flags initialization into one function
  The new function, fill_one_block_group() will handle the
  initialization of such members.

- Use btrfs_block_group::length to replace key::offset
  Since skinny block group item would have a different meaning for its
  key offset.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-11 21:24:00 +02:00
Qu Wenruo
ccad599701 btrfs-progs: rename btrfs_block_group_cache to btrfs_block_group
To keep the same naming across kernel and btrfs-progs.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-11 20:50:00 +02:00
Qu Wenruo
5bc44891c9 btrfs-progs: kill block_group_cache::key
This would sync the code between kernel and btrfs-progs, and save at
least 1 byte for each btrfs_block_group_cache.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-11 20:49:50 +02:00
Qu Wenruo
877f512c55 btrfs-progs: sync block group item accessors from kernel
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-11 20:49:46 +02:00
Qu Wenruo
2a1823875c btrfs-progs: don't abuse READA_* for extent tree search
For extent tree search, we are only search two things: either
EXTENT_ITEM/METADATA_ITEM (inlined) or SHARED_BLOCK_REF/SHARED_DATA_REF
(keyed).

Except certain situation like cache_block_group(), we never read tree
blocks in a forward or backward sequence.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-11 20:46:22 +02:00
Adam Borowski
3d379b1341 btrfs-progs: lots of typo fixes (codespell)
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-31 18:37:38 +02:00
Qu Wenruo
b70eb104df btrfs-progs: extent-tree: Fix wrong post order rb tree cleanup for block groups
[BUG]
Valgrind reports memory leak for fsck/012, even after the image is
repair, the memory leak can still be reproduced.
  ==107060== HEAP SUMMARY:
  ==107060==     in use at exit: 176 bytes in 1 blocks
  ==107060==   total heap usage: 10,647 allocs, 10,646 frees, 3,000,654 bytes allocated
  ==107060==
  ==107060== 176 bytes in 1 blocks are definitely lost in loss record 1 of 1
  ==107060==    at 0x483BB65: calloc (vg_replace_malloc.c:762)
  ==107060==    by 0x1BD953: read_one_block_group (extent-tree.c:2661)
  ==107060==    by 0x1BDBD8: btrfs_read_block_groups (extent-tree.c:2719)
  ==107060==    by 0x1B3A2C: btrfs_setup_all_roots (disk-io.c:1024)
  ==107060==    by 0x1B44CA: __open_ctree_fd (disk-io.c:1299)
  ==107060==    by 0x1B46C6: open_ctree_fs_info (disk-io.c:1345)
  ==107060==    by 0x16952E: cmd_check (main.c:10154)
  ==107060==    by 0x11CDC6: cmd_execute (commands.h:125)
  ==107060==    by 0x11D712: main (btrfs.c:386)
  ==107060==
  ==107060== LEAK SUMMARY:
  ==107060==    definitely lost: 176 bytes in 1 blocks
  ==107060==    indirectly lost: 0 bytes in 0 blocks
  ==107060==      possibly lost: 0 bytes in 0 blocks
  ==107060==    still reachable: 0 bytes in 0 blocks
  ==107060==         suppressed: 0 bytes in 0 blocks

[CAUSE]
In btrfs_free_block_groups(), we use
rbtree_postorder_for_each_entry_safe() to iterate all block group cache.

However since we're already doing post order iteration, we shouldn't
call rb_erase() during that iteration, as it would re-balance the tree,
and break the post order iteration.

This wrong rb_erase() call leads to above memory leak.

[FIX]
Kill that wrong rb_erase() call.

Fixes: b1bd3cd93f ("btrfs-progs: reform block groups caches structure")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-31 18:37:37 +02:00
Qu Wenruo
3972c27db6 btrfs-progs: check/original: Fix uninitialized return value from btrfs_write_dirty_block_groups()
[BUG]
Valgrind reports the following error for fsck/007, which is only
repairable for original mode:
  ==97599== Conditional jump or move depends on uninitialised value(s)
  ==97599==    at 0x1D4A42: btrfs_commit_transaction (transaction.c:207)
  ==97599==    by 0x16475C: check_extent_refs (main.c:8097)
  ==97599==    by 0x166199: check_chunks_and_extents (main.c:8786)
  ==97599==    by 0x166441: do_check_chunks_and_extents (main.c:8842)
  ==97599==    by 0x169D13: cmd_check (main.c:10324)
  ==97599==    by 0x11CDC6: cmd_execute (commands.h:125)
  ==97599==    by 0x11D712: main (btrfs.c:386)
  ==97599==

[CAUSE]
If btrfs_write_dirty_block_groups() get called with no block group
dirtied (no dirty extents created), the return value of it is
uninitialized, as the stack @ret is not initialized at all.

[FIX]
Initialize @ret to 0 for btrfs_write_dirty_block_groups() as if there is
no dirty block groups, we do nothing and shouldn't fail.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-31 18:37:37 +02:00
David Sterba
58bcd4260f btrfs-progs: move free-space-tree.[ch] to kernel-shared/
The files are very close to kernel versions, for that keep them in the
shared directory.

Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-31 18:37:34 +02:00
Su Yue
fac618e0eb btrfs-progs: cleanups after block group cache refactoring
btrfs_fs_info::block_group_cache and the bit BLOCK_GROUP_DIRY are not
used anymore, so is the block_group_state_bits().  Remove them.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-03 19:58:54 +01:00
Su Yue
b1bd3cd93f btrfs-progs: reform block groups caches structure
This commit organises block groups cache in
btrfs_fs_info::block_group_cache_tree. And any dirty block groups are
linked in transaction_handle::dirty_bgs.

To keep coherence of bisect, it does almost replace in place:
1. Replace the old btrfs group lookup functions with new functions
introduced in former commits.
2. set_extent_bits(..., BLOCK_GROUP_DIRYT) things are replaced by linking
the block group cache into trans::dirty_bgs. Checking and clearing bits
are transformed too.
3. set_extent_bits(..., bit | EXTENT_LOCKED) things are replaced by
new the btrfs_add_block_group_cache() which inserts caches into
btrfs_fs_info::block_group_cache_tree directly. Other operations are
converted to tree operations.

Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-03 19:58:54 +01:00
Su Yue
162d891e4a btrfs-progs: pass @trans to functions working with dirty block groups
We are going to touch dirty_bgs in transaction directly, so every call
chain should pass @trans to the leaf functions.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-03 19:58:54 +01:00
Su Yue
a0edc6859e btrfs-progs: block-group: add dirty_bgs list related memebers
The old style uses extent bit BLOCK_GROUP_DIRTY to mark dirty block
groups in extent cache. To replace it, add btrfs_trans_handle::dirty_bgs
and btrfs_block_group_cache::dirty_list.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-03 19:58:54 +01:00
Su Yue
1d489df8f8 btrfs-progs: factor out inserting new block group
The new function btrfs_add_block_group_cache() abstracts the old
set_extent_bits and set_state_private operations.

Rename the rb tree version to btrfs_add_block_group_cache_kernel().

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-03 19:58:54 +01:00
Su Yue
e74d5ce039 btrfs-progs: rename parameter for block group search mode
Change @cotnains to @next of block_group_cache_tree_search().  Now, the
function will try to search the block group containing the @bytenr. If
not found, return NULL if @next is zero. Or It will return the next
block group.

The mode of search used in kernel has the parameter updated.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-03 19:58:54 +01:00
Su Yue
ff49668b71 btrfs-progs: port block group cache tree insertion and lookup functions
Simple copy and paste, remove useless lock operantions in progs.  Th new
coming lookup functions are temporarily named with suffix _kernel.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-03 19:58:54 +01:00
Su Yue
764c8dea72 btrfs-progs: handle error if btrfs_write_one_block_group() failed
Just break loop and return the error code if failed.  Functions in the
call chain are able to handle it.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-03 19:58:53 +01:00
Qu Wenruo
e073b6e14f btrfs-progs: fix superblock range exclusion check
[BUG]
For certain btrfs images, a BUG_ON() can be triggered at open_ctree()
time:

  Opening filesystem to check...
  extent_io.c:158: insert_state: BUG_ON `end < start` triggered, value 1
  btrfs(+0x2de57)[0x560c4d7cfe57]
  btrfs(+0x2e210)[0x560c4d7d0210]
  btrfs(set_extent_bits+0x254)[0x560c4d7d0854]
  btrfs(exclude_super_stripes+0xbf)[0x560c4d7c65ff]
  btrfs(btrfs_read_block_groups+0x29d)[0x560c4d7c698d]
  btrfs(btrfs_setup_all_roots+0x3f3)[0x560c4d7c0b23]
  btrfs(+0x1ef53)[0x560c4d7c0f53]
  btrfs(open_ctree_fs_info+0x90)[0x560c4d7c11a0]
  btrfs(+0x6d3f9)[0x560c4d80f3f9]
  btrfs(main+0x94)[0x560c4d7b60c4]
  /usr/lib/libc.so.6(__libc_start_main+0xf3)[0x7fd189773ee3]
  btrfs(_start+0x2e)[0x560c4d7b635e]

[CAUSE]
This is caused by passing @len == 0 to add_excluded_extent(), which
means one reverse mapped range is just out of the block group range,
normally means a by-one error.

[FIX]
Fix the boundary check on the reserve mapped range against block group
range.  If a reverse mapped super block starts at the end of the block
group, it doesn't cover so we don't need to bother the case.

Issue: #210
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-09 14:27:10 +01:00
Qu Wenruo
77b2af0a9a btrfs-progs: do proper error handling in btrfs_chunk_readonly()
[BUG]
For a fuzzed image, `btrfs check` both modes trigger BUG_ON():

  Opening filesystem to check...
  volumes.c:1795: btrfs_chunk_readonly: BUG_ON `!ce` triggered, value 1
  btrfs(+0x2f712)[0x557beff3b712]
  btrfs(+0x32059)[0x557beff3e059]
  btrfs(btrfs_read_block_groups+0x282)[0x557beff30972]
  btrfs(btrfs_setup_all_roots+0x3f3)[0x557beff2ab23]
  btrfs(+0x1ef53)[0x557beff2af53]
  btrfs(open_ctree_fs_info+0x90)[0x557beff2b1a0]
  btrfs(+0x6d3f9)[0x557beff793f9]
  btrfs(main+0x94)[0x557beff200c4]
  /usr/lib/libc.so.6(__libc_start_main+0xf3)[0x7f623ac97ee3]
  btrfs(_start+0x2e)[0x557beff2035e]

[CAUSE]
The fuzzed image has a bad extent tree:

        item 0 key (288230376165343232 BLOCK_GROUP_ITEM 8388608) itemoff 16259 itemsize 24
                block group used 0 chunk_objectid 256 flags DATA

There is no corresponding chunk for the block group.

In then we hit the BUG_ON(), which expects chunk mapping for
btrfs_chunk_readonly().

[FIX]
Remove that BUG_ON() with proper error handling, and make
btrfs_read_block_groups() handle the -ENOENT error from
read_one_block_group() to continue.

So one corrupted block group item won't screw up the remaining block
group items.

Issue: #209
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-09 14:27:10 +01:00
Su Yue
97fc76c0ac btrfs-progs: add comments of block group lookup functions
The progs side function btrfs_lookup_first_block_group() calls
find_first_extent_bit() to find block group which contains bytenr
or after the bytenr. This behavior differs from kernel code, so
add the comments.

Add the coments of btrfs_lookup_block_group() too, this one works
like kernel side.

Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-22 19:09:51 +01:00
David Sterba
1f5094bb5c btrfs-progs: add support for raid1c3 and raid1c4
Add support for 3- and 4- copy variants of RAID1. This adds resiliency
against 2 or resp. 3 devices lost or damaged.

$ ./mkfs.btrfs -m raid1c4 -d raid1c3 /dev/sd[abcd]

Label:              (null)
UUID:               f1f988ab-6750-4bc2-957b-98a4ebe98631
Node size:          16384
Sector size:        4096
Filesystem size:    8.00GiB
Block group profiles:
  Data:             RAID1C3         273.06MiB
  Metadata:         RAID1C4         204.75MiB
  System:           RAID1C4           8.00MiB
SSD detected:       no
Incompat features:  extref, skinny-metadata, raid1c34
Number of devices:  4
Devices:
   ID        SIZE  PATH
    1     2.00GiB  /dev/sda
    2     2.00GiB  /dev/sdb
    3     2.00GiB  /dev/sdc
    4     2.00GiB  /dev/sdd

Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-22 19:09:50 +01:00
Qu Wenruo
989a99b5f8 btrfs-progs: Replace btrfs_block_group_cache::item with dedicated members
We access btrfs_block_group_cache::item mostly for @used and @flags.

@flags is already a dedicated member in btrfs_block_group_cache, only
@used doesn't have a dedicated member.

This patch will remove btrfs_block_group_cache::item and add
btrfs_block_group_cache::used.

It's the btrfs-progs equivalent of the following kernel patches:
btrfs: move block_group_item::used to block group
btrfs: move block_group_item::flags to block group
btrfs: remove embedded block_group_cache::item

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 19:21:09 +01:00
Qu Wenruo
e33a73b754 btrfs-progs: Refactor btrfs_read_block_groups()
This patch does the following refactor:
- Refactor parameter from @root to @fs_info

- Refactor the large loop body into another function
  Now we have a helper function, read_one_block_group(), to handle
  block group cache and space info related routine.

- Refactor the return value
  Even we have the code handling ret > 0 from find_first_block_group(),
  it never works, as when there is no more block group,
  find_first_block_group() just return -ENOENT other than 1.

  This is super confusing, it's almost a mircle it even works.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 19:21:08 +01:00
Qu Wenruo
e46281d6fb btrfs-progs: Refactor excluded extent functions to use fs_info
The following functions are just using @root to reach fs_info:
- exclude_super_stripes
- free_excluded_extents
- add_excluded_extent

Refactor them to use fs_info directly.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 19:21:08 +01:00
Johannes Thumshirn
c04bcdcacc btrfs-progs: move crc32c implementation to crypto/
With the introduction of xxhash64 to btrfs-progs we created a crypto/
directory for all the hashes used in btrfs (although no
cryptographically secure hash is there yet).

Move the crc32c implementation from kernel-lib/ to crypto/ as well so we
have all hashes consolidated.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 19:20:02 +01:00
Rosen Penev
5d72055066 btrfs-progs: Fix printf formats
Discovered with cppcheck. Fix signed/unsigned int mismatches, sizeof and
long formats.

Pull-request: #197
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-10-14 17:31:05 +02:00
David Sterba
94fced6353 btrfs-progs: build: drop kernel-lib from -I and update paths
Include the files by full path to avoid any confusion in case of
potentially duplicate names.

Signed-off-by: David Sterba <dsterba@suse.com>
2019-07-03 20:49:04 +02:00
David Sterba
c07960c8be btrfs-progs: move utils.[ch] to common/
Update include paths and remove some duplicates.

Signed-off-by: David Sterba <dsterba@suse.com>
2019-07-03 20:49:04 +02:00
Nikolay Borisov
4e34bdb868 btrfs-progs: Remove old commented code
This piece of code has been commented since 2009, given the number of
changes that have happened it's unlikely it could be made to work or is
needed at all. Just delete it.

The code was disabled in commit 95d3f20b51 ("Mixed back reference
(FORWARD ROLLING FORMAT CHANGE)") that changed the format significantly
and we don't need the compatibility code anymore.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-07-03 13:31:16 +02:00
Nikolay Borisov
56c31f13d6 btrfs-progs: Remove redundant if
'pin' is always true in __free_extent so there is no point in checking
it. Just remove the if and unindent the code.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-07-03 13:31:16 +02:00
Qu Wenruo
c31edf610c btrfs-progs: Fix false ENOSPC alert by tracking used space correctly
[BUG]
There is a bug report of unexpected ENOSPC from btrfs-convert, issue #123.

After some debugging, even when we have enough unallocated space, we
still hit ENOSPC at btrfs_reserve_extent().

[CAUSE]
Btrfs-progs relies on chunk preallocator to make enough space for
data/metadata.

However after the introduction of delayed-ref, it's no longer reliable
to rely on btrfs_space_info::bytes_used and
btrfs_space_info::bytes_pinned to calculate used metadata space.

For a running transaction with a lot of allocated tree blocks,
btrfs_space_info::bytes_used stays its original value, and will only be
updated when running delayed ref.

This makes btrfs-progs chunk preallocator completely useless. And for
btrfs-convert/mkfs.btrfs --rootdir, if we're going to have enough
metadata to fill a metadata block group in one transaction, we will hit
ENOSPC no matter whether we have enough unallocated space.

[FIX]
This patch will introduce btrfs_space_info::bytes_reserved to track how
many space we have reserved but not yet committed to extent tree.

To support this change, this commit also introduces the following
modification:

- More comment on btrfs_space_info::bytes_*
  To make code a little easier to read

- Export update_space_info() to preallocate empty data/metadata space
  info for mkfs.
  For mkfs, we only have a temporary fs image with SYSTEM chunk only.
  Export update_space_info() so that we can preallocate empty
  data/metadata space info before we start a transaction.

- Proper btrfs_space_info::bytes_reserved update
  The timing is the as kernel (except we don't need to update
  bytes_reserved for data extents)
  * Increase bytes_reserved when call alloc_reserved_tree_block()
  * Decrease bytes_reserved when running delayed refs
    With the help of head->must_insert_reserved to determine whether we
    need to decrease.

Issue: #123
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-07-03 13:31:14 +02:00
Qu Wenruo
be08ddeb52 btrfs-progs: Do metadata preallocation for fs trees and csum tree
In github issues, one user reports unexpected ENOSPC error if enabling
datasum druing convert.  After some investigation, it looks like that
during ext2_saved/image creation, we could create large file extent
whose size can be 128M (max data extent size).

In that case, its csum block will be at least 128K. Under certain case
we need to allocate extra metadata chunks to fulfill such space
requirement.

However we only do metadata prealloc if we're reserving extents for fs
trees.  (we use btrfs_root::ref_cows to determine whether we should do
metadata prealloc, and that member is only set for fs trees).

There is no explaination on why we only do metadata prealloc for file
trees, but from my educated guess, it could be related to avoid nested
extent/chunk tree modication.

At least extent reservation for csum tree shouldn't be a problem with
metadata block group preallocation.

So adding new condition for metadata preallocate to avoid unexpected
ENOSPC problem.

Issue: #123
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-05 20:27:31 +02:00
Qu Wenruo
922a631d50 btrfs-progs: Avoid nested chunk allocation call
There is a indirect recursion which can reach the extent reservation:

btrfs_reserve_extent()             <--|
|- do_chunk_alloc()                   |
   |- btrfs_alloc_chunk()             |
      |- btrfs_insert_item()          |
	 |- btrfs_reserve_extent() <--|

Currently, we're using root->ref_cows to determine whether we should do
chunk prealloc to avoid such loop.

But that's still a hidden trap. Instead of solving it using some hidden
tricks, this patch will make chunk/block group allocation exclusive.

Now if do_chunk_alloc() determines to alloc chunk, it will set a flag in
transaction handle so new call of do_chunk_alloc() will refuse to
allocate new chunk until current chunk allocation finishes.

The chunks get over-allocated by 2M so there's enough space in case the
recursive call asks for a different type of blockgroup.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-05 20:27:31 +02:00
Qu Wenruo
085445e793 btrfs-progs: Cleanup BTRFS_COMPAT_EXTENT_TREE_V0
BTRFS_COMPAT_EXTENT_TREE_V0 is introduced for a short time in kernel,
and it's over 10 years ago.

Nowadays there should be no user for that feature, and kernel has remove
this support in Jun, 2018. There is no need for btrfs-progs to support
it.

This patch will remove EXTENT_TREE_V0 related code and replace those
BUG_ON() to a more graceful error message.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-05 18:00:07 +02:00
Qu Wenruo
c98155f07a btrfs-progs: Output extent tree leaf if we failed to find a backref
There is a bug report of BUG_ON() which is caused by __free_extent()
failed to lookup a backref extent:
  Failed to find [1429288337408, 168, 16384]
  btrfs unable to find ref byte nr 1429288583168 parent 0 root 2 owner 0 offset 0
  convert/source-ext2.c:834: ext2_copy_inodes: BUG_ON ret triggered, value -5
  ./btrfs-convert[0x410941]
  ./btrfs-convert(main+0x1fdc)[0x40d3b8]
  /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f93bb7d2f33]
  ./btrfs-convert(_start+0x2e)[0x40a96e]

It's still unclear how this bug can be triggered, but adding such debug
output will provide more info for us to debug.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-05-27 16:11:43 +02:00
Qu Wenruo
5672a69639 btrfs-progs: Handle error properly in btrfs_commit_transaction()
[BUG]
When running fuzz-tests/003 and fuzz-tests/009, btrfs-progs will crash
due to BUG_ON().

[CAUSE]
We abused BUG_ON() in btrfs_commit_transaction(), which is one of the
most error prone function for fuzzed images.

Currently to cleanup the aborted transaction, we only need to clean up
the only per-transaction data: delayed refs.

This patch will introduce a new function, btrfs_destroy_delayed_refs()
to cleanup delayed refs when we failed to commit transaction.

With that function, we will gently destroy per-trans delayed ref, and
remove the BUG_ON()s in btrfs_commit_transaction().

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-05-13 15:54:47 +02:00
Qu Wenruo
45e58a1acf btrfs-progs: Refactor btrfs_finish_extent_commit()
This patch will refactor btrfs_finish_extent_commit():

- Make it return void
  There is no failure pattern for btrfs_finish_extent_commit(), thus it
  always return 0. And the caller doesn't care about the return value.
  So no need to return int.

- Remove @root and @unpin parameters

  @root is only used to extract fs_info, which can be extracted from
  transaction handler already.
  @unpin is always fs_info->pinned_extents.
  All these parameters can be extracted from @trans, no need to pass
  them.

The function signature now matches the kernel counterpart.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-05-13 15:52:46 +02:00
Qu Wenruo
27a5b9ddc3 btrfs-progs: Remove the dead branch in btrfs_run_delayed_refs()
cleanup_ref_head() will only return 0 or 1, no way to return a negative
value.  So remove the dead branch.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-05-13 15:52:24 +02:00
Qu Wenruo
4ab95eb8b0 Revert "btrfs-progs: Do metadata preallocation as long as we're not modifying extent tree"
Commit 7a12d8470e ("btrfs-progs: Do metadata preallocation as long as
we're not modifying extent tree") tries to fix #123, however due to the
fact that chunk tree also has root->ref_cows set, we will call
do_chunk_alloc() until call stack explodes.

So revert that offending patch until we have a much better comment on
root->ref_cows and find a better solution to this problem.

Signed-off-by: Qu Wenruo <wqu@suse.com>
2019-04-16 09:04:43 +08:00
Qu Wenruo
7a12d8470e btrfs-progs: Do metadata preallocation as long as we're not modifying extent tree
In github issues, one user reports unexpected ENOSPC error if enabling
datasum druing convert.  After some investigation, it looks like that
during ext2_saved/image creation, we could create large file extent
whose size can be 128M (max data extent size).

In that case, its csum block will be at least 128K. Under certain case
we need to allocate extra metadata chunks to fulfill such space
requirement.

However we only do metadata prealloc if we're reserving extents for fs
trees.  (we use btrfs_root::ref_cows to determine whether we should do
metadata prealloc, and that member is only set for fs trees).

There is no explaination on why we only do metadata prealloc for file
trees, but at least from my investigation, it could be related to avoid
nested extent tree modication.

At least extent reservation for csum tree shouldn't be a problem with
metadata block group preallocation.

So change the metadata block group preallocation check from
"root->ref_cow" to "root->root_key.objectid !=
BTRFS_EXTENT_TREE_OBJECTID", and add some comment for it.

Issue: #123
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-03-05 15:33:54 +01:00
Rosen Penev
01e35d9f53 btrfs-progs: treewide: Fix missing declarations
Found using -Wmissing-prototypes in GCC.  This should improve LTO
behavior.

Note that set_free_space_tree_thresholds is an unused function. Adding
inline seems to remove the unused function warning.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-11-13 13:32:41 +01:00
Nikolay Borisov
cb4af7021c btrfs-progs: Hook FST code in extent (de)alloc
For now this doesn't change the functionality since FST code is not yet
enabled via the compat bits. But this will be needed when it's enabled
so that the FST is correctly modified during repair operations that
allocate/deallocate extents.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-25 16:11:39 +02:00
Nikolay Borisov
43f14d06fd btrfs-progs: Merge alloc_reserved_tree_block2 and alloc_reserved_tree_block
Now that delayed refs have been wired let's merge the two function. In
the process also remove one BUG_ON since alloc_reserved_tree_block's
callers can handle errors. No functional changes.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-23 14:48:42 +02:00
Nikolay Borisov
8bbb72cfc5 btrfs-progs: Remove __free_extent2, now unused
Now that delayed refs have been all wired up clean up the __free_extent2
adapter function since it's no longer needed. No functional changes.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-23 14:48:41 +02:00
Nikolay Borisov
6de2debdb0 btrfs-progs: Remove old delayed refs infrastructure
Given that the new delayed refs infrastructure is implemented and wired
up, there is no point in keeping the old code. So just remove it.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-23 14:48:41 +02:00
Nikolay Borisov
909357e867 btrfs-progs: Wire up delayed refs
This commit enables the delayed refs infrastructures. This entails doing
the following:

1. Replacing existing calls of btrfs_extent_post_op (which is the
   equivalent of delayed refs) with the proper btrfs_run_delayed_refs.
   As well as eliminating open-coded calls to finish_current_insert and
   del_pending_extents which execute the delayed ops.

2. Wiring up the addition of delayed refs when freeing extents
   (btrfs_free_extent) and when adding new extents (alloc_tree_block).

3. Adding calls to btrfs_run_delayed refs in the transaction commit
   path alongside comments why every call is needed, since it's not
   always obvious (those call sites were derived empirically by running
   and debugging existing tests)

4. Correctly flagging the transaction in which we are reinitialising
   the extent tree.

5. Moving btrfs_write_dirty_block_groups to
   btrfs_write_dirty_block_groups since blockgroups should be written to
   disk after the last delayed refs have been run.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-23 14:48:41 +02:00
Nikolay Borisov
d8a5e756be btrfs-progs: Make btrfs_write_dirty_block_groups take only trans argument
The root argument is used only to get a reference to the fs_info, this
can be achieved with the transaction handle being passed so use that.
This is in preparation for moving this function in the main transaction
commit routine. No functional changes.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-23 14:48:41 +02:00