Now that delayed refs have been all wired up clean up the __free_extent2
adapter function since it's no longer needed. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Given that the new delayed refs infrastructure is implemented and wired
up, there is no point in keeping the old code. So just remove it.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This commit enables the delayed refs infrastructures. This entails doing
the following:
1. Replacing existing calls of btrfs_extent_post_op (which is the
equivalent of delayed refs) with the proper btrfs_run_delayed_refs.
As well as eliminating open-coded calls to finish_current_insert and
del_pending_extents which execute the delayed ops.
2. Wiring up the addition of delayed refs when freeing extents
(btrfs_free_extent) and when adding new extents (alloc_tree_block).
3. Adding calls to btrfs_run_delayed refs in the transaction commit
path alongside comments why every call is needed, since it's not
always obvious (those call sites were derived empirically by running
and debugging existing tests)
4. Correctly flagging the transaction in which we are reinitialising
the extent tree.
5. Moving btrfs_write_dirty_block_groups to
btrfs_write_dirty_block_groups since blockgroups should be written to
disk after the last delayed refs have been run.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The root argument is used only to get a reference to the fs_info, this
can be achieved with the transaction handle being passed so use that.
This is in preparation for moving this function in the main transaction
commit routine. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This commit pulls those portions of the kernel implementation of
delayed refs which are necessary to have them working in user-space.
I've done the following modifications:
1. Replaced all kmem_cache_alloc calls to kmalloc.
2. Removed all locking-related code, since we are single threaded in
userspace.
3. Removed code which deals with data refs - delayed refs in user space
are going to be used only for cowonly trees.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is a simple adapter function to convert the delayed-refs structures
to the current arguments of alloc_reserved_tree_block.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is a simple adapter to convert the arguments delayed ref arguments
to the existing arguments of __free_extent.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When running test fuzz/003, we could hit the following BUG_ON:
====== RUN MAYFAIL btrfs check --init-csum-tree tests//fuzz-tests/images/bko-155621-bad-block-group-offset.raw.restored
Unable to find block group for 0
Unable to find block group for 0
Unable to find block group for 0
extent-tree.c:2657: alloc_tree_block: BUG_ON `ret` triggered, value -28
failed (ignored, ret=134): btrfs check --init-csum-tree tests/fuzz-tests/images/bko-155621-bad-block-group-offset.raw.restored
mayfail: returned code 134 (SIGABRT), not ignored
test failed for case 003-multi-check-unmounted
Just remove that BUG_ON() and allow us to exit gracefully, the caller
handles the errors.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently some instances of btrfs_free_extent are called with the
last parameter ("offset") being set to 1. This makes no sense, since
offset is used for data extents. I suspect this is a left-over from
95d3f20b51e9 ("Mixed back reference (FORWARD ROLLING FORMAT CHANGE)")
since this commit changed the signature of the function from :
-int btrfs_free_extent(struct btrfs_trans_handle *trans, struct btrfs_root
- *root, u64 bytenr, u64 num_bytes, u64 parent,
- u64 root_objectid, u64 ref_generation,
- u64 owner_objectid, int pin);
to
+int btrfs_free_extent(struct btrfs_trans_handle *trans,
+ struct btrfs_root *root,
+ u64 bytenr, u64 num_bytes, u64 parent,
+ u64 root_objectid, u64 owner, u64 offset);
I.e the last parameter was "pin" and not offset. So these are just
leftovers with no semantic meaning. Fix this by passing 0.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is not really needed, since we can reference the fs_info from the
passed transaction. This is in preparation for delayed-refs support.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This argument is no longer used in this function so remove it.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
They are not really needed, what free_extent_hook wants is really a
pointer to fs_info so give it to it directly. This is in preparation
of delayed refs code.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is in preparation of delayed refs code.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Instead of updating this during update_block_group, move the updating
code at the places where we free/allocate a block. This resembles the
current state of the kernel code. This is in prep for delayed refs.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It's not needed, since we can obtain a reference to fs_info from the
passed transaction handle. This is needed by delayed refs code.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This argument is used to obtain a reference to fs_info, which can
already be done from the passed trans handle, so use that instead.
This is in preparation for delayed refs support.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Presently btrfs-progs haven't pulled the enum defining the symbolic
names of read ahead constants. This commit adds the enum and
simultaneously converts all usages to respective symbolic name.
No functional change, just making the code human readable.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It's not needed since we can acquire a reference to the fs_info from
the transaction handle already passed.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It's used only to get a reference to fs_info, which can be obtained from
the transaction handle.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
That function really wants an fs_info and not a root. Accidentally,
this also makes the kernel/user space signatures to be coherent.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is no longer used by the callees of that function so remove it.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This function actually uses only the extent_buffer arg but takes 3
arguments. Furthermore, it's current interface doesn't even mirror
the kernel counterpart. Just remove the extra arguments.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This function needs btrfs_fs_info and not a root. So make it directly
take btrfs_fs_info,
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Just reference it directly from trans->fs_info.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This function always operates on the extent root which can be
referenced from trans->fs_info. Do that to simplify function's
signature.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It's always set to extent_root and the function already takes a
transaction handle where fs_info could be referenced and in turn
the extent_tree.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Originally commit 2681e00f00 ("btrfs-progs: check for matchingi
free space in cache") added the account_super_bytes function to prevent
false negative when running btrfs check. Turns out this function is
really copied exclude_super_stripes, excluding the calls to
exclude_super_stripes. Later commit e4797df6a9 ("btrfs-progs: check
the free space tree in btrfsck") introduced proper version of
exclude_super_stripes. Instead of duplicating the function, just remove
account_super_bytes and use exclude_super_stripes instead of the former.
This also has the benefit of bringing the userspace code a bit closer
to the kernel counterpart.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This parameter was introduced with the original implementation of the
function but has never really been used, so just remove it.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Just like kernel cleanup made by David, btrfs_print_leaf() and
btrfs_print_tree() doesn't need btrfs_root parameter at all.
With previous patches as preparation, now we can remove the btrfs_root
parameter.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Although skinny_metadata's type is int, its value just can be 0/1. And
if condition be true only when skinny_metadata equals 1, so in if's
executive part, set skinny_metadata to 1 is redundancy. Remove it.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
@chunk_objectid of btrfs_make_block_group() function is always fixed to
BTRFS_FIRST_FREE_OBJECTID, so there is no need to pass it as parameter
explicitly.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
btrfs_reserve_extent() uses int @data to determine if we're allocating
data extent, while reuse the parameter later to pass it as profile
(data/meta/sys).
It's a little confusing, this patch will follow kernel parameter to use
bool @is_data to replace it.
And in btrfs_reserve_extent(), use dedicated u64 @profile.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Some parameter of trans is not used indeed.
Let's remove them.
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When passing directory larger than block device using --rootdir
parameter, we get the following backtrace:
------
extent-tree.c:2693: btrfs_reserve_extent: BUG_ON `ret` triggered, value -28
./mkfs.btrfs(+0x1a05d)[0x557939e6b05d]
./mkfs.btrfs(btrfs_reserve_extent+0xb5a)[0x557939e710c8]
./mkfs.btrfs(+0xb0b6)[0x557939e5c0b6]
./mkfs.btrfs(main+0x15d5)[0x557939e5de04]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f83b101af6a]
./mkfs.btrfs(_start+0x2a)[0x557939e5af5a]
------
Nothing special, just BUG_ON() abusing from ancient code.
Fix them by using correct return.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
free_block_group_cache() calls clear_extent_bits() with wrong end, which
is one byte larger than the correct range.
This will cause the next adjacent cache state to be split. And due to
the split, private pointer (which points to block group cache) will be
reset to NULL.
This is very hard to detect as this function only gets called in
cleanup_temp_chunks() which is just before mkfs finishes. This bug only
gets exposed when reworking --rootdir option.
Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The warning can pop up frequently on a fuzzed image, the message seems
to be enough. Add a more fitting error code too.
Signed-off-by: David Sterba <dsterba@suse.com>
As btrfs_update_block_group fails when the block group is not found in
cache, we can exit btrfs_free_block_group, not much to rollback. The
caller will also exit in turn.
Signed-off-by: David Sterba <dsterba@suse.com>
Metadata blocks are always nodesize. When reading the
superblock::sys_array, the actual size of data is fixed to 4k and
smaller than nodesize, but otherwise everything works as before.
Signed-off-by: David Sterba <dsterba@suse.com>
If the found %ins is crossing a stripe len, ie. BTRFS_STRIPE_LEN, we'd
search again with a stripe-aligned %search_start. The current code
calculates %search_start by adding a wrong offset, in order to fix it, the
start position of the block group should be taken, otherwise, it'll end up
with looking at the same block group forever.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
4 functions are involved in this refactor: btrfs_make_block_group()
btrfs_make_block_groups(), btrfs_alloc_chunk, btrfs_alloc_data_chunk().
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Just to keep the 1st paramter the same as kernel.
We can also save a few lines since the parameter is shorter now.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When a 0 sized block group item is found, set_extent_bits() will not
really set any bits.
While set_state_private() still inserts allocated block group cache into
block group extent_io_tree.
So at close_ctree() time, we won't free the private block group cache
stored since we can't find any bit set for the 0 sized block group.
To fix it, at btrfs_read_block_groups() we skip any 0 sized block group,
so such leak won't happen.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>