We use this in ctree.c in the kernel, so sync this helper into
btrfs-progs to make sync'ing ctree.c easier.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is the calling convention in the kernel because we track dirty
blocks per transaction instead of globally in the fs_info. Simply
mirror what we do in the kernel to make it easier to sync ctree.c
locally.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The following functions accept a buffer for write, which can be marked
as const:
- btrfs_pwrite()
- write_data_to_disk()
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This exists in the kernel as a wrapper for readahead_tree_block, and is
used extensively in ctree.c in the kernel. Sync this helper so that we
can easily sync ctree.c
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In the kernel we only take a bytenr for this as the extent buffer cache
is indexed on bytenr. Since we're passing in the btrfs_fs_info we can
simply use the ->nodesize for the blocksize, and drop the blocksize
argument completely. This brings us into parity with the kernel, which
will allow the syncing of ctree.c.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This does exactly what free_extent_buffer_nocache does, but we call
btrfs_free_extent_buffer_stale in the kernel code, so add this extra
helper. Once the kernel code is synced we can get rid of the old
helper.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is a mirror of the change I've done in the kernel, but in progs
it's even more simply because clean_tree_block was just a wrapper around
clear_extent_buffer_dirty. Change this to btrfs_clear_buffer_dirty, and
then update all the callers to use this helper instead of
clean_tree_block and clear_extent_buffer_dirty.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The following are some extent buffer helpers we have in the kernel but
not in btrfs-progs. Sync these in to make syncing ctree.c easier.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
These helpers are all take const struct extent_buffer in the kernel, do
the same in btrfs-progs in order to enable us to more easily sync
ctree.c.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is a bit larger than the previous syncs, because we use
extent_io_tree's everywhere. There's a lot of stuff added to
kerncompat.h, and then I went through and cleaned up all the API
changes, which were
- extent_io_tree_init takes an fs_info and an owner now.
- extent_io_tree_cleanup is now extent_io_tree_release.
- set_extent_dirty takes a gfpmask.
- clear_extent_dirty takes a cached_state.
- find_first_extent_bit takes a cached_state.
The diffstat looks insane for this, but keep in mind extent-io-tree.c
and extent-io-tree.h are ~2000 loc just by themselves.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is what we do in the kernel, and while we're syncing individual
files we're going to have state where some callers are using a const,
but progs isn't. So adjust write_extent_buffer to take a const eb in
order to make this less painful.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have been overloading the extent_state flags for use on the extent
buffers as well. When we sync extent-io-tree.[ch] this will become
impossible, so rename these flags to EXTENT_BUFFER_* and use those
definitions instead of the extent_state definitions.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We used to store random private things into extent_states, but we
haven't done this for a while and there are no users of this code,
simply delete it.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have some extra features in the btrfs-progs copy of the
extent_io_tree that don't exist in the kernel. In order to make syncing
easier simply move this functionality into btrfs_fs_info, that way we
can sync in the new extent_io_tree code and not have to worry about
breaking anything.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is a cleanup patch to make syncing the btrfs kernel code into
btrfs-progs easier. In btrfs-progs we have an extra cache in the
extent_io_tree that's exclusively used for the extent buffer tracking.
In order to untangle this dependency start passing around the fs_info to
search for extent_buffers, and then have the helpers use the appropriate
structure to find the extent buffer.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The function was introduced by commit a5ce5d2198 ("btrfs-progs:
extent-cache: actually cache extent buffers") but never got utilized.
Thus we can just remove it.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Shinichiro reported that "mkfs.btrfs -m DUP" is doing repeated write
into the device.
For non-zoned device this is not a big deal, but for zoned device this
is critical, as zoned device doesn't support overwrite at all.
[CAUSE]
The problem is related to write_and_map_eb() call, since commit
2a93728391 ("btrfs-progs: use write_data_to_disk() to replace
write_extent_to_disk()"), we call write_data_to_disk() for metadata
write back.
But the problem is, write_data_to_disk() will call btrfs_map_block()
with rw = WRITE.
By that btrfs_map_block() will always return all stripes, while in
write_data_to_disk() we also iterate through each mirror of the range.
This results above repeated writeback.
[FIX]
Fix this problem by completely remove @mirror argument
from write_data_to_disk().
With extra comments to explicitly show that function will write to
all mirrors.
Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Fixes: 2a93728391 ("btrfs-progs: use write_data_to_disk() to replace write_extent_to_disk()")
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Headers that are only exported and not used for build do not need the
BTRFS_FLAT_INCLUDES switch (between local and installed headers). Now
that there are local copies of the shared headers drop the respective
part from local headers.
Signed-off-by: David Sterba <dsterba@suse.com>
Those two members are a shortcut for non-RAID56 profiles.
But we should not use such shortcut, and move all our logical address
read/write to the unified read_data_from_disk()/write_data_to_disk().
With previous refactors, now we're safe to remove them.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The function read_extent_from_disk() is only a wrapper to read tree
block.
And read_extent_data() is just a while loop to eliminate short read
caused by stripe boundary.
In fact, a lot of call sites of read_extent_data() are either reading
metadata (thus no possible short read) or doing extra loop by
themselves.
This patch will replace those two functions with read_data_from_disk(),
making it the only entrance for data/metadata read.
And update read_data_from_disk() to return the read bytes, so caller can
do a simple while loop.
For the few callers of read_extent_data(), open-code a small while loop
for them.
This will allow later RAID56 read repair using P/Q much easier.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Function write_extent_to_disk() is just writing the content of a tree
block to disk.
It can not handle RAID56, and its work is the same as
write_data_to_disk().
Thus we can replace write_extent_to_disk() with write_data_to_disk()
easily.
There is only one special call site in write_raid56_with_parity(), which
can easily be replace with btrfs_pwrite() directly.
This reduce the write entrance, and make later eb::fd removal easier.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>