This new ability is added by:
- Allow btrfs_map_block() to return the chunk type
This makes later work much easier
- Only reset stripe offset inside btrfs_map_block() when needed
Currently if @raid_map is not NULL, btrfs_map_block() will consider
this call is for WRITE and will reset stripe offset.
This is no longer the case, as for RAID56 read with mirror_num 1/0,
we will still call btrfs_map_block() with non-NULL raid_map.
Add a small check to make sure we won't reset stripe offset for
mirror 1/0 read.
- Add new helper read_raid56() to handle rebuild
We will read the full stripe (including all data and P/Q stripes)
do the rebuild, then only copy the refered part to the caller.
There is a catch for RAID6, we have no way to exhaust all combination,
so the current repair will assume the mirror = 0 data is corrupted,
then try to find a missing device.
But if no missing device can be found, it will assume P is corrupted.
This is just a guess, and can to totally wrong, but we have no better
idea.
Now btrfs-progs have full read ability for RAID56.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Those two members are a shortcut for non-RAID56 profiles.
But we should not use such shortcut, and move all our logical address
read/write to the unified read_data_from_disk()/write_data_to_disk().
With previous refactors, now we're safe to remove them.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The function read_extent_from_disk() is only a wrapper to read tree
block.
And read_extent_data() is just a while loop to eliminate short read
caused by stripe boundary.
In fact, a lot of call sites of read_extent_data() are either reading
metadata (thus no possible short read) or doing extra loop by
themselves.
This patch will replace those two functions with read_data_from_disk(),
making it the only entrance for data/metadata read.
And update read_data_from_disk() to return the read bytes, so caller can
do a simple while loop.
For the few callers of read_extent_data(), open-code a small while loop
for them.
This will allow later RAID56 read repair using P/Q much easier.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Function write_extent_to_disk() is just writing the content of a tree
block to disk.
It can not handle RAID56, and its work is the same as
write_data_to_disk().
Thus we can replace write_extent_to_disk() with write_data_to_disk()
easily.
There is only one special call site in write_raid56_with_parity(), which
can easily be replace with btrfs_pwrite() directly.
This reduce the write entrance, and make later eb::fd removal easier.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Wrap pread with btrfs_pread as well.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Wrap pwrite with btrfs_pwrite(). It simply calls pwrite() on non-zoned
btrfs (opened without O_DIRECT). On zoned mode (opened with O_DIRECT),
it allocates an aligned bounce buffer, copies the contents and uses it
for direct-IO writing.
Writes in device_zero_blocks() and btrfs_wipe_existing_sb() are a little
tricky. We don't have fs_info on our hands, so use zinfo to determine it
is a zoned device or not.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add the GPL v2 header to files where it was missing and is not from an
external source, update to the most recent version with the address.
Signed-off-by: David Sterba <dsterba@suse.com>