btrfs-progs

mirror of https://github.com/kdave/btrfs-progs synced 2025-04-28 14:08:00 +00:00

Author	SHA1	Message	Date
David Sterba	c2be0e2ce0	btrfs-progs: use template for out of memory error messages Signed-off-by: David Sterba <dsterba@suse.com>	2022-10-11 09:08:09 +02:00
Qu Wenruo	08bb354a1c	btrfs-progs: properly handle write error when writing back tree blocks [BUG] If we emulate a write error during commit transaction, by setting the block device read-only, then we can easily have the following crash using "btrfs check --clear-space-cache v2": Opening filesystem to check... Checking filesystem on /dev/test/scratch1 UUID: 5945915b-37f1-4bfa-9f64-684b318b8f73 Clear free space cache v2 Error writing to device 1 kernel-shared/transaction.c:156: __commit_transaction: BUG_ON `ret` triggered, value 1 ./btrfs(+0x570c9)[0x562ec894f0c9] ./btrfs(+0x57167)[0x562ec894f167] ./btrfs(__commit_transaction+0x13b)[0x562ec894f7f2] ./btrfs(btrfs_commit_transaction+0x214)[0x562ec894fa64] ./btrfs(btrfs_clear_free_space_tree+0x177)[0x562ec8941ae6] ./btrfs(+0xc8958)[0x562ec89c0958] ./btrfs(+0xc9d53)[0x562ec89c1d53] ./btrfs(+0x17ec7)[0x562ec890fec7] ./btrfs(main+0x12f)[0x562ec8910908] /usr/lib/libc.so.6(+0x232d0)[0x7ff917ee82d0] /usr/lib/libc.so.6(__libc_start_main+0x8a)[0x7ff917ee838a] ./btrfs(_start+0x25)[0x562ec890fdc5] Aborted (core dumped) [CAUSE] The call trace has shown it's a BUG_ON(), and it's from __commit_transaction(), which is writing tree blocks back. [FIX] The fix is pretty simple, just return error. In fact we even have an error value check in btrfs_commit_transaction() just after __commit_transaction() call (although not catching the return value from it). And since we're here, also call btrfs_abort_transaction() to prevent newer transactions from being started. Now we won't have a full crash: Opening filesystem to check... Checking filesystem on /dev/test/scratch1 UUID: 5945915b-37f1-4bfa-9f64-684b318b8f73 Clear free space cache v2 Error writing to device 1 ERROR: failed to write bytenr 30425088 length 16384: Operation not permitted ERROR: failed to write tree block 30425088: Operation not permitted ERROR: failed to clear free space cache v2: -1 extent buffer leak: start 30720000 len 16384 Reported-by: Christoph Anton Mitterer <calestyo@scientia.org> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-10-11 09:08:08 +02:00
Qu Wenruo	75800c2fee	btrfs-progs: remove duplicated leaked extent buffer report [BUG] When transaction is aborted halfway, we can have extent buffer leaked, and in that case, the same leaked extent buffer can be reported for multiple times: ERROR: failed to clear free space cache v2: -1 extent buffer leak: start 30441472 len 16384 WARNING: dirty eb leak (aborted trans): start 30441472 len 16384 extent buffer leak: start 30720000 len 16384 extent buffer leak: start 30425088 len 16384 extent buffer leak: start 30425088 len 16384 << Duplicated WARNING: dirty eb leak (aborted trans): start 30425088 len 16384 Note that 30425088 line is reported twice (not accounting the "dirty eb leak" line). [CAUSE] When we detected a leaked eb, we call free_extent_buffer_nocache(), but free_extent_buffer_nocache() can only remove the eb when its reduced refs is 0. If the eb has refs 2, it will need two free_extent_buffer_nocache() calls to remove it from the cache. [FIX] Just reset the eb->refs to 1 so that free_extent_buffer_nocache() can remove it from cache for sure. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-10-11 09:08:08 +02:00
Qu Wenruo	811ae819e3	btrfs-progs: remove unused function extent_io_tree_init_cache_max() The function was introduced by commit `a5ce5d2198` ("btrfs-progs: extent-cache: actually cache extent buffers") but never got utilized. Thus we can just remove it. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-10-11 09:08:08 +02:00
Qu Wenruo	2060120201	btrfs-progs: fix a BUG_ON() condition for write_data_to_disk() The BUG_ON() condition in write_data_to_disk() is no longer correct. Now write_raid56_with_parity() will return the bytes written of last stripe. Thus a success writeback can trigger the BUG_ON(ret). Fix the condition to (ret < 0). Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-08-16 15:18:12 +02:00
Qu Wenruo	fc6925bfd3	btrfs-progs: avoid repeated data write for metadata [BUG] Shinichiro reported that "mkfs.btrfs -m DUP" is doing repeated write into the device. For non-zoned device this is not a big deal, but for zoned device this is critical, as zoned device doesn't support overwrite at all. [CAUSE] The problem is related to write_and_map_eb() call, since commit `2a93728391` ("btrfs-progs: use write_data_to_disk() to replace write_extent_to_disk()"), we call write_data_to_disk() for metadata write back. But the problem is, write_data_to_disk() will call btrfs_map_block() with rw = WRITE. By that btrfs_map_block() will always return all stripes, while in write_data_to_disk() we also iterate through each mirror of the range. This results above repeated writeback. [FIX] Fix this problem by completely remove @mirror argument from write_data_to_disk(). With extra comments to explicitly show that function will write to all mirrors. Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Fixes: `2a93728391` ("btrfs-progs: use write_data_to_disk() to replace write_extent_to_disk()") Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-08-16 15:18:12 +02:00
Qu Wenruo	4e9e978783	btrfs-progs: allow read_data_from_disk() to rebuild RAID56 using P/Q This new ability is added by: - Allow btrfs_map_block() to return the chunk type This makes later work much easier - Only reset stripe offset inside btrfs_map_block() when needed Currently if @raid_map is not NULL, btrfs_map_block() will consider this call is for WRITE and will reset stripe offset. This is no longer the case, as for RAID56 read with mirror_num 1/0, we will still call btrfs_map_block() with non-NULL raid_map. Add a small check to make sure we won't reset stripe offset for mirror 1/0 read. - Add new helper read_raid56() to handle rebuild We will read the full stripe (including all data and P/Q stripes) do the rebuild, then only copy the refered part to the caller. There is a catch for RAID6, we have no way to exhaust all combination, so the current repair will assume the mirror = 0 data is corrupted, then try to find a missing device. But if no missing device can be found, it will assume P is corrupted. This is just a guess, and can to totally wrong, but we have no better idea. Now btrfs-progs have full read ability for RAID56. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-04-25 19:08:30 +02:00
Qu Wenruo	a99bece1cd	btrfs-progs: remove extent_buffer::fd and extent_buffer::dev_bytes Those two members are a shortcut for non-RAID56 profiles. But we should not use such shortcut, and move all our logical address read/write to the unified read_data_from_disk()/write_data_to_disk(). With previous refactors, now we're safe to remove them. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-04-25 19:08:30 +02:00
Qu Wenruo	3ff9d35257	btrfs-progs: use read_data_from_disk() to replace read_extent_from_disk() and replace read_extent_data() The function read_extent_from_disk() is only a wrapper to read tree block. And read_extent_data() is just a while loop to eliminate short read caused by stripe boundary. In fact, a lot of call sites of read_extent_data() are either reading metadata (thus no possible short read) or doing extra loop by themselves. This patch will replace those two functions with read_data_from_disk(), making it the only entrance for data/metadata read. And update read_data_from_disk() to return the read bytes, so caller can do a simple while loop. For the few callers of read_extent_data(), open-code a small while loop for them. This will allow later RAID56 read repair using P/Q much easier. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-04-25 19:08:30 +02:00
Qu Wenruo	2a93728391	btrfs-progs: use write_data_to_disk() to replace write_extent_to_disk() Function write_extent_to_disk() is just writing the content of a tree block to disk. It can not handle RAID56, and its work is the same as write_data_to_disk(). Thus we can replace write_extent_to_disk() with write_data_to_disk() easily. There is only one special call site in write_raid56_with_parity(), which can easily be replace with btrfs_pwrite() directly. This reduce the write entrance, and make later eb::fd removal easier. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-04-25 19:08:29 +02:00
Naohiro Aota	ae0dfb246d	btrfs-progs: introduce btrfs_pread wrapper for pread Wrap pread with btrfs_pread as well. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:23 +02:00
Naohiro Aota	c821e5545f	btrfs-progs: introduce btrfs_pwrite wrapper for pwrite Wrap pwrite with btrfs_pwrite(). It simply calls pwrite() on non-zoned btrfs (opened without O_DIRECT). On zoned mode (opened with O_DIRECT), it allocates an aligned bounce buffer, copies the contents and uses it for direct-IO writing. Writes in device_zero_blocks() and btrfs_wipe_existing_sb() are a little tricky. We don't have fs_info on our hands, so use zinfo to determine it is a zoned device or not. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:23 +02:00
David Sterba	c3ee6a8a09	btrfs-progs: unify GPL header comments Add the GPL v2 header to files where it was missing and is not from an external source, update to the most recent version with the address. Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-07 13:58:44 +02:00
David Sterba	b1f374dd1d	btrfs-progs: switch %Lu to %llu format The %Lu format is not standard and we use %llu everywhere else, so switch the remaining cases. Signed-off-by: David Sterba <dsterba@suse.com>	2021-06-19 22:07:49 +02:00
David Sterba	0144bcb713	btrfs-progs: move volumes.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:06 +02:00
David Sterba	abb670f883	btrfs-progs: move ctree.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:05 +02:00
David Sterba	4e49bd703d	btrfs-progs: move extent_io.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:04 +02:00

17 Commits