btrfs-progs

Commit Graph

Author	SHA1	Message	Date
David Sterba	844caf8639	btrfs-progs: fix double free on error in read_raid56() Reported by 'gcc -fanalyzer': kernel-shared/extent_io.c: In function ‘read_raid56’: ./include/kerncompat.h:393:18: warning: dereference of NULL ‘pointers’ [CWE-476] [-Wanalyzer-null-dereference] After allocation of the pointers array fails it's dereferenced in the exit block. We can return immediately instead. Signed-off-by: David Sterba <dsterba@suse.com>	2024-04-18 19:16:15 +02:00
David Sterba	d739e3b73a	btrfs-progs: kernel-shared: use kmalloc and kfree All the code in kernel-shared should use the proper memory allocation helpers. Signed-off-by: David Sterba <dsterba@suse.com>	2023-11-03 18:04:37 +01:00
David Sterba	21aa6777b2	btrfs-progs: clean up includes, using include-what-you-use Signed-off-by: David Sterba <dsterba@suse.com>	2023-10-03 01:11:57 +02:00
Josef Bacik	cb269a492e	btrfs-progs: sync memcpy_extent_buffer from the kernel We use this in ctree.c in the kernel, so sync this helper into btrfs-progs to make sync'ing ctree.c easier. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-10-03 01:11:55 +02:00
Josef Bacik	f94ad0c516	btrfs-progs: pass btrfs_trans_handle through btrfs_clear_buffer_dirty This is the calling convention in the kernel because we track dirty blocks per transaction instead of globally in the fs_info. Simply mirror what we do in the kernel to make it easier to sync ctree.c locally. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-10-03 01:11:55 +02:00
David Sterba	22cf63d8ee	btrfs-progs: kernel-shared: add helper write_extent_buffer_chunk_tree_uuid Sync the helper write_extent_buffer_chunk_tree_uuid from kernel. Signed-off-by: David Sterba <dsterba@suse.com>	2023-06-27 23:40:56 +02:00
David Sterba	339de9b2d7	btrfs-progs: kernel-shared: use write_extent_buffer_fsid where possible We already have the helper but don't use it everywhere. Signed-off-by: David Sterba <dsterba@suse.com>	2023-06-27 16:29:58 +02:00
David Sterba	f094e35d63	btrfs-progs: kernel-shared: use copy_extent_buffer_full where possible We already have the helper for full extent buffer copy but don't use it everywhere. Signed-off-by: David Sterba <dsterba@suse.com>	2023-06-27 16:29:54 +02:00
Qu Wenruo	3ce08b2ff6	btrfs-progs: constify the buffer pointer for write functions The following functions accept a buffer for write, which can be marked as const: - btrfs_pwrite() - write_data_to_disk() Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-05-26 18:02:31 +02:00
Josef Bacik	f8da33abc5	btrfs-progs: add btrfs_readahead_node_child helper This exists in the kernel as a wrapper for readahead_tree_block, and is used extensively in ctree.c in the kernel. Sync this helper so that we can easily sync ctree.c Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-05-26 18:02:30 +02:00
Josef Bacik	5d84ee58e9	btrfs-progs: update arguments of find_extent_buffer In the kernel we only take a bytenr for this as the extent buffer cache is indexed on bytenr. Since we're passing in the btrfs_fs_info we can simply use the ->nodesize for the blocksize, and drop the blocksize argument completely. This brings us into parity with the kernel, which will allow the syncing of ctree.c. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-05-26 18:02:30 +02:00
Josef Bacik	f37cad074f	btrfs-progs: add a free_extent_buffer_stale helper This does exactly what free_extent_buffer_nocache does, but we call btrfs_free_extent_buffer_stale in the kernel code, so add this extra helper. Once the kernel code is synced we can get rid of the old helper. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-05-26 18:02:30 +02:00
Josef Bacik	656938b665	btrfs-progs: rename clear_extent_buffer_dirty to btrfs_clear_buffer_dirty This is a mirror of the change I've done in the kernel, but in progs it's even more simply because clean_tree_block was just a wrapper around clear_extent_buffer_dirty. Change this to btrfs_clear_buffer_dirty, and then update all the callers to use this helper instead of clean_tree_block and clear_extent_buffer_dirty. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-05-26 18:02:29 +02:00
Josef Bacik	216f442e5e	btrfs-progs: add some missing extent buffer helpers The following are some extent buffer helpers we have in the kernel but not in btrfs-progs. Sync these in to make syncing ctree.c easier. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-05-26 18:02:29 +02:00
Josef Bacik	c6b160c4e4	btrfs-progs: constify the extent buffer helpers These helpers are all take const struct extent_buffer in the kernel, do the same in btrfs-progs in order to enable us to more easily sync ctree.c. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-05-26 18:02:29 +02:00
Josef Bacik	4a9a8f2a8a	btrfs-progs: sync extent-io-tree.[ch] and misc.h from the kernel This is a bit larger than the previous syncs, because we use extent_io_tree's everywhere. There's a lot of stuff added to kerncompat.h, and then I went through and cleaned up all the API changes, which were - extent_io_tree_init takes an fs_info and an owner now. - extent_io_tree_cleanup is now extent_io_tree_release. - set_extent_dirty takes a gfpmask. - clear_extent_dirty takes a cached_state. - find_first_extent_bit takes a cached_state. The diffstat looks insane for this, but keep in mind extent-io-tree.c and extent-io-tree.h are ~2000 loc just by themselves. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-05-26 18:02:29 +02:00
Josef Bacik	228aa34f10	btrfs-progs: sync messages.[ch] from the kernel These are the printk helpers from the kernel. There were a few modifications, the hi-lights are - We do not have fs_info::fs_state, so that needed to be removed. - We do not have discard.h sync'ed yet, so that dependency was dropped. - Anything related to struct super_block was commented out. - The transaction abort had to be modified to fit with the current btrfs-progs code. - Added a btrfs_no_printk() helper to common/messages.* so that the print statements still worked. - The 32bit limit checkers are not needed so are behind __KERNEL__ Additionally there were kerncompat.h changes that needed to be made to handle the dependencies properly. Those are easier to spot. Any function that needed to be modified has a MODIFIED tag in the comment section with a list of things that were changed. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-05-26 18:02:28 +02:00
Josef Bacik	e380421ff2	btrfs-progs: make write_extent_buffer take a const eb This is what we do in the kernel, and while we're syncing individual files we're going to have state where some callers are using a const, but progs isn't. So adjust write_extent_buffer to take a const eb in order to make this less painful. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-11-30 19:14:29 +01:00
Josef Bacik	fac1fae3ef	btrfs-progs: rename extent buffer flags to EXTENT_BUFFER_* We have been overloading the extent_state flags for use on the extent buffers as well. When we sync extent-io-tree.[ch] this will become impossible, so rename these flags to EXTENT_BUFFER_* and use those definitions instead of the extent_state definitions. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-11-28 18:57:44 +01:00
Josef Bacik	83cc5a5489	btrfs-progs: delete state_private code We used to store random private things into extent_states, but we haven't done this for a while and there are no users of this code, simply delete it. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-11-28 18:57:44 +01:00
Josef Bacik	20d88c17e7	btrfs-progs: move extent cache code directly into btrfs_fs_info We have some extra features in the btrfs-progs copy of the extent_io_tree that don't exist in the kernel. In order to make syncing easier simply move this functionality into btrfs_fs_info, that way we can sync in the new extent_io_tree code and not have to worry about breaking anything. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-11-28 18:57:44 +01:00
Josef Bacik	ccee633f3a	btrfs-progs: move dirty eb tracking to it's own io_tree btrfs-progs has a cache tree embedded in the extent_io_tree in order to track extent buffers. We use the extent_io_tree part to track dirty, and the cache tree to keep the extent buffers in. When we sync extent-io-tree.[ch] we'll lose this ability, so separate out the dirty tracking into its own extent_io_tree. Subsequent patches will adjust the extent buffer lookup so it doesn't use the custom extent_io_tree thing. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-11-28 18:57:43 +01:00
Josef Bacik	af30cf2e3e	btrfs-progs: make the find extent buffer helpers take fs_info This is a cleanup patch to make syncing the btrfs kernel code into btrfs-progs easier. In btrfs-progs we have an extra cache in the extent_io_tree that's exclusively used for the extent buffer tracking. In order to untangle this dependency start passing around the fs_info to search for extent_buffers, and then have the helpers use the appropriate structure to find the extent buffer. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-11-28 18:57:43 +01:00
Qu Wenruo	2aa4085bf7	btrfs-progs: properly handle degraded raid56 reads [BUG] For a degraded RAID5, btrfs check will fail to even read the chunk root: # mkfs.btrfs -f -m raid5 -d raid5 $dev1 $dev2 $dev3 # wipefs -fa $dev1 # btrfs check $dev2 Opening filesystem to check... warning, device 1 is missing bad tree block 22036480, bytenr mismatch, want=22036480, have=0 ERROR: cannot read chunk root ERROR: cannot open file system [CAUSE] Although read_tree_block() function from btrfs-progs is properly iterating the mirrors (mirror 1 is reading from the disk directly, mirror 2 will be rebuild from parity), the raid56 recovery path is not handling the read error correctly. The existing code will try to read the full stripe, but any read failure (including missing device) will immediately cause an error: for (i = 0; i < num_stripes; i++) { ret = btrfs_pread(multi->stripes[i].dev->fd, pointers[i], BTRFS_STRIPE_LEN, multi->stripes[i].physical, fs_info->zoned); if (ret < BTRFS_STRIPE_LEN) { ret = -EIO; goto out; } } [FIX] To make failed_a/failed_b calculation much easier, and properly handle too many missing devices, here this patch will introduce a new bitmap based solution. The new @failed_stripe_bitmap will represent all the failed stripes. So the initial read will mark all the missing devices in the @failed_stripe_bitmap, and later operations will all operate on that bitmap. Only before we call raid56_recov(), we convert the bitmap to the old failed_a/failed_b interface and continue. Now btrfs check can handle above case properly: # btrfs check $dev2 Opening filesystem to check... warning, device 1 is missing Checking filesystem on /dev/test/scratch2 UUID: 8b2e1cb4-f35b-4856-9b11-262d39d8458b [1/7] checking root items [2/7] checking extents [3/7] checking free space tree [4/7] checking fs roots [5/7] checking only csums items (without verifying data) [6/7] checking root refs [7/7] checking quota groups skipped (not enabled on this FS) found 147456 bytes used, no error found total csum bytes: 0 total tree bytes: 147456 total fs tree bytes: 32768 total extent tree bytes: 16384 btree space waste bytes: 139871 file data blocks allocated: 0 referenced 0 Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-11-24 17:29:12 +01:00
David Sterba	c2be0e2ce0	btrfs-progs: use template for out of memory error messages Signed-off-by: David Sterba <dsterba@suse.com>	2022-10-11 09:08:09 +02:00
Qu Wenruo	08bb354a1c	btrfs-progs: properly handle write error when writing back tree blocks [BUG] If we emulate a write error during commit transaction, by setting the block device read-only, then we can easily have the following crash using "btrfs check --clear-space-cache v2": Opening filesystem to check... Checking filesystem on /dev/test/scratch1 UUID: 5945915b-37f1-4bfa-9f64-684b318b8f73 Clear free space cache v2 Error writing to device 1 kernel-shared/transaction.c:156: __commit_transaction: BUG_ON `ret` triggered, value 1 ./btrfs(+0x570c9)[0x562ec894f0c9] ./btrfs(+0x57167)[0x562ec894f167] ./btrfs(__commit_transaction+0x13b)[0x562ec894f7f2] ./btrfs(btrfs_commit_transaction+0x214)[0x562ec894fa64] ./btrfs(btrfs_clear_free_space_tree+0x177)[0x562ec8941ae6] ./btrfs(+0xc8958)[0x562ec89c0958] ./btrfs(+0xc9d53)[0x562ec89c1d53] ./btrfs(+0x17ec7)[0x562ec890fec7] ./btrfs(main+0x12f)[0x562ec8910908] /usr/lib/libc.so.6(+0x232d0)[0x7ff917ee82d0] /usr/lib/libc.so.6(__libc_start_main+0x8a)[0x7ff917ee838a] ./btrfs(_start+0x25)[0x562ec890fdc5] Aborted (core dumped) [CAUSE] The call trace has shown it's a BUG_ON(), and it's from __commit_transaction(), which is writing tree blocks back. [FIX] The fix is pretty simple, just return error. In fact we even have an error value check in btrfs_commit_transaction() just after __commit_transaction() call (although not catching the return value from it). And since we're here, also call btrfs_abort_transaction() to prevent newer transactions from being started. Now we won't have a full crash: Opening filesystem to check... Checking filesystem on /dev/test/scratch1 UUID: 5945915b-37f1-4bfa-9f64-684b318b8f73 Clear free space cache v2 Error writing to device 1 ERROR: failed to write bytenr 30425088 length 16384: Operation not permitted ERROR: failed to write tree block 30425088: Operation not permitted ERROR: failed to clear free space cache v2: -1 extent buffer leak: start 30720000 len 16384 Reported-by: Christoph Anton Mitterer <calestyo@scientia.org> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-10-11 09:08:08 +02:00
Qu Wenruo	75800c2fee	btrfs-progs: remove duplicated leaked extent buffer report [BUG] When transaction is aborted halfway, we can have extent buffer leaked, and in that case, the same leaked extent buffer can be reported for multiple times: ERROR: failed to clear free space cache v2: -1 extent buffer leak: start 30441472 len 16384 WARNING: dirty eb leak (aborted trans): start 30441472 len 16384 extent buffer leak: start 30720000 len 16384 extent buffer leak: start 30425088 len 16384 extent buffer leak: start 30425088 len 16384 << Duplicated WARNING: dirty eb leak (aborted trans): start 30425088 len 16384 Note that 30425088 line is reported twice (not accounting the "dirty eb leak" line). [CAUSE] When we detected a leaked eb, we call free_extent_buffer_nocache(), but free_extent_buffer_nocache() can only remove the eb when its reduced refs is 0. If the eb has refs 2, it will need two free_extent_buffer_nocache() calls to remove it from the cache. [FIX] Just reset the eb->refs to 1 so that free_extent_buffer_nocache() can remove it from cache for sure. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-10-11 09:08:08 +02:00
Qu Wenruo	811ae819e3	btrfs-progs: remove unused function extent_io_tree_init_cache_max() The function was introduced by commit `a5ce5d2198` ("btrfs-progs: extent-cache: actually cache extent buffers") but never got utilized. Thus we can just remove it. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-10-11 09:08:08 +02:00
Qu Wenruo	2060120201	btrfs-progs: fix a BUG_ON() condition for write_data_to_disk() The BUG_ON() condition in write_data_to_disk() is no longer correct. Now write_raid56_with_parity() will return the bytes written of last stripe. Thus a success writeback can trigger the BUG_ON(ret). Fix the condition to (ret < 0). Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-08-16 15:18:12 +02:00
Qu Wenruo	fc6925bfd3	btrfs-progs: avoid repeated data write for metadata [BUG] Shinichiro reported that "mkfs.btrfs -m DUP" is doing repeated write into the device. For non-zoned device this is not a big deal, but for zoned device this is critical, as zoned device doesn't support overwrite at all. [CAUSE] The problem is related to write_and_map_eb() call, since commit `2a93728391` ("btrfs-progs: use write_data_to_disk() to replace write_extent_to_disk()"), we call write_data_to_disk() for metadata write back. But the problem is, write_data_to_disk() will call btrfs_map_block() with rw = WRITE. By that btrfs_map_block() will always return all stripes, while in write_data_to_disk() we also iterate through each mirror of the range. This results above repeated writeback. [FIX] Fix this problem by completely remove @mirror argument from write_data_to_disk(). With extra comments to explicitly show that function will write to all mirrors. Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Fixes: `2a93728391` ("btrfs-progs: use write_data_to_disk() to replace write_extent_to_disk()") Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-08-16 15:18:12 +02:00
Qu Wenruo	4e9e978783	btrfs-progs: allow read_data_from_disk() to rebuild RAID56 using P/Q This new ability is added by: - Allow btrfs_map_block() to return the chunk type This makes later work much easier - Only reset stripe offset inside btrfs_map_block() when needed Currently if @raid_map is not NULL, btrfs_map_block() will consider this call is for WRITE and will reset stripe offset. This is no longer the case, as for RAID56 read with mirror_num 1/0, we will still call btrfs_map_block() with non-NULL raid_map. Add a small check to make sure we won't reset stripe offset for mirror 1/0 read. - Add new helper read_raid56() to handle rebuild We will read the full stripe (including all data and P/Q stripes) do the rebuild, then only copy the refered part to the caller. There is a catch for RAID6, we have no way to exhaust all combination, so the current repair will assume the mirror = 0 data is corrupted, then try to find a missing device. But if no missing device can be found, it will assume P is corrupted. This is just a guess, and can to totally wrong, but we have no better idea. Now btrfs-progs have full read ability for RAID56. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-04-25 19:08:30 +02:00
Qu Wenruo	a99bece1cd	btrfs-progs: remove extent_buffer::fd and extent_buffer::dev_bytes Those two members are a shortcut for non-RAID56 profiles. But we should not use such shortcut, and move all our logical address read/write to the unified read_data_from_disk()/write_data_to_disk(). With previous refactors, now we're safe to remove them. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-04-25 19:08:30 +02:00
Qu Wenruo	3ff9d35257	btrfs-progs: use read_data_from_disk() to replace read_extent_from_disk() and replace read_extent_data() The function read_extent_from_disk() is only a wrapper to read tree block. And read_extent_data() is just a while loop to eliminate short read caused by stripe boundary. In fact, a lot of call sites of read_extent_data() are either reading metadata (thus no possible short read) or doing extra loop by themselves. This patch will replace those two functions with read_data_from_disk(), making it the only entrance for data/metadata read. And update read_data_from_disk() to return the read bytes, so caller can do a simple while loop. For the few callers of read_extent_data(), open-code a small while loop for them. This will allow later RAID56 read repair using P/Q much easier. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-04-25 19:08:30 +02:00
Qu Wenruo	2a93728391	btrfs-progs: use write_data_to_disk() to replace write_extent_to_disk() Function write_extent_to_disk() is just writing the content of a tree block to disk. It can not handle RAID56, and its work is the same as write_data_to_disk(). Thus we can replace write_extent_to_disk() with write_data_to_disk() easily. There is only one special call site in write_raid56_with_parity(), which can easily be replace with btrfs_pwrite() directly. This reduce the write entrance, and make later eb::fd removal easier. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-04-25 19:08:29 +02:00
Naohiro Aota	ae0dfb246d	btrfs-progs: introduce btrfs_pread wrapper for pread Wrap pread with btrfs_pread as well. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:23 +02:00
Naohiro Aota	c821e5545f	btrfs-progs: introduce btrfs_pwrite wrapper for pwrite Wrap pwrite with btrfs_pwrite(). It simply calls pwrite() on non-zoned btrfs (opened without O_DIRECT). On zoned mode (opened with O_DIRECT), it allocates an aligned bounce buffer, copies the contents and uses it for direct-IO writing. Writes in device_zero_blocks() and btrfs_wipe_existing_sb() are a little tricky. We don't have fs_info on our hands, so use zinfo to determine it is a zoned device or not. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:23 +02:00
David Sterba	c3ee6a8a09	btrfs-progs: unify GPL header comments Add the GPL v2 header to files where it was missing and is not from an external source, update to the most recent version with the address. Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-07 13:58:44 +02:00
David Sterba	b1f374dd1d	btrfs-progs: switch %Lu to %llu format The %Lu format is not standard and we use %llu everywhere else, so switch the remaining cases. Signed-off-by: David Sterba <dsterba@suse.com>	2021-06-19 22:07:49 +02:00
David Sterba	0144bcb713	btrfs-progs: move volumes.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:06 +02:00
David Sterba	abb670f883	btrfs-progs: move ctree.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:05 +02:00
David Sterba	4e49bd703d	btrfs-progs: move extent_io.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:04 +02:00

41 Commits