btrfs-progs

mirror of https://github.com/kdave/btrfs-progs synced 2025-04-11 03:31:17 +00:00

Author	SHA1	Message	Date
Qu Wenruo	eacdd1606c	btrfs-progs: print-tree: fix chunk/block group flags output [BUG] Commit ("btrfs-progs: use raid table for profile names in print-tree.c") introduced one bug in block group and chunk flags output and changed the behavior: item 1 key (FIRST_CHUNK_TREE CHUNK_ITEM 13631488) itemoff 16105 itemsize 80 length 8388608 owner 2 stripe_len 65536 type SINGLE ... item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15993 itemsize 112 length 8388608 owner 2 stripe_len 65536 type DUP ... item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15881 itemsize 112 length 268435456 owner 2 stripe_len 65536 type DUP ... Note that, the flag string only contains the profile (SINGLE/DUP/etc...) no type (DATA/METADATA/SYSTEM). And we have new "SINGLE" string, even that profile has no extra bit to indicate that. [CAUSE] The "SINGLE" part is caused by the raid array which has a name for SINGLE profile, even it doesn't have the corresponding bit. The missing type string is caused by a code bug: strcpy(buf, name); while (tmp) { tmp = toupper(tmp); tmp++; } strcpy(ret, buf); The last strcpy() call overrides the existing string in @ret. [FIX] - Enhance string handling using strn()/snprintf() - Add extra "UKNOWN.0x%llx" output for unknown profiles - Call proper strncat() to merge type and profile - Add extra handling for "SINGLE" to keep the old output Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:24 +02:00
Qu Wenruo	f4c712e024	btrfs-progs: rename data parameter to profile in extent allocation path In function btrfs_reserve_extent(), we call find_free_extent() passing "u64 profile" into "int data". This is definitely a width reduction, but when looking further into the code, it's more serious than that, in fact the "int data" parameter is not really to indicate whether it's data extent, but really a block group profile (with block group type). This is not only width reduction, but also confusing. Thankfully so for we don't have any BLOCK_GROUP bits beyond 32 bits, so the width reduction is not causing a big problem. This patch will rename the "int data" parameter to a more proper one, "u64 profile" in all involved call paths. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:24 +02:00
David Sterba	bc28dc6bea	btrfs-progs: introduce helper for striped profiles There are several profiles like raid0, raid10, raid5 and raid6 that can span as many devices as possible and need special handling for the stripe calculations. Provide a helper to identify the profiles in a simple way. Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:24 +02:00
David Sterba	a25a5cc2c0	btrfs-progs: use btrfs_bg_type_to_nparity in btrfs_stripe_length Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:24 +02:00
David Sterba	11fcdbc35e	btrfs-progs: introduce helper to get allowed profiles for a given device number Use the raid table helper to avoid hard coding profiles for the given number of devices in test_num_disk_vs_raid. Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:24 +02:00
David Sterba	cb0b63cd90	btrfs-progs: use raid table value for sub_stripes in btrfs_check_chunk_valid Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:24 +02:00
David Sterba	4f662d74fd	btrfs-progs: export raid table helper for sub_stripes Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:24 +02:00
David Sterba	833ce53872	btrfs-progs: use btrfs_bg_type_to_nparity in calc_stripe_length Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:24 +02:00
David Sterba	e3355b43b4	btrfs-progs: use raid table for min devs in btrfs_check_chunk_valid Replace the hard coded values with the raid table reference. Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:24 +02:00
David Sterba	3d05b20435	btrfs-progs: use btrfs_bg_type_to_nparity in chunk_bytes_by_type Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:23 +02:00
David Sterba	359710d4dd	btrfs-progs: use btrfs_bg_type_to_nparity in get_dev_extent_len Stripe calculation with hard coded parity, use the helper. Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:23 +02:00
David Sterba	f759e9bbad	btrfs-progs: introduce a public helper for raid parity There's a private helper for parity and there are many open coded calculations of parity for the RAID56 profiles. The helper will be used to remove that and use the raid table values. Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:23 +02:00
David Sterba	447bf2fb37	btrfs-progs: zoned: factor out supported profiles to a helper The enumeration could get out of date, like fixed in previous commit. Create a helper that will hide the implementation details. Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:23 +02:00
David Sterba	15eb03ca1d	btrfs-progs: use raid table for profile names in print-tree.c Pick the names from the raid table and do the uppercase conversion. Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:23 +02:00
David Sterba	80714610f3	btrfs-progs: use raid table for ncopies There's opencoded value of raid table ncopies in print_filesystem_usage_overall, add a helper and use it. Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:23 +02:00
David Sterba	b29f1603b0	btrfs-progs: use raid table for devs_min and replace local helper Another duplication of the raid table, in this case missing the changes to raid10 and raid0 minimum devices changed in `a177ef7dd4` ("btrfs-progs: mkfs: allow degenerate raid0/raid10"). Define and use a helper using the table value. Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:23 +02:00
Naohiro Aota	e9696b06f0	btrfs-progs: use direct-io for zoned device We need to use direct-IO for zoned devices to preserve the write ordering. Instead of detecting if the device is zoned or not, we simply use direct-IO for any kind of device (even if emulated zoned mode on a regular device). Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:23 +02:00
Naohiro Aota	4a8d85f730	btrfs-progs: temporarily set zoned flag for initial tree reading Functions to read data/metadata e.g. read_extent_from_disk() now depend on the fs_info->zoned flag to determine if they do direct-IO or not. The flag (and zone_size) is not known before reading the chunk tree and it set to 0 while in the initial chunk tree setup process. That will cause btrfs_pread() to fail because it does not align the buffer. Use fcntl() to find out the file descriptor is opened with O_DIRECT or not, and if it is, set the zoned flag to 1 temporally for this initial process. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:23 +02:00
Naohiro Aota	ae0dfb246d	btrfs-progs: introduce btrfs_pread wrapper for pread Wrap pread with btrfs_pread as well. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:23 +02:00
Naohiro Aota	c821e5545f	btrfs-progs: introduce btrfs_pwrite wrapper for pwrite Wrap pwrite with btrfs_pwrite(). It simply calls pwrite() on non-zoned btrfs (opened without O_DIRECT). On zoned mode (opened with O_DIRECT), it allocates an aligned bounce buffer, copies the contents and uses it for direct-IO writing. Writes in device_zero_blocks() and btrfs_wipe_existing_sb() are a little tricky. We don't have fs_info on our hands, so use zinfo to determine it is a zoned device or not. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-20 18:59:23 +02:00
Naohiro Aota	40ab7530df	btrfs-progs: set eb::fs_info properly everywhere Several extent_buffer initializations miss fs_info initialization. This is OK before the following patch ("btrfs-progs: use direct-io for zoned device") as eb->fs_info is not always necessary. But, after that patch, we will use fs_info to determine it is zoned or not and that causes segfault in such cases. Properly set fs_info when initializing extent_buffers to fix the issue. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-08 20:47:04 +02:00
David Sterba	8bb13015bd	btrfs-progs: don't include btrfs-list.h unless necessary We don't need to include this besides btrfs-list.c itself and subvolume.c that does use the btrfs_list_* API. Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-08 20:47:03 +02:00
Naohiro Aota	585ac14d1a	btrfs-progs: use btrfs_device_size() instead of device_get_partition_size_fd() device_get_partition_size_fd() fails if we pass a regular file. This can happen when trying to create an emulated zoned filesystem on a regular file. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-08 20:46:35 +02:00
David Sterba	785218efb1	btrfs-progs: remove direct calls to crc32c from ctree.h Make the helpers using crc32c not inline so the crc32c.h can be removed from the public headers exported by libbtrfs. Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-08 20:46:35 +02:00
David Sterba	732d73dc1f	btrfs-progs: remove btrfs_crc32c alias There's an ancient macro btrfs_crc32c which is just wrapping crc32c and not doing anything else, so we can use the crc helper directly. Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-08 20:46:35 +02:00
David Sterba	979bda6fb5	btrfs-progs: libbtrfs: replace SZ_ constants and drop sizes.h To drop sizes.h from exported headers, replace the few SZ_ constants from the existing exported headers (ctree.h, send.h). It would be nice to use them in the long run but right now it would prevent unexporting the sizes.h file. Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-08 20:46:35 +02:00
David Sterba	38356d456b	btrfs-progs: libbtrfs: drop radix-tree.h from exported headers The header is only included from ctree.h but not actually used, we can drop it from the exported files. Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-08 20:46:35 +02:00
Nikolay Borisov	39c6e0b79c	btrfs-progs: add btrfs_uuid_tree_remove It will be used to clear received data on RW snapshots that were received. The function is copied from kernel sources. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-08 20:46:34 +02:00
Nikolay Borisov	97640a5b81	btrfs-progs: remove root argument from btrfs_truncate_item This function lies in the kernel-shared directory and is supposed to be close to 1:1 copy with its kernel counterpart, yet it takes one extra argument - root. But this is now unused to simply remove it. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-08 20:46:34 +02:00
Nikolay Borisov	c3584b4fc0	btrfs-progs: remove fs_info argument from leaf_data_end The function already takes an extent_buffer which has a reference to the owning filesystem's fs_info. This also brings the function in line with the kernel's signature. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-08 20:46:34 +02:00
Nikolay Borisov	7c58b09548	btrfs-progs: remove root argument from btrfs_fixup_low_keys It's not used, so just remove it. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-08 20:46:34 +02:00
David Sterba	d0ea2b2af4	btrfs-progs: zoned: also exclude raid1c3 and raid1c4 from supported profiles The enumeration of profiles not available for zoned mode in btrfs_load_block_group_zone_info was lacking the 3 and 4 copy raid1, add them. Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-07 18:39:58 +02:00
David Sterba	27207d651a	btrfs-progs: dump-tree: print complete root_item The output of root_item in the 'inspect dump-tree' command lacks some items and some of them are printed conditionally. As the dump utility is for debugging, it's better to print all the items, with names matching the structure members and order. Some values will inevitably be all zeros like uuids or various timestamps, but that's a minor issue and affecting only a few trees. Example: item 0 key (EXTENT_TREE ROOT_ITEM 0) itemoff 15844 itemsize 439 generation 5 root_dirid 0 bytenr 30523392 byte_limit 0 bytes_used 16384 last_snapshot 0 flags 0x0(none) refs 1 drop_progress key (0 UNKNOWN.0 0) drop_level 0 level 0 generation_v2 5 uuid 00000000-0000-0000-0000-000000000000 parent_uuid 00000000-0000-0000-0000-000000000000 received_uuid 00000000-0000-0000-0000-000000000000 ctransid 0 otransid 0 stransid 0 rtransid 0 ctime 0.0 (1970-01-01 01:00:00) otime 0.0 (1970-01-01 01:00:00) stime 0.0 (1970-01-01 01:00:00) rtime 0.0 (1970-01-01 01:00:00) item 3 key (FS_TREE ROOT_ITEM 0) itemoff 14949 itemsize 439 generation 4 root_dirid 256 bytenr 30408704 byte_limit 0 bytes_used 16384 last_snapshot 0 flags 0x0(none) refs 1 drop_progress key (0 UNKNOWN.0 0) drop_level 0 level 0 generation_v2 4 uuid ec4669b6-6d21-46ab-857e-d60cafde45b3 parent_uuid 00000000-0000-0000-0000-000000000000 received_uuid 00000000-0000-0000-0000-000000000000 ctransid 0 otransid 0 stransid 0 rtransid 0 ctime 1633021823.0 (2021-09-30 19:10:23) otime 1633021823.0 (2021-09-30 19:10:23) stime 0.0 (1970-01-01 01:00:00) rtime 0.0 (1970-01-01 01:00:00) Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-07 18:39:44 +02:00
Josef Bacik	dd8e7477f7	btrfs-progs: remove data extents from the free space tree Dave reported a failure of mkfs-test 009 with the free space tree enabled by default. This is because 009 pre-populates the file system with a given directory, and for some reason our data allocation path isn't the same as in the kernel. Fix this by making sure when we allocate a data extent we remove the space from the free space tree, and with this our mkfs tests now pass. Issue: #410 Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-06 16:49:52 +02:00
Naohiro Aota	85e102f212	btrfs-progs: properly format btrfs_header in btrfs_create_root() Enabling quota in zoned mored hits the following assertion: $ mkfs.btrfs -f -d single -m single -R quota /dev/nullb0 btrfs-progs v5.11 See http://btrfs.wiki.kernel.org for more information. Zoned: /dev/nullb0: host-managed device detected, setting zoned feature Resetting device zones /dev/nullb0 (1600 zones) ... bad tree block 25395200, bytenr mismatch, want=25395200, have=0 kernel-shared/disk-io.c:549: write_tree_block: BUG_ON `1` triggered, value 1 ./mkfs.btrfs(+0x26aaa)[0x564d1a7ccaaa] ./mkfs.btrfs(write_tree_block+0xb8)[0x564d1a7cee29] ./mkfs.btrfs(__commit_transaction+0x91)[0x564d1a7e3740] ./mkfs.btrfs(btrfs_commit_transaction+0x135)[0x564d1a7e39aa] ./mkfs.btrfs(main+0x1fe9)[0x564d1a7b442a] /lib64/libc.so.6(__libc_start_main+0xcd)[0x7f36377d37fd] ./mkfs.btrfs(_start+0x2a)[0x564d1a7b1fda] zsh: IOT instruction sudo ./mkfs.btrfs -f -d single -m single -R quota /dev/nullb0 The issue occurs because btrfs_create_root() is not formatting the root node properly. This is fine in regular mode, because it's fortunately reusing an once freed buffer. As the previous tree node allocation kindly formatted the header, it will see the proper bytenr and pass the checks. However, we never reuse a once freed buffer on zoned filesystem. As a result, we have zero-filled bytenr, FSID, and chunk-tree UUID, hitting the asserts in check_tree_block(). Reported-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-06 16:49:11 +02:00
Johannes Thumshirn	c22e9487a7	btrfs-progs: remove max_zone_append_size logic max_zone_append_size is unused and can as well be removed just like we did on the kernel side. Keep one sanity check though, so we're not adding devices to a zoned FS that aren't supporting zone append. Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-06 16:49:07 +02:00
Naohiro Aota	53ec59ead0	btrfs-progs: do not zone reset on emulated zoned mode We cannot zone reset a regular file with emulated zones. So, mkfs.btrfs on such a file causes the following error. ERROR: zoned: failed to reset device '/home/naota/tmp/btrfs.img' zones: Inappropriate ioctl for device Introduce btrfs_zoned_device_info->emulated to distinguish the zones are emulated or not. And, use it to decide it needs zone reset or not. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-10-06 16:48:56 +02:00
Qu Wenruo	60651ad9da	btrfs-progs: introduce OPEN_CTREE_ALLOW_TRANSID_MISMATCH flag [BUG] There is a report that, btrfstune can even work while the fs has transid mismatch problems. $ btrfstune -f -u /dev/sdb1 Current fsid: b2b5ae8d-4c49-45f0-b42e-46fe7dcfcb07 New fsid: b2b5ae8d-4c49-45f0-b42e-46fe7dcfcb07 Set superblock flag CHANGING_FSID Change fsid in extents parent transid verify failed on 792854528 wanted 20103 found 20091 parent transid verify failed on 792854528 wanted 20103 found 20091 parent transid verify failed on 792854528 wanted 20103 found 20091 Ignoring transid failure parent transid verify failed on 792870912 wanted 20103 found 20091 parent transid verify failed on 792870912 wanted 20103 found 20091 parent transid verify failed on 792870912 wanted 20103 found 20091 Ignoring transid failure parent transid verify failed on 792887296 wanted 20103 found 20091 parent transid verify failed on 792887296 wanted 20103 found 20091 parent transid verify failed on 792887296 wanted 20103 found 20091 Ignoring transid failure ERROR: child eb corrupted: parent bytenr=38010880 item=69 parent level=1 child level=1 ERROR: failed to change UUID of metadata: -5 ERROR: btrfstune failed This leaves a corrupted fs even more corrupted, and due to the extra CHANGING_FSID flag, btrfs check will not even try to run on it: Opening filesystem to check... ERROR: Filesystem UUID change in progress ERROR: cannot open file system [CAUSE] Unlike kernel, btrfs-progs has a less strict check on transid mismatch. In read_tree_block() we will fall back to use the tree block even its transid mismatch if we can't find any better copy. However not all commands in btrfs-progs needs this feature, only btrfs-check (which may fix the problem) and btrfs-restore (it just tries to ignore any problems) really utilize this feature. [FIX] Introduce a new open ctree flag, OPEN_CTREE_ALLOW_TRANSID_MISMATCH, to be explicit about whether we really want to ignore transid error. Currently only btrfs-check and btrfs-restore will utilize this new flag. Also add btrfs-image to allow opening such fs with transid error. Link: https://www.reddit.com/r/btrfs/comments/pivpqk/failure_during_btrfstune_u/ Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-20 12:17:29 +02:00
David Sterba	96a5cf0719	btrfs-progs: handle EINVAL when reading zone size on older kernels A combination of new progs and old kernel may lead to problems with detecting zone size by ioctl. Fixed by #376 but still incomplete because old kernels may return EINVAL for unsupported ioctl. This should be ENOTTY but hasn't been like that until kernel 5.11. As we always pass valid arguments to the ioctl we can't conflate the two and can EINVAL the same way as ENOTTY. Issue: #399 Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-20 11:31:09 +02:00
David Sterba	ee17bcec33	btrfs-progs: remove stale declaration from send.h We don't use this header for kernel compilation so the guarded declaration is pointless. Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-07 19:27:59 +02:00
David Sterba	e86425242f	btrfs-progs: move send.h to kernel-shared/ The header contains the protocol definitions and is almost exactly the same as the kernel version, move it to the proper directory. Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-07 19:26:46 +02:00
David Sterba	76ab1fa364	btrfs-progs: rename and move group_profile_max_safe_loss The helper belongs to the others that translate bg flags to the raid attr table member. Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-07 16:38:56 +02:00
Qu Wenruo	9a11b1b792	btrfs-progs: backport btrfs_check_node() from kernel The btrfs_check_node() has far less meaningful error message compared to kernel counterpart, and it even lacks certain checks like level check. Backport btrfs_check_node() to btrfs-progs to not only unify the code but greatly improve the readability of the error messages. Extra modification includes: - No fs_info needed As we don't need to output fsid. - Remove unlikely() macro - Extra BTRFS_TREE_BLOCK_* error type - Btrfs-progs specific error handling To record the corrupted tree blocks. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-07 14:20:41 +02:00
Qu Wenruo	8f8cafa2ce	btrfs-progs: backport btrfs_check_leaf() from kernel Currently btrfs_check_leaf() provides almost meaningless messages for things like invalid item offset: incorrect offsets 8492 3707786077 While kernel tree-checker is doing a way better job, so it's wise to backport btrfs_check_leaf() from kernel. There are some modification needed: - New generic_err() helper - Remove unlikely() macro - Remove empty essential tree check Mkfs still needs to create empty essential trees. - Using BTRFS_TREE_BLOCK_* return value Original mode check still relies on them to do certain repair. - No need for btrfs_fs_info We no longer need fsid output, thus no need for btrfs_fs_info. - No item contents check - Still using the fail: label for btrfs-progs specific error handling The new output looks like: corrupt leaf: root=2 block=72164753408 slot=109, unexpected item end, have 3707786077 expect 8492 Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-07 14:19:54 +02:00
Qu Wenruo	1f8dfe681f	btrfs-progs: use btrfs_key for btrfs_check_node() and btrfs_check_leaf() In kernel space we hardly use btrfs_disk_key, unless for very lowlevel code. There is no need to intentionally use btrfs_disk_key in btrfs-progs either. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-07 13:58:44 +02:00
David Sterba	c3ee6a8a09	btrfs-progs: unify GPL header comments Add the GPL v2 header to files where it was missing and is not from an external source, update to the most recent version with the address. Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-07 13:58:44 +02:00
David Sterba	7572839a74	btrfs-progs: add and use bit masks for RAID1 and RAID56 profiles Many test conditions can be simplified in case they check all the related profiles. Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-06 16:36:18 +02:00
David Sterba	7fe4396467	btrfs-progs: copy some raid_attr helpers from kernel There are convenience helpers for the raid attr table, copy them from kernel for further cleanups. Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-06 16:36:17 +02:00
Josef Bacik	79e534def9	btrfs-progs: add the incompat flag for extent tree v2 I will have a lot of preparatory patches to reduce the review pain of this large feature. In order to enable that work define the incompat flag. Once all of the work lands to support the feature there will be a patch to actually enable us to select it and manipulate file systems with that incompat flag set. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-06 16:36:17 +02:00
Josef Bacik	826e466028	btrfs-progs: add add_block_group_free_space helper This exists in the kernel free-space-tree.c but not in progs. We need it to generate the free space items for new block groups, which is needed when we start creating the free space tree in make_btrfs(). Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-06 16:36:17 +02:00
Josef Bacik	3d870a491f	btrfs-progs: make sure track_dirty and ref_cows is set properly Adding support for the per-block group roots means we will be reading the roots directly in different places. Make sure we set ->track_dirty and ->ref_cows properly in the helper so we don't have to do this everywhere. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-09-03 15:33:53 +02:00
David Sterba	a177ef7dd4	btrfs-progs: mkfs: allow degenerate raid0/raid10 Kernel patch b2f78e88052bc0bee ("btrfs: allow degenerate raid0/raid10") in 5.15 will allow mounting and converting to single device raid0 or two device raid10. Let mkfs create such filesystem. "The motivation is to allow to preserve the profile type as long as it possible for some intermediate state (device removal, conversion), or when there are disks of different size, with raid0 the otherwise unusable space of the last device will be used too. Similarly for raid10, though the two largest devices would need to be the same." Signed-off-by: David Sterba <dsterba@suse.com>	2021-08-27 15:40:53 +02:00
Qu Wenruo	991a598f53	btrfs-progs: move btrfs_format_csum() to common/utils.[ch] Function btrfs_format_csum() is a special helper only used in btrfs-progs. Move it to common/utils.[ch] other than leaving it in kernel-shared/disk-io.c. Since we're moving the code, also introduce a macro, BTRFS_CSUM_STRING_LEN, to replace open-coded string length calculation. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-08-26 14:26:13 +02:00
Josef Bacik	8c3c13bb45	btrfs-progs: check blocks in btrfs_next_sibling_block By enabling the lowmem checks properly I uncovered the case where test fsck/007 will infinite loop at the detection stage. This is because when checking the inode item we will just btrfs_next_item(), and because we ignore check tree block failures at read time we don't get an -EIO from btrfs_next_leaf. This occurs because we allow fsck to raw-read blocks even if they fail basic sanity checks, because we want the opportunity to repair the blocks. However this means corrupt blocks are sitting in cache marked as uptodate. btrfs_search_slot() handles this by doing a check_block() on every block we add to the path, so that anything that is doing a search gets a proper -EIO. btrfs_next_sibling_block() needs a similar check. With this fix we now return -EIO on btrfs_next_leaf() properly and we no longer infinite loop on fsck/007 with lowmem. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-08-25 15:38:54 +02:00
Qu Wenruo	a138daac17	btrfs-progs: mkfs: set super_cache_generation to 0 if we're using free space tree [HICCUP] There is a bug report that mkfs.btrfs -R free-space-tree still makes kernel to try to cleanup the v1 space cache: # mkfs.btrfs -R free-space-tree -f /dev/test/scratch1 # mount /dev/test/scratch1 /mnt/btrfs # dmesg \| grep cleaning BTRFS info (device dm-6): cleaning free space cache v1 [CAUSE] By default, mkfs.btrfs will set super cache generation to (u64)-1, which will inform kernel that the v1 space cache is invalid, needs to regenerate it. But for free space cache tree, kernel will set super cache generation to 0, to indicate v1 space cache is not in use. This means, even we enabled free space tree with all the RO compatible bits and new tree, as long as super cache generation is not 0, kernel still consider the fs has some invalid v1 space cache, and will try to remove them. [FIX] This is not a big deal, but to make the "-R free-space-tree" to really work as kernel, we also need to set super cache generation to 0. Reported-by: Chris Murphy <lists@colorremedies.com> Link: https://lore.kernel.org/linux-btrfs/CAJCQCtSvgzyOnxtrqQZZirSycEHp+g0eDH5c+Kw9mW=PgxuXmw@mail.gmail.com/ Reviewed-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-08-20 14:24:55 +02:00
David Sterba	6527771668	btrfs-progs: add nparity for raid1c34 definitions The values of .ncopies was not explicitly set. Signed-off-by: David Sterba <dsterba@suse.com>	2021-07-23 00:59:27 +02:00
Qu Wenruo	07ecf878c1	btrfs-progs: check: batch v1 space cache inodes when clearing Currently v1 space cache clearing will delete one cache inode just in one transaction, and then start a new transaction to delete the next inode. This is far from efficient and can make the already slow v1 space cache deleting even slower, as large fs has tons of cache inodes to delete. This patch will speed up the process by batching up to 16 inode deletion into one transaction. A quick benchmark of deleting 702 v1 space cache inodes would look like this: Unpatched: 4.898s Patched: 0.087s Which is obviously a big win. Reported-by: Joshua <joshua@mailmag.net> Link: https://lore.kernel.org/linux-btrfs/0b4cf70fc883e28c97d893a3b2f81b11@mailmag.net/ Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-07-22 16:26:05 +02:00
Sidong Yang	94f3b75c00	btrfs-progs: zoned: fix memory leak in btrfs_sb_io() In btrfs_sb_io(), blk_zone_report is used for getting information about zones. But it is not freed if code goes in usual path. This patch frees the variable just after it used. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Sidong Yang <realwakka@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-07-02 17:27:53 +02:00
David Sterba	1dc6f33c28	btrfs-progs: zoned: use fixed width type when reading zone size The ioctl BLKGETZONESZ expects 32bit integer, declare the target variable as such. Signed-off-by: David Sterba <dsterba@suse.com>	2021-07-02 17:27:53 +02:00
David Sterba	b1f374dd1d	btrfs-progs: switch %Lu to %llu format The %Lu format is not standard and we use %llu everywhere else, so switch the remaining cases. Signed-off-by: David Sterba <dsterba@suse.com>	2021-06-19 22:07:49 +02:00
David Sterba	9f6c055e38	btrfs-progs: dump-tree: add options to dump checksums Add new options to dumps checksums in node headers and in the checksum items: $ btrfs inspect dump-tree --csum-headers image root tree leaf 471515136 items 19 free space 12186 generation 15 owner ROOT_TREE leaf 471515136 flags 0x1(WRITTEN) backref revision 1 csum 0x756b2d54 fs uuid df0348df-5773-47dd-81e9-a18221461239 For nodes/leaves it's appended on the 2nd line of the header. Checksum items are stored in leaves as EXTENT_CSUM key type, with offset value as the logical offset starting. As the array would be hard to parse or match, each offset value is printed with the checksum. For crc32c it's 4 values on a line, for xxhash it's 2 and for the long 256bit checksums it's one checksum per line. $ btrfs inspect dump-tree --csum-items image leaf 5423104 items 1 free space 30 generation 6 owner CSUM_TREE leaf 5423104 flags 0x1(WRITTEN) backref revision 1 fs uuid bd7c981e-16ff-4081-a734-3ef5d50cafc1 chunk uuid 13f4c76c-7845-4984-88ed-f01b52e05cf8 item 0 key (EXTENT_CSUM EXTENT_CSUM 22020096) itemoff 55 itemsize 16228 range start 22020096 end 38637568 length 16617472 [22020096] 0x8941f998 [22024192] 0x8941f998 [22028288] 0x8941f998 [22032384] 0x8941f998 [22036480] 0x8941f998 [22040576] 0x8941f998 [22044672] 0x8941f998 [22048768] 0x8941f998 ... $ btrfs inspect dump-tree --csum-items image leaf 5718016 items 1 free space 7746 generation 6 owner CSUM_TREE leaf 5718016 flags 0x1(WRITTEN) backref revision 1 fs uuid f453a5b4-8b4a-4fbf-90a2-2925e4fe2335 chunk uuid eb1da63b-248b-44c2-82da-71b2564bf50e item 0 key (EXTENT_CSUM EXTENT_CSUM 52387840) itemoff 7771 itemsize 8512 range start 52387840 end 53477376 length 1089536 [52387840] 0x686ede9288c391e7e05026e56f2f91bfd879987a040ea98445dabc76f55b8e5f [52391936] 0x686ede9288c391e7e05026e56f2f91bfd879987a040ea98445dabc76f55b8e5f ... The options are not on by default, the header checksum is not important for the structures. Data checksums can be quite big so that would make the dump long and without any actual data to match against. Signed-off-by: David Sterba <dsterba@suse.com>	2021-06-19 22:07:49 +02:00
David Sterba	72d710637c	btrfs-progs: print-tree: convert mode to bitmask Replace follow and traverse by one parameter that takes bits to affect the behaviour. This allows to extend btrfs_print_tree output with more modes from one place. Signed-off-by: David Sterba <dsterba@suse.com>	2021-06-09 20:31:49 +02:00
David Sterba	6134973527	btrfs-progs: zoned: make it work without kernel support There's a report that a system with 4.19 kernel fails boot because device scan exits with error. This is because zoned support is compiled in btrfs-progs but not in kernel. To make new progs and old kernels work, do a fallback when the zoned ioctl is not available, as if it were a non-zoned device. There is no other option, but this is safe at least for the device scan that would not error out. Any unaligned writes to a zoned device will fail as expected. Issue: #376 Signed-off-by: David Sterba <dsterba@suse.com>	2021-06-07 17:38:46 +02:00
Su Yue	80a86f1b47	btrfs-progs: do not BUG_ON if btrfs_add_to_fsid succeeded to write superblock Commit `8ef9313cf2` ("btrfs-progs: zoned: implement log-structured superblock") changed to write BTRFS_SUPER_INFO_SIZE bytes to device. The before num of bytes to be written is sectorsize. It causes mkfs.btrfs failed on my 16k pagesize kvm: $ /usr/bin/mkfs.btrfs -s 16k -f -mraid0 /dev/vdb2 /dev/vdb3 btrfs-progs v5.12 See http://btrfs.wiki.kernel.org for more information. ERROR: superblock magic doesn't match ERROR: superblock magic doesn't match common/device-scan.c:195: btrfs_add_to_fsid: BUG_ON `ret != sectorsize` triggered, value 1 /usr/bin/mkfs.btrfs(btrfs_add_to_fsid+0x274)[0xaaab4fe8a5fc] /usr/bin/mkfs.btrfs(main+0x1188)[0xaaab4fe4dc8c] /usr/lib/libc.so.6(__libc_start_main+0xe8)[0xffff7223c538] /usr/bin/mkfs.btrfs(+0xc558)[0xaaab4fe4c558] [1] 225842 abort (core dumped) /usr/bin/mkfs.btrfs -s 16k -f -mraid0 /dev/vdb2 /dev/vdb3 btrfs_add_to_fsid() now always calls sbwrite() to write BTRFS_SUPER_INFO_SIZE bytes to device, so change condition of the BUG_ON(). Also add comments for sbread() and sbwrite(). Signed-off-by: Su Yue <l@damenly.su> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-12 16:00:14 +02:00
David Sterba	6c53222add	btrfs-progs: delete bogus zero checksum check The check condition (csum_result == 0) does not make sense anymore as it's not the buffer and not the crc32c result as it used to be. The message does not bring any value and looks like it's some debugging aid from the old times (added in 2008 as `bb7055ec21` ("Add some extra debugging around file data checksum failures")). Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-08 00:58:51 +02:00
David Sterba	c19ac510a7	btrfs-progs: move repair.[ch] to common/ Move the file to common as it's used by several parts, while still keeping the name 'repair' although the only thing it does is adding a corrupted extent. Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:47 +02:00
David Sterba	b19a603d62	btrfs-progs: remove unnecessary linux/*.h includes Decrease dependency on system headers, remove where they're not needed or became stale after code moved. The path-utils.h encapsulate path operations so include linux/limits.h here, that's where PATH_MAX is defined. Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:47 +02:00
David Sterba	aa56bf3a31	btrfs-progs: zoned: replace raw ioctl with a helper for device size Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:46 +02:00
David Sterba	c7b5f884e0	btrfs-progs: add prefix to zero_blocks This is a public helper for devices, add the prefix to make it clear. Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:46 +02:00
David Sterba	2b5d4f2e6f	btrfs-progs: add prefix to discard_blocks This is a helper for devices, make it clear in the function name. Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:46 +02:00
David Sterba	bc6864967b	btrfs-progs: add prefix to exported queue_param As this is a public helper, add a prefix that makes it clear what is the queue related to. Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:46 +02:00
David Sterba	38254c4934	btrfs-progs: kerncompat: add const_ilog2 The newly added zoned mode constants can utilize the const ilog2 version. Copy it from kernel include/linux/log2.h. Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:46 +02:00
Naohiro Aota	8c2dfa6387	btrfs-progs: zoned: wipe temporary superblocks in superblock log zone mkfs.btrfs uses a temporary superblock during the initialization process. The temporary superblock uses BTRFS_MAGIC_TEMPORARY as its magic which is different from a regular superblock. As a result, libblkid, which only supports the usual magic, cannot recognize the volume as btrfs. So, let's wipe the temporary magic before writing out the usual superblock. Technically, we can add the temporary magic to the libblkid's table. But, it will result in recognizing a half-baked filesystem as btrfs, which is not ideal. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:46 +02:00
Naohiro Aota	8bbb0c5744	btrfs-progs: zoned: support zero out on zoned block device If we zero out a region in a sequential write required zone, we cannot write to the region until we reset the zone. Thus, we must prohibit zeroing out to a sequential write required zone. zero_dev_clamped() is modified to take the zone information and it calls zero_zone_blocks() if the device is host managed to avoid writing to sequential write required zones. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:46 +02:00
Naohiro Aota	58ec593892	btrfs-progs: zoned: support resetting zoned device All zones of zoned block devices should be reset before writing. Support this by introducing PREP_DEVICE_ZONED. btrfs_reset_all_zones() walk all the zones on a device, and reset a zone if it is sequential required zone, or discard the zone range otherwise. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:46 +02:00
Naohiro Aota	bfdb3ae237	btrfs-progs: zoned: reset zone of freed block group When freeing a chunk, we can/should reset the underlying device zones for the chunk. Introduce btrfs_reset_chunk_zones() and reset the zones. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	bfd34b7876	btrfs-progs: zoned: redirty clean extent buffers Tree manipulating operations like merging nodes often release once-allocated tree nodes. Btrfs cleans such nodes so that pages in the node are not uselessly written out. On ZONED drives, however, such optimization blocks the following IOs as the cancellation of the write out of the freed blocks breaks the sequential write sequence expected by the device. Check if next dirty extent buffer is continuous to a previously written one. If not, it redirty extent buffers between the previous one and the next one, so that all dirty buffers are written sequentially. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	feff533e34	btrfs-progs: zoned: calculate allocation offset for conventional zones Conventional zones do not have a write pointer, so we cannot use it to determine the allocation offset for sequential allocation if a block group contains a conventional zone. But instead, we can consider the end of the highest addressed extent in the block group for the allocation offset. For new block group, we cannot calculate the allocation offset by consulting the extent tree, because it can cause deadlock by taking extent buffer lock after chunk mutex, which is already taken in btrfs_make_block_group(). Since it is a new block group anyways, we can simply set the allocation offset to 0. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	50ae9f62c7	btrfs-progs: zoned: implement sequential extent allocation Implement a sequential extent allocator for zoned filesystems. This allocator only needs to check if there is enough space in the block group after the allocation pointer to satisfy the extent allocation request. Since the allocator is really simple, we implement it directly in find_search_start(). Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	f08410f078	btrfs-progs: zoned: load zone's allocation offset A zoned filesystem must allocate blocks at the zones' write pointer. The device's write pointer position can be mapped to a logical address within a block group. To facilitate this, add an "alloc_offset" to the block group to track the logical addresses of the write pointer. This logical address is populated in btrfs_load_block_group_zone_info() from the write pointers of corresponding zones. For now, zoned filesystems the single profile. Supporting non-single profile with zone append writing is not trivial. For example, in the DUP profile, we send a zone append writing IO to two zones on a device. The device reply with written LBAs for the IOs. If the offsets of the returned addresses from the beginning of the zone are different, then it results in different logical addresses. We need fine-grained logical to physical mapping to support such separated physical address issue. Since it should require additional metadata type, disable non-single profiles for now. This commit supports the case all the zones in a block group are sequential. The next patch will handle the case having a conventional zone. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	b031fe84fd	btrfs-progs: zoned: implement zoned chunk allocator Implement a zoned chunk and device extent allocator. One device zone becomes a device extent so that a zone reset affects only this device extent and does not change the state of blocks in the neighbor device extents. To implement the allocator, we need to extend the following functions for a zoned filesystem: - init_alloc_chunk_ctl - dev_extent_search_start - dev_extent_hole_check - decide_stripe_size Here, dev_extent_hole_check() is newly introduced to check the validity of a hole found. init_alloc_chunk_ctl_zoned() is mostly the same as regular one. It always set the stripe_size to the zone size and aligns the parameters to the zone size. dev_extent_search_start() only aligns the start offset to zone boundaries. We don't care about the first 1MB like in regular filesystem because we anyway reserve the first two zones for superblock logging. dev_extent_hole_check_zoned() checks if zones in given hole are either conventional or empty sequential zones. Also, it skips zones reserved for superblock logging. With the change to the hole, the new hole may now contain pending extents. So, in this case, loop again to check that. Finally, decide_stripe_size_zoned() should shrink the number of devices instead of stripe size because we need to honor stripe_size == zone_size. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	8ef9313cf2	btrfs-progs: zoned: implement log-structured superblock Superblock (and its copies) is the only data structure in btrfs which has a fixed location on a device. Since we cannot overwrite in a sequential write required zone, we cannot place superblock in the zone. One easy solution is limiting superblock and copies to be placed only in conventional zones. However, this method has two downsides: one is reduced number of superblock copies. The location of the second copy of superblock is 256GB, which is in a sequential write required zone on typical devices in the market today. So, the number of superblock and copies is limited to be two. Second downside is that we cannot support devices which have no conventional zones at all. To solve these two problems, we employ superblock log writing. It uses two adjacent zones as a circular buffer to write updated superblocks. Once the first zone is filled up, start writing into the second one. Then, when both zones are filled up and before starting to write to the first zone again, reset the first zone. We can determine the position of the latest superblock by reading write pointer information from a device. One corner case is when both zones are full. For this situation, we read out the last superblock of each zone, and compare them to determine which zone is older. The following zones are reserved as the circular buffer on ZONED btrfs. - primary superblock: offset 0B (and the following zone) - first copy: offset 512G (and the following zone) - Second copy: offset 4T (4096G, and the following zone) If these reserved zones are conventional, superblock is written fixed at the start of the zone without logging. Currently, superblock reading/writing is done by pread/pwrite. This commit replace the call sites with sbread/sbwrite to wrap the functions. For zoned btrfs, btrfs_sb_io which is called from sbread/sbwrite reverses the IO position back to a mirror number, maps the mirror number into the superblock logging position, and do the IO. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	49d5ce4d0f	btrfs-progs: zoned: allow zoned filesystems on non-zoned block devices Run a zoned filesystem on non-zoned devices. This is done by "slicing up" the block device into fixed-sized chunks and emulate a conventional zone on each of them. The emulated zone size is determined from the size of device extent. This is mainly aimed at testing of zoned filesystems, i.e. the zoned chunk allocator, on regular block devices. Currently, we always use EMULATED_ZONE_SIZE (256MiB) for the emulated zone size. In the future, this will be customized by mkfs option. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	707f0716e0	btrfs-progs: zoned: disallow mixed-bg in ZONED mode Placing both data and metadata in a block group is impossible in ZONED mode. For data, we can allocate a space for it and write it immediately after the allocation. For metadata, however, we cannot do that, because the logical addresses are recorded in other metadata buffers to build up the trees. As a result, a data buffer can be placed after a metadata buffer, which is not written yet. Writing out the data buffer will break the sequential write rule. Check and disallow MIXED_BG with ZONED mode. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	3c0f83e541	btrfs-progs: zoned: introduce max_zone_append_size The zone append write command has a maximum IO size restriction it accepts. This is because a zone append write command cannot be split, as we ask the device to place the data into a specific target zone and the device responds with the actual written location of the data. Introduce max_zone_append_size to zone_info and fs_info to track the value, so we can limit all I/O to a zoned block device that we want to write using the zone append command to the device's limits. Zone append command is mandatory for zoned btrfs. So, reject a device with max_zone_append_size == 0. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	7e520022ff	btrfs-progs: zoned: check and enable ZONED mode Introduce function btrfs_check_zoned_mode() to check if ZONED flag is enabled on the file system and if the file system consists of zoned devices with equal zone size. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	384840b9c0	btrfs-progs: zoned: get zone information of zoned block devices Get the zone information (number of zones and zone size) from all the devices, if the volume contains a zoned block device. To avoid costly run-time zone report commands to test the device zones type during block allocation, it also records all the zone status (zone type, write pointer position, etc.). Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	242c8328bc	btrfs-progs: zoned: add new ZONED feature flag With the zoned feature enabled, a zoned block device-aware btrfs allocates block groups aligned to the device zones and always written in sequential zones at the zone write pointer position. It also supports "emulated" zoned mode on a non-zoned device. In the emulated mode, btrfs emulates conventional zones by slicing the device into fixed-size zones. We don't support conversion from the ext4 volume with the zoned feature because we can't be sure all the converted block groups are aligned to zone boundaries. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	acdd22ab68	btrfs-progs: provide fs_info from btrfs_device Likewise in the kernel code, provide fs_info access from struct btrfs_device. This will help to unify the code between the kernel and the userland. Since fs_info can be NULL at the time of btrfs_add_to_fsid(), let's use btrfs_open_devices() to set fs_info to the devices. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	cf67267d33	btrfs-progs: rename calc_size to stripe_size alloc_chunk_ctl::calc_size is actually the stripe_size in the kernel side code. Let's rename it to clarify what the "calc" is. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	a968606632	btrfs-progs: simplify arguments of chunk_bytes_by_type() Chunk_bytes_by_type() takes type, calc_size, and ctl as arguments. But the first two can be obtained from the ctl. Let's drop these arguments for simplicity. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:45 +02:00
Naohiro Aota	3428487d90	btrfs-progs: drop alloc_chunk_ctl::stripe_len Since commit `b9444efb66` ("btrfs-progs: don't pretend RAID56 has a different stripe length"), alloc_chunk_ctl::stripe_len is always fixed to BTRFS_STRIPE_LEN. Let's replace alloc_chunk_ctl::stripe_len with BTRFS_STRIPE_LEN, like in the kernel code. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:44 +02:00
Naohiro Aota	2d3b31d604	btrfs-progs: use round_down for allocation calcs Several calculations in the chunk allocation process use this pattern. x /= y; x *= y; Replace this pattern with round_down(). Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:44 +02:00
Naohiro Aota	907c5fd7a4	btrfs-progs: fix to use half the available space for DUP profile In the DUP profile, we can use only half of the space available in a device extent. Fix the calculation of calc_size for it. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:44 +02:00
Naohiro Aota	13a6cff8b6	btrfs-progs: rewrite btrfs_alloc_data_chunk() using create_chunk() btrfs_alloc_data_chunk() and create_chunk() have the most part in common. Let's rewrite btrfs_alloc_data_chunk() using create_chunk(). There are two differences between btrfs_alloc_data_chunk() and create_chunk(). create_chunk() uses find_next_chunk() to decide the logical address of the chunk, and it uses btrfs_alloc_dev_extent() to decide the physical address of a device extent. On the other hand, btrfs_alloc_data_chunk() uses *start for both logical and physical addresses. To support the btrfs_alloc_data_chunk()'s use case, we use ctl->start and ctl->dev_offset. If these values are set (non-zero), use the specified values as the address. It is safe to use 0 to indicate the value is not set here. Because both lower addresses of logical (0..BTRFS_FIRST_CHUNK_TREE_OBJECT_ID) and physical (0..BTRFS_BLOCK_RESERVED_1M_FOR_SUPER) are reserved. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:44 +02:00
Naohiro Aota	1e344dc8cf	btrfs-progs: factor out create_chunk() Factor out create_chunk() from btrfs_alloc_chunk(). This new function creates a chunk. There is no functional changes. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:44 +02:00
Naohiro Aota	0e3865206e	btrfs-progs: factor out decide_stripe_size() Factor out decide_stripe_size() from btrfs_alloc_chunk(). This new function calculates the actual stripe size to allocate and decides the size of a stripe (ctl->calc_size). This commit has no functional changes. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:44 +02:00
Naohiro Aota	3acac33e8c	btrfs-progs: consolidate parameter initialization of regular allocator Move parameter initialization code for regular allocator to init_alloc_chunk_ctl_policy_regular(). This will help adding another allocator in the future. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:44 +02:00
Naohiro Aota	605ffad6f0	btrfs-progs: convert type of alloc_chunk_ctl::type Convert alloc_chunk_ctl::type to take the original type in btrfs_alloc_chunk(). This will help refactoring in the following commits. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:44 +02:00
Naohiro Aota	ff65b83306	btrfs-progs: refactor find_free_dev_extent_start() Factor out the function dev_extent_search_start() from find_free_dev_extent_start() to decide the starting position of a device extent search. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:44 +02:00
Naohiro Aota	1da9fede64	btrfs-progs: introduce chunk allocation policy Introduce chunk allocation policy for btrfs. This policy controls how chunks and device extents are allocated from devices. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-06 16:41:44 +02:00
Nikolay Borisov	0595309541	btrfs-progs: fix null pointer deref in balance_level In case the right buffer is emptied it's first set to NULL and subsequently it's dereferenced to get its size to pass to root_sub_used. This naturally leads to a NULL pointer dereference. The correct thing to do is to pass the stashed right->len in "blocksize". Issue: #296 Pull-request: #360 Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-04-19 18:58:26 +02:00
Johannes Thumshirn	94b60b67a9	btrfs-progs: pass in fs_info to btrfs_csum_data For passing authentication keys to the checksumming functions we need a container for the key. Pass in a btrfs_fs_info to btrfs_csum_data() so we can use the fs_info as a container for the authentication key. Note this is not always possible for all callers of btrfs_csum_data() so we're just passing in NULL for now Functions calling btrfs_csum_data() with a NULL fs_info argument are currently not supported in the context of an authenticated file system. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: David Sterba <dsterba@suse.com>	2021-03-24 22:20:19 +01:00
David Sterba	6bb6d1215d	btrfs-progs: factor open_ctree parameters to a structure Extending open_ctree with more parameters would be difficult, we'll need to add more so factor out the parameters to a structure for easier extension. Signed-off-by: David Sterba <dsterba@suse.com>	2021-03-24 22:20:19 +01:00
Dāvis Mosāns	a2ccf96d76	btrfs-progs: fix checksum output for "checksum verify failed" Currently only single checksum byte is printed. Fix it so that the whole checksum is printed, in the order as the bytes are stored in the buffer. This matches what kernel does, though it might not correspond to the cases of CRC32C and XXHASH as if they were stored in integer variable and printed in the native format. For consistency we need to print the same format. Signed-off-by: Dāvis Mosāns <davispuh@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-03-15 14:59:40 +01:00
Marek Behún	08fb534eb7	btrfs-progs: do not fail when offset of a ROOT_ITEM is not -1 When the btrfs_read_fs_root() function is searching a ROOT_ITEM with location key offset other than -1, it currently fails via BUG_ON. The offset can have other value than -1, though. This can happen for example if a subvolume is renamed: $ btrfs subvolume create X && sync Create subvolume './X' $ btrfs inspect-internal dump-tree /dev/root \| grep -B 2 'name: X$ location key (270 ROOT_ITEM 18446744073709551615) type DIR transid 283 data_len 0 name_len 1 name: X $ mv X Y && sync $ btrfs inspect-internal dump-tree /dev/root \| grep -B 2 'name: Y$ location key (270 ROOT_ITEM 0) type DIR transid 285 data_len 0 name_len 1 name: Y As can be seen the offset changed from -1ULL to 0. Do not fail in this case. Signed-off-by: Marek Behún <marek.behun@nic.cz> CC: Qu Wenruo <wqu@suse.com> CC: Tom Rini <trini@konsulko.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-02-19 15:24:42 +01:00
Su Yue	be6710f89d	btrfs-progs: print bytenr of child eb if mismatched level found in read_node_slot If btrfs check reported like ERROR: child eb corrupted: parent bytenr=178081 item=246 parent level=1 child level=2 It's hard to find which eb is corrupted without bytenr in dump tree information: node 178081 level 1 items 424 free 69 generation 44495 owner EXTENT_TREE fs uuid 7d9dbe1b-dea6-4141-807b-026325123ad8 chunk uuid 97a3e3aa-7105-4101-aaf7-50204a240e69 key (16613126144 EXTENT_ITEM 4096) block 177939087360 gen 44433 key (16632803328 EXTENT_ITEM 4096) block 177939120128 gen 44433 key (16654548992 EXTENT_ITEM 8192) block 177970380800 gen 44336 key (16697884672 EXTENT_ITEM 8192) block 177970397184 gen 44336 key (16714223616 EXTENT_ITEM 16384) block 177970413568 gen 44336 key (16721760256 EXTENT_ITEM 16384) block 177943855104 gen 44436 key (16857755648 EXTENT_ITEM 4096) block 177857544192 gen 44416 ... For easier lookup, print bytenr of child eb if its level is not equal to parent's level - 1 in read_node_slot(). Signed-off-by: Su Yue <l@damenly.su> Signed-off-by: David Sterba <dsterba@suse.com>	2021-01-13 22:33:10 +01:00
Josef Bacik	9cc9c9ab32	btrfs-progs: print the eb flags for nodes as well While debugging a corruption problem I realized we don't spit out the flags for nodes, which is needed when debugging relocation problems so we know which nodes are the RELOC root items and which are the actual fs tree's items. Fix this by unifying the header printing helper so both leaf's and nodes get the same information printed out. node 41070940160 level 1 items 34 free space 87 generation 7709536 owner ROOT_TREE node 41070940160 flags 0x1(WRITTEN) backref revision 1 Same for leaves: leaf 41070944256 items 12 free space 515 generation 7709536 owner ROOT_TREE leaf 41070944256 flags 0x1(WRITTEN) backref revision 1 Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2020-12-16 17:08:53 +01:00
Boris Burkov	92d92e99b7	btrfs-progs: mkfs: support free space tree as -R option Add a runtime feature (-R) flag for the free space tree. A filesystem that is mkfs'd with -R free-space-tree then mounted with no options has the same contents as one mkfs'd without the option, then mounted with '-o space_cache=v2'. The only tricky thing is in exactly how to call the tree creation code. Using btrfs_create_free_space_tree as is did not quite work, because an extra reference to the eb (root->commit_root) is leaked, which mkfs complains about with a warning. I opted to follow how the uuid tree is created by adding it to the dirty roots list for cleanup by commit_tree_roots in commit_transaction. As a result, btrfs_create_free_space_tree no longer exactly matches the version in the kernel sources. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>	2020-09-08 22:06:04 +02:00
Marcos Paulo de Souza	c655b5e4b1	btrfs-progs: make btrfs_lookup_dir_index in parity with kernel code This function exists in kernel side but using the _item suffix, and objectid argument is placed before the name argument. Change the function to reflect the kernel version. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:09:49 +02:00
David Sterba	0144bcb713	btrfs-progs: move volumes.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:06 +02:00
David Sterba	6069bc52a9	btrfs-progs: move transaction.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:06 +02:00
David Sterba	978f300c21	btrfs-progs: move inode.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:05 +02:00
David Sterba	abb670f883	btrfs-progs: move ctree.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:05 +02:00
David Sterba	c03619b864	btrfs-progs: move file.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:05 +02:00
David Sterba	772f0da6df	btrfs-progs: move disk-io.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:05 +02:00
David Sterba	cf529f36ad	btrfs-progs: move print-tree.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:05 +02:00
David Sterba	7dd4abc3c5	btrfs-progs: move extent-tree.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:04 +02:00
David Sterba	4e49bd703d	btrfs-progs: move extent_io.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:04 +02:00
David Sterba	da90f38ad9	btrfs-progs: move free-space-tree.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-08-31 17:01:04 +02:00
David Sterba	f7fe7de64c	btrfs-progs: move uuid-tree.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-06-09 22:19:09 +02:00
David Sterba	a656166d11	btrfs-progs: move root-tree.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-06-09 22:19:09 +02:00
David Sterba	fdf058ac0f	btrfs-progs: move inode-item.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-06-09 22:19:09 +02:00
David Sterba	a067ecef0a	btrfs-progs: move file-item.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-06-09 22:19:09 +02:00
David Sterba	925bf01f5d	btrfs-progs: move dir-item.c to kernel-shared/ Signed-off-by: David Sterba <dsterba@suse.com>	2020-06-09 22:19:08 +02:00
Qu Wenruo	ccad599701	btrfs-progs: rename btrfs_block_group_cache to btrfs_block_group To keep the same naming across kernel and btrfs-progs. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2020-05-11 20:50:00 +02:00
Qu Wenruo	5bc44891c9	btrfs-progs: kill block_group_cache::key This would sync the code between kernel and btrfs-progs, and save at least 1 byte for each btrfs_block_group_cache. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2020-05-11 20:49:50 +02:00
David Sterba	8c85b34420	btrfs-progs: move backref.[ch] to kernel-shared/ The files are very close to kernel versions, for that keep them in the shared directory. Signed-off-by: David Sterba <dsterba@suse.com>	2020-05-04 20:48:35 +02:00
David Sterba	58bcd4260f	btrfs-progs: move free-space-tree.[ch] to kernel-shared/ The files are very close to kernel versions, for that keep them in the shared directory. Signed-off-by: David Sterba <dsterba@suse.com>	2020-03-31 18:37:34 +02:00
David Sterba	019489a143	btrfs-progs: move delayed-ref.[ch] to kernel-shared/ The files are very close to kernel versions, for that keep them in the shared directory. Signed-off-by: David Sterba <dsterba@suse.com>	2020-03-31 18:37:34 +02:00
David Sterba	94fced6353	btrfs-progs: build: drop kernel-lib from -I and update paths Include the files by full path to avoid any confusion in case of potentially duplicate names. Signed-off-by: David Sterba <dsterba@suse.com>	2019-07-03 20:49:04 +02:00
David Sterba	440e8b9830	btrfs-progs: shared: rename ulist_fini to ulist_release Sync the file with kernel version. Signed-off-by: David Sterba <dsterba@suse.com>	2019-01-16 21:16:14 +01:00
David Sterba	2fefed2601	btrfs-progs: shared: cleanup includes in ulist.c Signed-off-by: David Sterba <dsterba@suse.com>	2017-03-08 13:00:47 +01:00
David Sterba	390db8a346	btrfs-progs: shared: remove debug code from ulist Sync with kernel sources, we don't define CONFIG_BTRFS_DEBUG in userspace anyway. Signed-off-by: David Sterba <dsterba@suse.com>	2017-03-08 13:00:47 +01:00
David Sterba	4049542d4f	btrfs-progs: shared: copy ulist_del from kernel Signed-off-by: David Sterba <dsterba@suse.com>	2017-03-08 13:00:47 +01:00
David Sterba	886a8565e0	btrfs-progs: move ulist.[ch] to kernel-shared The implementation of ulist_* is same for kernel and userspace, without dependencies, so we can keep it separately for code sync. Signed-off-by: David Sterba <dsterba@suse.com>	2017-03-08 13:00:47 +01:00

1 2 3 4 5

236 Commits