Commit Graph

417 Commits

Author SHA1 Message Date
Qu Wenruo
99dc37bcfe btrfs-progs: cross-port btrfs_uuid_tree_add() from kernel
The modification is minimal:

- Replace WARN_ON() with UASSERT()

- Remove the @trans parameter for btrfs_extend_item() and
  btrfs_mark_buffer_dirty()
  As progs version doesn't need a transaction handler.

- Remove the btrfs_uuid_tree_add() in mkfs/main.c

Signed-off-by: Qu Wenruo <wqu@suse.com>
2024-07-30 20:02:42 +02:00
Qu Wenruo
8efa8092aa btrfs-progs: move uuid-tree definitions to kernel-shared/uuid-tree.h
Currently we already have a kernel-shared/uuid-tree.c, which is mostly
shared with kernel.

Kernel also has a uuid-tree.h, but we are still using ctree.h for the
header.

Move all the uuid-tree related definitions to kernel-shared/uuid-tree.h,
making future code sync easier.

Signed-off-by: Qu Wenruo <wqu@suse.com>
2024-07-30 20:01:59 +02:00
Mark Harmstone
32ab0e6328 btrfs-progs: set transid in btrfs_insert_dir_item
btrfs_insert_dir_item wasn't setting the transid field in
btrfs_dir_item. Set it to the current transaction ID rather than writing
uninitialized memory to disk.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Mark Harmstone <maharmstone@fb.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
2024-07-30 20:01:38 +02:00
Yaroslav Halchenko
16a7cbca91 btrfs-progs: run codespell throughout fixing typos automagically
Spell checking can now run in automated mode.

=== Do not change lines below ===
{
 "chain": [],
 "cmd": "codespell -w",
 "exit": 0,
 "extra_inputs": [],
 "inputs": [],
 "outputs": [],
 "pwd": "."
}
^^^ Do not change lines above ^^^

Author: Yaroslav Halchenko <debian@onerussian.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-30 19:56:08 +02:00
Qu Wenruo
9ad15a7301 btrfs-progs: use btrfs_link_subvolume() to replace btrfs_mksubvol()
The function btrfs_mksubvol() is very different between btrfs-progs and
kernel, the former version is really just linking a subvolume to another
directory inode, but the kernel version is really to make a completely
new subvolume.

Instead of same-named function, introduce btrfs_link_subvolume() and use
it to replace the old btrfs_mksubvol().

This is done by:

- Introduce btrfs_link_subvolume()
  Which does extra checks before doing any modification:
  * Make sure the target inode is a directory
  * Make sure no filename conflict

  Then do the linkage:
  * Add the dir_item/dir_index into the parent inode
  * Add the forward and backward root refs into tree root

- Introduce link_image_subvolume() helper
  Currently btrfs_mksubvol() has a dedicated convert filename retry
  behavior, which is unnecessary and should be done by the convert code.

  Now move the filename retry behavior into the helper.

- Remove btrfs_mksubvol()
  Since there is only one caller utilizing btrfs_mksubvol(), and it's
  now gone, we can remove the old btrfs_mksubvol().

Signed-off-by: Qu Wenruo <wqu@suse.com>
2024-07-30 19:54:50 +02:00
Qu Wenruo
9b74d80919 btrfs-progs: remove fs_info parameter from btrfs_create_tree()
The @fs_info parameter can be easily extracted from @trans, and kernel
has already remove the parameter.

Signed-off-by: Qu Wenruo <wqu@suse.com>
2024-07-30 19:54:00 +02:00
David Sterba
ef73193623 btrfs-progs: dump-tree: escape special characters in paths or xattrs
Filenames can contain a newline (or other funny characters), this makes
the dump-tree output confusing, same for xattr names or values that can
binary data.  Encode the special characters in the C-style ('\e' ->
"\e", or \NNN if there's no single letter representation). This is based
on the isprint() as it's espected either on a terminal or in a dump
file.

Issue: #350
Issue: #407
Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-30 19:53:33 +02:00
Johannes Thumshirn
7c549b5f7c btrfs-progs: remove raid stripe encoding
Remove the not needed encoding and reserved fields in struct
raid_stripe_extent.

This saves 8 bytes per stripe extent.

Note: this is a format change and previously created filesystems with
raid-stripe-tree will not be accessible. Similar patch is needed in
kernel.

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-06-24 19:40:18 +02:00
David Sterba
4db925911c btrfs-progs: use strncpy_null everywhere
Use the safe version of strncpy that makes sure the string is
terminated.

To be noted:

- the conversion in scrub path handling was skipped
- sizes of device paths in some ioctl related structures is
  BTRFS_DEVICE_PATH_NAME_MAX + 1

Recently gcc 13.3 started to detect problems with our use of strncpy
potentially lacking the null terminator, warnings like:

cmds/inspect.c: In function ‘cmd_inspect_logical_resolve’:
cmds/inspect.c:294:33: warning: ‘__builtin_strncpy’ specified bound 4096 equals destination size [-Wstringop-truncation]
  294 |                                 strncpy(mount_path, mounted, PATH_MAX);
      |                                 ^

Signed-off-by: David Sterba <dsterba@suse.com>
2024-06-24 19:18:48 +02:00
Qu Wenruo
5dc737c42c btrfs-progs: print-tree: handle all supported flags
Although we already have a pretty good array defined for all
super/compat_ro/incompat flags, we still rely on a manually defined mask
to do the printing.

This can lead to easy de-sync between the definition and the flags.

Change it to automatically iterate through the array to calculate the
flags, and add the remaining super flags.

Pull-request: #810
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-06-24 19:17:53 +02:00
Qu Wenruo
2f8a6ee294 btrfs-progs: fix the conflicting super block flags
[BUG]
There is a bug report that a canceled checksum conversion (still
experimental feature) resulted in unexpected super flags:

csum_type		0 (crc32c)
csum_size		4
csum			0x14973811 [match]
bytenr			65536
flags			0x1000000001
			( WRITTEN |
			  CHANGING_FSID_V2 )
magic			_BHRfS_M [match]

While for a filesystem under checksum conversion it should have either
CHANGING_DATA_CSUM or CHANGING_META_CSUM.

[CAUSE]
It turns out that, due to btrfs-progs keeps its own extra flags inside
its own ctree.h headers, not the shared uapi headers, we have
conflicting super flags:

kernel-shared/uapi/btrfs_tree.h:#define BTRFS_SUPER_FLAG_METADUMP_V2	(1ULL << 34)
kernel-shared/uapi/btrfs_tree.h:#define BTRFS_SUPER_FLAG_CHANGING_FSID	(1ULL << 35)
kernel-shared/uapi/btrfs_tree.h:#define BTRFS_SUPER_FLAG_CHANGING_FSID_V2 (1ULL << 36)
kernel-shared/ctree.h:#define BTRFS_SUPER_FLAG_CHANGING_DATA_CSUM	(1ULL << 36)
kernel-shared/ctree.h:#define BTRFS_SUPER_FLAG_CHANGING_META_CSUM	(1ULL << 37)

Note that CHANGING_FSID_V2 is conflicting with CHANGING_DATA_CSUM.

[FIX]
Cross port the proper updated uapi headers into btrfs-progs, and remove
the definition from ctree.h.

This would change the value for CHANGING_DATA_CSUM and
CHANGING_META_CSUM, but considering they are experimental features, and
kernel would reject them anyway, the damage is not that huge and we can
accept such change before exposing it to end users.

Pull-request: #810
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-06-24 19:17:49 +02:00
Qu Wenruo
0eeb12aef5 btrfs-progs: error out immediately if an unknown backref type is found
There is a bug report that for fuzzed image
bko-155621-bad-block-group-offset.raw, "btrfs check --mode=lowmem
--repair" would lead to an endless loop.

Unlike original mode, lowmem mode relies on the backref walk to properly
go through each root, but unfortunately inside __add_inline_refs() we
doesn't handle unknown backref types correctly, causing it never moving
forward thus deadloop.

Fix it by erroring out to prevent an endless loop.

Issue: #788
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-06-05 19:48:04 +02:00
Qu Wenruo
cef75dde63 btrfs-progs: print-tree: do sanity checks for dir items
There is a bug report that with UBSAN enabled, fuzz/006 test case
crashes.

It turns out that the image bko-154021-invalid-drop-level.raw has
invalid dir items, that the name/data len is beyond the item.

And if we try to read beyond the eb boundary, UBSAN got triggered.

Normally in kernel tree-checker would reject such metadata in the first
place, but in btrfs-progs we can not be that strict or we cannot do a
lot of repair.

So here just enhance print_dir_item() to do extra sanity checks for
data/name len before reading the contents.

Issue: #805
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-06-05 19:46:41 +02:00
Qu Wenruo
6ad89f67a9 btrfs-progs: print-tree: add support for dev-replace item
This is inspired by a recent bug that csum change doesn't detect
finished dev-replace.

At the time of that csum change patch, there is no print-tree to
show the content of btrfs_dev_replace_item thus contributes to the bug.

Add the new output for btrfs_dev_replace_item, and the example looks
like this:

	item 1 key (0 DEV_REPLACE 0) itemoff 16171 itemsize 72
		src devid -1 cursor left 1179648000 cursor right 1179648000 mode ALWAYS
		state FINISHED write errors 0 uncorrectable read errors 0
		start time 1717282771 (2024-06-02 08:29:31)
		stop time 1717282771 (2024-06-02 08:29:31)

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-06-03 21:41:27 +02:00
Naohiro Aota
edd80fbde3 btrfs-progs: support byte length for zone resetting
Even with "mkfs.btrfs -b", mkfs.btrfs resets all the zones on the device.
Limit the reset target within the specified length.

Also, we need to check that there is no active zone outside of the FS
range. Having an active zone outside FS reduces the number of zones btrfs
can write simultaneously. Technically, we can still scan all the device
zones and keep active zones outside FS intact and try to live with the
limited active zones. But, that will make btrfs operations harder.

It is generally bad idea to use "-b" on a non-test usage on a device with
active zone limit in the first place. You really need to take care that FS
and outside the FS goes over the limit. That means you'll never be able to
use zones outside the FS anyway.

So, until there is a strong request for that, I don't think it's worthwhile
to do so.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-06-03 21:26:39 +02:00
Qu Wenruo
cae94956d9 btrfs-progs: dump-tree: support simple quota mode status flags
[BUG]
For simple quota mode btrfs, dump tree does not show the extra flags
correctly:

 # mkfs.btrfs -f -O squota $dev
 # btrfs inspect dump-tree -t quota $dev | grep QGROUP_STATUS -A1
	item 0 key (0 QGROUP_STATUS 0) itemoff 16243 itemsize 40
		version 1 generation 10 flags ON scan 0 enable_gen 7

Note just ON is shown, but squota has one extra bit set for it.

[CAUSE]
Just no support for the new flag.

[FIX]
Add the new flag support, also to be consistent with other flags string
output, add output for extra unknown flags.

With a hand crafted image, the output with unknown flags looks like
this:
	item 0 key (0 QGROUP_STATUS 0) itemoff 16243 itemsize 40
		version 1 generation 10 flags ON|SIMPLE_MODE|UNKNOWN(0xf00) scan 0 enable_gen 7

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-05-10 15:20:16 +02:00
David Sterba
7f396f5ced btrfs-progs: reorder key initializations
Use the objectid, type, offset natural order as it's more readable and
we're used to read keys like that.

Signed-off-by: David Sterba <dsterba@suse.com>
2024-04-30 21:49:15 +02:00
David Sterba
12bdb72d50 btrfs-progs: print-tree: fix deref before check in btrfs_print_tree()
Reported by 'gcc -fanalyzer':
kernel-shared/print-tree.c:1745:12: warning: check of ‘eb’ for NULL after already dereferencing it [-Wanalyzer-deref-before-check]

The fs_info is initialized before we check 'eb' but we always get a
valid one so no need to validate it.

Signed-off-by: David Sterba <dsterba@suse.com>
2024-04-18 19:16:15 +02:00
David Sterba
844caf8639 btrfs-progs: fix double free on error in read_raid56()
Reported by 'gcc -fanalyzer':
kernel-shared/extent_io.c: In function ‘read_raid56’:
./include/kerncompat.h:393:18: warning: dereference of NULL ‘pointers’ [CWE-476] [-Wanalyzer-null-dereference]

After allocation of the pointers array fails it's dereferenced in the
exit block. We can return immediately instead.

Signed-off-by: David Sterba <dsterba@suse.com>
2024-04-18 19:16:15 +02:00
Boris Burkov
682f676eb3 btrfs-progs: enable send v3 correctly (use EXPERIMENTAL instead of CONFIG_BTRFS_DEBUG)
The send v3 protocol is enabled in kernel by a different config option
than in btrfs-progs to actually work. Now v3 can be tested when
configured and built with --enable-experimental.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-18 23:19:52 +01:00
David Sterba
2edd439617 btrfs-progs: use unsigned types for bit shifts
Bit shifts should be done on unsigned type as a matter of good practice
to avoid any problems with bit overflowing to the sign bit.

Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-12 22:05:09 +01:00
David Sterba
1c551e22cf btrfs-progs: make all parameters of rb_tree search/insert const
Tree comparators never change parameters, make them all const and also
change the rb-tree prototypes.

Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-12 21:43:54 +01:00
David Sterba
a80d717db2 btrfs-progs: minor source sync with kernel 6.8
Sync a few more file on the source level with kernel 6.8.

- type cleanups
- defines and enums
- comments
- parameter updates
- error handling

Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-12 21:22:56 +01:00
David Sterba
bec6bc8eee btrfs-progs: minor source sync with kernel 6.8-rc3
Sync a few more file on the source level with kernel 6.8-rc3, no
functional changes.

Signed-off-by: David Sterba <dsterba@suse.com>
2024-02-08 09:30:16 +01:00
Qu Wenruo
e54514aaea btrfs-progs: fix stray fd close in open_ctree_fs_info()
[BUG]
Although commit b2a1be83b8 ("btrfs-progs: mkfs: keep file descriptors
open during whole time") is making sure we're only closing the writeable
fds after the fs is properly created, there is still a missing fd not
following the requirement.

And this explains the issue why sometimes after mkfs.btrfs, lsblk still
doesn't give a valid uuid.

Shown by the strace output (the command is "mkfs.btrfs -f
/dev/test/scratch1"):

  openat(AT_FDCWD, "/dev/test/scratch1", O_RDWR) = 5 <<< Writeable open
  fadvise64(5, 0, 0, POSIX_FADV_DONTNEED) = 0
  sysinfo({uptime=2529, loads=[8704, 6272, 2496], totalram=4104548352, freeram=3376611328, sharedram=9211904, bufferram=43016192, totalswap=3221221376, freeswap=3221221376, procs=190, totalhigh=0, freehigh=0, mem_unit=1}) = 0
  lseek(5, 0, SEEK_END)                   = 10737418240
  lseek(5, 0, SEEK_SET)                   = 0
  ......
  close(5)                                = 0 <<< Closed now
  pwrite64(6, "O\250\22\261\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 1163264) = 16384
  pwrite64(6, "\201\316\272\342\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 1179648) = 16384
  pwrite64(6, "K}S\t\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 1196032) = 16384
  pwrite64(6, "\207j$\265\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 1212416) = 16384
  pwrite64(6, "q\267;\336\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 5242880) = 16384
  fsync(6) <<< But we're still writing into the disk.

[CAUSE]
After more digging, it turns out we have a very obvious escape in
open_ctree_fs_info():

	open_ctree_fs_info()
	|- fp = open(oca->filename, flags);
	|- info = __open_ctree_fd();
	|- close(fp);

As later we only do IO using the device fd, this close() seems fine.

But the truth is, for mkfs usage, this fs_info is a temporary one, with
a special magic number for the disk.  And since mkfs is doing writeable
operations, this close() would immediately trigger udev scan.

And since at this stage, the fs is not yet fully created, udev can race
with mkfs, and may get the invalid temporary superblock.

[FIX]
Introduce a new btrfs_fs_info member, initial_fd, for
open_ctree_fs_info() to record the fd.

And on close_ctree(), if we find fs_info::initial_fd is a valid fd, then
close it.

By this, we make sure all writeable fds are only closed after we have
written valid super blocks into the disk.

Issue: #734
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-02-08 08:30:37 +01:00
Qu Wenruo
389c959d6d btrfs-progs: implement arg_strtou64_with_suffix() with a new helper
This patch introduces a new parser helper, parse_u64_with_suffix(),
which has a better error handling, following all the parse_*()
helpers to return non-zero value for errors.

This new helper is going to replace parse_size_from_string(), which
would directly call exit(1) to stop the whole program.

Furthermore most callers of parse_size_from_string() are expecting
exit(1) for error, so that they can skip the error handling.

For those call sites, introduce a wrapper, arg_strtou64_with_suffix(),
to do that.  The only disadvantage is a little less detailed error
report for why the parse failed, but for most cases the generic error
string should be enough.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-01-18 02:14:23 +01:00
Qu Wenruo
94ace90508 btrfs-progs: tree-checker: dump the tree block when hitting an error
Unlike kernel where tree-checker would provide enough info so later we
can use "btrfs inspect dump-tree" to catch the offending tree block, in
progs we may not even have a btrfs to start "btrfs inspect dump-tree".
E.g during btrfs-convert.

To make later debuging easier, let's call btrfs_print_tree() for every
error we hit inside tree-checker.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-01-12 16:34:44 +01:00
Boris Burkov
d7c6baf82f btrfs-progs: make OWNER_REF_KEY type value smallest among inline refs
Companion patch to progs for the same change in the kernel. Inline refs
are expected to have non-decreasing type value but owner ref violated
this and got away with it via special parsing. Fix the inconsistency
while it is still experimental.

Link: https://lore.kernel.org/linux-btrfs/20231103134547.GA3548732@perftesting/T/#mca2c0e21ecb7a0da616dd09980b9f008c3c00f63
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-11-09 15:24:46 +01:00
Sergei Trofimovich
8a687cf954 kernel-shared: uapi: fix BTRFS_IOC_SCAN_DEV defiintion
Without the change `BTRFS_IOC_SCAN_DEV` aliased with `BTRFS_IOC_FORGET_DEV`.
It's a regression introduced in fcd9142b6 "btrfs-progs: docs: formatting,
fixups, updates".

It manifests as a sudden device disappearance when device is scanned:

    machine # [    4.095032] Btrfs loaded, crc32c=crc32c-intel, zoned=no, fsverity=no
    machine # ERROR: device scan failed on '/dev/vdb': No such file or directory
    machine # ERROR: device scan failed on '/dev/vdc': No such file or directory
    (finished: must succeed: mkfs.btrfs -d raid0 /dev/vdb /dev/vdc, in 10.31 seconds)

Issue: #704
Pull-request: #706
Reported-by: Atemu <atemu.main@gmail.com>
Bug: https://github.com/NixOS/nixpkgs/issues/265668
Author: Sergei Trofimovich <slyich@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-11-05 23:10:28 +01:00
David Sterba
fcd9142b67 btrfs-progs: docs: formatting, fixups, updates
- update Status page
- new features in 6.7
- more ioctls
- CSS fix to wrap long lines in tables

[ci skip]

Signed-off-by: David Sterba <dsterba@suse.com>
2023-11-03 18:04:37 +01:00
David Sterba
d739e3b73a btrfs-progs: kernel-shared: use kmalloc and kfree
All the code in kernel-shared should use the proper memory allocation
helpers.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-11-03 18:04:37 +01:00
David Sterba
b4f43d72ff btrfs-progs: mkfs: support parametric zone size
In experimental build, read global '--param zone-size=SIZE' and use it
as emulated zone size.  This is for testing only, will be promoted to a
proper option in the future.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-11-03 18:04:37 +01:00
Qu Wenruo
ad8a831a74 btrfs-progs: dump-tree: output the sequence number for inline references
Commit 6cf11f3e38 ("btrfs-progs: check: check order of inline extent
refs") fixes a problem that btrfs check never properly verify the
sequence of inline references.

It's not obvious because by default kernel handles EXTENT_DATA_REF_KEY
using its own hash, resulting some seemingly out-of-order result:

	item 0 key (13631488 EXTENT_ITEM 4096) itemoff 16143 itemsize 140
		refs 4 gen 7 flags DATA
		extent data backref root FS_TREE objectid 258 offset 0 count 1
		extent data backref root FS_TREE objectid 257 offset 0 count 1
		extent data backref root FS_TREE objectid 260 offset 0 count 1
		extent data backref root FS_TREE objectid 259 offset 0 count 1

By a quick glance, no one can see the above inline backref items are in
any order.

To make such sequence more obvious, let dump-tree to output a new prefix
to indicate the type and the internal sequence number:

For above case, the new output would look like this:

        item 0 key (13631488 EXTENT_ITEM 4096) itemoff 16143 itemsize 140
                refs 4 gen 7 flags DATA
                (178 0xdfb591fbbf5f519) extent data backref root FS_TREE objectid 258 offset 0 count 1
                (178 0xdfb591fa80d95ea) extent data backref root FS_TREE objectid 257 offset 0 count 1
                (178 0xdfb591f9c0534ff) extent data backref root FS_TREE objectid 260 offset 0 count 1
                (178 0xdfb591f49f9f8e7) extent data backref root FS_TREE objectid 259 offset 0 count 1

Although still not that obvious, it should show the inline data backrefs
has descending sequence number.

For the type part, it's anti-instinctive in ascending order, which is
not that easy to produce.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-23 15:52:12 +02:00
Naohiro Aota
8816a65fec btrfs-progs: zoned: check SB zone existence properly
Currently, write_dev_supers() compares the superblock location vs the size
of the device to check if it can write the superblock. This is not correct
for a zoned device, whose superblock location is different than a regular
device.

Introduce check_sb_location() to check if the superblock zone exists for
the zoned case.

Running btrfs check can fail on a certain zoned device setup (e.g,
zone size = 128MB, device size = 16GB).

From generic/330:

  yes | btrfs check --repair --force /dev/nullb1
  [1/7] checking root items
  Fixed 0 roots.
  [2/7] checking extents
  ERROR: zoned: failed to read zone info of 4096 and 4097: Invalid argument
  ERROR: failed to write super block for devid 1: write error: Input/output error
  failed to write new super block err -5
  failed to repair damaged filesystem, aborting

This happens because write_dev_supers() is comparing the original
superblock location vs the device size to check if it can write out a
superblock copy or not.

For the above example, since the first copy location (64MB) < device size
(16GB), it tries to write out the copy. But, the copy must be written into
zone 4096 (512G / zone size (128M) = 4096), which is out of the device.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-21 15:51:06 +02:00
Naohiro Aota
58148d5209 btrfs-progs: zoned: introduce sb_bytenr_to_sb_zone()
Introduce sb_bytenr_to_sb_zone(), which converts the original superblock
location to the zone number of superblock log writing.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-17 19:34:00 +02:00
David Sterba
b421fdff95 btrfs-progs: move raid-stripe-tree and squota build out of experimental
The kernel patches for RST and squota are queued for 6.7, we need to be
able to test the features so it's not necessary to hide the mkfs support
under experimental build. The kernel may still need debug build to
enable mount.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-17 19:33:59 +02:00
Qu Wenruo
e79f18a4a7 btrfs-progs: introduce a basic metadata free space reservation check
Unlike kernel, in btrfs-progs btrfs_start_transaction() never checks if
there is enough metadata space.

This can lead to very dangerous situation where there is no metadata
space left at all, deadlocking future tree operations.

This patch introduces a very basic version of metadata/system free space
check by:

- Check if there is enough metadata/system space left
  If there is enough, go as usual.

- If there is not enough space left, try allocating a new chunk

- Recheck if the new space can meet our demand
  If not, return ERR_PTR(-ENOSPC).
  Otherwise, allocate a new trans handle to the caller.

This is possible thanks to the simplified transaction model in
btrfs-progs:

- We don't allow joining a transaction
  This means we don't need to handle complex cases like data ordered
  extents, which need to reserve space first, then join the current
  transaction and use the reserved blocks.

- We don't allow multiple transaction handles for one transaction
  Since btrfs-progs is single threaded, we always start a transaction
  and then commit it.

However there is a feature that must be an exception for the new
metadata/system free space check:

- btrfs check --init-extent-tree
  As all the meta/system free space check is based on the space info,
  which is loaded from block group items.
  Thus when rebuilding extent tree, we can no longer have an accurate
  view, thus we have to disable the feature for the whole execution if
  we're rebuilding the extent tree.

For now, there is no regression exposed during the self tests, but I
really hope this can be an extra safety net to prevent causing ENOSPC
deadlock in btrfs-progs.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-17 19:33:59 +02:00
Qu Wenruo
930c6362d1 btrfs-progs: fix all variable shadowing
There are quite some variable shadowing in btrfs-progs, most of them are
just reusing some common names like tmp.
And those are quite safe and the shadowed one are even different type.

But there are some exceptions:

- @end in traverse_tree_blocks()
  There is already an @end with the same type, but a different meaning
  (the end of the current extent buffer passed in).
  Just rename it to @child_end.

- @start in generate_new_data_csums_range()
  Just rename it to @csum_start.

- @size of fixup_chunk_tree_block()
  This one is particularly bad, we declare a local @size and initialize
  it to -1, then before we really utilize the variable @size, we
  immediately reset it to 0, then pass it to logical_to_physical().
  Then there is a location to check if @size is -1, which will always be
  true.

  According to the code in logical_to_physical(), @size would be clamped
  down by its original value, thus our local @size will always be 0.

  This patch would rename the local @size to @found_size, and only set
  it to -1.
  The call site is only to pass something as logical_to_physical()
  requires a non-NULL pointer.
  We don't really need to bother the returned value.

- duplicated @ref declaration in run_delayed_tree_ref()
- duplicated @super_flags in change_meta_csums()
  Just delete the duplicated one.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-10 19:16:29 +02:00
Qu Wenruo
c788977878 btrfs-progs: pull in the full max/min/clamp implementation from kernel
The current implementation would introduce variable shadowing due to
both max() and min() are using the same __x and __y.

This may not be a big deal, but since kernel is already handling it
properly using __UNIQUE_ID() macro, and has more checks, we can
cross-port the kernel version to btrfs-progs.

There are some dependency needed, they are all small enough thus can be
put into the helper.

- __PASTE()
- __UNIQUE_ID()
- BUILD_BUG_ON_ZERO()
- __is_constexpr()

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-10 19:16:29 +02:00
Johannes Thumshirn
dfc866bfef btrfs-progs: remove stride length from on-disk format
The stride length has been removed from kernel code, remove it here as
well.

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-06 17:26:56 +02:00
Johannes Thumshirn
4acadd1d42 btrfs-progs: remove stride length from tree dump
The length has been removed from kernel, remove it here as well.

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-06 17:26:54 +02:00
Josef Bacik
2a34fd3892 btrfs-progs: cleanup dirty buffers on transaction abort
When adding the extent buffer leak detection I started getting failures
on some of the fuzz tests.  This is because we don't clean up dirty
buffers for aborted transactions, we just leave them dirty and thus we
leak them.  Fix this up by making btrfs_commit_transaction() on an
aborted transaction properly cleanup the dirty buffers that exist in the
system.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:57 +02:00
David Sterba
21aa6777b2 btrfs-progs: clean up includes, using include-what-you-use
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:57 +02:00
David Sterba
d4cf2a3b4c btrfs-progs: kernel-shared: sync delayed-refs.[ch]
Update parts of struct btrfs_delayed_ref_head and updated where used,
add more prototypes. More still needs to be synced.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:57 +02:00
Josef Bacik
47dc6bd8a9 btrfs-progs: update btrfs_split_item to match the in-kernel definition
In the kernel new_key is const, update the definition in btrfs-progs to
match the in-kernel definition.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:57 +02:00
Josef Bacik
692b68110e btrfs-progs: inline btrfs_name_hash and btrfs_extref_hash
This is the opposite of what we do in the kernel, however in the kernel
we put the helpers in dir-item.h and inode-item.h respectively.  Those
do not exist in btrfs-progs right now, so instead of doing all that work
right now simply inline them in ctree.h to make it easier to sync
ctree.c from the kernel.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:57 +02:00
Josef Bacik
f4e16e0238 btrfs-progs: update read_tree_block to take a btrfs_parent_tree_check
In the kernel we've added a control struct to handle the different
checks we want to do on extent buffers when we read them.  Update our
copy of read_tree_block to take this as an argument, then update all of
the callers to use the new structure.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:57 +02:00
Josef Bacik
a38570f9d6 btrfs-progs: use btrfs_tree_parent_check for btrfs_read_extent_buffer
In the kernel we have a control structure call btrfs_tree_parent_check
to pass around the various sanity checks we have for extent buffers.
Add this to btrfs_tree_parent_check and then update the callers.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:57 +02:00
Josef Bacik
909bde13f4 btrfs-progs: update btrfs_leaf_free_space to match the kernel
In the kernel we have const struct extent_buffer instead of struct
extent_buffer, update this to make it more straightforward to sync
ctree.c.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:56 +02:00
Josef Bacik
fbaff3b121 btrfs-progs: update btrfs_insert_item to match the kernel
This has const struct btrfs_key instead of just struct btrfs_key, update
this to make it more straightforward to sync ctree.c.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:56 +02:00