This involves the following error cases:
- Unable to find the original item
Return -EAGAIN and release the path (which is not done in the original
code)
- Error from split_leaf()
Remove the BUG_ON() and handle the error.
The most common error is ENOSPC.
- Error from kmalloc()
Just handle the error and return -ENOMEM.
Issue: #312
Signed-off-by: Qu Wenruo <wqu@suse.com>
The current print-tree can not handle unsupported inode flags, e.g.
created by Synology's out-of-tree btrfs implementation.
The existing one just checks all the supported flags, and if no flag
hits, it will output "none" no matter if there is any unsupported one.
Fix this by implementing sprint_readable_flag(), and use the same
handling of print_readable_flag().
Although for inode flag, adds one extra handling to output "none" if no
flag hit at all.
Signed-off-by: Qu Wenruo <wqu@suse.com>
This includes:
- Remove the "__" prefix
Now the "__" is no longer recommended, and there is no function taking
the "print_readable_flag" in the first place.
- Move the supported flags calculation into print_readable_flag()
Since all callers are doing the same work before calling the function.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[BUG]
Sometimes test case btrfs/012 fails randomly, with the failure to read a
symlink:
QA output created by 012
Checking converted btrfs against the original one:
-OK
+readlink: Structure needs cleaning
Checking saved ext2 image against the original one:
OK
Furthermore, this will trigger a kernel error message:
BTRFS critical (device dm-2): regular/prealloc extent found for non-regular inode 133081
[CAUSE]
For that specific inode 133081, the tree dump looks like this:
item 127 key (133081 INODE_ITEM 0) itemoff 40984 itemsize 160
generation 1 transid 1 size 4095 nbytes 4096
block group 0 mode 120777 links 1 uid 0 gid 0 rdev 0
sequence 0 flags 0x0(none)
item 128 key (133081 INODE_REF 133080) itemoff 40972 itemsize 12
index 2 namelen 2 name: l3
item 129 key (133081 EXTENT_DATA 0) itemoff 40919 itemsize 53
generation 4 type 1 (regular)
extent data disk byte 2147483648 nr 38080512
extent data offset 37974016 nr 4096 ram 38080512
extent compression 0 (none)
Note that, the symlink inode size is 4095 at the max size (PATH_MAX,
removing the terminating NUL).
But the nbytes is 4096, exactly matching the sector size of the btrfs.
Thus it results the creation of a regular extent, but for btrfs we do
not accept a symlink with a regular/preallocated extent, thus kernel
rejects such read and failed the readlink call.
The root cause is in the convert code, where for symlinks we always
create a data extent with its size + 1, causing the above problem.
I guess the original code is to handle the terminating NUL, but in btrfs
we never need to store the terminating NUL for inline extents nor
file names.
Thus this pitfall in btrfs-convert leads to the above invalid data
extent and fail the test case.
[FIX]
- Fix the ext2 and reiserfs symbolic link creation code
To remove the terminating NUL.
- Add extra checks for the size of a symbolic link
Btrfs has extra limits on the size of a symbolic link, as btrfs must
store symbolic link targets as inlined extents.
This means for 4K node sized btrfs, the size limit is smaller than the
usual PATH_MAX - 1 (only around 4000 bytes instead of 4095).
So for certain nodesize, some filesystems can not be converted to
btrfs.
(this should be rare, because the default nodesize is 16K already)
- Split the symbolic link and inline data extent size checks
For symbolic links the real limit is PATH_MAX - 1 (removing the
terminating NUL), but for inline data extents the limit is
sectorsize - 1, which can be different from 4096 - 1 (e.g. 64K sector
size).
Pull-request: #884
Signed-off-by: Qu Wenruo <wqu@suse.com>
Sync up with kernel and fix warnings reported by -Wcast-qual. eg.
Most of the change is due to extent_buffer::data, which is a direct
struct member, unlike in kernel where it's an array of pages. The
const qualifier cannot be used the same way so it's dropped in affected
herlpers.
Signed-off-by: David Sterba <dsterba@suse.com>
The modification is minimal:
- Replace WARN_ON() with UASSERT()
- Remove the @trans parameter for btrfs_extend_item() and
btrfs_mark_buffer_dirty()
As progs version doesn't need a transaction handler.
- Remove the btrfs_uuid_tree_add() in mkfs/main.c
Signed-off-by: Qu Wenruo <wqu@suse.com>
Currently we already have a kernel-shared/uuid-tree.c, which is mostly
shared with kernel.
Kernel also has a uuid-tree.h, but we are still using ctree.h for the
header.
Move all the uuid-tree related definitions to kernel-shared/uuid-tree.h,
making future code sync easier.
Signed-off-by: Qu Wenruo <wqu@suse.com>
btrfs_insert_dir_item wasn't setting the transid field in
btrfs_dir_item. Set it to the current transaction ID rather than writing
uninitialized memory to disk.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Mark Harmstone <maharmstone@fb.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
The function btrfs_mksubvol() is very different between btrfs-progs and
kernel, the former version is really just linking a subvolume to another
directory inode, but the kernel version is really to make a completely
new subvolume.
Instead of same-named function, introduce btrfs_link_subvolume() and use
it to replace the old btrfs_mksubvol().
This is done by:
- Introduce btrfs_link_subvolume()
Which does extra checks before doing any modification:
* Make sure the target inode is a directory
* Make sure no filename conflict
Then do the linkage:
* Add the dir_item/dir_index into the parent inode
* Add the forward and backward root refs into tree root
- Introduce link_image_subvolume() helper
Currently btrfs_mksubvol() has a dedicated convert filename retry
behavior, which is unnecessary and should be done by the convert code.
Now move the filename retry behavior into the helper.
- Remove btrfs_mksubvol()
Since there is only one caller utilizing btrfs_mksubvol(), and it's
now gone, we can remove the old btrfs_mksubvol().
Signed-off-by: Qu Wenruo <wqu@suse.com>
Filenames can contain a newline (or other funny characters), this makes
the dump-tree output confusing, same for xattr names or values that can
binary data. Encode the special characters in the C-style ('\e' ->
"\e", or \NNN if there's no single letter representation). This is based
on the isprint() as it's espected either on a terminal or in a dump
file.
Issue: #350
Issue: #407
Signed-off-by: David Sterba <dsterba@suse.com>
Remove the not needed encoding and reserved fields in struct
raid_stripe_extent.
This saves 8 bytes per stripe extent.
Note: this is a format change and previously created filesystems with
raid-stripe-tree will not be accessible. Similar patch is needed in
kernel.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Use the safe version of strncpy that makes sure the string is
terminated.
To be noted:
- the conversion in scrub path handling was skipped
- sizes of device paths in some ioctl related structures is
BTRFS_DEVICE_PATH_NAME_MAX + 1
Recently gcc 13.3 started to detect problems with our use of strncpy
potentially lacking the null terminator, warnings like:
cmds/inspect.c: In function ‘cmd_inspect_logical_resolve’:
cmds/inspect.c:294:33: warning: ‘__builtin_strncpy’ specified bound 4096 equals destination size [-Wstringop-truncation]
294 | strncpy(mount_path, mounted, PATH_MAX);
| ^
Signed-off-by: David Sterba <dsterba@suse.com>
Although we already have a pretty good array defined for all
super/compat_ro/incompat flags, we still rely on a manually defined mask
to do the printing.
This can lead to easy de-sync between the definition and the flags.
Change it to automatically iterate through the array to calculate the
flags, and add the remaining super flags.
Pull-request: #810
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
There is a bug report that a canceled checksum conversion (still
experimental feature) resulted in unexpected super flags:
csum_type 0 (crc32c)
csum_size 4
csum 0x14973811 [match]
bytenr 65536
flags 0x1000000001
( WRITTEN |
CHANGING_FSID_V2 )
magic _BHRfS_M [match]
While for a filesystem under checksum conversion it should have either
CHANGING_DATA_CSUM or CHANGING_META_CSUM.
[CAUSE]
It turns out that, due to btrfs-progs keeps its own extra flags inside
its own ctree.h headers, not the shared uapi headers, we have
conflicting super flags:
kernel-shared/uapi/btrfs_tree.h:#define BTRFS_SUPER_FLAG_METADUMP_V2 (1ULL << 34)
kernel-shared/uapi/btrfs_tree.h:#define BTRFS_SUPER_FLAG_CHANGING_FSID (1ULL << 35)
kernel-shared/uapi/btrfs_tree.h:#define BTRFS_SUPER_FLAG_CHANGING_FSID_V2 (1ULL << 36)
kernel-shared/ctree.h:#define BTRFS_SUPER_FLAG_CHANGING_DATA_CSUM (1ULL << 36)
kernel-shared/ctree.h:#define BTRFS_SUPER_FLAG_CHANGING_META_CSUM (1ULL << 37)
Note that CHANGING_FSID_V2 is conflicting with CHANGING_DATA_CSUM.
[FIX]
Cross port the proper updated uapi headers into btrfs-progs, and remove
the definition from ctree.h.
This would change the value for CHANGING_DATA_CSUM and
CHANGING_META_CSUM, but considering they are experimental features, and
kernel would reject them anyway, the damage is not that huge and we can
accept such change before exposing it to end users.
Pull-request: #810
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There is a bug report that for fuzzed image
bko-155621-bad-block-group-offset.raw, "btrfs check --mode=lowmem
--repair" would lead to an endless loop.
Unlike original mode, lowmem mode relies on the backref walk to properly
go through each root, but unfortunately inside __add_inline_refs() we
doesn't handle unknown backref types correctly, causing it never moving
forward thus deadloop.
Fix it by erroring out to prevent an endless loop.
Issue: #788
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There is a bug report that with UBSAN enabled, fuzz/006 test case
crashes.
It turns out that the image bko-154021-invalid-drop-level.raw has
invalid dir items, that the name/data len is beyond the item.
And if we try to read beyond the eb boundary, UBSAN got triggered.
Normally in kernel tree-checker would reject such metadata in the first
place, but in btrfs-progs we can not be that strict or we cannot do a
lot of repair.
So here just enhance print_dir_item() to do extra sanity checks for
data/name len before reading the contents.
Issue: #805
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is inspired by a recent bug that csum change doesn't detect
finished dev-replace.
At the time of that csum change patch, there is no print-tree to
show the content of btrfs_dev_replace_item thus contributes to the bug.
Add the new output for btrfs_dev_replace_item, and the example looks
like this:
item 1 key (0 DEV_REPLACE 0) itemoff 16171 itemsize 72
src devid -1 cursor left 1179648000 cursor right 1179648000 mode ALWAYS
state FINISHED write errors 0 uncorrectable read errors 0
start time 1717282771 (2024-06-02 08:29:31)
stop time 1717282771 (2024-06-02 08:29:31)
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Even with "mkfs.btrfs -b", mkfs.btrfs resets all the zones on the device.
Limit the reset target within the specified length.
Also, we need to check that there is no active zone outside of the FS
range. Having an active zone outside FS reduces the number of zones btrfs
can write simultaneously. Technically, we can still scan all the device
zones and keep active zones outside FS intact and try to live with the
limited active zones. But, that will make btrfs operations harder.
It is generally bad idea to use "-b" on a non-test usage on a device with
active zone limit in the first place. You really need to take care that FS
and outside the FS goes over the limit. That means you'll never be able to
use zones outside the FS anyway.
So, until there is a strong request for that, I don't think it's worthwhile
to do so.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
For simple quota mode btrfs, dump tree does not show the extra flags
correctly:
# mkfs.btrfs -f -O squota $dev
# btrfs inspect dump-tree -t quota $dev | grep QGROUP_STATUS -A1
item 0 key (0 QGROUP_STATUS 0) itemoff 16243 itemsize 40
version 1 generation 10 flags ON scan 0 enable_gen 7
Note just ON is shown, but squota has one extra bit set for it.
[CAUSE]
Just no support for the new flag.
[FIX]
Add the new flag support, also to be consistent with other flags string
output, add output for extra unknown flags.
With a hand crafted image, the output with unknown flags looks like
this:
item 0 key (0 QGROUP_STATUS 0) itemoff 16243 itemsize 40
version 1 generation 10 flags ON|SIMPLE_MODE|UNKNOWN(0xf00) scan 0 enable_gen 7
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Use the objectid, type, offset natural order as it's more readable and
we're used to read keys like that.
Signed-off-by: David Sterba <dsterba@suse.com>
Reported by 'gcc -fanalyzer':
kernel-shared/print-tree.c:1745:12: warning: check of ‘eb’ for NULL after already dereferencing it [-Wanalyzer-deref-before-check]
The fs_info is initialized before we check 'eb' but we always get a
valid one so no need to validate it.
Signed-off-by: David Sterba <dsterba@suse.com>
Reported by 'gcc -fanalyzer':
kernel-shared/extent_io.c: In function ‘read_raid56’:
./include/kerncompat.h:393:18: warning: dereference of NULL ‘pointers’ [CWE-476] [-Wanalyzer-null-dereference]
After allocation of the pointers array fails it's dereferenced in the
exit block. We can return immediately instead.
Signed-off-by: David Sterba <dsterba@suse.com>
The send v3 protocol is enabled in kernel by a different config option
than in btrfs-progs to actually work. Now v3 can be tested when
configured and built with --enable-experimental.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Bit shifts should be done on unsigned type as a matter of good practice
to avoid any problems with bit overflowing to the sign bit.
Signed-off-by: David Sterba <dsterba@suse.com>
Sync a few more file on the source level with kernel 6.8.
- type cleanups
- defines and enums
- comments
- parameter updates
- error handling
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Although commit b2a1be83b8 ("btrfs-progs: mkfs: keep file descriptors
open during whole time") is making sure we're only closing the writeable
fds after the fs is properly created, there is still a missing fd not
following the requirement.
And this explains the issue why sometimes after mkfs.btrfs, lsblk still
doesn't give a valid uuid.
Shown by the strace output (the command is "mkfs.btrfs -f
/dev/test/scratch1"):
openat(AT_FDCWD, "/dev/test/scratch1", O_RDWR) = 5 <<< Writeable open
fadvise64(5, 0, 0, POSIX_FADV_DONTNEED) = 0
sysinfo({uptime=2529, loads=[8704, 6272, 2496], totalram=4104548352, freeram=3376611328, sharedram=9211904, bufferram=43016192, totalswap=3221221376, freeswap=3221221376, procs=190, totalhigh=0, freehigh=0, mem_unit=1}) = 0
lseek(5, 0, SEEK_END) = 10737418240
lseek(5, 0, SEEK_SET) = 0
......
close(5) = 0 <<< Closed now
pwrite64(6, "O\250\22\261\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 1163264) = 16384
pwrite64(6, "\201\316\272\342\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 1179648) = 16384
pwrite64(6, "K}S\t\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 1196032) = 16384
pwrite64(6, "\207j$\265\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 1212416) = 16384
pwrite64(6, "q\267;\336\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 5242880) = 16384
fsync(6) <<< But we're still writing into the disk.
[CAUSE]
After more digging, it turns out we have a very obvious escape in
open_ctree_fs_info():
open_ctree_fs_info()
|- fp = open(oca->filename, flags);
|- info = __open_ctree_fd();
|- close(fp);
As later we only do IO using the device fd, this close() seems fine.
But the truth is, for mkfs usage, this fs_info is a temporary one, with
a special magic number for the disk. And since mkfs is doing writeable
operations, this close() would immediately trigger udev scan.
And since at this stage, the fs is not yet fully created, udev can race
with mkfs, and may get the invalid temporary superblock.
[FIX]
Introduce a new btrfs_fs_info member, initial_fd, for
open_ctree_fs_info() to record the fd.
And on close_ctree(), if we find fs_info::initial_fd is a valid fd, then
close it.
By this, we make sure all writeable fds are only closed after we have
written valid super blocks into the disk.
Issue: #734
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch introduces a new parser helper, parse_u64_with_suffix(),
which has a better error handling, following all the parse_*()
helpers to return non-zero value for errors.
This new helper is going to replace parse_size_from_string(), which
would directly call exit(1) to stop the whole program.
Furthermore most callers of parse_size_from_string() are expecting
exit(1) for error, so that they can skip the error handling.
For those call sites, introduce a wrapper, arg_strtou64_with_suffix(),
to do that. The only disadvantage is a little less detailed error
report for why the parse failed, but for most cases the generic error
string should be enough.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Unlike kernel where tree-checker would provide enough info so later we
can use "btrfs inspect dump-tree" to catch the offending tree block, in
progs we may not even have a btrfs to start "btrfs inspect dump-tree".
E.g during btrfs-convert.
To make later debuging easier, let's call btrfs_print_tree() for every
error we hit inside tree-checker.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Without the change `BTRFS_IOC_SCAN_DEV` aliased with `BTRFS_IOC_FORGET_DEV`.
It's a regression introduced in fcd9142b6 "btrfs-progs: docs: formatting,
fixups, updates".
It manifests as a sudden device disappearance when device is scanned:
machine # [ 4.095032] Btrfs loaded, crc32c=crc32c-intel, zoned=no, fsverity=no
machine # ERROR: device scan failed on '/dev/vdb': No such file or directory
machine # ERROR: device scan failed on '/dev/vdc': No such file or directory
(finished: must succeed: mkfs.btrfs -d raid0 /dev/vdb /dev/vdc, in 10.31 seconds)
Issue: #704
Pull-request: #706
Reported-by: Atemu <atemu.main@gmail.com>
Bug: https://github.com/NixOS/nixpkgs/issues/265668
Author: Sergei Trofimovich <slyich@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
- update Status page
- new features in 6.7
- more ioctls
- CSS fix to wrap long lines in tables
[ci skip]
Signed-off-by: David Sterba <dsterba@suse.com>
In experimental build, read global '--param zone-size=SIZE' and use it
as emulated zone size. This is for testing only, will be promoted to a
proper option in the future.
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 6cf11f3e38 ("btrfs-progs: check: check order of inline extent
refs") fixes a problem that btrfs check never properly verify the
sequence of inline references.
It's not obvious because by default kernel handles EXTENT_DATA_REF_KEY
using its own hash, resulting some seemingly out-of-order result:
item 0 key (13631488 EXTENT_ITEM 4096) itemoff 16143 itemsize 140
refs 4 gen 7 flags DATA
extent data backref root FS_TREE objectid 258 offset 0 count 1
extent data backref root FS_TREE objectid 257 offset 0 count 1
extent data backref root FS_TREE objectid 260 offset 0 count 1
extent data backref root FS_TREE objectid 259 offset 0 count 1
By a quick glance, no one can see the above inline backref items are in
any order.
To make such sequence more obvious, let dump-tree to output a new prefix
to indicate the type and the internal sequence number:
For above case, the new output would look like this:
item 0 key (13631488 EXTENT_ITEM 4096) itemoff 16143 itemsize 140
refs 4 gen 7 flags DATA
(178 0xdfb591fbbf5f519) extent data backref root FS_TREE objectid 258 offset 0 count 1
(178 0xdfb591fa80d95ea) extent data backref root FS_TREE objectid 257 offset 0 count 1
(178 0xdfb591f9c0534ff) extent data backref root FS_TREE objectid 260 offset 0 count 1
(178 0xdfb591f49f9f8e7) extent data backref root FS_TREE objectid 259 offset 0 count 1
Although still not that obvious, it should show the inline data backrefs
has descending sequence number.
For the type part, it's anti-instinctive in ascending order, which is
not that easy to produce.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently, write_dev_supers() compares the superblock location vs the size
of the device to check if it can write the superblock. This is not correct
for a zoned device, whose superblock location is different than a regular
device.
Introduce check_sb_location() to check if the superblock zone exists for
the zoned case.
Running btrfs check can fail on a certain zoned device setup (e.g,
zone size = 128MB, device size = 16GB).
From generic/330:
yes | btrfs check --repair --force /dev/nullb1
[1/7] checking root items
Fixed 0 roots.
[2/7] checking extents
ERROR: zoned: failed to read zone info of 4096 and 4097: Invalid argument
ERROR: failed to write super block for devid 1: write error: Input/output error
failed to write new super block err -5
failed to repair damaged filesystem, aborting
This happens because write_dev_supers() is comparing the original
superblock location vs the device size to check if it can write out a
superblock copy or not.
For the above example, since the first copy location (64MB) < device size
(16GB), it tries to write out the copy. But, the copy must be written into
zone 4096 (512G / zone size (128M) = 4096), which is out of the device.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce sb_bytenr_to_sb_zone(), which converts the original superblock
location to the zone number of superblock log writing.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The kernel patches for RST and squota are queued for 6.7, we need to be
able to test the features so it's not necessary to hide the mkfs support
under experimental build. The kernel may still need debug build to
enable mount.
Signed-off-by: David Sterba <dsterba@suse.com>
Unlike kernel, in btrfs-progs btrfs_start_transaction() never checks if
there is enough metadata space.
This can lead to very dangerous situation where there is no metadata
space left at all, deadlocking future tree operations.
This patch introduces a very basic version of metadata/system free space
check by:
- Check if there is enough metadata/system space left
If there is enough, go as usual.
- If there is not enough space left, try allocating a new chunk
- Recheck if the new space can meet our demand
If not, return ERR_PTR(-ENOSPC).
Otherwise, allocate a new trans handle to the caller.
This is possible thanks to the simplified transaction model in
btrfs-progs:
- We don't allow joining a transaction
This means we don't need to handle complex cases like data ordered
extents, which need to reserve space first, then join the current
transaction and use the reserved blocks.
- We don't allow multiple transaction handles for one transaction
Since btrfs-progs is single threaded, we always start a transaction
and then commit it.
However there is a feature that must be an exception for the new
metadata/system free space check:
- btrfs check --init-extent-tree
As all the meta/system free space check is based on the space info,
which is loaded from block group items.
Thus when rebuilding extent tree, we can no longer have an accurate
view, thus we have to disable the feature for the whole execution if
we're rebuilding the extent tree.
For now, there is no regression exposed during the self tests, but I
really hope this can be an extra safety net to prevent causing ENOSPC
deadlock in btrfs-progs.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There are quite some variable shadowing in btrfs-progs, most of them are
just reusing some common names like tmp.
And those are quite safe and the shadowed one are even different type.
But there are some exceptions:
- @end in traverse_tree_blocks()
There is already an @end with the same type, but a different meaning
(the end of the current extent buffer passed in).
Just rename it to @child_end.
- @start in generate_new_data_csums_range()
Just rename it to @csum_start.
- @size of fixup_chunk_tree_block()
This one is particularly bad, we declare a local @size and initialize
it to -1, then before we really utilize the variable @size, we
immediately reset it to 0, then pass it to logical_to_physical().
Then there is a location to check if @size is -1, which will always be
true.
According to the code in logical_to_physical(), @size would be clamped
down by its original value, thus our local @size will always be 0.
This patch would rename the local @size to @found_size, and only set
it to -1.
The call site is only to pass something as logical_to_physical()
requires a non-NULL pointer.
We don't really need to bother the returned value.
- duplicated @ref declaration in run_delayed_tree_ref()
- duplicated @super_flags in change_meta_csums()
Just delete the duplicated one.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The current implementation would introduce variable shadowing due to
both max() and min() are using the same __x and __y.
This may not be a big deal, but since kernel is already handling it
properly using __UNIQUE_ID() macro, and has more checks, we can
cross-port the kernel version to btrfs-progs.
There are some dependency needed, they are all small enough thus can be
put into the helper.
- __PASTE()
- __UNIQUE_ID()
- BUILD_BUG_ON_ZERO()
- __is_constexpr()
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The stride length has been removed from kernel code, remove it here as
well.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The length has been removed from kernel, remove it here as well.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When adding the extent buffer leak detection I started getting failures
on some of the fuzz tests. This is because we don't clean up dirty
buffers for aborted transactions, we just leave them dirty and thus we
leak them. Fix this up by making btrfs_commit_transaction() on an
aborted transaction properly cleanup the dirty buffers that exist in the
system.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>