Some tests don't use the /tmp temporary files and store it locally in
the test directory. To support NFS this needs to be created by a few
commands. To avoid accidental breakage add a convenience helper.
Signed-off-by: David Sterba <dsterba@suse.com>
The preferred order:
- system headers
- standard headers
- libraries
- kernel library
- kernel shared
- common headers
- other tools
- own headers
Signed-off-by: David Sterba <dsterba@suse.com>
All files include the <btrfsutil.h> which could be confused with the
system-wide installation. Drop the -I path from build and use full path
for any libbtrfsutil headers.
Signed-off-by: David Sterba <dsterba@suse.com>
The preferred order:
- system headers
- standard headers
- libraries
- kernel library
- kernel shared
- common headers
- other tools
- own headers
Signed-off-by: David Sterba <dsterba@suse.com>
The size reported as Unallocated in the table was different that the one
in the listing, calculated differently. The values should reflect the
unallocated area available for the filesystem - not necessarily the
total size of the device. If there's such slack space it's reported
separately.
The values in the table mean:
- Unallocated: block device size - slack - allocated
- Total: block device size - slack
- Slack: block device size - filesystem
The new columns make the table wider but the values are deemed to be
important by users and for filesystems with normal profiles it fits
under reasonable line width. During balance or with multiple profiles it
can get wider but this should not be a serious problem.
Example output:
Overall:
Device size: 13.00GiB
Device allocated: 536.00MiB
Device unallocated: 12.48GiB
Device missing: 0.00B
Device slack: 1.00GiB
Used: 2.31MiB
Free (estimated): 12.48GiB (min: 6.24GiB)
Free (statfs, df): 12.48GiB
Data ratio: 1.00
Metadata ratio: 2.00
Global reserve: 3.50MiB (used: 0.00B)
Multiple profiles: no
Data Metadata System
Id Path single DUP DUP Unallocated Total Slack
-- ---------- ------- --------- -------- ----------- -------- -------
1 /dev/loop0 8.00MiB 512.00MiB 16.00MiB 2.48GiB 3.00GiB 1.00GiB
2 /dev/loop1 - - - 10.00GiB 10.00GiB -
-- ---------- ------- --------- -------- ----------- -------- -------
Total 8.00MiB 256.00MiB 8.00MiB 12.48GiB 13.00GiB 1.00GiB
Used 2.00MiB 144.00KiB 16.00KiB
Issue: #508
Pull-request: #509 (partial fix)
Signed-off-by: David Sterba <dsterba@suse.com>
The stream dump escapes the path on which the operation is done but
there are a some that use another path that's the target. A file with
eg. a newline then does not format properly on one line as expected.
Extend the printing helpers to skip printing the newline and then print
the escaped path manually.
Issue: #510
Signed-off-by: David Sterba <dsterba@suse.com>
The preferred order:
- system headers
- standard headers
- libraries
- kernel library
- kernel shared
- common headers
- other tools
- own headers
Signed-off-by: David Sterba <dsterba@suse.com>
The preferred order:
- system headers
- standard headers
- libraries
- kernel library
- kernel shared
- common headers
- other tools
- own headers
Signed-off-by: David Sterba <dsterba@suse.com>
The preferred order:
- system headers
- standard headers
- libraries
- kernel library
- kernel shared
- common headers
- other tools
- own headers
Signed-off-by: David Sterba <dsterba@suse.com>
The preferred order:
- system headers
- standard headers
- libraries
- kernel library
- kernel shared
- common headers
- other tools
- own headers
Signed-off-by: David Sterba <dsterba@suse.com>
The preferred order:
- system headers
- standard headers
- libraries
- kernel library
- kernel shared
- common headers
- other tools
- own headers
Signed-off-by: David Sterba <dsterba@suse.com>
The source dir points to the argv data, we should make a copy to be sure
it won't change due to further processing.
Signed-off-by: David Sterba <dsterba@suse.com>
The helper parse_label is used only once and is trivial. Open code it in
the argument parsing, also to make the exit() is more visible.
Signed-off-by: David Sterba <dsterba@suse.com>
There's a helper to parse profile name and exits on error. As this is a
trivial helper we can open code it and adapt the error message to be
more specific what failed.
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
If using btrfs-corrupt-block to corrupt the generation of a tree
block (in my example, it's csum root), it will cause csum mismatch other
than the expected transid mismatch:
# ./btrfs-corrupt-block --metadata-block 30474240 -f generation \
/dev/test/scratch1
# btrfs check /dev/test/scratch1
Opening filesystem to check...
checksum verify failed on 30474240 wanted 0xb3e8059a found 0xb4a4b45c
checksum verify failed on 30474240 wanted 0xb3e8059a found 0xb4a4b45c
checksum verify failed on 30474240 wanted 0xb3e8059a found 0xb4a4b45c
Csum didn't match
ERROR: could not setup csum tree
ERROR: cannot open file system
[CAUSE]
Inside the switch branch BTRFS_METADATA_BLOCK_GENERATION in
corrupt_metadata_block(), we just set the generation and trigger
write_and_map_eb().
However write_and_map_eb() doesn't re-generate the checksum by itself,
thus we make the victim tree block to have a stale checksum.
[FIX]
Just call csum_tree_block_size() before write_and_map_eb().
Now the corrupted fs have the expected corruption pattern now:
# btrfs check /dev/test/scratch1
Opening filesystem to check...
parent transid verify failed on 30474240 wanted 7 found 11814770867473404344
parent transid verify failed on 30474240 wanted 7 found 11814770867473404344
parent transid verify failed on 30474240 wanted 7 found 11814770867473404344
Ignoring transid failure
...
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The function csum_tree_block() is not really utilized by anyone, all
current callers just use csum_tree_block_size().
Furthermore there is a stale definition in common/utils.h which is using
the old "struct btrfs_root" as the first argument, while we have already
migrated to "struct btrfs_fs_info".
So just unexport csum_tree_block() and remove the stale definition.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Some compilers warn about potentially unused variable, however the value
validity is guarded by have_prev so this can't happen and it's probably
insufficient analysis on the compiler side. Let's initialize the
prev_key to zeros that would also work as the condition.
In file included from /usr/include/stdio.h:894,
from ./kerncompat.h:27,
from ./kernel-lib/list.h:23,
from ./kernel-shared/ctree.h:24,
from kernel-shared/free-space-tree.c:19:
In function ‘fprintf’,
inlined from ‘load_free_space_extents’ at kernel-shared/free-space-tree.c:1446:5,
inlined from ‘load_free_space_tree’ at kernel-shared/free-space-tree.c:1577:9:
/usr/include/bits/stdio2.h:105:10: warning: ‘prev_key.objectid’ may be used uninitialized [-Wmaybe-uninitialized]
105 | return __fprintf_chk (__stream, __USE_FORTIFY_LEVEL - 1, __fmt,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
106 | __va_arg_pack ());
| ~~~~~~~~~~~~~~~~~
kernel-shared/free-space-tree.c: In function ‘load_free_space_tree’:
kernel-shared/free-space-tree.c:1398:31: note: ‘prev_key.objectid’ was declared here
1398 | struct btrfs_key key, prev_key;
Signed-off-by: David Sterba <dsterba@suse.com>
The header dependency rules generated as .o.d files are sometimes stale
and fail the build. Add a rule to clean them if needed, otherwise
they're also cleaned by 'make clean'.
Signed-off-by: David Sterba <dsterba@suse.com>
To be able to run the test suite on NFS the temporary files need to be
writeable for all, root due to send and owner due to the way it's
created.
Signed-off-by: David Sterba <dsterba@suse.com>
The raw number of the features in the list of 'mkfs.btrfs -O list-all'
and for -R is not that useful, it's an implementation detail or can be
put to documentation.
Now looks like:
Filesystem features available:
mixed-bg - mixed data and metadata block groups (compat=2.6.37, safe=2.6.37)
extref - increased hardlink limit per file to 65536 (compat=3.7, safe=3.12, default=3.12)
raid56 - raid56 extended format (compat=3.9)
...
Signed-off-by: David Sterba <dsterba@suse.com>
I noticed a segfault of 'btrfs receive'.
$ gdb
#0 process_clone (path=0x23829d0 "after.s1.txt", offset=0, len=2097152, clone_uuid=<optimized out>,
clone_ctransid=<optimized out>, clone_path=0x2382920 "after.s1.txt", clone_offset=0, user=0x7ffe21985ba0)
at cmds/receive.c:793
793 free(si->path);
(gdb) p si
$1 = (struct subvol_info *) 0xfffffffffffffffe
'si' was an error pointer value. Add the check to make sure we don't
pass such pointer to free().
Signed-off-by: Wang Yugui <wangyugui@e16-tech.com>
Signed-off-by: David Sterba <dsterba@suse.com>
To reduce the test matrix and to follow the kernel behavior, make sure
for block-group-tree feature, we have no-holes and free-space-tree
features enabled.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The new '-b' option will be responsible for converting to block group
tree compat ro feature.
The workflow looks like this for new convert:
- Setting CHANGING_BG_TREE flag
And initialize fs_info->last_converted_bg_bytenr value to (u64)-1.
Any bg with bytenr >= last_converted_bg_bytenr will have its bg item
update go to the new root (bg tree).
- Iterate each block group by their bytenr in descending order
This involves:
* Delete the old bg item from the old tree (extent tree)
* Update last_converted_bg_bytenr to the bytenr of the bg
* Add the new bg item into the new tree (bg tree)
* If we have converted a bunch of bgs, commit current transaction
- Clear CHANGING_BG_TREE flag
And set the new BLOCK_GROUP_TREE compat ro flag and commit.
And since we're doing the convert in multiple transactions, we also need
to resume from last interrupted convert.
In that case, we just grab the last unconverted bg, and start from it.
And to co-operate with the new kernel requirement for both no-holes and
free-space-tree features, the convert tool will check for
free-space-tree feature. If not enabled, will error out with an error
message to how to continue (by mounting with "-o space_cache=v2").
For missing no-holes feature, we just need to set the flag during
convert.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Block group tree feature is completely a standalone feature, and it has
been over 5 years before the initial introduction to solve the long
mount time.
I don't really want to waste another 5 years waiting for a feature which
may or may not work, but definitely not properly reviewed for its
preparation patches.
So this patch will separate the block group tree feature into a
standalone compat RO feature.
There is a catch, in mkfs create_block_group_tree(), current
tree-checker only accepts block group item with valid chunk_objectid,
but the existing code from extent-tree-v2 didn't properly initialize it.
This patch will also fix above mentioned problem so kernel can mount it
correctly.
Now mkfs/fsck should be able to handle the fs with block group tree.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The extent tree v2 (thankfully not yet fully materialized) needs a
new root for storing all block group items.
My initial proposal years ago just added a new tree rootid, and load it
from tree root, just like what we did for quota/free space tree/uuid/extent
roots.
But the extent tree v2 patches introduced a completely new (and to me,
wasteful) way to store block group tree root into super block.
Currently there are only 3 trees stored in super blocks, and they all
have their valid reasons:
- Chunk root
Needed for bootstrap.
- Tree root
Really the entrance of all trees.
- Log root
This is special as log root has to be updated out of existing
transaction mechanism.
There is not even any reason to put block group root into super blocks,
the block group tree is updated at the same timing as old extent tree,
no need for extra bootstrap/out-of-transaction update.
So just move block group root from super block into tree root.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In mkfs_btrfs(), we have a btrfs_mkfs_block array to store how many tree
blocks we need to reserve for the initial btrfs image.
Currently we have two very similar arrays, extent_tree_v1_blocks and
extent_tree_v2_blocks.
The only difference is just v2 has an extra block for block group tree.
This patch will add two helpers, mkfs_blocks_add() and
mkfs_blocks_remove() to properly add/remove one block dynamically from
the array.
This allows 3 things:
- Merge extent_tree_v1_blocks and extent_tree_v2_blocks into one array
The new array will be the same as extent_tree_v1_blocks.
For extent-tree-v2, we just dynamically add MKFS_BLOCK_GROUP_TREE.
- Remove free space tree block on-demand
This only works for extent-tree-v1 case, as v2 has a hard requirement
on free space tree.
But this still make code much cleaner, not doing any special hacks.
- Allow future expansion without introduce new array
I strongly doubt why this is not properly done in extent-tree-v2
preparation patches.
We should not allow bad practice to sneak in just because it's some
preparation patches for a larger feature.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In read_tree_block, extent buffer EXTENT_BAD_TRANSID flagged will
be added into fs_info->recow_ebs with an increment of its refs.
The corresponding free_extent_buffer should be called after we
fix transid error by cowing extent buffer then remove them from
fs_info->recow_ebs.
Otherwise, extent buffers will be leaked as fsck-tests/002 reports:
===================================================================
====== RUN CHECK /root/btrfs-progs/btrfs check --repair --force ./default_case.img.restored
parent transid verify failed on 29360128 wanted 9 found 755944791
parent transid verify failed on 29360128 wanted 9 found 755944791
parent transid verify failed on 29360128 wanted 9 found 755944791
Ignoring transid failure
[1/7] checking root items
Fixed 0 roots.
[2/7] checking extents
[3/7] checking free space cache
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
extent buffer leak: start 29360128 len 4096
enabling repair mode
===================================================================
Fixes: c64485544b ("Btrfs-progs: keep track of transid failures and fix them if possible")
Signed-off-by: Su Yue <glass@fydeos.io>
Signed-off-by: David Sterba <dsterba@suse.com>
If we found that the underlying block device size is smaller than
total_bytes in dev item, kernel will reject the mount, and there is no
progs tool to fix it.
Under most case it's just a small mismatch, and there is no dev extent
in the shrunk range.
In that case, we can let "btrfs rescue fix-device-size" to reset the
total_bytes in dev items to fix.
We add some extra checks, like to make sure there is no dev extent in
the shrunk device range, to make sure we won't lose data during the
device item shrink.
And also update the test case to verify the repaired fs can pass the
check.
Issue: #504
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Create a filesystem on a file backed loop block device, then shrink the
file (and its loop block device), then make sure btrfs check can detect
such shrunk device.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
There is a bug report that, one btrfs got its underlying device shrunk
accidentally.
Fortunately the user has no data at the truncated range. However kernel
will reject such filesystem, while btrfs-check reports nothing wrong
with it.
This can be easily reproduced by:
# truncate -s 1G test.img
# mkfs.btrfs test.img
# truncate -s 996M test.img
# btrfs check test.img
Opening filesystem to check...
Checking filesystem on test.img
UUID: dbf0a16d-f158-4383-9025-29d7f4c43f17
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space tree
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 16527360 bytes used, no error found
^^^^^^^^^^^^^^
total csum bytes: 13836
total tree bytes: 2359296
total fs tree bytes: 2162688
total extent tree bytes: 65536
btree space waste bytes: 503569
file data blocks allocated: 14168064
referenced 14168064
[CAUSE]
Btrfs check really only checks the metadata cross references, not really
bothering if the underlying device has correct size. Thus we completely
ignored such size mismatch.
[FIX]
For both regular and lowmem mode, add extra check against the underlying
block device size.
If the block device size is smaller than its total_bytes, gives a error
message and error out.
Now the check looks like this for both modes:
...
[2/7] checking extents
ERROR: block device size is smaller than total_bytes in device item, has 1046478848 expect >= 1073741824
ERROR: errors found in extent allocation tree or chunk allocation
[3/7] checking free space tree
...
found 16527360 bytes used, error(s) found
Issue: #504
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The extent leaks are detected in debug builds but tests/scan-build.sh
does not look for them, so add the match expression.
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Commit 06b6ad5e01 ("btrfs-progs: check: check for invalid free space
tree entries") makes btrfs check to report eb leakage even on newly
created btrfs:
# mkfs.btrfs -f test.img
# btrfs check test.img
Opening filesystem to check...
Checking filesystem on test.img
UUID: 13c26b6a-3b2c-49b3-94c7-80bcfa4e494b
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space tree
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 147456 bytes used, no error found
total csum bytes: 0
total tree bytes: 147456
total fs tree bytes: 32768
total extent tree bytes: 16384
btree space waste bytes: 140595
file data blocks allocated: 0
referenced 0
extent buffer leak: start 30572544 len 16384 <<< Extent buffer leakage
[CAUSE]
The patch in mailinglist uses a dynamically allocated path while the
committed one has been converted to on-stack path, which is preferred.
However, the cleanup was not done properly. We only release the path
inside the while loop, no at out label. This means, if we hit error or
even just exhausted free space tree as expected, we will leak the path
to free space tree root.
Thus leading to the above leak report.
[FIX]
Fix the bug by calling btrfs_release_path() at out: label too.
This should make the code behave the same as the patch submitted to the
mailing list.
Fixes: 06b6ad5e01 ("btrfs-progs: check: check for invalid free space tree entries")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Compile warning:
./kerncompat.h:142: warning: "__bitwise__" redefined
#define __bitwise__
In file included from ./kerncompat.h:35,
from check/qgroup-verify.c:24:
/usr/include/linux/types.h:25: note: this is the location of the previous definition
#define __bitwise__ __bitwise
Because __bitwise__ is already defined in newer kernel-headers
(/usr/include/linux/types.h), so add #ifndef to avoid this warning.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Wang Yugui <wangyugui@e16-tech.com>
Signed-off-by: David Sterba <dsterba@suse.com>
While testing some changes to how we reclaim block groups I started
hitting failures with my TEST_DEV. This occurred because I had a bug
and failed to properly remove a block groups free space tree entries.
However this wasn't caught in testing when it happened because
btrfs check only checks that the free space cache for the existing block
groups is valid, it doesn't check for free space entries that don't have
a corresponding block group.
Fix this by checking for free space entries that don't have a
corresponding block group. Additionally add a test image to validate
this fix.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Function write_data_to_disk() can handle RAID56 writes without any
problem.
So just call write_data_to_disk() inside write_and_map_eb() instead of
manually doing the RAID56 write.
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>