The variable @root is only set but not utilized, while we only utilize
@root1.
Replace @root1 with @root, then remove the @root1.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Allow repair_qgroups() to do silent repair, so it can acts as offline
qgroup rescan.
This provides the basis for later mkfs quota support.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The original qgroup-verify integrates qgroup classification into
report_qgroups(). This behavior makes silent qgroup repair (or offline
rescan) impossible.
To repair qgroup, we must call report_qgroups() to trigger bad qgroup
classification, which will output error message.
This patch moves bad qgroup classification from report_qgroups() to
qgroup_verify_all(). Now report_qgroups() is pretty lightweight, only
doing basic qgroup difference report thus change it type to void.
And since the functionality of qgroup_verify_all() changes, change
callers to handle the new return value correctly.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ removed some comments ]
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In fact qgroup-verify is just kind of offline qgroup rescan, and later
mkfs qgroup support will reuse it.
So qgroup-verify doesn't really need to rely the global variable @repair
to check if it should repair qgroups.
Instead check fs_info->readonly to do the repair.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Current kernel only supports qgroup version 1. Make qgroup-verify to
follow this standard.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 078e9a1cc9 ("btrfs-progs: check: enhanced progress indicator")
introduced @qgroup_item_count for progress indicator.
However since we will later introduce silent qgroup rescan
functionality, the @qgroup_item_count pointer can be NULL.
So check if @qgroup_item_count is NULL before accessing it.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There is a report about checksum item overlap, which makes newer btrfs
kernel to reject it due to tree-checker.
Now let btrfs-progs have the same ability to detect such problem.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In check_chunk_item() we search extent tree for block group item.
Refactor this part into a separate function, find_block_group_item(), so
that later skinny-bg-tree feature can reuse it.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This would sync the code between kernel and btrfs-progs, and save at
least 1 byte for each btrfs_block_group_cache.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The whole maybe_repair_root_item() and repair_root_items() functions are
introduced to handle an ancient bug in v3.17.
However in certain extent tree corruption case, such early exit would
only exit the whole check process early on, preventing user to know
what's really wrong about the fs.
So this patch will allow the check to continue, since the ancient bug is
no long that common.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We can easily get the level from @eb parameter, thus the level is not
needed.
This is inspired by the work of Marek in U-boot.
Cc: Marek Behun <marek.behun@nic.cz>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If we have memory allocation failure in add_cache_extent(), it will
simply exit with one error message.
That's definitely not proper, especially when all but one call sites
have handled the error.
This patch will return -ENOMEM for add_cache_extent(), and fix the only
call site which doesn't handle error from it.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
btrfs check can return strange return value for shell:
[Inferior 1 (process 48641) exited with code 0213]
^^^^
[CAUSE]
It's caused by the incorrect handling of qgroup error.
qgroup_report_ret can be -117 (-EUCLEAN), using that value with exit()
can cause overflow, causing return value not properly recognized.
[FIX]
Fix it by sanitize the return value to 0 or 1.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Reviewed-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Valgrind reports the following error for fsck/002 (which only supports
original mode):
==97088== Conditional jump or move depends on uninitialised value(s)
==97088== at 0x15BFF6: add_data_backref (main.c:4884)
==97088== by 0x16025C: run_next_block (main.c:6452)
==97088== by 0x165539: deal_root_from_list (main.c:8471)
==97088== by 0x166040: check_chunks_and_extents (main.c:8753)
==97088== by 0x166441: do_check_chunks_and_extents (main.c:8842)
==97088== by 0x169D13: cmd_check (main.c:10324)
==97088== by 0x11CDC6: cmd_execute (commands.h:125)
==97088== by 0x11D712: main (btrfs.c:386)
[CAUSE]
In alloc_data_backref(), only ref->node is set to 0.
While ref->disk_bytenr is not initialized at all.
And then in add_data_backref(), if @back is a newly allocated data
backref, we use the garbage from back->disk_bytenr to determine if we
should reset them.
[FIX]
Fix it by initialize the whole data_backref structure in
alloc_data_backref().
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
With valgrind, fsck/002 test with original mode would report the
following valgrind error:
==90600== Conditional jump or move depends on uninitialised value(s)
==90600== at 0x15C280: pick_next_pending (main.c:4949)
==90600== by 0x15F3CF: run_next_block (main.c:6175)
==90600== by 0x1655CC: deal_root_from_list (main.c:8486)
==90600== by 0x1660C7: check_chunks_and_extents (main.c:8762)
==90600== by 0x166439: do_check_chunks_and_extents (main.c:8842)
==90600== by 0x169D0B: cmd_check (main.c:10324)
==90600== by 0x11CDC6: cmd_execute (commands.h:125)
==90600== by 0x11D712: main (btrfs.c:386)
[CAUSE]
The problem happens like this:
deal_root_from_list(@list is empty)
|- stack @last is not initialized
|- while(!list_empty(list)) {} is skipped
|- run_next_block(&last);
|- pick_next_pending(*last);
|- node_start = last;
Since the stack @last is not initialized in deal_root_from_list(), the
final node_start = last assignment would just fetch the garbage from
stack.
[FIX]
Fix the problem by initializing @last to 0, as that's exactly what the
first while loop did.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
There are some reports on fsck/001 test segfault failure with lowmem mode.
While I failed to reproduce it, valgrind still catches it with the
following output:
Delete backref in extent [12845056 1048576]
ERROR: file extent [257, 0] has unaligned disk bytenr: 755944791, should be aligned to 4096
ERROR: file extent[257 0] root 5 owner 5 backref lost
Deleted root 5 item[257, 108, 0]
==29080== Conditional jump or move depends on uninitialised value(s)
==29080== at 0x1A81D7: btrfs_release_path (ctree.c:97)
==29080== by 0x192C33: repair_extent_data_item (mode-lowmem.c:3330)
==29080== by 0x1962FF: check_leaf_items (mode-lowmem.c:4696)
==29080== by 0x196ABF: walk_down_tree (mode-lowmem.c:4858)
==29080== by 0x197762: check_btrfs_root (mode-lowmem.c:5157)
==29080== by 0x198335: check_chunks_and_extents_lowmem (mode-lowmem.c:5450)
==29080== by 0x166414: do_check_chunks_and_extents (main.c:8829)
==29080== by 0x169CF7: cmd_check (main.c:10313)
==29080== by 0x11CDC6: cmd_execute (commands.h:125)
==29080== by 0x11D712: main (btrfs.c:386)
==29080==
[CAUSE]
In repair_extent_data_item() if we find unaligned file extent, we just
delete it and kick in hole punch procedure.
The problem is, file extent deletion is done before initializing @path.
And when the deletion is done without problem, we will goto out tag,
which will release @path, containing uninitialized values, and
triggering segfault.
[FIX]
Don't try to abort trans nor free path if we're going through file
extent deletion routine.
Fixes: 0617bde3bc ("btrfs-progs: lowmem: delete unaligned bytes extent data under repair")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since 1d5b2ad9 ("btrfs-progs: qgroup-verify: Don't treat qgroup
difference as error if the fs hasn't initialized a rescan") a new
message is being printed when the qgroups is incosistent and the rescan
hasn't being executed, so remove the later message send to stderr.
While in this function, simplify the check for a not executed rescan
since !counts.rescan_running and counts.rescan_running == 0 means the
same thing.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If we don't find holes in our hole rb tree we'll just assume there's a
gap from 0 to the length of the file and print that out. But this
simply isn't correct, we could have a gap between the last extent and
the isize, or 0 and the start of the first extent. Fix the error
message to tell us exactly where the hole is.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Lowmem check had the opposite problem of normal check, it caught gaps
that started at 0, but would still fail with my fixes in place. This is
because lowmem check doesn't take into account the isize of the inode.
Address this by making sure we do not complain about gaps that are after
isize. This makes lowmem pass with my fixes applied, and still fail
without my fixes.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When writing my test for the i_size patches, I noticed that I was not
actually failing without my patches as I should have been. This is
because we only check if the inode record extent end is < isize, we
don't check if the inode record extent start is > 0. Add this check to
make sure we're catching holes that start at the beginning of the file.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We are going to touch dirty_bgs in transaction directly, so every call
chain should pass @trans to the leaf functions.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Much like what we have done in lowmem mode, also detect and report
invalid extent generation in original mode.
Unlike lowmem mode, we have extent_record::generation, which is the
higher number of generations in EXTENT_ITEM, EXTENT_DATA or tree block
header, so there is no need to check generations in different locations.
For repair, we still need to depend on --init-extent-tree, as directly
modifying extent items can easily cause conflicts with delayed refs,
thus it should be avoided.
Reviewed-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since older `btrfs check --init-extent-tree` could cause invalid
EXTENT_ITEM generation for data extents, add such check to lowmem mode
check.
Also add such generation check to file extents too.
For the repair part, I don't have any good idea yet. So affected user
may depend on --init-extent-tree again.
Reviewed-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When using `btrfs check --init-extent-tree`, we will create incorrect
generation number for data extents in extent tree:
item 10 key (13631488 EXTENT_ITEM 1048576) itemoff 15828 itemsize 53
refs 1 gen 0 flags DATA
extent data backref root FS_TREE objectid 257 offset 0 count 1
[CAUSE]
Since data extent generation is not as obvious as tree blocks, which has
header containing its generations, so for data extents, its
extent_record::generation is not really updated, resulting such 0
generation.
[FIX]
To get generation of a data extent, there are two sources we can rely:
- EXTENT_ITEM
There is always a btrfs_extent_item::generation can be utilized.
Although this is not possible for --init-extent-tree use case.
- EXTENT_DATA
We have btrfs_file_extent_item::generation for regular and
preallocated data extents.
Since --init-extent-tree will go through subvolume trees, this would
be the main source for extent data generation.
Then we only need to make add_data_backref() to accept @gen parameter,
and pass it down to extent_record structure.
And for the final extent item generation update, here we add extra
fallback values, if we can't find FILE_EXTENT items.
In that case, we just fall back to current transid.
With this modification, recreated data EXTENT_ITEM now has correct
generation number:
item 10 key (13631488 EXTENT_ITEM 1048576) itemoff 15828 itemsize 53
refs 1 gen 6 flags DATA
extent data backref root FS_TREE objectid 257 offset 0 count 1
Reviewed-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When using `btrfs check --init-extent-tree`, there is a pretty high
chance that the result fs can't pass tree-checker:
BTRFS critical (device dm-3): corrupt leaf: block=5390336 slot=149 extent bytenr=20115456 len=4096 invalid generation, have 16384 expect (0, 360]
BTRFS error (device dm-3): block=5390336 read time tree block corruption detected
BTRFS error (device dm-3): failed to read block groups: -5
BTRFS error (device dm-3): open_ctree failed
[CAUSE]
The result fs has a pretty screwed up EXTENT_ITEMs for data extents:
item 148 key (20111360 EXTENT_ITEM 4096) itemoff 8777 itemsize 53
refs 1 gen 0 flags DATA
extent data backref root FS_TREE objectid 841 offset 0 count 1
item 149 key (20115456 EXTENT_ITEM 4096) itemoff 8724 itemsize 53
refs 1 gen 16384 flags DATA
extent data backref root FS_TREE objectid 906 offset 0 count 1
Kernel tree-checker will accept 0 generation, but that 16384 generation
is definitely going to trigger the alarm.
Looking into the code, it's add_extent_rec_nolookup() allocating a new
extent_rec, but not copying all members from parameter @tmpl, resulting
generation not properly initialized.
[FIX]
Just copy tmpl->generation in add_extent_rec_nolookup(). And since all
call sites have set all members of @tmpl to 0 before
add_extent_rec_nolookup(), we shouldn't get garbage values.
For the 0 generation problem, it will be solved in another patch.
Issue: #225 (Not the initial report, but extent tree rebuild result)
Reviewed-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
For certain fuzzed image, `btrfs check` will fail with the following
call trace:
Checking filesystem on issue_213.raw
UUID: 99e50868-0bda-4d89-b0e4-7e8560312ef9
[1/7] checking root items
[2/7] checking extents
Program received signal SIGABRT, Aborted.
0x00007ffff7c88f25 in raise () from /usr/lib/libc.so.6
(gdb) bt
#0 0x00007ffff7c88f25 in raise () from /usr/lib/libc.so.6
#1 0x00007ffff7c72897 in abort () from /usr/lib/libc.so.6
#2 0x00005555555abc3e in run_next_block (...) at check/main.c:6398
#3 0x00005555555b0f36 in deal_root_from_list (...) at check/main.c:8408
#4 0x00005555555b1a3d in check_chunks_and_extents (fs_info=0x5555556a1e30) at check/main.c:8690
#5 0x00005555555b1e3e in do_check_chunks_and_extents (fs_info=0x5555556a1e30) a
#6 0x00005555555b5710 in cmd_check (cmd=0x555555696920 <cmd_struct_check>, argc
#7 0x0000555555568dc7 in cmd_execute (cmd=0x555555696920 <cmd_struct_check>, ar
#8 0x0000555555569713 in main (argc=2, argv=0x7fffffffde70) at btrfs.c:386
[CAUSE]
This fuzzed images has a corrupted EXTENT_DATA item in data reloc tree:
item 1 key (256 EXTENT_DATA 256) itemoff 16111 itemsize 12
generation 0 type 2 (prealloc)
prealloc data disk byte 16777216 nr 0
prealloc data offset 0 nr 0
There are several problems with the item:
- Bad item size
12 is too small.
- Bad key offset
offset of EXTENT_DATA type key represents file offset, which should
always be aligned to sector size (4K in this particular case).
[FIX]
Do extra item size and key offset check for original mode, and remove
the abort() call in run_next_block().
And to show off how robust lowmem mode is, lowmem can handle it without
any hiccup.
With this fix, original mode can detect the problem properly:
Checking filesystem on issue_213.raw
UUID: 99e50868-0bda-4d89-b0e4-7e8560312ef9
[1/7] checking root items
[2/7] checking extents
ERROR: invalid file extent item size, have 12 expect (21, 16283]
ERROR: errors found in extent allocation tree or chunk allocation
[3/7] checking free space cache
[4/7] checking fs roots
root 18446744073709551607 root dir 256 error
root 18446744073709551607 inode 256 errors 62, no orphan item, odd file extent, bad file extent
ERROR: errors found in fs roots
found 131072 bytes used, error(s) found
total csum bytes: 0
total tree bytes: 131072
total fs tree bytes: 32768
total extent tree bytes: 16384
btree space waste bytes: 124774
file data blocks allocated: 0
referenced 0
Issue: #213
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When compiling the devel branch with commit fb8f05e40b458
("btrfs-progs: check: Make repair_imode_common() handle inodes in
subvolume trees"), the following warning will be reported:
check/mode-common.c: In function ‘detect_imode’:
check/mode-common.c|1071 col 23| warning: ‘imode’ may be used uninitialized in this function [-Wmaybe-uninitialized]
1071 | *imode_ret = (imode | 0700);
| ~~~~~~~^~~~~~~
This only occurs for regular build. If compiled with D=1, the warning
just disappears.
[CAUSE]
Looks like a bug in gcc optimization.
The code will only set @imode_ret when @found is true.
And for every "found = true" assignment we have assigned @imode.
So this is just a false alert.
[FIX]
I hope I can fix the problem of GCC, but obviously I can't (at least for
now).
So let's assign an initial value 0 to @imode to suppress the false
alert.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The manual page of btrfsck clearly states 'btrfs check --repair' is a
dangerous operation.
Although this warning is in place users do not read the manual page
and/or are used to the behaviour of fsck utilities which repair the
filesystem, and thus potentially cause harm.
Similar to 'btrfs balance' without any filters, add a warning and a
countdown, so users can bail out before eventual corrupting the
filesystem more than it already is.
To override the timeout, let --force skip it and continue.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
We access btrfs_block_group_cache::item mostly for @used and @flags.
@flags is already a dedicated member in btrfs_block_group_cache, only
@used doesn't have a dedicated member.
This patch will remove btrfs_block_group_cache::item and add
btrfs_block_group_cache::used.
It's the btrfs-progs equivalent of the following kernel patches:
btrfs: move block_group_item::used to block group
btrfs: move block_group_item::flags to block group
btrfs: remove embedded block_group_cache::item
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The following functions are just using @root to reach fs_info:
- exclude_super_stripes
- free_excluded_extents
- add_excluded_extent
Refactor them to use fs_info directly.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
There are at least two bug reports of kernel tree-checker complaining
about invalid inode generation.
All offending inodes seem to be caused by old kernel around 2014, with
inode generation overflow.
So add such check and repair ability to lowmem mode check first.
This involves:
- Calculate the inode generation upper limit
Unlike the lowmem mode context, we don't have anyway to determine if
this inode belongs to log tree.
So we use super_generation + 1 as upper limit, just like what we did
in kernel tree checker.
- Check if the inode generation is larger than the upper limit
- Repair by resetting inode generation to current transaction
generation
The difference is, in original mode, we have a common trans handle for
all repair and reset path for each repair.
Reported-by: Charles Wright <charles.v.wright@gmail.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>