Convert root->sectorsize/nodesize users in btrfs-corrupt-block.
This provides the basis to further refactor incorrect btrfs_root
parameter to btrfs_fs_info parameter.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Since we have cached block sizes in fs_info, there is no need to specify
these sizes in btrfs_setup_root() function.
And refactor all root->sector/node/stripesize users in disk-io.c.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
btrfs_fs_info
Just like what we do in kernel, since we will not support different
leaf/node/stripe size per tree, there is no need to store these block
sizes in btrfs_root.
This patch will introduce these block size members into btrfs_fs_info
structure, allowing us to convert such usage in later patches.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Leafsize is deprecated for a long time, and kernel has already updated
ctree.h to rename sb->leafsize to sb->__unused_leafsize.
This patch will remove normal users of leafsize:
1) Remove leafsize member from btrfs_root structure
Now only root->nodesize and root->sectorisze.
No longer root->leafsize.
2) Remove @leafsize parameter from btrfs_setup_root() function
Since no root->leafsize, no need for @leafsize parameter.
The remaining user of leafsize will be:
1) btrfs inspect-internal dump-super
Reformat the "leafsize" output to "leafsize (deprecated)" and
use le32_to_cpu() to do the cast manually.
2) mkfs
We still need to set sb->__unused_leafsize to nodesize.
Do the manual cast too.
3) convert
Same as mkfs, these two superblock setup should be merged later
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
In btrfs_check_chunk_valid() we calculate chunk item using open code,
use an existing helper btrfs_chunk_item_size() instead.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
So btrfs_set_header_flags() vs btrfs_set_header_flag, the difference is
sort of similar to "=" vs "|=", when creating and initialising a new
extent buffer, convert uses the former one which clears header_rev by
accident.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With the current btrfs-convert, if we convert a ext4 without data checksum,
it'd not set nodatasum flag in inode item, nor create csum item, reading
file ends up with checksum errors.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This updates mkfs.btrfs's man page with the new limitation that nodesize must
be a power of 2 as well.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
As Qu mentioned in this thread
(https://www.spinics.net/lists/linux-btrfs/msg64469.html), compression
can cause regular extent to co-exist with an inlined extent. This
coexistence makes things confusing. Since it is currently allowed and
can appear in a filesystem, fix btrfsck to prevent a bunch of error
reports to appear that will make user feel uneasy.
When checking a file extent, record the extent_end of the regular extent
to check if there is a gap between the regular extents. Normally there
is only one inlined extent, so the extent_end of inlined extent is
useless. However, if a regular extent can co-exist with an inlined
extent, the extent_end of the inlined extent also needs to be recorded.
Reported-by: Marc MERLIN <marc@merlins.org>
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add test case which we have NO_HOLES incompat flag while still have
hole file extent.
This can be created by enabling NO_HOLES feature on an existing
filesystem, which lowmem mode would cause false alert for it.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
[ minor adjustments ]
Signed-off-by: David Sterba <dsterba@suse.com>
Since the incompat feature NO_HOLES still allows us to have an explicit
hole file extent, current check is too strict and will cause false
alerts like:
root 5 EXTENT_DATA[257, 0] shouldn't be hole
Fix it by removing the strict file hole extent check.
Link: https://www.spinics.net/lists/linux-btrfs/msg66374.html
Reported-by: Henk Slager <eye1tm@gmail.com>
Tested-by: Henk Slager <eye1tm@gmail.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With larger file system (in this case its 22TB), ext2fs_open() returns
EXT2_ET_CANT_USE_LEGACY_BITMAPS error message with
ext2fs_read_block_bitmap().
To overcome this issue,
(a) we need pass EXT2_FLAG_64BITS flag with ext2fs_open.
(b) use 64-bit functions like ext2fs_get_block_bitmap_range2,
ext2fs_inode_data_blocks2,ext2fs_read_ext_attr2
(c) use 64bit types with btrfs_convert_context fields
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=194795
Signed-off-by: Lakshmipathi.G <lakshmipathi.g@giis.co.in>
Signed-off-by: David Sterba <dsterba@suse.com>
The u32 types in the convert context might not be enough for some very
large filesytems (20TB). Use 64bit types to be safe.
Signed-off-by: Lakshmipathi.G <lakshmipathi.g@giis.co.in>
Signed-off-by: David Sterba <dsterba@suse.com>
"Bug 194961 - btrfs device stats --check <folder> does not work"
The long option --check is not recognized as it's missing from the
option table.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=194961
Reported-by: Tomas Thiemel<thiemel@centrum.cz>
Signed-off-by: Lakshmipathi.G <Lakshmipathi.G@giis.co.in>
Signed-off-by: David Sterba <dsterba@suse.com>
While the command interpreter may be able to disambiguate the meaning,
the reader is not helped by being forced to do so.
Pull request: #48
Signed-off-by: David Sterba <dsterba@suse.com>
User Kasijjuf points out the VFS initialism is not explained anywhere.
While this could be fixed, the whole note about inability to delete the
device by which the filesystem has been mounted, is wrong.
Issue: #49
Signed-off-by: David Sterba <dsterba@suse.com>
While talking to another btrfs user on IRC today, it became clear that a
major point of confusion in the btrfs send manual is that it's not
telling the user soon enough that send/receive solely operates on
subvolume snapshots instead of the original (read/write) subvolumes.
So, change the first few lines to explicitly mention snapshots instead.
Technically, snapshots are also just subvolumes, but requiring this
level of technical detailed knowledge doesn't help the user who is just
trying out things.
Signed-off-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Reported by a wiki user, that there are formatting artifacts in the
'get' section:
in html rendered as "The -t <em><type></em> option can be..."
This is probably due to the nesting '' and <>. We don't need the <> in
the explanation, as this is only to describe the command line syntax.
Signed-off-by: David Sterba <dsterba@suse.com>
Test that we are able to create an image from a multiple devices fs, that
we are able to restore that image into a single device and finally that we
are able to mount it.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
[ added shell quotation and chmod a+w so testsuite on NFS works ]
Signed-off-by: David Sterba <dsterba@suse.com>
We correctly build an image from a multiple devices filesystem but when
restoring the image into a single device we were missing updating the
number of devices in the superblock to the value 1 (we already took care
of setting the number of stripes to 1 for each chunk item and setting
the device id for each chunk item to match the device id from the super
block).
This missing update of the number of devices makes it impossible to mount
the restored filesystem on recent kernels, more specifically since the
linux kernel commit 99e3ecfcb9f4ca35192d20a5bea158b81f600062
("Btrfs: add more validation checks for superblock"), that produce the
following message in the dmesg/syslog:
[21097.542047] BTRFS error (device sdi): super_num_devices 2 mismatch with num_devices 1 found here
[21097.543972] BTRFS error (device sdi): failed to read chunk tree: -22
[21097.720360] BTRFS error (device sdi): open_ctree failed
So fix this by updating the number of devices to 1 in the superblock.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Patches "btrfs-progs: tests: correctly receive clones to mounted subvol"
(8eaf63bc9a) and followup are missing last
unmount which leads to failure of misc/020.
Signed-off-by: David Sterba <dsterba@suse.com>
Returning -ENODATA is only considered invalid on the first run of the
loop where we would detect entirely empty stream.
The enhanced test misc-tests/018-recv-end-of-stream now passes.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195597
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Btrfs header has a u64 member flags, whose lowest 56 bits are for header
flags like WRITTEN and RELOC.
And its highest 8 bits are for backref revision.
Manually checking btrfs_header_flags() will be a pain, so add such leaf
flags and backref revision output for print-tree.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When reading out name from inode_ref, it's possible that corrupted
name_len can lead to read beyond boundary of item or even extent buffer.
This happens when checking fuzzed image /tmp/bko-161811.raw, for both
lowmem mode and original mode.
Below is the example from lowmem mode.
ERROR: root 5 INODE REF[256 256] doesn't have related DIR_INDEX[256 216172782113783808] namelen 255 filename bar filetype 0
ERROR: root 5 INODE REF[256 256] doesn't have related DIR_ITEM[256 1306590535] namelen 255 filename bar filetype 0
WARNING: root 5 INODE[256] mode 0 shouldn't have DIR_INDEX[256 1167283096]
WARNING: root 5 DIR_ITEM[256 1167283096] name too long
==13013== Invalid read of size 1
==13013== at 0x4C31A38: memmove (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==13013== by 0x431518: read_extent_buffer (extent_io.c:863)
==13013== by 0x4752AB: check_dir_item (cmds-check.c:4627)
==13013== by 0x475E5C: check_inode_item (cmds-check.c:4911)
==13013== by 0x476200: check_fs_first_inode (cmds-check.c:5011)
==13013== by 0x476276: check_fs_root_v2 (cmds-check.c:5044)
==13013== by 0x4769FB: check_fs_roots_v2 (cmds-check.c:5242)
==13013== by 0x488B5B: cmd_check (cmds-check.c:13033)
==13013== by 0x40A8C5: main (btrfs.c:246)
==13013== Address 0x5c95b80 is 0 bytes after a block of size 4,224 alloc'd
==13013== at 0x4C2CF35: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==13013== by 0x4307E0: __alloc_extent_buffer (extent_io.c:538)
==13013== by 0x430C37: alloc_extent_buffer (extent_io.c:642)
==13013== by 0x413DFE: btrfs_find_create_tree_block (disk-io.c:193)
==13013== by 0x414370: read_tree_block_fs_info (disk-io.c:340)
==13013== by 0x40B5D5: read_tree_block (disk-io.h:125)
==13013== by 0x40CFD2: read_node_slot (ctree.c:652)
==13013== by 0x40E5EB: btrfs_search_slot (ctree.c:1172)
==13013== by 0x4761A8: check_fs_first_inode (cmds-check.c:5001)
==13013== by 0x476276: check_fs_root_v2 (cmds-check.c:5044)
==13013== by 0x4769FB: check_fs_roots_v2 (cmds-check.c:5242)
==13013== by 0x488B5B: cmd_check (cmds-check.c:13033)
Fix it by double checking dir_item, name_len against item boundary
before trying to read out name from extent buffer, for both original
mode and lowmem mode.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When reading out name from inode_ref, it's possible that corrupted
name_len can lead to read beyond boundary of item or even extent buffer.
This happens when checking fuzzed image /tmp/bko-161811.raw, for both
lowmem mode and original mode.
ERROR: root 5 INODE REF[256 256] doesn't have related DIR_INDEX[256 504403158265495680] namelen 0 filename filetype 0
ERROR: root 5 INODE REF[256 256] doesn't have related DIR_ITEM[256 4294967294] namelen 0 filename filetype 0
WARNING: root 5 INODE_REF[256 256] name too long
==13022== Invalid read of size 8
==13022== at 0x4C319BE: memmove (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==13022== by 0x431518: read_extent_buffer (extent_io.c:863)
==13022== by 0x474730: check_inode_ref (cmds-check.c:4307)
==13022== by 0x475D65: check_inode_item (cmds-check.c:4890)
==13022== by 0x476200: check_fs_first_inode (cmds-check.c:5011)
==13022== by 0x476276: check_fs_root_v2 (cmds-check.c:5044)
==13022== by 0x4769FB: check_fs_roots_v2 (cmds-check.c:5242)
==13022== by 0x488B5B: cmd_check (cmds-check.c:13033)
==13022== by 0x40A8C5: main (btrfs.c:246)
==13022== Address 0x5c96780 is 0 bytes after a block of size 4,224 alloc'd
==13022== at 0x4C2CF35: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==13022== by 0x4307E0: __alloc_extent_buffer (extent_io.c:538)
==13022== by 0x430C37: alloc_extent_buffer (extent_io.c:642)
==13022== by 0x413DFE: btrfs_find_create_tree_block (disk-io.c:193)
==13022== by 0x414370: read_tree_block_fs_info (disk-io.c:340)
==13022== by 0x40B5D5: read_tree_block (disk-io.h:125)
==13022== by 0x40CFD2: read_node_slot (ctree.c:652)
==13022== by 0x40E5EB: btrfs_search_slot (ctree.c:1172)
==13022== by 0x4761A8: check_fs_first_inode (cmds-check.c:5001)
==13022== by 0x476276: check_fs_root_v2 (cmds-check.c:5044)
==13022== by 0x4769FB: check_fs_roots_v2 (cmds-check.c:5242)
==13022== by 0x488B5B: cmd_check (cmds-check.c:13033)
=
Fix it by double checking inode_ref, name_len against item boundary
before trying to read out name from extent buffer, for both original
mode and lowmem mode.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
fsck/004-no-dir-index makes valgrinds complaining about Invalid read.
==31890== Invalid read of size 1
==31890== at 0x453D09: repair_inode_backrefs (cmds-check.c:2690)
==31890== by 0x453D09: check_inode_recs (cmds-check.c:3330)
==31890== by 0x453D09: check_fs_root (cmds-check.c:4012)
==31890== by 0x45E788: check_fs_roots (cmds-check.c:4098)
==31890== by 0x45E788: cmd_check (cmds-check.c:13031)
==31890== by 0x40A88A: main (btrfs.c:246)
==31890== Address 0x5cb7b90 is 16 bytes inside a block of size 50 free'd
==31890== at 0x4C2C14B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==31890== by 0x453D08: repair_inode_backrefs (cmds-check.c:2684)
==31890== by 0x453D08: check_inode_recs (cmds-check.c:3330)
==31890== by 0x453D08: check_fs_root (cmds-check.c:4012)
==31890== by 0x45E788: check_fs_roots (cmds-check.c:4098)
==31890== by 0x45E788: cmd_check (cmds-check.c:13031)
==31890== by 0x40A88A: main (btrfs.c:246)
==31890== Block was alloc'd at
==31890== at 0x4C2AF1F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==31890== by 0x45055C: get_inode_backref (cmds-check.c:1075)
==31890== by 0x45055C: add_inode_backref (cmds-check.c:1097)
==31890== by 0x45180C: process_dir_item (cmds-check.c:1525)
==31890== by 0x45180C: process_one_leaf (cmds-check.c:1838)
==31890== by 0x45180C: walk_down_tree (cmds-check.c:2134)
==31890== by 0x45180C: check_fs_root (cmds-check.c:3957)
==31890== by 0x45E788: check_fs_roots (cmds-check.c:4098)
==31890== by 0x45E788: cmd_check (cmds-check.c:13031)
==31890== by 0x40A88A: main (btrfs.c:246)
==31890==
==31890== Invalid read of size 8
==31890== at 0x452D66: repair_inode_backrefs (cmds-check.c:2731)
==31890== by 0x452D66: check_inode_recs (cmds-check.c:3330)
==31890== by 0x452D66: check_fs_root (cmds-check.c:4012)
==31890== by 0x45E788: check_fs_roots (cmds-check.c:4098)
==31890== by 0x45E788: cmd_check (cmds-check.c:13031)
==31890== by 0x40A88A: main (btrfs.c:246)
==31890== Address 0x5cb7b90 is 16 bytes inside a block of size 50 free'd
==31890== at 0x4C2C14B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==31890== by 0x453D08: repair_inode_backrefs (cmds-check.c:2684)
==31890== by 0x453D08: check_inode_recs (cmds-check.c:3330)
==31890== by 0x453D08: check_fs_root (cmds-check.c:4012)
==31890== by 0x45E788: check_fs_roots (cmds-check.c:4098)
==31890== by 0x45E788: cmd_check (cmds-check.c:13031)
==31890== by 0x40A88A: main (btrfs.c:246)
==31890== Block was alloc'd at
==31890== at 0x4C2AF1F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==31890== by 0x45055C: get_inode_backref (cmds-check.c:1075)
==31890== by 0x45055C: add_inode_backref (cmds-check.c:1097)
==31890== by 0x45180C: process_dir_item (cmds-check.c:1525)
==31890== by 0x45180C: process_one_leaf (cmds-check.c:1838)
==31890== by 0x45180C: walk_down_tree (cmds-check.c:2134)
==31890== by 0x45180C: check_fs_root (cmds-check.c:3957)
==31890== by 0x45E788: check_fs_roots (cmds-check.c:4098)
==31890== by 0x45E788: cmd_check (cmds-check.c:13031)
==31890== by 0x40A88A: main (btrfs.c:246)
==31890==
While iterating over backrefs in repair_inode_backrefs, there are
several situations to repair one backref according
backref->found_dir_item and backref->found_dir_index. Two of these
branches may free the backref, but next checks will still access the
freed memory.
Because these branches are independent, let repair_inode_backrefs skip
to handle next backref after free can fix it.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since we memset tmpl, max_size==0. This does not seem consistent with nr = 1.
In check_extent_refs, we will call:
set_extent_dirty(root->fs_info->excluded_extents,
rec->start,
rec->start + rec->max_size - 1);
This ends up with BUG_ON(end < start) in insert_state.
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When this happens, we will trip a BUG_ON(end < start) in insert_state
because in check_extent_refs, we use this max_size expecting it's not zero:
set_extent_dirty(root->fs_info->excluded_extents,
rec->start,
rec->start + rec->max_size - 1);
See https://bugzilla.redhat.com/show_bug.cgi?id=1435567
for an example where this scenario occurs.
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.com>
See https://bugzilla.redhat.com/show_bug.cgi?id=1435567 for an example
where the message occurs.
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
[ un-indent strings overfowing 80 cols ]
Signed-off-by: David Sterba <dsterba@suse.com>
Fuzzed image bko-156811-bad-parent-ref-qgroup-verify.raw causes qgroup
to report -ENOMEM.
But the fact is, such image is heavily damaged so there is no valid root
item for the extent tree.
Normal extent tree key in root tree should be (EXTENT_TREE ROOT_ITEM 0),
while in that fuzzed image, we got (EXTENT_TREE EXXTENT_DATA SOME_NUMBER).
It's btrfs_find_last_root() that only checks the objectid, not caring
about the key type leading to such problem.
Fix it by doing extra check on key type.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
[ edit changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
Fuzzed image bko-161821.raw causes btrfs check to get segmentation fault.
The function check_owner_ref attempts to access a non-exist quota tree
when dealing with extent_item [4198400 4096] in the corrupted filesystem.
The function btrfs_new_fs_info always allocates memory for
fs_info->quota_root regardless of whether quota_tree exists or not.
Additionally, the function btrfs_read_fs_root will directly return
fs_info->quota_root if location->objectid == BTRFS_QUOTA_TREE_OBJECTID.
This patch does the following things:
1. Do extra check and return ENOENT if quota tree does not exist in the
function btrfs_read_fs_root.
2. Free useless fs_info->quota_root in the function btrfs_setup_all_roots
to reduce confusion.
3. free_extent_buffer even if check_child_node failed in the function
walk_down_tree.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
fsck/003-shift-offsets makes valgrinds complaining about memory leaks.
==5910==
==5910== HEAP SUMMARY:
==5910== in use at exit: 1,112 bytes in 11 blocks
==5910== total heap usage: 161 allocs, 150 frees, 164,800 bytes allocated
==5910==
==5910== 216 (72 direct, 144 indirect) bytes in 1 blocks are definitely lost in loss record 3 of 5
==5910== at 0x4C2AF1F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5910== by 0x4815A3: add_root_item_to_list (cmds-check.c:9683)
==5910== by 0x481CE2: check_chunks_and_extents (cmds-check.c:9886)
==5910== by 0x48888B: cmd_check (cmds-check.c:12977)
==5910== by 0x40A8C5: main (btrfs.c:246)
==5910==
The check_chunks_and_extents() memory leaks are caused by not freeing
added root items of normal_trees and dropping_trees.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
tests/fssum.c:599:13: warning: In the GNU C Library, "major" is defined
by <sys/sysmacros.h>. For historical compatibility, it is
currently defined by <sys/types.h> as well, but we plan to
remove this soon. To use "major", include <sys/sysmacros.h>
directly. If you did not intend to use a system-defined macro
"major", you should undefine it after including <sys/types.h>.
sum_add_u64(&cs, major(st.st_rdev));
Signed-off-by: David Sterba <dsterba@suse.com>
When a 0 sized block group item is found, set_extent_bits() will not
really set any bits.
While set_state_private() still inserts allocated block group cache into
block group extent_io_tree.
So at close_ctree() time, we won't free the private block group cache
stored since we can't find any bit set for the 0 sized block group.
To fix it, at btrfs_read_block_groups() we skip any 0 sized block group,
so such leak won't happen.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>