Add BTRFS_EXTENDED_PROFILE_MASK to consider also the
BTRFS_AVAIL_ALLOC_BIT_SINGLE bit.
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: David Sterba <dsterba@suse.com>
- complete the function btrfs_err_str adding some missing cases
- sync the enum btrfs_err_code (in libbtrfsutil/btrfs.h) with the
rest of the codes (user space and kernel space).
- add missing fields to btrfs_raid_array[] for raid1c[34]
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
btrfs check can return strange return value for shell:
[Inferior 1 (process 48641) exited with code 0213]
^^^^
[CAUSE]
It's caused by the incorrect handling of qgroup error.
qgroup_report_ret can be -117 (-EUCLEAN), using that value with exit()
can cause overflow, causing return value not properly recognized.
[FIX]
Fix it by sanitize the return value to 0 or 1.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Reviewed-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: David Sterba <dsterba@suse.com>
Some scripts can still rely on this message, so make it available with
-vv, so -v stays sane.
Fixes: #127
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Valgrind reports memory leak for fsck/012, even after the image is
repair, the memory leak can still be reproduced.
==107060== HEAP SUMMARY:
==107060== in use at exit: 176 bytes in 1 blocks
==107060== total heap usage: 10,647 allocs, 10,646 frees, 3,000,654 bytes allocated
==107060==
==107060== 176 bytes in 1 blocks are definitely lost in loss record 1 of 1
==107060== at 0x483BB65: calloc (vg_replace_malloc.c:762)
==107060== by 0x1BD953: read_one_block_group (extent-tree.c:2661)
==107060== by 0x1BDBD8: btrfs_read_block_groups (extent-tree.c:2719)
==107060== by 0x1B3A2C: btrfs_setup_all_roots (disk-io.c:1024)
==107060== by 0x1B44CA: __open_ctree_fd (disk-io.c:1299)
==107060== by 0x1B46C6: open_ctree_fs_info (disk-io.c:1345)
==107060== by 0x16952E: cmd_check (main.c:10154)
==107060== by 0x11CDC6: cmd_execute (commands.h:125)
==107060== by 0x11D712: main (btrfs.c:386)
==107060==
==107060== LEAK SUMMARY:
==107060== definitely lost: 176 bytes in 1 blocks
==107060== indirectly lost: 0 bytes in 0 blocks
==107060== possibly lost: 0 bytes in 0 blocks
==107060== still reachable: 0 bytes in 0 blocks
==107060== suppressed: 0 bytes in 0 blocks
[CAUSE]
In btrfs_free_block_groups(), we use
rbtree_postorder_for_each_entry_safe() to iterate all block group cache.
However since we're already doing post order iteration, we shouldn't
call rb_erase() during that iteration, as it would re-balance the tree,
and break the post order iteration.
This wrong rb_erase() call leads to above memory leak.
[FIX]
Kill that wrong rb_erase() call.
Fixes: b1bd3cd93f ("btrfs-progs: reform block groups caches structure")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Valgrind reports the following error for fsck/012:
adding new tree backref on start 4206592 len 4096 parent 0 root 5
==100735== Syscall param pwrite64(buf) points to uninitialised byte(s)
==100735== at 0x49F303A: pwrite (in /usr/lib/libpthread-2.31.so)
==100735== by 0x1A5C85: write_extent_to_disk (extent_io.c:815)
==100735== by 0x1B2507: write_and_map_eb (disk-io.c:512)
==100735== by 0x1B26A7: write_tree_block (disk-io.c:545)
==100735== by 0x1D4822: __commit_transaction (transaction.c:148)
==100735== by 0x1D4AA2: btrfs_commit_transaction (transaction.c:213)
==100735== by 0x16360D: fixup_extent_refs (main.c:7662)
==100735== by 0x16449F: check_extent_refs (main.c:8033)
==100735== by 0x166199: check_chunks_and_extents (main.c:8786)
==100735== by 0x166441: do_check_chunks_and_extents (main.c:8842)
==100735== by 0x169D13: cmd_check (main.c:10324)
==100735== by 0x11CDC6: cmd_execute (commands.h:125)
==100735== Address 0x4e8aeb0 is 128 bytes inside a block of size 4,224 alloc'd
==100735== at 0x483BB65: calloc (vg_replace_malloc.c:762)
==100735== by 0x1A54C5: __alloc_extent_buffer (extent_io.c:609)
==100735== by 0x1A5AD1: alloc_extent_buffer (extent_io.c:752)
==100735== by 0x1B1A0A: btrfs_find_create_tree_block (disk-io.c:222)
==100735== by 0x1BD4A2: btrfs_alloc_free_block (extent-tree.c:2538)
==100735== by 0x1A8CE3: __btrfs_cow_block (ctree.c:322)
==100735== by 0x1A91C6: btrfs_cow_block (ctree.c:415)
==100735== by 0x1AB16C: btrfs_search_slot (ctree.c:1185)
==100735== by 0x160BBC: delete_extent_records (main.c:6652)
==100735== by 0x16343F: fixup_extent_refs (main.c:7629)
==100735== by 0x16449F: check_extent_refs (main.c:8033)
==100735== by 0x166199: check_chunks_and_extents (main.c:8786)
==100735==
[CAUSE]
For new extent buffer allocated, we don't initialize its content.
This is not a major concern, at all.
For the above report, the reported range is inside the unused part of
the extent buffer, thus won't cause anything.
Regular btrfs_cow_block() will cover all the used ranges of one extent
buffer.
[FIX]
But still, since kernel initialize the extent buffer with 0, it won't
hurt to do extra initialized to make valgrind happy.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Valgrind reports the following error for fsck/007, which is only
repairable for original mode:
==97599== Conditional jump or move depends on uninitialised value(s)
==97599== at 0x1D4A42: btrfs_commit_transaction (transaction.c:207)
==97599== by 0x16475C: check_extent_refs (main.c:8097)
==97599== by 0x166199: check_chunks_and_extents (main.c:8786)
==97599== by 0x166441: do_check_chunks_and_extents (main.c:8842)
==97599== by 0x169D13: cmd_check (main.c:10324)
==97599== by 0x11CDC6: cmd_execute (commands.h:125)
==97599== by 0x11D712: main (btrfs.c:386)
==97599==
[CAUSE]
If btrfs_write_dirty_block_groups() get called with no block group
dirtied (no dirty extents created), the return value of it is
uninitialized, as the stack @ret is not initialized at all.
[FIX]
Initialize @ret to 0 for btrfs_write_dirty_block_groups() as if there is
no dirty block groups, we do nothing and shouldn't fail.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Valgrind reports the following error for fsck/002 (which only supports
original mode):
==97088== Conditional jump or move depends on uninitialised value(s)
==97088== at 0x15BFF6: add_data_backref (main.c:4884)
==97088== by 0x16025C: run_next_block (main.c:6452)
==97088== by 0x165539: deal_root_from_list (main.c:8471)
==97088== by 0x166040: check_chunks_and_extents (main.c:8753)
==97088== by 0x166441: do_check_chunks_and_extents (main.c:8842)
==97088== by 0x169D13: cmd_check (main.c:10324)
==97088== by 0x11CDC6: cmd_execute (commands.h:125)
==97088== by 0x11D712: main (btrfs.c:386)
[CAUSE]
In alloc_data_backref(), only ref->node is set to 0.
While ref->disk_bytenr is not initialized at all.
And then in add_data_backref(), if @back is a newly allocated data
backref, we use the garbage from back->disk_bytenr to determine if we
should reset them.
[FIX]
Fix it by initialize the whole data_backref structure in
alloc_data_backref().
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
With valgrind, fsck/002 test with original mode would report the
following valgrind error:
==90600== Conditional jump or move depends on uninitialised value(s)
==90600== at 0x15C280: pick_next_pending (main.c:4949)
==90600== by 0x15F3CF: run_next_block (main.c:6175)
==90600== by 0x1655CC: deal_root_from_list (main.c:8486)
==90600== by 0x1660C7: check_chunks_and_extents (main.c:8762)
==90600== by 0x166439: do_check_chunks_and_extents (main.c:8842)
==90600== by 0x169D0B: cmd_check (main.c:10324)
==90600== by 0x11CDC6: cmd_execute (commands.h:125)
==90600== by 0x11D712: main (btrfs.c:386)
[CAUSE]
The problem happens like this:
deal_root_from_list(@list is empty)
|- stack @last is not initialized
|- while(!list_empty(list)) {} is skipped
|- run_next_block(&last);
|- pick_next_pending(*last);
|- node_start = last;
Since the stack @last is not initialized in deal_root_from_list(), the
final node_start = last assignment would just fetch the garbage from
stack.
[FIX]
Fix the problem by initializing @last to 0, as that's exactly what the
first while loop did.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
With INSTRUMENT=valgrind set, some fsck tests will fail, e.g. fsck/013:
====== RUN CHECK mount -t btrfs -o loop /home/adam/btrfs/btrfs-progs/tests//test.img /home/adam/btrfs/btrfs-progs/tests//mnt
==114106==
==114106== Warning: Can't execute setuid/setgid/setcap executable: /usr/bin/mount
==114106== Possible workaround: remove --trace-children=yes, if in effect
==114106==
valgrind: /usr/bin/mount: Permission denied
failed: mount -t btrfs -o loop /home/adam/btrfs/btrfs-progs/tests//test.img /home/adam/btrfs/btrfs-progs/tests//mnt
test failed for case 013-extent-tree-rebuild
[CAUSE]
Just as stated by valgrind itself, it can't handle program with
setuid/setgid/setcap.
Thankfully in our case it's mount and we don't really care about it at
all.
[FIX]
Although we could use complex skip pattern to skip mount in valgrind, we
don't really want to run valgrind on mount or sudo command anyway.
So here we do extra check if we're running mount command. And if that's
the case, just skip $INSTRUMENT command.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
There are some reports on fsck/001 test segfault failure with lowmem mode.
While I failed to reproduce it, valgrind still catches it with the
following output:
Delete backref in extent [12845056 1048576]
ERROR: file extent [257, 0] has unaligned disk bytenr: 755944791, should be aligned to 4096
ERROR: file extent[257 0] root 5 owner 5 backref lost
Deleted root 5 item[257, 108, 0]
==29080== Conditional jump or move depends on uninitialised value(s)
==29080== at 0x1A81D7: btrfs_release_path (ctree.c:97)
==29080== by 0x192C33: repair_extent_data_item (mode-lowmem.c:3330)
==29080== by 0x1962FF: check_leaf_items (mode-lowmem.c:4696)
==29080== by 0x196ABF: walk_down_tree (mode-lowmem.c:4858)
==29080== by 0x197762: check_btrfs_root (mode-lowmem.c:5157)
==29080== by 0x198335: check_chunks_and_extents_lowmem (mode-lowmem.c:5450)
==29080== by 0x166414: do_check_chunks_and_extents (main.c:8829)
==29080== by 0x169CF7: cmd_check (main.c:10313)
==29080== by 0x11CDC6: cmd_execute (commands.h:125)
==29080== by 0x11D712: main (btrfs.c:386)
==29080==
[CAUSE]
In repair_extent_data_item() if we find unaligned file extent, we just
delete it and kick in hole punch procedure.
The problem is, file extent deletion is done before initializing @path.
And when the deletion is done without problem, we will goto out tag,
which will release @path, containing uninitialized values, and
triggering segfault.
[FIX]
Don't try to abort trans nor free path if we're going through file
extent deletion routine.
Fixes: 0617bde3bc ("btrfs-progs: lowmem: delete unaligned bytes extent data under repair")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Cleanups done separately:
* use the default test image, loop devices not needed for the test
* trim TEST_MNT from all paths
* send output is created inside the test filesystem
Signed-off-by: David Sterba <dsterba@suse.com>
This test case is the reproducer for the previous fix.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When we process a clone request, we look up the source subvolume by
UUID, even if the source is the subvolume that we're currently
receiving. Usually, this is fine. However, if for some reason we
previously received the same subvolume, then this will use paths
relative to the previously received subvolume instead of the current
one. This is incorrect, since the send stream may use temporary names
for the clone source. This can be reproduced as follows:
btrfs subvolume create subvol
dd if=/dev/urandom of=subvol/foo bs=1M count=1
cp --reflink subvol/foo subvol/bar
mkdir subvol/dir
mv subvol/foo subvol/dir/
btrfs property set subvol ro true
btrfs send -f send.data subvol
mkdir first second
btrfs receive -f send.data first
btrfs receive -f send.data second
The second receive results in this error:
ERROR: cannot open first/subvol/o259-7-0/foo: No such file or directory
Fix it by always cloning from the current subvolume if its UUID matches.
This has the nice side effect of avoiding unnecessary UUID tree lookups
in that case.
Fixes: f1c24cd80d ("Btrfs-progs: add btrfs send/receive commands")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The checks for a subvolume being modified after it was received have
been commented out since they were added back in commit f1c24cd80d
("Btrfs-progs: add btrfs send/receive commands"). Let's just get rid of
the noise.
If they were ever in place, it would have never been possible
to do an incremental send and running dedupe against the parent
snapshot.
That particular use case used to cause send, the kernel side, to fail
(initially with a BUG_ON() and later with -EIO returned to user
space), see commit b4f9a1a87a48 ("Btrfs: fix incremental send failure
after deduplication").
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
[ add Filipe's note ]
Signed-off-by: David Sterba <dsterba@suse.com>
Since 1d5b2ad9 ("btrfs-progs: qgroup-verify: Don't treat qgroup
difference as error if the fs hasn't initialized a rescan") a new
message is being printed when the qgroups is incosistent and the rescan
hasn't being executed, so remove the later message send to stderr.
While in this function, simplify the check for a not executed rescan
since !counts.rescan_running and counts.rescan_running == 0 means the
same thing.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Running the lowmem mode for check needs some setup and is not usually
tested, so add a new target that sets up the variables.
Signed-off-by: David Sterba <dsterba@suse.com>
Skip the test 013-subvolume-delete-by-id if the first valid attempt to
use the ioctl fails with 'Inappropriate ioctl for device'.
Signed-off-by: David Sterba <dsterba@suse.com>
Add a variant of mayfail helper that will duplicate the output to
results log and also provides it to the caller for processing. Can be
used for catching unsupported functionality or other special cases.
Signed-off-by: David Sterba <dsterba@suse.com>
xxhash's state and results are always in little, but in progs after the
hash was calculated it was copied to the final buffer via memcpy,
meaning it'd be parsed as a big endian number on big endian machines.
This is incompatible with the kernel implementation of xxhash which
results in erroneous "checksum didn't match" errors on mount.
Fix it by using put_unaligned_le64 which always ensures the resulting
checksum will be copied in little endian format as the kernel expects
it.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206835
Fixes: f070ece2e9 ("btrfs-progs: add xxhash64 to mkfs")
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If we don't find holes in our hole rb tree we'll just assume there's a
gap from 0 to the length of the file and print that out. But this
simply isn't correct, we could have a gap between the last extent and
the isize, or 0 and the start of the first extent. Fix the error
message to tell us exactly where the hole is.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Lowmem check had the opposite problem of normal check, it caught gaps
that started at 0, but would still fail with my fixes in place. This is
because lowmem check doesn't take into account the isize of the inode.
Address this by making sure we do not complain about gaps that are after
isize. This makes lowmem pass with my fixes applied, and still fail
without my fixes.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When writing my test for the i_size patches, I noticed that I was not
actually failing without my patches as I should have been. This is
because we only check if the inode record extent end is < isize, we
don't check if the inode record extent start is > 0. Add this check to
make sure we're catching holes that start at the beginning of the file.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
On aarch64 with pagesize 64k, btrfs-convert of ext4 is successful,
but it won't mount because we don't yet support subpage blocksize, ie.
when page size and sectorsize don't match.
BTRFS error (device vda): sectorsize 4096 not supported yet, only support 65536
So in this case during convert provide a warning but let the conversion
proceed.
Example:
WARNING: Blocksize 4096 is not equal to the pagesize 65536,
converted filesystem won't mount on this system.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The old code of copy_one_extent() is a mess:
- The main loop is implemented using goto
- @mirror_num is reset to 1 for each loop
- @mirror num check against @num_copies is wrong for decompression error
This patch will fix this mess by:
- Use read_extent_data()
read_extent_data() has all the good wrapping of btrfs_map_block()
and length check.
This removes a lot of complexity.
- Add extra file extent offset check
To prevent underflow for memory allocation
- Do proper mirror_num check for decompression error
Issue: #221
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
That's a simple tool to calculate direntry hash that's used as part of
key in the b-trees, based on crc32c. There's also btrfs-crc that does
the same and has some additional features, so we can remove hasher.c.
Signed-off-by: David Sterba <dsterba@suse.com>
Move the check of dmsetup to check_dm_target_support, and adapt the only
two places checking if dmsetup is present in the system. Now we skip the
tests if dmsetup isn't available, instead of marking the test as failed.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This way we ensure the linear target is available and skip the test.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If dm-thin or dm-linear are not supported, let's skip the test
altogether instead of throwing an error.
Issue: #192
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This function will be used later to test if dm-thin is supported.
Suggested-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Previously, no filenames/xattrs would be printed with --nofilename, but
to keep the format of dump, print a placeholder instead of all names.
This is:
* directory entries (files, directories, subvolumes)
* default subvolume
* extended attributes (name, value)
* hardlink names if stored inside another item
Note that lengths are not hidden because they can be calculated from the
item size anyway.
Signed-off-by: David Sterba <dsterba@suse.com>
In the mail list, it's pretty common that a developer is asking dump tree
output from the reporter, it's better to protect those kind reporters by
hiding the filename if the reporter wants.
This option will skip @name/@data output for the following items:
- DIR_INDEX
- DIR_ITEM
- INODE_REF
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
According to the documentation, btrfs qgroup remove takes the same
options as qgroup assign, i.e., --rescan and --no-rescan. However,
currently no options are accepted. Activate option handling also for
qgroup remove, so that automatic rescan can be disabled by the user.
Signed-off-by: Michael Lass <bevan@bi-co.net>
Signed-off-by: David Sterba <dsterba@suse.com>
One reload_btrfs is lost, add it.
Fixes: 0de2e22ad2 ("btrfs-progs: tests: Add tests for changing fsid feature")
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
LOGICAL_INO v1 ignored the reserved fields, so they could be filled
with random stack garbage and have no effect. LOGICAL_INO_V2 requires
all unused reserved bits to be set to zero, and returns EINVAL if they
are not, to guard against future kernel versions which may interpret
non-zero bit values.
Sometimes when 'btrfs ins log' runs, the stack garbage is zeros, so the
-o (ignore offsets) option for logical-resolve works. Sometimes the
stack garbage is something else, and 'btrfs ins log -o' fails with
invalid argument. This depends mostly on compiler version and build
environment details, so a binary typically either always works or never
works.
Fix by initializing logical-resolve's argument structure with a C99
compound literal zero.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Signed-off-by: David Sterba <dsterba@suse.com>