Rename the function so its name falls in line with the rest of btrfs
nomenclature. So when we change the uuid of the header it's in fact of
the buffer header. Also remove the root argument since it's used solely
to take a reference to fs_info which can be done via the mandatory eb
argument. No functional changes.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This function already takes an extent buffer that contains a reference
to the fs_info. Use that and reduce argument count. No functional
changes.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Originally print-tree uses depth first search to print trees.
This works fine until we reach 3 level trees (root node level is 2).
For tall trees whose root level is 2 or higher, DFS will mix nodes and
leaves like the following example:
Level 2 A
/ \
Level 1 B C
/ | | \
Level 0 D E F G
DFS will cause the following output sequence:
A B D E C F G
Which in term of node/leave is:
N N L L N L L
Such mixed node/leave result is sometimes hard for human to read.
This patch will introduce 2 new options, --bfs and --dfs.
--dfs: keeps the original output sequence, and to keep things
compatible it's the default behavior.
--bfs: uses breadth-first search, and will cause better output
sequence like:
A B C D E F G
Which in term of node/leave is:
N N N L L L L
Much better sorted and may become default in the future.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce a new function, bfs_print_children(), to do breadth-first
search to print a node.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce a new function, btrfs_next_sibling_tree_block(), which could
find any sibling tree blocks at path->lowest_level, unlike level 0
limited btrfs_next_leaf().
Since this function is more generic than btrfs_next_leaf(), also make
btrfs_next_leaf() a wrapper of btrfs_next_sibling_tree_block(), to keep
the interface the same as kernel.
This would provide the basis for later breadth-first search print-tree.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
As the @root parameter is only used to get @fs_info, use fs_info
directly instead.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Factor out main logic of btrfs_util_subvolume_info_fd(). This is a
preparation work to relax the root privilege of this function. No
functional changes.
Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Copy and add 3 definitions of new unprivileged ioctls:
* BTRFS_IOC_GET_SUBVOL_INFO
* BTRFS_IOC_GET_SUBVOL_ROOTREF
* BTRFS_IOC_INO_LOOKUP_USER
from kernel definitions. They will be used to implement the version
of "btrfs subvolume list/show" that will not require root privileges.
Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Some information is obsolete and updated as follows:
- add missing explanations of some options
- remove outdated explanation of "subvolrootid" mount option
- reorder/group options of "subvolume list" so it matches the help
message
- add explanation of different meanings of parent in "parent ID/UUID"
- fix indentation/spelling
- add missing comma
Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Similar to the changes where strerror(errno) was converted, continue
with the remaining cases where the argument was stored in another
variable.
The savings in object size are about 4500 bytes:
$ size btrfs.old btrfs.new
text data bss dec hex filename
805055 24248 19748 849051 cf49b btrfs.old
804527 24248 19748 848523 cf28b btrfs.new
Signed-off-by: David Sterba <dsterba@suse.com>
At restore time, btrfs-image will try to fixup dev extents, this will
trigger a transaction.
It's normally OK to start a new transaction, but for an image with log
tree, a new transaction will increment generation, and log tree generation
will not match super block generation + 1.
There is no good way to solve it yet. Just output a warning for now.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update error message ]
Signed-off-by: David Sterba <dsterba@suse.com>
Make the checks in check_file_extent a bit more explicit. First we check
for unknown type and fail accordingly. Then we check for inline extent
and handle it in the newly introduced check_file_extent_inline. Finally
if none of the above checks triggered then we must have a regular or
prealloc extents.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Instead of having another top-level if which checks for
'extent_num_bytes != item_inline_len' only if we are !compressed, just
move the 'if' inside the 'else' branch of the first top-level if, since
it has already checked for !compressed or not. No functional changes.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since the inline extent code can be largely self-sufficient, factor
it out from check_file_extent. No functional changes.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When using gcc8 to compile btrfs-progs, it complains as below:
ctree.c: In function 'btrfs_search_slot_for_read':
ctree.c:1249:45: warning: passing argument 3 of 'btrfs_search_slot'
discards 'const' qualifier from pointer target type
[-Wdiscarded-qualifiers]
ret = btrfs_search_slot(NULL, root, key, p, 0, 0);
Change btrfs_search_slot prototype with 'const' qualifier for argument 3.
Also fix similar problems as above change.
Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Total of three conditions are tested. One for short name, one with
name length 255, the last one with more than 255.
This case should pass after commit
'btrfs-progs: change filename limit to 255 when creating subvolume'.
Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Modify the file name length limit to meet the Linux naming convention.
In addition, the file name length is always bigger than 0, no need to
compare with 0 again.
Issue: #145
Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When convert failed, the error messsage would look like:
create btrfs filesystem:
blocksize: 4096
nodesize: 16384
features: extref, skinny-metadata (default)
creating ext2 image file
ERROR: failed to create ext2_saved/image: -1
WARNING: an error occurred during conversion, filesystem is partially
created but not finalized and not mountable
We can only know something wrong happened during "ext2_saved/image" file
creation, but unable to know what exactly went wrong.
This patch will add the following error messages for create_image() and
its callee:
1) Sanity test error
2) Csum calculation error
3) Free ino number allocation error
4) Inode creation error
5) Inode mode change error
6) Inode link error
With all these error messages, we should be pretty easy to locate the
error without extra debugging.
Reported-by: Serhat Sevki Dincer <jfcgauss@gmail.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When pread64() returns value smaller than expected, it normally means
EIO, so just return -EIO to replace the intermediate number. So when IO
fails, we should be able to get more meaningful error number of than
EPERM.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Simple test case which preps a filesystem, then corrupts the FST and
finally repairs it. Tests both extent based and bitmap based FSTs.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Now that all the prerequisite code for proper support of free space
tree repair is in, it's time to wire it in. This is achieved by first
hooking the freespace tree to the __free_extent/alloc_reserved_tree_block
functions. And then introducing a wrapper function to contains the
existing check_space_cache and the newly introduced repair code.
Finally, it's important to note that FST repair code first clears the
existing FST in case of any problem found and rebuilds it from scratch.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The RO_FREE_SPACE_TREE(_VALID) flags are required in order to be able
to open an FST filesystem in repair mode. Add them to
BTRFS_FEATURE_COMPAT_RO_SUPP.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For now this doesn't change the functionality since FST code is not yet
enabled via the compat bits. But this will be needed when it's enabled
so that the FST is correctly modified during repair operations that
allocate/deallocate extents.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
To help implement free space tree checker in user space some kernel
function are necessary, namely iterating/deleting/adding freespace
items, some internal search functions. Functions to populate a block
group based on the extent tree. The code is largely copy/paste from
the kernel with locking eliminated (i.e free_space_lock). It supports
reading/writing of both bitmap and extent based FST trees.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This commit introduces explicit little endian bit operations. The only
difference with the existing bitops implementation is that bswap(32|64)
is called when the _le versions are invoked on a big-endian machine.
This is in preparation for adding free space tree conversion support.
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Replace existing find_*_bit functions with kernel equivalent. This
reduces duplication, simplifies the code (we really have one worker
function _find_next_bit) and is quite likely faster. No functional
changes.
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Those functions are in preparation for adding the freespace tree repair
code since it needs to be able to deal with bitmap based FSTs. This
patch adds extent_buffer_bitmap_set and extent_buffer_bitmap_clear
functions. Since in userspace we don't have to deal with page mappings
their implementation is vastly simplified by simply setting each bit in
the passed range.
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For completeness sake add code to btrfs_read_fs_root so that it can
handle the freespace tree.
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Now that delayed refs have been wired let's merge the two function. In
the process also remove one BUG_ON since alloc_reserved_tree_block's
callers can handle errors. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Now that delayed refs have been all wired up clean up the __free_extent2
adapter function since it's no longer needed. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Given that the new delayed refs infrastructure is implemented and wired
up, there is no point in keeping the old code. So just remove it.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This commit enables the delayed refs infrastructures. This entails doing
the following:
1. Replacing existing calls of btrfs_extent_post_op (which is the
equivalent of delayed refs) with the proper btrfs_run_delayed_refs.
As well as eliminating open-coded calls to finish_current_insert and
del_pending_extents which execute the delayed ops.
2. Wiring up the addition of delayed refs when freeing extents
(btrfs_free_extent) and when adding new extents (alloc_tree_block).
3. Adding calls to btrfs_run_delayed refs in the transaction commit
path alongside comments why every call is needed, since it's not
always obvious (those call sites were derived empirically by running
and debugging existing tests)
4. Correctly flagging the transaction in which we are reinitialising
the extent tree.
5. Moving btrfs_write_dirty_block_groups to
btrfs_write_dirty_block_groups since blockgroups should be written to
disk after the last delayed refs have been run.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The root argument is used only to get a reference to the fs_info, this
can be achieved with the transaction handle being passed so use that.
This is in preparation for moving this function in the main transaction
commit routine. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This commit pulls those portions of the kernel implementation of
delayed refs which are necessary to have them working in user-space.
I've done the following modifications:
1. Replaced all kmem_cache_alloc calls to kmalloc.
2. Removed all locking-related code, since we are single threaded in
userspace.
3. Removed code which deals with data refs - delayed refs in user space
are going to be used only for cowonly trees.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is a simple adapter function to convert the delayed-refs structures
to the current arguments of alloc_reserved_tree_block.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is a simple adapter to convert the arguments delayed ref arguments
to the existing arguments of __free_extent.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
BTRFS_IOC_QGROUP_ASSIGN ioctl could return >0 if qgroup is marked
inconsistent after successful relationship assignment/removal.
We leak the return value as the final return value of btrfs command.
But according to the man page, return value other than 0 means failure.
Fix this by resetting the return value to 0 for --no-rescan case.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
During fuzz/007 we hit the following error:
====== RUN MAYFAIL btrfs rescue super-recover -y -v tests/fuzz-tests/images/bko-200409.raw.restored.scratch
ERROR: tree_root block unaligned: 33554431
ERROR: superblock checksum matches but it has invalid members
ERROR: tree_root block unaligned: 33554431
ERROR: superblock checksum matches but it has invalid members
ERROR: tree_root block unaligned: 33554431
ERROR: superblock checksum matches but it has invalid members
ERROR: failed to add chunk map start=12582912 len=8454144: -17 (File exists)
Couldn't read chunk tree
failed (ignored, ret=139): btrfs rescue super-recover -y -v tests/fuzz-tests/images/bko-200409.raw.restored.scratch
mayfail: returned code 139 (SEGFAULT), not ignored
test failed for case 007-simple-super-recover
[CAUSE]
In __open_ctree_fd(), if we have valid @open_ctree_flags and
btrfs_scan_fs_devices() succeeds without problems, no matter what
happens we will call btrfs_close_devices(), thus free all related
devices.
In super-recover, before we call open_ctree(), we have called
btrfs_scan_fs_devices() already, so btrfs_scan_fs_devices() should not
fail in open_ctree(), fs_devices will always be freed in open_ctree() or
close_ctree().
[FIX]
So in super-recover.c, we should not call btrfs_close_devices(), or we
will find fs_devices->list freed, and trigger segfault when exiting.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Another BUG_ON() during fuzz/003:
====== RUN MAYFAIL btrfs check --repair tests/fuzz-tests/images/bko-199833-reloc-recovery-crash.raw.restored
[1/7] checking root items
Fixed 0 roots.
[2/7] checking extents
ctree.c:1650: leaf_space_used: Warning: assertion `data_len < 0` failed, value 1
bad key ordering 18 19
bad block 29409280
ERROR: errors found in extent allocation tree or chunk allocation
WARNING: minor unaligned/mismatch device size detected
WARNING: recommended to use 'btrfs rescue fix-device-size' to fix it
[3/7] checking free space cache
[4/7] checking fs roots
ctree.c:1650: leaf_space_used: Warning: assertion `data_len < 0` failed, value 1
bad key ordering 18 19
root 18446744073709551608 missing its root dir, recreating
Unable to find block group for 0
Unable to find block group for 0
Unable to find block group for 0
volumes.c:564: btrfs_alloc_dev_extent: BUG_ON `ret` triggered, value -28
failed (ignored, ret=134): btrfs check --repair tests/fuzz-tests/images/bko-199833-reloc-recovery-crash.raw.restored
mayfail: returned code 134 (SIGABRT), not ignored
test failed for case 003-multi-check-unmounted
However the culprit function btrfs_alloc_dev_extent() has proper error
handling label err:, just using that label would solve the problem easily.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
An infinite loop can be triggered during fuzz/003:
====== RUN MAYFAIL btrfs check --repair tests/fuzz-tests/images/bko-199833-reloc-recovery-crash.raw.restored
[1/7] checking root items
Fixed 0 roots.
[2/7] checking extents
ctree.c:1650: leaf_space_used: Warning: assertion `data_len < 0` failed, value 1
bad key ordering 18 19
ctree.c:1650: leaf_space_used: Warning: assertion `data_len < 0` failed, value 1
bad key ordering 18 19
ctree.c:1650: leaf_space_used: Warning: assertion `data_len < 0` failed, value 1
bad key ordering 18 19
[CAUSE]
In try_to_fix_bad_block() it's possible that btrfs_find_all_roots()
finds no root referring to that tree block, thus we can't do any repair.
However in that case, we still return 0 since the last caller assigning
@ret is btrfs_find_all_roots(), and the ulist while loop doesn't get run
at all.
And since try_to_fix_bad_block() returns 0, check_block() in
check/main.c will return -EAGAIN to re-check the tree block.
This leads to the infinite loop.
[FIX]
Change the default return value from 0 to -EIO in
try_to_fix_bad_block(), so if there is no tree referring to the bad tree
block, it won't cause infinite loop anymore.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Another BUG_ON() during fuzz/003:
====== RUN MAYFAIL btrfs check --init-csum-tree tests/fuzz-tests/images/bko-161821.raw.restored
[1/7] checking root items
Fixed 0 roots.
[2/7] checking extents
parent transid verify failed on 4198400 wanted 14 found 1114126
parent transid verify failed on 4198400 wanted 14 found 1114126
Ignoring transid failure
owner ref check failed [4198400 4096]
repair deleting extent record: key [4198400,169,0]
adding new tree backref on start 4198400 len 4096 parent 0 root 5
Repaired extent references for 4198400
ref mismatch on [4222976 4096] extent item 1, found 0
backref 4222976 root 7 not referenced back 0x5617f8ecf780
incorrect global backref count on 4222976 found 1 wanted 0
backpointer mismatch on [4222976 4096]
owner ref check failed [4222976 4096]
repair deleting extent record: key [4222976,169,0]
Repaired extent references for 4222976
[3/7] checking free space cache
[4/7] checking fs roots
parent transid verify failed on 4198400 wanted 14 found 1114126
Ignoring transid failure
Wrong generation of child node/leaf, wanted: 1114126, have: 14
root 5 missing its root dir, recreating
parent transid verify failed on 4198400 wanted 14 found 1114126
Ignoring transid failure
ERROR: child eb corrupted: parent bytenr=4222976 item=0 parent level=1 child level=2
ERROR: errors found in fs roots
extent buffer leak: start 4222976 len 4096
extent_io.c:611: free_extent_buffer_internal: BUG_ON `eb->flags & EXTENT_DIRTY` triggered, value 1
failed (ignored, ret=134): btrfs check --init-csum-tree tests/fuzz-tests/images/bko-161821.raw.restored
mayfail: returned code 134 (SIGABRT), not ignored
test failed for case 003-multi-check-unmounted
Since we're shifting to use btrfs_abort_transaction() in btrfs-progs,
it will be more and more common to see dirty leaked eb. Instead of
BUG_ON(), we only need to report it as a warning.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When running test fuzz/003, we could hit the following BUG_ON:
====== RUN MAYFAIL btrfs check --init-csum-tree tests//fuzz-tests/images/bko-155621-bad-block-group-offset.raw.restored
Unable to find block group for 0
Unable to find block group for 0
Unable to find block group for 0
extent-tree.c:2657: alloc_tree_block: BUG_ON `ret` triggered, value -28
failed (ignored, ret=134): btrfs check --init-csum-tree tests/fuzz-tests/images/bko-155621-bad-block-group-offset.raw.restored
mayfail: returned code 134 (SIGABRT), not ignored
test failed for case 003-multi-check-unmounted
Just remove that BUG_ON() and allow us to exit gracefully, the caller
handles the errors.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
- group entries that belong together
- add / prefix for files that are at fixed location
- remove obsolte build targets
- remove automake support scripts
- add missing targets (.static)
Signed-off-by: David Sterba <dsterba@suse.com>
There's a regular manual page that matches the file glob mask *.8 so we
have to be more careful and remove only the known intermediate files.
Signed-off-by: David Sterba <dsterba@suse.com>
The manual pages are not compressed anymore and we can remove gzip from
build dependencies and build steps.
Signed-off-by: David Sterba <dsterba@suse.com>