Function find_possible_backrefs() is used to locate the file extents
referring to an data extent.
For data extent backref, its btrfs_extent_data_ref structure has
the following members:
- root
Which root refers to this data extent
- objectid
Which inode refers to this data extent
- offset
Search *hint*.
Its value is @file_offset - @extent_offset.
While for @file_offset, it's directly recorded in (INO EXTENT_DATA
FILE_OFFSET) key.
So when searching the file extents refers to this data extent, we can't
use btrfs_extent_data_ref::offset as search key::offset.
We must search from file offset 0, and iterate all file extents until we
hit a file extent matches the data backref.
Thankfully such time consuming behavior is not triggered frequently,
it only gets called for repair, so it shouldn't affect normal check
routine.
Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>
[Update commit message]
Signed-off-by: Qu Wenruo <wqu@suse.com>
Commit 0ddf63c09f ("btrfs-progs: Record orphan data extent ref to
corresponding root.") introduces the ability to record a file extent
even all other related info is lost (data backref, inode item).
However this patch only records such info without doing any proper
repair, further more, it could even record invalid file extents, and the
report part only happens after all check is done.
Since we will later introduce proper file extent repair functionality,
we could revert that patch.
Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>
[Update commit message, solve merge conflicts]
Signed-off-by: Qu Wenruo <wqu@suse.com>
Commit ad03f840f0 ("btrfs-progs: Add repair and report function for
orphan file extent.") will record and try to repair orphan file extents
by:
- Removing the orphan file extent item if no extent backref can be found
Or
- Re-insert a file extent using data backref
Especially the later case is far from ideal, as normally extent tree is
more fragile and corruption prone.
Use any data from extent tree to try to repair could easily lead to
further corruption.
So here we revert commit ad03f840f0 ("btrfs-progs: Add repair and report
function for orphan file extent.") to cleanup the space for later proper
repair in original mode.
Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>
[Update commit message, solve conflicts with DIR_ITEM hash mismatch patchset]
Signed-off-by: Qu Wenruo <wqu@suse.com>
If found a extent data item has unaligned part, lowmem repair
just deletes it.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
The function can delete items in trees besides extent tree.
Rename and move it for further use.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
[Update comment, solve merge conflicts]
Signed-off-by: Qu Wenruo <wqu@suse.com>
Add support to check unaligned disk_bytenr for extent_data.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Previously, @err are assigned immediately after check but before
repair.
repair_extent_item()'s return value also confuses the caller. If
error has been repaired and returns 0, check_extent_item() will try
to continue check the nonexistent and cause flase alerts.
Here make repair_extent_item()'s return codes only represents status
of the extent item, error bits are handled in caller of the repair
function.
Change of @err after repair.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
[Solve conflicts with DIR_ITEM hash mismatch patchset]
Signed-off-by: Qu Wenruo <wqu@suse.com>
For files, lowmem repair will try to check nbytes and isize,
but isize check depends nbytes.
Once bytes has been repaired, then isize should be checked and
repaired.
So move nbytes check before isize check. Also set nbytes to
extent_size once repaired successfully.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Since repair will do CoW, the outer path may be invalid.
This patch will add an argument, @path, to punch_extent_hole().
When punch_extent_hole() returns, path will still point to the same key.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
[Update comment and commit message]
Signed-off-by: Qu Wenruo <wqu@suse.com>
The 'end' parameter of check_file_extent tracks the ending offset of the
last checked extent. This is used to detect gaps between adjacent extents.
Currently such gaps are wrongly detected since for regular extents only
the size of the extent is added to the 'end' parameter. This results in
wrongly considering all extents of a file as having gaps between them
when only 2 of them really have a gap as seen in the example below.
Solution:
The extent_end variable should set to the sum of the offset and the
extent_num_bytes of the file extent.
Example:
Suppose that lowmem check the following file extent of inode 257.
item 6 key (257 EXTENT_DATA 0) itemoff 15813 itemsize 53
generation 6 type 1 (regular)
extent data disk byte 13631488 nr 4096
extent data offset 0 nr 4096 ram 4096
extent compression 0 (none)
item 7 key (257 EXTENT_DATA 8192) itemoff 15760 itemsize 53
generation 6 type 1 (regular)
extent data disk byte 13631488 nr 4096
extent data offset 0 nr 4096 ram 4096
extent compression 0 (none)
item 8 key (257 EXTENT_DATA 12288) itemoff 15707 itemsize 53
generation 6 type 1 (regular)
extent data disk byte 13631488 nr 4096
extent data offset 0 nr 4096 ram 4096
extent compression 0 (none)
For inode 257, check_inode_item set extent_end to 0, then call
check_file_extent to check item {6,7,8}.
item 6)
offset(0) == extent_end(0)
extent_end = extent_end(0) + extent_num_bytes(4096)
item 7)
offset(8192) != extent_end(4096)
extent_end = extent_end(4096) + extent_num_bytes(4096)
^^^
The old extent_end should replace by offset(8192).
item 8)
offset(12288) != extent_end(8192)
^^^
But there is no gap between item {7,8}.
Fixes: d88da10ddd ("btrfs-progs: check: introduce function to check file extent")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
[Move this patch as the 1st patch, since it's an independent fix]
Signed-off-by: Qu Wenruo <wqu@suse.com>
get_argv0_buf is used only in two functions in help.c which also has
direct access to the static argv0_buf. All other references to this
buffer are direct. So remove the function. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Reuse extent-cache facility to record multiple bytenr so --block can be
specified multiple times.
Despite that, add a sector size alignment check before we try to print a
tree block. Please note that, nodesize alignment check is not suitable
here as meta chunk start bytenr could be unaligned to nodesize.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In github issues, one user reports unexpected ENOSPC error if enabling
datasum druing convert. After some investigation, it looks like that
during ext2_saved/image creation, we could create large file extent
whose size can be 128M (max data extent size).
In that case, its csum block will be at least 128K. Under certain case
we need to allocate extra metadata chunks to fulfill such space
requirement.
However we only do metadata prealloc if we're reserving extents for fs
trees. (we use btrfs_root::ref_cows to determine whether we should do
metadata prealloc, and that member is only set for fs trees).
There is no explaination on why we only do metadata prealloc for file
trees, but at least from my investigation, it could be related to avoid
nested extent tree modication.
At least extent reservation for csum tree shouldn't be a problem with
metadata block group preallocation.
So change the metadata block group preallocation check from
"root->ref_cow" to "root->root_key.objectid !=
BTRFS_EXTENT_TREE_OBJECTID", and add some comment for it.
Issue: #123
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Defragging an executable conflicts both way with it being run, resulting in
ETXTBSY. This either makes defrag fail or prevents the program from being
executed.
Kernels 4.19-rc1 and later allow defragging files you could have possibly
opened rw, even if the passed descriptor is ro (commit 616d374efa23
"btrfs: allow defrag on a file opened read-only that has rw
permissions").
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: David Sterba <dsterba@suse.com>
The code fails if the third section is missing (like "4.18") or is followed
by anything but "." or "-". This happens for example if we're not exactly
at a tag and CONFIG_LOCALVERSION_AUTO=n (which results in "4.18.5+").
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: David Sterba <dsterba@suse.com>
Provide an option in `btrfs receive` to suppress the informational
messages when writing files.
Signed-off-by: Steven Davies <btrfs@steev.me.uk>
Signed-off-by: David Sterba <dsterba@suse.com>
A lot of editor/IDE related config files are dotfiles, like .vimrc
or .clang_complete.
Instead of adding gitignore entry for each editor/IDE, just ignore all
dotfiles.
Files tracked by git are not ignored and can be updated as usual.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Print a better message when there are unknown problems while parsing the
sort string. This currently is -ENOMEM and -1 on uknown format. This
will be changed.
Signed-off-by: David Sterba <dsterba@suse.com>
Since commit 208ba29007 "btrfs-progs: qgroup assign: can't handle
options", the unknown options will not be reported for qgroup
assign/remove as this was mistakenly removed from the caller.
Signed-off-by: David Sterba <dsterba@suse.com>
The error values returned from the command are ad-hoc and don't have
much meaning, the error message tells the user what's wrong. Use only
the common values 0 for ok and 1 other error.
Signed-off-by: David Sterba <dsterba@suse.com>
Let the specific error be the last message printed and don't dump the
whole usage that's not quite helpful in this context.
Signed-off-by: David Sterba <dsterba@suse.com>
The error message about the unsatisfied argument count is scrolled away
by the full usage string dump. This is not considered a good usability
practice.
This commit switches all direct usage -> return patterns, where the
argument check has no other constraint, eg. dependency on an option.
Signed-off-by: David Sterba <dsterba@suse.com>
Update handling of unknown option in all commands. This will not print
only the unknown option and short pointer to help. Dumping the whole
help was a bad idea that stuck for too long.
Signed-off-by: David Sterba <dsterba@suse.com>
Currently any unrecognized option does not print very usable message and
only dumps the whole help. Other common utilities (eg. from the
util-linux suite) print a short message and point to help. And we're
going to do the same.
Example:
$ btrfs device add --unknown device path
btrfs device add: unrecognized option '--unknown'
Try 'btrfs device add --help' for more information
Signed-off-by: David Sterba <dsterba@suse.com>
GCC 8.2.1 will report the following error:
check/main.c: In function 'try_repair_inode':
check/main.c:2606:5: warning: 'ret' may be used uninitialized in this function [-Wmaybe-uninitialized]
if (!ret) {
^
check/main.c:2584:6: note: 'ret' was declared here
int ret;
^~~
The offending code is in repair_mismatch_dir_hash():
int ret;
printf(
"Deleting bad dir items with invalid hash for root %llu ino %llu\n",
root->root_key.objectid, rec->ino);
while (!list_empty(&rec->mismatch_dir_hash)) {
/* do some repair */
}
if (!ret) { <<< Here
/* do some fix */
}
The truth is, to enter try_repair_inode(), we must have
I_ERR_MISMATCH_DIR_HASH bit set for rec->errors.
And just after we set I_ERR_MISMATCH_DIR_HASH, we call
add_mismatch_dir_hash() and handled its error correctly.
So it's impossible to to skip the while loop.
Fix it by initializing @ret to -EUCLEAN, so even we hit some impossible
case, repair_mismatch_dir_hash() won't falsely consider the mismatch
hash fixed.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Replace start fails to report the appropriate error if balance is already
running, as below:
$ btrfs rep start -B -f /dev/sdb /dev/sde /btrfs
ERROR: ioctl(DEV_REPLACE_START) on '/btrfs' returns error: <illegal result value>
Translate the positive values, the exclusive operation is reported as
BTRFS_ERROR_DEV_EXCL_RUN_IN_PROGRESS when balance is running.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Userspace understands the ioctl BTRFS_IOC_DEV_REPLACE command status
using the struct btrfs_ioctl_dev_replace_args::result, and so userspace
initializes this to BTRFS_IOCTL_DEV_REPLACE_RESULT_NO_RESULT, so exclude
this value in checking for the error.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Mkfs tends to create pretty large metadata chunk compared to kernel:
Node size: 16384
Sector size: 4096
Filesystem size: 10.00GiB
Block group profiles:
Data: single 8.00MiB
Metadata: DUP 1.00GiB
System: DUP 8.00MiB
While kernel only tends to create 256MiB metadata chunk:
/* for larger filesystems, use larger metadata chunks */
if (fs_devices->total_rw_bytes > 50ULL * SZ_1G)
max_stripe_size = SZ_1G;
else
max_stripe_size = SZ_256M;
This won't cause problems in real world, but it's still better to make
the behavior unified.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Unlike kernel, btrfs-progs doesn't (yet) support devices grow/shrink,
the port only needs to handle open_ctree() time initialization (at
read_one_dev()), and btrfs_add_device() used for mkfs.
This provide the basis for incoming unification of chunk allocator
behavior.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch adds option --forget to 'device scan'
$ btrfs device scan --forget [dev...]
to unregister the given device from kernel module. The device cannot be
part of a mounted filesystem, this will be reported as an error.
If no argument is given it will unregister all stale (device which are
not mounted) from the kernel.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
This reverts commit cb8abddb20.
There are several reports in IRC that this patch breaks in some
send/receive environments. There are no exact steps to reproduce, only
approximate descroptions. Until a proper reproducer is known, the patch
is temporarily reverted due to the user-visible impact.
Issue: #162
Signed-off-by: David Sterba <dsterba@suse.com>
Since this row is longer than the rest it's enough to use a single tab
character to delimig the value from caption. Without this patch
output looked like:
nodesize 16384
leafsize (deprecated) 16384
With it:
nodesize 16384
leafsize (deprecated) 16384
No functional changes just making the output a bit neater.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Use the Xenial dist for CI so now we have only 2 years old base version,
instead of 4 years old. Kernel is 4.15, gcc is 5.4, clang 7.0.
Signed-off-by: David Sterba <dsterba@suse.com>
The autodetection of conversion filesystems support will skip the tests
if the library is not available for some reason. This now happens for
reiserfs in the CI environment. We need to know about such case so
request the support.
Signed-off-by: David Sterba <dsterba@suse.com>