The image has 2 problems mixed:
1) Too small super total_bytes
This super total_bytes is manually modified to create such problem.
2) Unaligned dev item total_bytes
This is created by v4.12 kernel, with 128M + 2K device added, and
original device removed.
Then we can create such image with unaligned dev item total_bytes.
Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Along with the rescue introduced, also introduce check and repair for them.
Unlike normal check functions, some of the check is optional, and even if
the image failed to pass optional check, kernel can still runs fine.
(But may cause noisy kernel warning)
So some check, mainly for alignment, will not cause btrfs check to fail,
but only to output warning and instructs how to fix it.
For repair, it just calls the same repair function in rescue, and is
included in 'btrfs check --repair'.
But 'btrfs rescue' is still the preferred method, since it can be used
independent of all the 'check' passes, if we know what's the exact
problem to fix.
Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce new subcommand 'fix-device-size' to the rescue group, to fix
device size alignment-related problems.
Especially for people unable to mount their fs with super::total_bytes
mismatch, this tool will fix the problems and let the mount continue.
Reported-by: Asif Youssuff <yoasif@gmail.com>
Reported-by: Rich Rauenzahn <rrauenza@gmail.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Recent kernel (starting from v4.6) will refuse to mount if super block
total bytes is smaller than all devices' size.
This makes end user unable to do anything to their otherwise quite
healthy fs.
To fix such problem, introduce repair function to fix it on an unmounted
filesystem.
Reported-by: Asif Youssuff <yoasif@gmail.com>
Reported-by: Rich Rauenzahn <rrauenza@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Recent kernel introduced alignment check for dev item, however older
kernel doesn't align device size when adding new device or shrinking
existing device.
This makes noisy kernel warning every time when any DEV_ITEM gets updated.
Introduce function to fix device size on an unmounted filesystem.
Reported-by: Asif Youssuff <yoasif@gmail.com>
Reported-by: Rich Rauenzahn <rrauenza@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch updates help/document of "btrfs device remove" in two points:
1. Add explanation of 'missing' for 'device remove'. This is only
written in wikipage currently.
(https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices)
2. Add example of device removal in the man document. This is because
that explanation of "remove" says "See the example section below", but
there is no example of removal currently.
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Satoru Takeuchi <satoru.takeuchi@gmail.com>
[ move "" from the macro to help strings ]
Signed-off-by: David Sterba <dsterba@suse.com>
State that the 'delete' is the alias of 'remove' as the man page says.
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Satoru Takeuchi <satoru.takeuchi@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The underscore types are for ioctl structures and should not be used for
regular code that does not need them.
Signed-off-by: David Sterba <dsterba@suse.com>
Currently "fi usage" (and "dev usage") cannot run for the filesystem
using the seed device.
This is because FS_INFO ioctl returns the number of devices excluding
seeds, but load_device_info() tries to access valid device from devid 0
to max_id, and results in accessing seeds too (thus causing mismatching
number of devices).
Since only the size of non-seed devices matters, fix this by just
skipping seed device by checking device's fsid and comparing it to the
fsid obtained by FS_INFO ioctl.
Anand Jain:
%fi_args.num_devices provides number of devices excluding the seed device.
So when looping through the device list for a given fsid, determine if the
given device is a seed device by reading its superblock and then skip it
if its a seed device. Reading of the superblock is done by the function
dev_to_fsid() which can fail if the user is not root OR if the device has
media errors as well. So skip the seed check altogether if we fail to know
the device superblock and thus the fsid.
With this now we are able to view the btrfs fi usage when the device is
bad.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Move dev_to_fsid() from cmds-filesystem.c to cmds-fi-usage.c in order to
call it from both "fi show" and "fi usage".
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Seems to be a typo that, in (ret > 0) branch of check_mounted(),
zero-log set the return value but doesn't return.
Fix it by adding back the missing return.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We should not free the string until we don't call strtok any longer.
If the string is freed in advance, in fact, the second and subsequent
sort items will be ignored.
Fixes: 9fcdf8f894 ("btrfs-progs: don't write to optarg in btrfs_qgroup_parse_sort_string")
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
test_minimum_size() function is only called to check if provided device
is large enough.
However the minimal device size only needs to be calculated once, and
can be reused everywhere.
Refactor that function to make later modification easier.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For rollback, we only needs to open the fs to check if it meets the
condition to rollback. And this RW read makes us failed to rollback
btrfs with v2 space cache.
In fact, we don't even start a transaction during rollback.
So open the fs RO for rollback, to avoid v2 space cache problem.
Reported-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: Gu JinXiang <gujx@cn.fujitsu.com>
Tested-by: Gu JinXiang <gujx@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
--rootdir option will start a transaction to fill the fs, however if
something goes wrong, from ENOSPC to lack of permission, we won't commit
the transaction and cause BUG_ON triggered by uncommitted transaction:
------
extent buffer leak: start 29392896 len 16384
extent_io.c:579: free_extent_buffer: BUG_ON `eb->flags & EXTENT_DIRTY` triggered, value 1
------
The root fix is to introduce btrfs_abort_transaction() in btrfs-progs,
however in this particular case, we can workaround it by force
committing the transaction.
Since during mkfs, the magic of btrfs is set to an invalid one, without
setting fs_info->finalize_on_close() the fs is never able to be mounted.
So even we force to commit wrong transaction we won't screw up things
worse.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For mkfs failure, especially --rootdir errors like EPERM/ENOSPC, the out
branch will overwrite the return value, causing wrong status code.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since we're calling btrfs_search_slot() the return value can be
positive. However we just pass that return value out, causing undefined
return value.
This can cause mkfs to return 1, which indicates something wrong.
Fix it.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When passing directory larger than block device using --rootdir
parameter, we get the following backtrace:
------
extent-tree.c:2693: btrfs_reserve_extent: BUG_ON `ret` triggered, value -28
./mkfs.btrfs(+0x1a05d)[0x557939e6b05d]
./mkfs.btrfs(btrfs_reserve_extent+0xb5a)[0x557939e710c8]
./mkfs.btrfs(+0xb0b6)[0x557939e5c0b6]
./mkfs.btrfs(main+0x15d5)[0x557939e5de04]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f83b101af6a]
./mkfs.btrfs(_start+0x2a)[0x557939e5af5a]
------
Nothing special, just BUG_ON() abusing from ancient code.
Fix them by using correct return.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There is a warning in btrfs-filesystem(8) saying that running 'defrag'
in Linux will almost certainly break ref-links, with much data potentially
being physically duplicated.
However, many users tend to read man pages *after* trying to run things
on their own risk and may miss this important information. This commit
adds a brief copy of this warning into the command built-in help message
where it has good chances to be spotted before user is stuck with
a crowded filesystem.
Pull-request: #73
Signed-off-by: Pavel Kretov <firegurafiku@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add new test to check functionality of subvol get/set-default.
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
[ fix style issues, add missing SUDO_HELPER ]
Signed-off-by: David Sterba <dsterba@suse.com>
Some people were asking why disabling compression via properties is not
set by "none" instead. As this is purely userspace conversion to "" that
kernel accepts, let's add "none" as well for convenience.
Signed-off-by: David Sterba <dsterba@suse.com>
It's messy to use "" to disable compression. Introduce the new value "no"
which can also be used for this purpose.
Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com>
[ coding style fixes ]
Signed-off-by: David Sterba <dsterba@suse.com>
This patch changes "subvol set-default" to also accept the subvolume path
for convenience.
If there are two args, they are assumed as subvol id and path to the fs
(the same as current behavior), and if there is only one arg, it is assumed
as the path to the subvolume.
subvol id is resolved by test_issubvolume() + lookup_path_rootid().
The empty subvol (ino == 2) will get error on test_issubvolume() which
checks whether inode num is 256 or not.
Issue: #35
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
[ update documentation, use the new multi-line command scheme ]
Signed-off-by: David Sterba <dsterba@suse.com>
The help string for some commands could be split to more lines for
clarity, eg. as is now in the receive command. The 'btrfs help' listing
should indent all the lines properly, similar the command specific
help with "usage:'.
The syntax of the first help string line is to separate all command
usage schemas by "\n".
Signed-off-by: David Sterba <dsterba@suse.com>
If one of btrfs' devices was pulled out and we've replaced it with a
new one, then they have the same uuid.
If that device gets reconnected, 'btrfs filesystem show' will show the
stale one instead of the new one, but on the kernel side btrfs has a fix
not to include the stale one, this could confuse users as people may
monitor btrfs by running that command.
This does the similar thing to what kernel side has done.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
[ format string adjustments ]
Signed-off-by: David Sterba <dsterba@suse.com>
This case is for avoiding crash in lowmem check mode.
Field type of extent_inline_ref in an extent is corrupted.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Lowmem check does not skip invalid type in extent_inline_ref and then
calls btrfs_extent_inline_ref_size(type) which causes a crash.
Error:
$ btrfs check --mode=lowmem /tmp/data_small
Checking filesystem on /tmp/data_small
UUID: ee205d69-8724-4aa2-a4f5-bc8558a62169
checking extents
ERROR: extent[20971520 16384] backref type mismatch, missing bit: 2
ERROR: extent[20971520 16384] backref generation mismatch,
wanted: 7, have: 0
ERROR: extent[20971520 16384] is referred by other roots than 3
ctree.h:1754: btrfs_extent_inline_ref_size: BUG_ON `1` triggered,
value 1
btrfs(+0x543db)[0x55fabc2ab3db]
btrfs(+0x587f7)[0x55fabc2af7f7]
btrfs(+0x5fa44)[0x55fabc2b6a44]
btrfs(cmd_check+0x194a)[0x55fabc2bd717]
btrfs(main+0x88)[0x55fabc2682e0]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f021c3824ca]
btrfs(_start+0x2a)[0x55fabc267e7a]
[1] 5188 abort (core dumped) btrfs check --mode=lowmem /tmp/data_small
Fix it by introducing check_extent_inline_ref() to check the type.
If the checker returns a non-zero value, we should not try to check the
corrupted extent item anymore.
Suggested-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Return value of repair_root_items():
<0 on error
=0 does nothing
>0 if repair is enabled, N roots are repaired;
else N roots are corrupted.
In the repair mode, there should be no error if the return value is
bigger than 0. This fixes the test fsck/006 again.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The annotation of repair_root_items says:
"This must be run before any other repair code - not doing it so,
makes other repair code delete or modify backrefs in the extent tree
for example, which will result in an inconsistent fs after repairing
the root items."
However, the rule was broken by commit 1f728b1a51 ("Btrfs-progs,
fsck: move root items repair after root rebuilding").
The commit intends to fix failure of test-fsck/013 so it moves
repair_root_items() after check_extents_and_chunks().
The correct way is to skip calling repair_root_item() when
init_extent_tree is non-zero.
Now put repair_root_items() before do_check_chunks_and_extents() and
do not call repair_root_items() if init_extent_tree is set.
Then test-fsck/013 works well.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In original check mode (without option --repair), check_extent_refs()
always returns 0.
Add a variable @err to record status while checking extents. At the end
of check_extent_refs(), let it return -EIO if @err is non-zero.
The test fsck/006-bad-root-items will fail after this patch and fixed by
the following patches.
Example:
$ btrfs check bad-extent-inline-ref-type.raw
Checking filesystem on bad-extent-inline-ref-type.raw
UUID: 1942d6fe-617b-4499-9982-cc8ffae5447f
checking extents
corrupt extent record: key 29360128 169 16384
ref mismatch on [29360128 16384] extent item 0, found 1
Backref 29360128 parent 5 root 5 not found in extent tree
backpointer mismatch on [29360128 16384]
bad extent [29360128, 29376512), type mismatch with chunk
checking free space cache
checking fs roots
checking csums
checking root refs
found 114688 bytes used, no error found
total csum bytes: 0
total tree bytes: 114688
total fs tree bytes: 32768
total extent tree bytes: 16384
btree space waste bytes: 109471
file data blocks allocated: 0
referenced 0
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
[ add note about the failing test, rename variable to err ]
Signed-off-by: David Sterba <dsterba@suse.com>
Add a macro named BG_ACCOUNT_ERROR meaning that block group used size
does not equal the total.
After extent-tree repair, BG_ACCOUNT_ERROR should be fixed up.
Clean bits at end of check_chunks_and_extents_v2().
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The only thing repair_extent_data_item() does is that it adds backref of the
tree_block. Just like what original mode does:
It first searches the corresponding extent item.
1. If the extent item exists but backref is missing, add one backref to the
extent.
2. Found nothing, just add an extent item and add one backref.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The only thing repair_tree_block_ref() does is that it adds backref of the
tree_block. Just like what original repair do:
It first searches the corresponding extent item then
1. If the extent item exists but backref is missing, add one backref to the
extent.
2. if found nothing, just add an extent item and add one backref.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Because this patchset concentrates on repair of extent tree,
repair_chunk_item() now only inserts missed chunk group item into
extent tree.
There are some things left TODO, for example dev_item fix.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce delete_extent_tree_item() and repair_extent_item() to do
only deletion.
While checking the extent tree, just delete the wrong item. For extent
item, free wrong backref. Otherwise, delete. So the remaining items in
extent tree should be correct.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch is a preparation for extent-tree repair in lowmem mode.
In the lowmem mode, checking tree blocks of various trees is recursive.
But during repair, adding or deleting item(s) may modify upper nodes
which will cause the repair to be complicated and dangerous.
Before this patch:
One problem of lowmem check is that it only checks the lowest node's
backref in check_tree_block_ref.
This way ensures checked tree blocks are valid and avoids to traverse
all trees for performance reasons.
However, there is one shortcoming that it can not detect backref mistake
if one extent whose owner == offset but lacks the other backref(s).
In check, correctness is more important than speed.
If errors can not be detected, repair is impossible.
Changes in the patch:
check_chunks_and_extents now has to check *ALL* trees so lowmem check
will behave like original mode.
Changing the way of traversal to be same as fs tree which calls
walk_down_tree_v2() and walk_up_tree_v2() is easy for further
repair.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
[ heavy coding style fixes ]
Signed-off-by: David Sterba <dsterba@suse.com>
Since lowmem mode can repair certain corruptions (mostly in fs tree),
insert a beacon into each fsck test cases to allow some of them be
tested in lowmem mode.
With this patch, fsck option override will check the beacon file
".lowmem_repairable" in the same directory of the test image, and if the
beacon exists, then it will also run lowmem mode repair to repair the
image.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>