When btrfs-progs walk down the tree, it does not check whether the child
node/leaf is valid.
In fact, there is some corrupted image whose csum is all valid but
parent node points to a invalid leaf.
In my case, the parent node in fs tree point to a invalid leaf(gen 11),
whose generation(15) and first key(EXTENT_TREE ROOT_ITEM 0) is
completely invalid, and will cause BUG_ON in process_inode_item().
Unfortunately, we are unable to fix when it happens.
So we can only output meaningful error message and avoid the insane
node/leaf, which is still much better than the original BUG_ON().
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
It is highly obnoxious to have to go put in a testdev when all you really want
is to run the quick image tests. Make this part optional so if we don't have a
testdev specified we just don't run that particular test. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
If scrub is not cancelled nor finished, the recorded status will prevent
scrub to start again though it's not running. There's a force option to
run it anyway, but this is just a bandaid and the true status of scrub
should be detected automatically. The force option should not be
necessary anymore.
The test introduced in 9681f82853 checks only the status file,
not kernel status of scrub.
Signed-off-by: David Sterba <dsterba@suse.cz>
We should kill free_some_buffers() to stop reclaiming extent buffers or
we will hit a problem described below.
As of commit 53ee1bccf9, we are not
counting a reference for tree->lru anymore. However free_some_buffers()
is still left and is reclaiming extent buffers whose @refs == 1. This
cause extent buffers to be reclaimed unintentionally. Thus the following
steps could happen:
1. A buffer at address A is reclaimed by free_some_buffers()
(address A is also free()ed)
2. Some code call alloc_extent_buffer()
3. Address A is assigned to newly allocated buffer
4. You see a buffer pointed by A suddenly changed its content
This problem is also pointed out here and it has a reproducer:
https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg36703.html
This commit drop free_some_buffers() and related variables, and also it
modify extent_io_tree_cleanup() to catch non-free'ed buffers properly.
Signed-off-by: Naohiro Aota <naota@elisp.net>
Signed-off-by: David Sterba <dsterba@suse.cz>
We have --init-csum-tree, which just empties the csum tree. I'm not sure why we
would ever need this, but we definitely need to be able to rebuild the csum tree
in some cases. This patch adds the ability to completely rebuild the crc tree
by reading all of the data and adding csum entries for them. This patch doesn't
pay attention to NODATASUM inodes, it'll happily add csums for everything.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
With the changes as in the previous patch, now scan_for_btrfs()
is an unused function. So delete it.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The libblkid scan method which was introduced later, will also
scan devices under /proc/partitions. So we don't have to do
the explicit scan of the same.
Remove the scan method BTRFS_SCAN_PROC.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
super-recover collects btrfs devices infomation using existed
functions scan_one_devices().
Problem is fs_devices is freed twice in close_ctree() and
free_recover_superblock() for super correction path.
Fix this problem by checking whether fs_devices memory
have been freed before we free it.
Cc: Eric Sandeen <sandeen@redhat.com>
Cc: Chris Murphy <lists@colorremedies.com>
Acked-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Wang Shilong <wangshilong1991@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
(I am unable to reproduce the issue, tried to go back with progs versions
but still the same. So as of now this code remains untested, suggest to
wait till we have a reproducible test case).
Here is a test case which says it all..
mkfs.xfs -f $DEV
mkfs.btrfs -f $DEV
mount $DEV $MNT
mount: /dev/vdiskc: more filesystems detected. This should not happen,
use -t <type> to explicitly specify the filesystem type or
use wipefs(8) to clean up the device.
mount: you must specify the filesystem type
with this patch btrfs_prepare_device() also wipes old FS if any,
btrfs_prepare_device() is called after we have verified that
user has provided -f option.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The value of sector for space cache was hardcoded to 4k, and used to
calculate bitmap sizes. In kernel, the BITS_PER_BITMAP is derived from
PAGE_CACHE_SIZE which is not available for userspace, that can also deal
with filesystem of varying sectorsize.
Signed-off-by: David Sterba <dsterba@suse.cz>
When we have non-inlined extent references, we were failing to find the
corresponding extent item for an existing csum item in the csum tree.
Reproducer:
mkfs.btrfs -f /dev/sdd
mount /dev/sdd /mnt
xfs_io -f -c "falloc 780366 135302" /mnt/foo
xfs_io -c "falloc 327680 151552" /mnt/foo
xfs_io -c "pwrite -S 0xff -b 131072 0 131072" /mnt/foo
sync
for i in `seq 1 40`; do btrfs subvolume snapshot /mnt /mnt/snap$i ; done
umount /mnt
btrfs check /dev/sdd
The check command exited with status 1 and the following output:
Checking filesystem on /dev/sdd
UUID: 2416ab5f-9d71-457e-bb13-a27d4f6b399a
checking extents
checking free space cache
checking fs roots
checking csums
There are no extents for csum range 12980224-12984320
Csum exists for 12980224-12984320 but there is no extent record
found 1388544 bytes used err is 1
total csum bytes: 132
total tree bytes: 704512
total fs tree bytes: 573440
total extent tree bytes: 16384
btree space waste bytes: 564479
file data blocks allocated: 19341312
referenced 14606336
Btrfs v3.14.1-94-g80597e7
After this change it no longer erroneously reports a missing extent for the
csum item and exits with a status of 0.
Also added missing btrfs_prev_leaf() return value checks, as we were ignoring
errors and non-existence of left siblings completely.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
When encountering system crash or balance enospc errors,
there maybe still some reloc roots left.
The way we store reloc root is different from fs root:
reloc root's root key(BTRFS_RELOC_TREE_OBJECTID, ROOT_ITEM, objectid)
fs root's root key(objectid, ROOT_ITEM, -1)
reloc data's root key(BTRFS_DATA_RELOC_TREE_OBJECTID, ROOT_ITEM, 0)
So this patch use right key to search corresponding root node, and
avoid using normal fs root cache for reloc roots.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
If btrfsck fail to repair, we hit something like following:
Check tree block failed, want=29442048, have=0
Check tree block failed, want=29442048, have=0
Check tree block failed, want=29442048, have=0
Check tree block failed, want=29442048, have=0
Check tree block failed, want=29442048, have=0
read block failed check_tree_block
found 98304 bytes used err is 1
total csum bytes: 0
total tree bytes: 0
total fs tree bytes: 0
total extent tree bytes: 0
btree space waste bytes: 0
file data blocks allocated: 0
referenced 0
Btrfs v3.14.2-rc2-63-g3944f15
btrfs: transaction.h:38: btrfs_start_transaction: Assertion `!(root->commit_root)' failed.
Aborted (core dumped)
This is because under repair mode, we will start a transaction, and if we error out,
we don't finish this transaction. So in close_ctree(), it will try
to start and commit transaction which causes the above segmentation.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Now btrfsck would hit assertation failure for some searching tree failure.
It is true that filesystem may get some metadata block corrupted,
and btrfsck could not deal with these corruptings. But, Users really
don't want a BUG_ON() here, Instead, just return errors to caller.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Repair mode will commit transaction which will make us
fail to load log tree anymore.
Give a warning to common users, if they really want to
coninue, we will clear out log tree.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
This can not only give some speedups but also avoid forever loop
with a really broken filesystem.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
After the previous 2 patches, nothing uses
whole-dev-tree scanning, so remove the code which
implemented that functionality.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
If we didn't find what we are looking for in /proc/partitions,
we're not going to find it by scanning every node under /dev, either.
But that's just what btrfs_scan_for_fsid() does.
Remove that fallback; at that point btrfs_scan_for_fsid() just calls
scan_for_btrfs(), so remove the wrapper & call it directly.
Side note: so, these paths always use /proc/partitions, not libblkid.
Userspace-intiated scans default to libblkid. I presume this is
part of the design, and intentional? Anyway, not changing it now!
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
We can scan for btrfs devices in a few ways. By default
libblkid is used for "device scan" and "filesystem show";
with the -m option only mounted filesystems are scanned,
and with -d we physically read every system device.
But there's no reason for the complexity of a descent through
/dev; /proc/partitions has every device known to the kernel, so
just use that when -d is specified.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Enhance the 'subvolume' subcommand to wait until a given list of
subvolumes or all currently scheduled for deletion are cleaned
completely from the filesystem.
Signed-off-by: David Sterba <dsterba@suse.cz>
recover_prepare() in chunk-recover.c alloc memory which only contains
sizeof(struct btrfs_super_block). This will cause glibc malloc error
after superblock csum is calculated.
Use BTRFS_SUPER_INFO_SIZE to fix the bug.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
There's an off by one error in btrfs_check_leaf, we should be going to nritems -
1, not nritems - 2, we were missing problems with items in the very last slot.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
[BUG]
Some fsfuzzed btrfs image will cause btrfsck segfault.
[REPRODUCER]
Run btrfsck on a csum tree block corrupted image.
[REASON]
check_csums() function call btrfs_search_slot() on csum_tree but doesn't
check whether the csum_tree contains a valid extent_buffer, which causes
the segfault.
[FIX]
Check the csum_root->node before any search.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
A user reported a WARN_ON() when trying to run btrfsck --repair on his fs with
bad key ordering. This was because the root that was broken wasn't part of the
transaction yet. We do this open coded thing in a few other places in fsck, so
just make it a helper function and make sure all the places that need to call it
do call it. With this patch he was able to run repair without it dying.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
We use the read extent buffer infrastructure to read the super block when we are
creating a btrfs-image. This works out fine most of the time except when the fs
has been balanced, then it fails to map the super block. So we could fix
btrfs-image to read in the super in a special way, but thats more code. So
instead just check in the eb reading code if we are reading the super and then
don't bother mapping the block, just read the actual offset. This fixed some
poor guy who was trying to btrfs-image his fs that had been balanced. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Asserting is no fun, we may be able to recover from this error in certain cases
(like btrfs-image and btrfsck). Just do what the kernel does and spit out an
error and return that there is only 1 copy. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Currently these macros just tie to assert(), which gives us line number and such
but no backtrace so no actual context. This patch adds support for spitting out
a backtrace so we can see how we got to the given assert. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
[backtrace_symbols_fd]
Signed-off-by: Naohiro Aota <naota@elisp.net>
[minor fixups]
Signed-off-by: David Sterba <dsterba@suse.cz>
Add human readable incompat flags output for btrfs-show-super,
now no longer needs to calculate the hex flags by hand.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The new option -f will force to do dangerous changes.
e.g. clear the seeding flag.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
[more text from 3db4c0a3d3 changelog]
Signed-off-by: David Sterba <dsterba@suse.cz>
Bug-Debian: http://bugs.debian.org/539433
Bug-Debian: http://bugs.debian.org/583768
Authors:
Luca Bruno <lucab@debian.org>
Alexander Kurtz <kurtz.alex@googlemail.com>
Daniel Baumann <daniel.baumann@progress-technologies.net>
Signed-off-by: Dimitri John Ledkov <xnox@debian.org>
Signed-off-by: David Sterba <dsterba@suse.cz>
There are many trivial typos in Documentation/*.txt.
All of these use "exist status" to mean "exit status"
by mistake. I guess someone first made this mistake
and it has spread by copy-and-paste :-D
Signed-off-by: Naohiro Aota <naota@elisp.net>
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
man 8 btrfs-property refers to `setattr(8)` which does not actually exist.
It should refer to `chattr (1)` instead.
Signed-off-by: Naohiro Aota <naota@elisp.net>
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Before this patch, you could see the following after exec restore
# :too few arguments
The tool name "btrfs restore" is missing.
The @set_argv0() function is introduced by:
commit a184abc70f
btrfs-progs: move the check_argc_* functions into utils.c
...
Also add a new function "set_argv0" to set the correct tool name:
*btrfs-image*: too few arguments
But @set_argv0() only applies to the independent tools with
the name pattern btrfs-***.
Since restore is now is subcommand under "btrfs",
there is no need to use @set_argv0() before check_argc_* to
repair the prompt tool name before "too few arguments".
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Original find_mount_root() will use the first mount point match and
return it.
It was OK until the following commit, which will also check the fstype:
de22c28ef3 btrfs-progs: Check fstype in find_mount_root()
With fstype check, we should check the last match, not only the first
one.
Or the following mount will not pass the find_mount_root():
/dev/sdc on /mnt/test type ext4 (rw,relatime,data=ordered)
/dev/sdb on /mnt/test type btrfs (rw,relatime,space_cache)
This patch will use the last match to do the fstype check.
Reported-by: Remco Hosman <remco@yerf-it.nl>
Signed-off-by: Remco Hosman <remco@yerf-it.nl>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
When entering the next level node, the @next_leaf in restore forgets to
start at the first slot. Just reset it to the first one.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Steps to reproduce:
# mkfs.btrfs -f <dev>
# mount -o compress-force=lzo <dev> <mnt>
# for ((i=0;i<4000;i++)); do
echo -n 'A' >> <mnt>/inline_data
done
# umount <mnt>
# valgrind --tool=memcheck --leak-check=full \
btrfs restore <dev> <dest_dir>
output:
==32118== Invalid read of size 1
==32118== at 0x4A0A4E4: memcpy@@GLIBC_2.14
==32118== by 0x43DC91: read_extent_buffer
==32118== by 0x421401: search_dir (cmds-restore.c:240)
==32118== by 0x422CBB: cmd_restore (cmds-restore.c:1317)
==32118== by 0x404709: main (btrfs.c:248)
==32118== Address 0x4c4f4ac is not stack'd, malloc'd or...
It is because when deal with inline extent, the read_extent_buffer
is now reading a len of @ram_bytes which is the len of the uncompressed
data. But actually here we want the len of the inline item.
So in the compressed situation, use the len of the inline item.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
David sent a quick patch that removed a BUG_ON(). I took a peek and
found that the function was already leaking an eb ref and only returned
0. So this fixes the leak and makes the function void and fixes up the
callers.
Accidentally-motivated-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Zach Brown <zab@zabbo.net>
Signed-off-by: David Sterba <dsterba@suse.cz>
When corrupting extent tree, corrupt-block will iterate each child
node/leaf of a node.
However, when a node's child is leaf, btrfs_corrupt_extent_leaf() may
delete some item in the leaf, which may cause the children number of the
parent node decrease.
Before this patch, corrupt-block will read out the nritems only *ONCE*
and iterate the 'nritems' times.
When btrfs_corrupt_extent_leaf() deletes enough item, causing the
nritems of btrfs_header decreased, the last few iteration will access
non-existed node, which will cause the delete and use bug like
the following:
deleting extent record: key 40714240 168 16384
Couldn't map the block 3459802452797161472
btrfs-corrupt-block: volumes.c:1137: btrfs_num_copies: Assertion
`!(!ce)' failed.
Aborted
This patch will update the nritmes in each iteration to avoid the bug.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
A memory problem reported by valgrind as follows:
=== Syscall param pwrite64(buf) points to uninitialised byte(s)
When running:
# valgrind --leak-check=yes btrfs restore /dev/sda9 /mnt/backup
Because the output buf size is alloced with malloc, but the length of
output data is shorter than the sizeof(buf), so valgrind report
uninitialised byte(s).
We could use calloc to repalce malloc and clear this WARNING away.
Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
When using send/receive, it it useful to be able to match up source
subvols on the send side (as, say, for -p or -c clone sources) with their
corresponding copies on the receive side. This patch adds a -R option to
btrfs sub list to show the received subvolume UUID on the receive side,
allowing the user to perform that matching correctly.
Signed-off-by: Hugo Mills <hugo@carfax.org.uk>
Signed-off-by: David Sterba <dsterba@suse.cz>
Fix (at least one user-visible) typos: it's its, not it's.
Signed-off-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
This commit improves the static-only building of btrfs-progs, and adds
support for installing the static only tools:
- It now ensures that all programs are built statically, not only a
small subset of them, by defining 'progs_static' from the existing
'progs' variable.
- It changes the order of libraries in the btrfs-%.static rule so
that -lpthread (part of STATIC_LIBS) appears *after* the '$($(subst
-,_,$(subst .static,,$@)-libs))' logic, which brings in
-lcom_err. This is needed because libcom_err.a uses the semaphore
functions, which are available in the pthread library.
- Adds the necessary rules to generate the btrfsck.static link and
btrfstune.static binary.
- Adds an 'install-static' target to install the static
binaries. Note that they are renamed to not carry a '.static'
suffix.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
This commit adds the support for a make variable named
"DISABLE_DOCUMENTATION", which allows to disable the build of the
documentation. This is useful in contexts where the tools needed to
build the documentation are not necessarily available.
Signed-off-by: Gustavo Zacarias <gustavo@zacarias.com.ar>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The restore tool should only print info of the restoring process
in verbose mode with -v option specified.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
commit 46de1a6ec3 changed the
parameters of btrfs_read_and_process_send_stream(). This breaks
snapper compilation. We can include version defines usable for the C
preprocessor.
Version 0.1.0: API up to and including 46de1a6ec3 (3.14.x)
Version 0.1.1: 909131939f (changed in 3.16)
Signed-off-by: Arvin Schnell <aschnell@suse.de>
Signed-off-by: David Sterba <dsterba@suse.cz>
Kernels >= 3.15 export the global block reserve as a space info presented
by 'btrfs fi df' but would display 'unknown' instead of some meaningful
string.
Signed-off-by: David Sterba <dsterba@suse.cz>