In some cases the extent tree can just be so gone there is no point in trying to
figure out how to put it back together. So add a --init-extent-tree mode which
will zero out the extent tree and then re-add extents for all of the blocks we
find. This will also undo any balance that was going on at the time of the
crash, this is needed because the reloc tree seems to confuse fsck at the
moment. With this patch I can put back together a users file system that was
completely gone. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
The tree log bug I introduced could create inconsistent file extent entries in
the file system tree and in some worst cases even create multiple extent entries
for the same entry. To fix this we need to do a few things
1) Keep track of extent items that overlap and then pick the one that covers the
largest area and delete the rest of the items.
2) Keep track of file extent items that land in extent items but don't match
disk_bytenr/disk_num_bytes exactly. Once we find these we need to figure out
who is the right ref and then fix all of the other refs to agree.
Each of these cases require a complete rescan of all of the extents, so
unfortunately if you hit this particular problem the fsck is going to take quite
a while since it will likely rescan all the trees 2 or 3 times. With this patch
the broken file system a user sent me is fixed and a broken file system that was
created by my reproducer is also fixed. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
An unfortunate side effect to my fsync bug means that anybody who didn't hit the
BUG_ON() during tree log replay would have ended up with a corrupted file
system. Currently our fsck does not catch this because it just looks for
bytenrs for backrefs, it doesn't look at the num_bytes at all. So this patch
makes us keep track of how big the backrefs are, since their disk_num_bytes
_have_ to match the number of bytes for the actual extent item. With this patch
fsck now finds problems with a file system it previously thought was ok.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
btrfsck was blowing up when checking the free space cache when we ran xfstests
with -l 64k. That is because I was init'ing the free space ctl to whatever the
leafsize was, which isn't right for data block groups. With this patch btrfsck
no longer complains. This also fixes a tiny little typo in free-space-cache.c I
noticed while figuring this problem out. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
All we need for restore to work is the chunk root, the tree root and the fs root
we want to restore from. So to do this we need to make a few adjustments
1) Make open_ctree_fs_info fail completely if it can't read the chunk tree.
There is no sense in continuing if we can't read the chunk tree since we won't
be able to translate logical to physical blocks.
2) Use open_ctree_fs_info in restore, and if we didn't load a tree root or
fs root go ahead and try to set those up manually ourselves.
This is related to work I did last year on restore, but it uses the
open_ctree_fs_info instead of my open coded open_ctree. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
I was running fsync() tests and noticed that occasionally I was getting a bunch
of errors from fsck complaining about csums not having corresponding extents.
Thankfully after a few days of debugging this it turned out to be a bug with
fsck. The csums were for an extent that started at the same offset as a block
group, and were offset within the extent. So the search put us out at the block
group item and we just walked forward from there, never finding the actual
extent. This is because the block group item key is higher than the extent item
key, so it comes first. In order to fix this we need to check and see if we
landed on a block group item and take another step backwards to make sure we end
up at the extent item. With this patch my reproducer no longer finds csums that
don't have matching extent records. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Looking at a recent user problem I noticed there are weird cases we could
possibly be leaving csums in place for an extent we've free'd. I don't think
this can happen unless the extent tree is also corrupt, but just in case I'm
adding sanity checks to btrfsck. This way we will catch this if it happens
normally since xfstests runs btrfsck between each run. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
I made open_ctree fail if the chunk tree couldn't be open, which means that fsck
now segfaults if it can't open the chunk tree. So fix fsck to check the fs_info
we get back from open_ctree_fsinfo to make sure it's valid and exit if it's not
instead of segfaulting. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
In trying to track down a weird tree log problem I wanted to make sure that the
free space cache was actually valid, which we currently have no way of doing.
So this patch adds a bunch of support for the free space cache code and then a
checker to fsck. Basically we go through and if we can actually load the free
space cache then we will walk the extent tree and verify that the free space
cache exactly matches what is in the extent tree. Hopefully this will always be
correct, the only time it wouldn't is if the extent tree is corrupt or we have
some sort of awful bug in the free space cache. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
This fixes up the progs to properly deal with skinny metadata. This adds the -x
option to mkfs and btrfstune for enabling the skinny metadata option. This also
makes changes to fsck so it can properly deal with the skinny metadata entries.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Allocate fs_info::super_copy dynamically of full BTRFS_SUPER_INFO_SIZE
and use it directly for saving superblock to disk.
This fixes incorrect superblock checksum after mkfs.
Signed-off-by: David Sterba <dsterba@suse.cz>
Segmentation fault occurred in the following command.
# btrfs check /dev/sdc7
No valid Btrfs found on /dev/sdc7
Segmentation fault (core dumped)
Fix it.
Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
This patch includes the functionality of btrfs, it's
found as "btrfs check".
Signed-off-by: Ian Kumlien <pomac@demius.net>
Signed-off-by: David Sterba <dsterba@suse.cz>