If checksum root is corrupted, fsck will get segmentation. This
is because if we fail to load checksum root, root's node is NULL which
cause NULL pointer deferences later.
To fix this problem, we just did something like extent tree rebuilding.
Allocate a new one and clear uptodate flag. We will do sanity check
before fsck going on.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
If btrfs tree root is corrupted, fsck will hit the following segmentation.
enabling repair mode
Check tree block failed, want=29376512, have=0
Check tree block failed, want=29376512, have=0
Check tree block failed, want=29376512, have=0
Check tree block failed, want=29376512, have=0
Check tree block failed, want=29376512, have=0
read block failed check_tree_block
Couldn't read tree root
Checking filesystem on /dev/sda9
UUID: 0e1a754d-04a5-4256-ae79-0f769751803e
Critical roots corrupted, unable to fsck the FS
Segmentation fault (core dumped)
In btrfs_setup_all_roots(), we could tolerate some trees(extent tree, csum tree)
corrupted, and we have did careful check inside that function, it will
return NULL if critial roots corrupt(for example tree root).
The problem is that we check @OPEN_CTREE_PARTIAL flag again after
calling btrfs_setup_all_roots() which will successfully return
@fs_info though critial roots corrupted.
Fix this problem by removing @OPEN_CTREE_PARTIAL flag check outsize
btrfs_setup_all_roots().
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The libblkid scan method which was introduced later, will also
scan devices under /proc/partitions. So we don't have to do
the explicit scan of the same.
Remove the scan method BTRFS_SCAN_PROC.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
If we didn't find what we are looking for in /proc/partitions,
we're not going to find it by scanning every node under /dev, either.
But that's just what btrfs_scan_for_fsid() does.
Remove that fallback; at that point btrfs_scan_for_fsid() just calls
scan_for_btrfs(), so remove the wrapper & call it directly.
Side note: so, these paths always use /proc/partitions, not libblkid.
Userspace-intiated scans default to libblkid. I presume this is
part of the design, and intentional? Anyway, not changing it now!
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
We use the read extent buffer infrastructure to read the super block when we are
creating a btrfs-image. This works out fine most of the time except when the fs
has been balanced, then it fails to map the super block. So we could fix
btrfs-image to read in the super in a special way, but thats more code. So
instead just check in the eb reading code if we are reading the super and then
don't bother mapping the block, just read the actual offset. This fixed some
poor guy who was trying to btrfs-image his fs that had been balanced. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
David sent a quick patch that removed a BUG_ON(). I took a peek and
found that the function was already leaking an eb ref and only returned
0. So this fixes the leak and makes the function void and fixes up the
callers.
Accidentally-motivated-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Zach Brown <zab@zabbo.net>
Signed-off-by: David Sterba <dsterba@suse.cz>
Btrfs-progs superblock checksum check is somewhat too restricted for
super-recover, since current btrfs-progs will only read the 1st
superblock and if you need super-recover the 1st superblock is
possibly already damaged.
The fix is introducing super_recover parameter for
btrfs_read_dev_super() and callers to allow scan backup superblocks if
needed.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
When encountering a corrupted fs root node, fsck hit following message:
Check tree block failed, want=29360128, have=0
Check tree block failed, want=29360128, have=0
Check tree block failed, want=29360128, have=0
Check tree block failed, want=29360128, have=0
Check tree block failed, want=29360128, have=0
read block failed check_tree_block
Checking filesystem on /dev/sda9
UUID: 0d295d80-bae2-45f2-a106-120dbfd0e173
checking extents
Segmentation fault (core dumped)
This is because in btrfs_setup_all_roots(), we check
btrfs_read_fs_root() return value by verifing whether it is
NULL pointer, this is wrong since btrfs_read_fs_root() return
PTR_ERR(ret), fix it.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
This patch adds functionality (in qgroup-verify.c) to compute bytecounts in
subvolume quota groups. The original groups are read in and stored in memory
so that after we compute our own bytecounts, we can compare them with those
on disk. A print function is provided to do this comparison and show the
results on the console.
A 'qgroup check' pass is added to btrfsck. If any subvolume quota groups
differ from what we compute, the differences for them are printed. We also
provide an option '--qgroup-report' which will run only the quota check code
and print a report on all quota groups. Other than making it possible to
verify that our qgroup changes work correctly, this mode can also be used in
xfstests for automated checking after qgroup tests.
This patch does not address the following:
- compressed counts are identical to non compressed, because kernel doesn't
make the distinction yet. Adding the code to verify compressed counts
shouldn't be hard at all though once kernel can do this.
- It is only concerned with subvolume quota groups (like most of
btrfs-progs).
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: David Sterba <dsterba@suse.cz>
Fix double free of memory if btrfs_open_devices fails:
*** Error in `btrfs': double free or corruption (fasttop): 0x000000000066e020 ***
Crash happened because when open failed on device inside
btrfs_open_devices it freed all memory by calling btrfs_close_devices but
inside disk-io.c we call btrfs_close_again it again.
Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
When a disk containing btrfs is overwritten with other FS, ext4
for example it doesn't overwrite 2nd and 3rd copy of the btrfs SB.
And btrfs_read_dev_super() would look for backup SB when primary
SB isn't found. This causes the problem as in the reproducer below.
In kernel we avoid this by _not_ reading backup SB implicitly,
this patch would port the same to btrfs-progs.
reproducer:
mkfs.btrfs /dev/sde
mkfs.ext4 /dev/sde
mount /dev/sde /ext4
btrfs-convert /dev/sde (is successful (bug))
with this patch
::
btrfs-convert /dev/sde
/dev/sde is mounted
Signed-off-by: Anand Jain <Anand.Jain@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
If we are cycling through all of the mirrors trying to find the best one we need
to make sure we set best_mirror to an actual mirror number and not 0. Otherwise
we could end up reading a mirror that wasn't the best and make everybody sad.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Currently, as of 8cae1840af when running
btrfs-convert I get a bus error.
The problem is that struct btrfs_key has __attribute__ ((__packed__))
so it is not aligned. Then, a pointer to it's objectid field is taken,
cast to a void*, then eventually cast back to a u64* and
dereferenced. The problem is that the dereferenced u64* is not
necessarily aligned (ie, not necessarily a valid u64*), resulting in
undefined behavior.
This patch adds a local u64 variable which would of course be properly
aligned and then uses a pointer to that.
I did not modify the call from btrfs_fs_roots_compare_roots as that
uses struct btrfs_root which is a regular struct and would thus have
it's members correctly aligned to begin with.
After patching this I realized Liu Bo had already written a similar
patch, but I think mine is cleaner, so I'm sending it anyway.
Signed-off-by: Ivan Jager <aij+@mrph.org>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
this patch will make btrfsck operations to open disk in exclusive mode,
so that mount will fail when btrfsck is running
Signed-off-by: Anand Jain <Anand.Jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
The following steps could trigger btrfs segfault:
mkfs -t btrfs -m raid5 -d raid5 /dev/loop{0..3}
losetup -d /dev/loop2
btrfs check /dev/loop0
The reason is that read_tree_block() returns NULL and
add_root_to_pending() dereferences it without checking it first.
Also replace a BUG_ON with proper error checking.
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Internally, btrfs_header_chunk_tree_uuid() calculates an unsigned
long, but casts it to a pointer, while all callers cast it to unsigned
long again.
From btrfs commit b308bc2f05a86e728bd035e21a4974acd05f4d1e
Signed-off-by: Ross Kirk <ross.kirk@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Unfortunately you can't run --init-extent-tree if you can't actually read the
extent root. Fix this by allowing partial starts with no extent root and then
have fsck only check to see if the extent root is uptodate _after_ the check to
see if we are init'ing the extent tree. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
So I needed to add a flag to not try to read block groups when doing
--init-extent-tree since we could hang there, but that meant adding a whole
other 0/1 type flag to open_ctree_fs_info. So instead I've converted it all
over to using a flags setting and added the flag that I needed. This has been
tested with xfstests and make test. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
In some cases the tree root is so hosed we can't get anything useful out of it.
So add the -b option to btrfsck to make us look for the most recent backup tree
root to use for repair. Then we can hopefully get ourselves into a working
state. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
mkfs -r wasn't creating chunks properly, making it very difficult to
allocate space for anything except tiny filesystems.
This changes it around to use more of the generic infrastructure, and
to do actual logical->physical block number translation.
It also allocates space to the files in smaller extents (max 1MB), which
keeps the allocator from trying to allocate an extent bigger than a
single chunk.
It doesn't quite support multi-device mkfs -r yet, but is much closer.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
A user was reporting an issue with bad transid errors on his blocks. The thing
is that btrfs-progs will ignore transid failures for things like restore and
fsck so we can do a best effort to fix a users file system. So fsck can put
together a coherent view of the file system with stale blocks. So if everything
else is ok in the mind of fsck then we can recow these blocks to fix the
generation and the user can get their file system back. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Internally, btrfs_header_fsid() calculates an unsigned long, but casts
it to a pointer, while all callers cast it to unsigned long again.
Committed to btrfs as fba6aa75654394fccf2530041e9451414c28084f
Fix line length issues and match changes to kernelspace
Signed-off-by: Ross Kirk <ross.kirk@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Remove unused parameter, 'eb'. Unused since introduction in
7777e63b42
Signed-off-by: Ross Kirk <ross.kirk@gmail.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Until now if one of device's first superblock is corrupt,btrfs will
fail to mount. Luckily, btrfs have at least two superblocks for
every disk.
In theory, if silent corrupting happens when we are writting superblocks
into disk, we must hold at least one good superblock.
One side effect is that user must gurantee that the disk must be
a btrfs disk. Otherwise, this tool may destroy other fs.(This is also
reason why btrfs only use first superblock in every disk to mount)
This little program will try to correct bad superblocks from
good superblocks with max generation.
There will be five kinds of return values:
0: all supers are valid, no need to recover
1: usage or syntax error
2: recover all bad superblocks successfully
3: fail to recover bad superblocks
4: abort to recover bad superblocks
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
If some fatal superblocks are damaged, running ioctl will return failure,
in this case, we should avoid run ioctl.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
As a result of a successful call to btrfs_read_sys_array(), the 'ret'
variable is already set to 0. Hence the function would return 0 even
if the call to read_tree_block() fails.
Signed-off-by: chandan <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
In files copied from the kernel, mark many functions as static,
and remove any resulting dead code.
Some functions are left unmarked if they aren't static in the
kernel tree.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Port of commit b3b4aa7 to userspace.
parameter tree root it's not used since commit
5f39d397dfbe140a14edecd4e73c34ce23c4f9ee ("Btrfs: Create extent_buffer
interface for large blocksizes")
This gets userspace a tad closer to kernelspace by removing
this unused parameter that was all over the codebase...
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
For most time, In open_ctree_*(), we use the first superblock
(BTRFS_SUPER_INFO_OFFSET). However, for btrfs-convert, we don't,
we should pass the correct sb_bytenr to btrfs_scan_fs_devices() rather
than always use BTRFS_SUPER_INFO_OFFSET.This patch fix the following
regression:
mkfs.ext2 <dev>
btrfs-convert <dev>
warning, device 1 is missing
Check tree block failed, want=2670592, have=0
read block failed check_tree_block
Couldn't read chunk root
Segmentation fault (core dumped)
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
btrfs_scan_for_fsid uses only one argument run_ioctl out of 3
so remove the rest two of them
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Some codes still use the cpu_to_lexx instead of the
BTRFS_SETGET_STACK_FUNCS declared in ctree.h.
Also added some BTRFS_SETGET_STACK_FUNCS for btrfs_header and
btrfs_super.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
This adds a 'btrfs-image -m' option, which let us restore an image that
is built from a btrfs of multiple disks onto several disks altogether.
This aims to address the following case,
$ mkfs.btrfs -m raid0 sda sdb
$ btrfs-image sda image.file
$ btrfs-image -r image.file sdc
---------
so we can only restore metadata onto sdc, and another thing is we can
only mount sdc with degraded mode as we don't provide informations of
another disk. And, it's built as RAID0 and we have only one disk,
so after mount sdc we'll get into readonly mode.
This is just annoying for people(like me) who're trying to restore image
but turn to find they cannot make it work.
So this'll make your life easier, just tap
$ btrfs-image -m image.file sdc sdd
---------
then you get everything about metadata done, the same offset with that of
the originals(of course, you need offer enough disk size, at least the disk
size of the original disks).
Besides, this also works with raid5 and raid6 metadata image.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Otherwise we will access illegal addresses while searching on fs_uuid list.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Add chunk-recover program to check or rebuild chunk tree when the system
chunk array or chunk tree is broken.
Due to the importance of the system chunk array and chunk tree, if one of
them is broken, the whole btrfs will be broken even other data are OK.
But we have some hint(fsid, checksum...) to salvage the old metadata.
So this function will first scan the whole file system and collect the
needed data(chunk/block group/dev extent), and check for the references
between them. If the references are OK, the chunk tree can be rebuilt and
luckily the file system will be mountable.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Because the fs/file roots are not extents, so it is better to use rb-tree
to manage them. Fix it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
In fact, the code of many rb-tree insert/search/delete functions is similar,
so we can abstract them, and implement common functions for rb-tree, and then
simplify them.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Some commands(such as btrfs-convert) access the devices again after we close
the ctree, so it is better that we don't free the devices objects when the ctree
is closed, or we need re-allocate the memory for the devices. We needn't worry
the memory leak problem, because all the memory will be freed after the taskes
die.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
As we know, the file descriptor 0 is a special number, so we shouldn't
use it to initialize the file descriptor of the devices, or we might
close this special file descriptor by mistake when we close the devices.
"-1" is a better choice.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
The tree log bug I introduced could create inconsistent file extent entries in
the file system tree and in some worst cases even create multiple extent entries
for the same entry. To fix this we need to do a few things
1) Keep track of extent items that overlap and then pick the one that covers the
largest area and delete the rest of the items.
2) Keep track of file extent items that land in extent items but don't match
disk_bytenr/disk_num_bytes exactly. Once we find these we need to figure out
who is the right ref and then fix all of the other refs to agree.
Each of these cases require a complete rescan of all of the extents, so
unfortunately if you hit this particular problem the fsck is going to take quite
a while since it will likely rescan all the trees 2 or 3 times. With this patch
the broken file system a user sent me is fixed and a broken file system that was
created by my reproducer is also fixed. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Only the first byte of the wanted csum is printed:
checksum verify failed on 65536 found DA97CF61 wanted 6B
checksum verify failed on 65536 found DA97CF61 wanted 6BC3870D
Also add leading zeros to the format.
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
All we need for restore to work is the chunk root, the tree root and the fs root
we want to restore from. So to do this we need to make a few adjustments
1) Make open_ctree_fs_info fail completely if it can't read the chunk tree.
There is no sense in continuing if we can't read the chunk tree since we won't
be able to translate logical to physical blocks.
2) Use open_ctree_fs_info in restore, and if we didn't load a tree root or
fs root go ahead and try to set those up manually ourselves.
This is related to work I did last year on restore, but it uses the
open_ctree_fs_info instead of my open coded open_ctree. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
I've been working on btrfs-image and I kept seeing these leaks pop up on
valgrind so I'm just fixing them. We don't properly cleanup the device cache,
the chunk tree mapping cache, or the space infos on close. With this patch
valgrind doesn't complain about any memory leaks running btrfs-image. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
We just free the log root after we set it up when we open a ctree in the tools.
This isn't nice, it makes double free's and leaks eb's, makes segfaults with
btrfs-image. So fix this to be correct, and fix the cleanup if the buffer is
not uptodate. With this fix I no longer segfault trying to do btrfs-image on a
file system with a log tree. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Allocate fs_info::super_copy dynamically of full BTRFS_SUPER_INFO_SIZE
and use it directly for saving superblock to disk.
This fixes incorrect superblock checksum after mkfs.
Signed-off-by: David Sterba <dsterba@suse.cz>
It seems highly unlikely that posix_fadvise could fail,
and even if it does, it was only advisory. Still, if
it does, we could issue a notice to the user.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Free the memory allocated to "multi" before the error
exit in read_whole_eb(). Set it to NULL after we free
it in the loop to avoid any potential double-free.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Instead of doing a BUG_ON() if we fail to find the last fs root just return
an error so the callers can deal with it how they like. Also we need to
actually return an error if we can't find the latest root so that the error
handling works. With this btrfsck was able to deal with a file system that
was missing a root item but still had extents that referred back to the
root. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
__setup_root() was present in find-root.c as well
as disk-io.c. No need for the cut and paste, just
use the one in disk-io.c
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Zach Brown <zab@redhat.com>
struct btrfs_super is about 3.5k but a few writing paths were writing it
out as the full 4k BTRFS_SUPER_INFO_SIZE, leaking a few hundred bytes
after the super_block onto disk. In practice this meant the memory
after super_copy in struct btrfs_fs_info and whatever came after it in
the heap.
Signed-off-by: Zach Brown <zab@redhat.com>
Errors cow-ing the root block are silently being dropped. This is
just a step towards error handling because both the caller and calee
assert on errors.
Signed-off-by: Zach Brown <zab@redhat.com>
The super block magic is a le64 whose value looks like an unterminated
string in memory. The lack of null termination leads to clumsy use of
string functions and causes static analysis tools to warn that the
string will be unterminated.
So let's just treat it as the le64 that it is. Endian wrappers are used
on the constant so that they're compiled into run-time constants.
Signed-off-by: Zach Brown <zab@redhat.com>
David Woodhouse originally contributed this code, and Chris Mason
changed it around to reflect the current design goals for raid56.
The original code expected all metadata and data writes to be full
stripes. This meant metadata block size == stripe size, and had a few
other restrictions.
This version allows metadata blocks smaller than the stripe size. It
implements both raid5 and raid6, although it does not have code to
rebuild from parity if one of the drives is missing or incorrect.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
This will effectively delete all of your crcs, but at least you'll
be able to mount the FS with nodatasum.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
fsck needs to be able to open a damaged FS, which means open_ctree needs
to be able to return a damaged FS.
This adds a new open_ctree_fs_info which can be used to open any and all
roots that are valid. btrfs-debug-tree is changed to use it.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
When building on ppc64 I hit a number of warnings in printf:
btrfs-map-logical.c:69: error: format ‘%Lu’ expects type ‘long long
unsigned int’, but argument 4 has type ‘u64’
Fix them.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Hugo Mills <hugo@carfax.org.uk>
gcc-4.6 (as shipped in Fedora) turns on -Wunused-but-set-variable by
default, which breaks the build when combined with -Wall, e.g.:
debug-tree.c: In function ‘print_extent_leaf’:
debug-tree.c:45:13: error: variable ‘last_len’ set but not used [-Werror=unused-but-set-variable]
debug-tree.c:44:13: error: variable ‘last’ set but not used [-Werror=unused-but-set-variable]
debug-tree.c:41:21: error: variable ‘item’ set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
This patch fixes the errors by removing the unused variables.
Signed-off-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Hugo Mills <hugo@carfax.org.uk>
Btrfs stores multiple copies of the superblock, and for common power-failure
crashes where barriers were not in use, one of the super copies is often
valid while the first copy is not.
This adds a btrfs-select-super -s N /dev/xxx command, which can
overwrite all the super blocks with a copy that you have already
determined is valid with btrfsck -s
Signed-off-by: Chris Mason <chris.mason@oracle.com>
When a device is missing, the btrfs tools need to be able to read alternate
copies from the remaining devices. This creates placeholder devices
that always return -EIO so the tools can limp along.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
After the roots are closed, root is freed. Yet close_ctree continues
to use it. It works generally because no new memory is allocated in
the interim, but with glibc malloc perturbing enabled, it crashes
every time. This is because root->fs_info points to garbage.
This patch uses the already-cached fs_info variable for the rest of
the accesses and fixes the crash.
This issue was reported at:
https://bugzilla.novell.com/show_bug.cgi?id=603620
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
This commit introduces a new kind of back reference for btrfs metadata.
Once a filesystem has been mounted with this commit, IT WILL NO LONGER
BE MOUNTABLE BY OLDER KERNELS.
The new back ref provides information about pointer's key, level and in which
tree the pointer lives. This information allow us to find the pointer by
searching the tree. The shortcoming of the new back ref is that it only works
for pointers in tree blocks referenced by their owner trees.
This is mostly a problem for snapshots, where resolving one of these fuzzy back
references would be O(number_of_snapshots) and quite slow. The solution used
here is to use the fuzzy back references in the common case where a given tree
block is only referenced by one root, and use the full back references when
multiple roots have a reference
This patch updates the ext3 to btrfs converter for the new
disk format. This mainly involves changing the convert's
data relocation and free space management code. This patch
also ports some functions from kernel module to btrfs-progs.
Thank you,
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
This patch updates btrfs-progs for superblock duplication.
Note: I didn't make this patch as complete as the one for
kernel since updating the converter requires changing the
code again. Thank you,
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
This is the btrfs-progs version of the patch to add the ability to have
different csum algorithims. Note I didn't change the image maker since it
seemed a bit more complicated than just changing some stuff around so I will let
Yan take care of that.
Everything else was converted and for now a mkfs just
sets the type to be BTRFS_CSUM_TYPE_CRC32.
Signed-off-by: Josef Bacik <jbacik@redhat.com>
This patch adds btrfs image tool. The image tool is
a debugging tool that creates/restores btrfs metadump
image.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
This patch does the following:
1) Update device management code to match the kernel code.
2) Allocator fixes.
3) Add a program called btrfstune to set/clear the SEEDING
super block flags.
The root node generation number code made commit_tree_root look like the
kernel code. It forces a cow of the tree of tree roots even when
the FS hasn't changed.
This causes errors during fsck and other readonly operations. This adds
a check to see if commit_tree_root is going to trigger writes to the
tree of tree roots, and bails if none are pending.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This patch adds transaction IDs to root tree pointers.
Transaction IDs in tree pointers are compared with the
generation numbers in block headers when reading root
blocks of trees. This can detect some types of IO errors.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
The main changes in this patch are adding chunk handing and data relocation
ability. In the last step of conversion, the converter relocates data in system
chunk and move chunk tree into system chunk. In the rollback process, the
converter remove chunk tree from system chunk and copy data back.
Regards
YZ
---
Block headers now store the chunk tree uuid
Chunk items records the device uuid for each stripes
Device extent items record better back refs to the chunk tree
Block groups record better back refs to the chunk tree
The chunk tree format has also changed. The objectid of BTRFS_CHUNK_ITEM_KEY
used to be the logical offset of the chunk. Now it is a chunk tree id,
with the logical offset being stored in the offset field of the key.
This allows a single chunk tree to record multiple logical address spaces,
upping the number of bytes indexed by a chunk tree from 2^64 to
2^128.
The mkfs code bootstraps the filesystem on a single device. Once
the raid block groups are setup, it needs to recow all of the blocks so
that each tree is properly allocated.
We get lots of warnings of the flavor:
utils.c:441: warning: format '%Lu' expects type 'long long unsigned int' but argument 2 has type 'u64'
And thanks to -Werror, the build fails. Clean up these printfs
by properly casting the arg to the format specified.
Signed-off-by: Alex Chiang <achiang@hp.com>
This patch adds rollback support for the converter, the converter can
roll back a conversion if the image file haven't been modified. In
addition, I rearrange some codes in convert.c and add a few comments.