Port of commit b3b4aa7 to userspace.
parameter tree root it's not used since commit
5f39d397dfbe140a14edecd4e73c34ce23c4f9ee ("Btrfs: Create extent_buffer
interface for large blocksizes")
This gets userspace a tad closer to kernelspace by removing
this unused parameter that was all over the codebase...
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
For most time, In open_ctree_*(), we use the first superblock
(BTRFS_SUPER_INFO_OFFSET). However, for btrfs-convert, we don't,
we should pass the correct sb_bytenr to btrfs_scan_fs_devices() rather
than always use BTRFS_SUPER_INFO_OFFSET.This patch fix the following
regression:
mkfs.ext2 <dev>
btrfs-convert <dev>
warning, device 1 is missing
Check tree block failed, want=2670592, have=0
read block failed check_tree_block
Couldn't read chunk root
Segmentation fault (core dumped)
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
btrfs_scan_for_fsid uses only one argument run_ioctl out of 3
so remove the rest two of them
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Some codes still use the cpu_to_lexx instead of the
BTRFS_SETGET_STACK_FUNCS declared in ctree.h.
Also added some BTRFS_SETGET_STACK_FUNCS for btrfs_header and
btrfs_super.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
This adds a 'btrfs-image -m' option, which let us restore an image that
is built from a btrfs of multiple disks onto several disks altogether.
This aims to address the following case,
$ mkfs.btrfs -m raid0 sda sdb
$ btrfs-image sda image.file
$ btrfs-image -r image.file sdc
---------
so we can only restore metadata onto sdc, and another thing is we can
only mount sdc with degraded mode as we don't provide informations of
another disk. And, it's built as RAID0 and we have only one disk,
so after mount sdc we'll get into readonly mode.
This is just annoying for people(like me) who're trying to restore image
but turn to find they cannot make it work.
So this'll make your life easier, just tap
$ btrfs-image -m image.file sdc sdd
---------
then you get everything about metadata done, the same offset with that of
the originals(of course, you need offer enough disk size, at least the disk
size of the original disks).
Besides, this also works with raid5 and raid6 metadata image.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Otherwise we will access illegal addresses while searching on fs_uuid list.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Add chunk-recover program to check or rebuild chunk tree when the system
chunk array or chunk tree is broken.
Due to the importance of the system chunk array and chunk tree, if one of
them is broken, the whole btrfs will be broken even other data are OK.
But we have some hint(fsid, checksum...) to salvage the old metadata.
So this function will first scan the whole file system and collect the
needed data(chunk/block group/dev extent), and check for the references
between them. If the references are OK, the chunk tree can be rebuilt and
luckily the file system will be mountable.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Because the fs/file roots are not extents, so it is better to use rb-tree
to manage them. Fix it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
In fact, the code of many rb-tree insert/search/delete functions is similar,
so we can abstract them, and implement common functions for rb-tree, and then
simplify them.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Some commands(such as btrfs-convert) access the devices again after we close
the ctree, so it is better that we don't free the devices objects when the ctree
is closed, or we need re-allocate the memory for the devices. We needn't worry
the memory leak problem, because all the memory will be freed after the taskes
die.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
As we know, the file descriptor 0 is a special number, so we shouldn't
use it to initialize the file descriptor of the devices, or we might
close this special file descriptor by mistake when we close the devices.
"-1" is a better choice.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
The tree log bug I introduced could create inconsistent file extent entries in
the file system tree and in some worst cases even create multiple extent entries
for the same entry. To fix this we need to do a few things
1) Keep track of extent items that overlap and then pick the one that covers the
largest area and delete the rest of the items.
2) Keep track of file extent items that land in extent items but don't match
disk_bytenr/disk_num_bytes exactly. Once we find these we need to figure out
who is the right ref and then fix all of the other refs to agree.
Each of these cases require a complete rescan of all of the extents, so
unfortunately if you hit this particular problem the fsck is going to take quite
a while since it will likely rescan all the trees 2 or 3 times. With this patch
the broken file system a user sent me is fixed and a broken file system that was
created by my reproducer is also fixed. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Only the first byte of the wanted csum is printed:
checksum verify failed on 65536 found DA97CF61 wanted 6B
checksum verify failed on 65536 found DA97CF61 wanted 6BC3870D
Also add leading zeros to the format.
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
All we need for restore to work is the chunk root, the tree root and the fs root
we want to restore from. So to do this we need to make a few adjustments
1) Make open_ctree_fs_info fail completely if it can't read the chunk tree.
There is no sense in continuing if we can't read the chunk tree since we won't
be able to translate logical to physical blocks.
2) Use open_ctree_fs_info in restore, and if we didn't load a tree root or
fs root go ahead and try to set those up manually ourselves.
This is related to work I did last year on restore, but it uses the
open_ctree_fs_info instead of my open coded open_ctree. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
I've been working on btrfs-image and I kept seeing these leaks pop up on
valgrind so I'm just fixing them. We don't properly cleanup the device cache,
the chunk tree mapping cache, or the space infos on close. With this patch
valgrind doesn't complain about any memory leaks running btrfs-image. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
We just free the log root after we set it up when we open a ctree in the tools.
This isn't nice, it makes double free's and leaks eb's, makes segfaults with
btrfs-image. So fix this to be correct, and fix the cleanup if the buffer is
not uptodate. With this fix I no longer segfault trying to do btrfs-image on a
file system with a log tree. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Allocate fs_info::super_copy dynamically of full BTRFS_SUPER_INFO_SIZE
and use it directly for saving superblock to disk.
This fixes incorrect superblock checksum after mkfs.
Signed-off-by: David Sterba <dsterba@suse.cz>
It seems highly unlikely that posix_fadvise could fail,
and even if it does, it was only advisory. Still, if
it does, we could issue a notice to the user.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Free the memory allocated to "multi" before the error
exit in read_whole_eb(). Set it to NULL after we free
it in the loop to avoid any potential double-free.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Instead of doing a BUG_ON() if we fail to find the last fs root just return
an error so the callers can deal with it how they like. Also we need to
actually return an error if we can't find the latest root so that the error
handling works. With this btrfsck was able to deal with a file system that
was missing a root item but still had extents that referred back to the
root. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
__setup_root() was present in find-root.c as well
as disk-io.c. No need for the cut and paste, just
use the one in disk-io.c
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Zach Brown <zab@redhat.com>
struct btrfs_super is about 3.5k but a few writing paths were writing it
out as the full 4k BTRFS_SUPER_INFO_SIZE, leaking a few hundred bytes
after the super_block onto disk. In practice this meant the memory
after super_copy in struct btrfs_fs_info and whatever came after it in
the heap.
Signed-off-by: Zach Brown <zab@redhat.com>
Errors cow-ing the root block are silently being dropped. This is
just a step towards error handling because both the caller and calee
assert on errors.
Signed-off-by: Zach Brown <zab@redhat.com>
The super block magic is a le64 whose value looks like an unterminated
string in memory. The lack of null termination leads to clumsy use of
string functions and causes static analysis tools to warn that the
string will be unterminated.
So let's just treat it as the le64 that it is. Endian wrappers are used
on the constant so that they're compiled into run-time constants.
Signed-off-by: Zach Brown <zab@redhat.com>
David Woodhouse originally contributed this code, and Chris Mason
changed it around to reflect the current design goals for raid56.
The original code expected all metadata and data writes to be full
stripes. This meant metadata block size == stripe size, and had a few
other restrictions.
This version allows metadata blocks smaller than the stripe size. It
implements both raid5 and raid6, although it does not have code to
rebuild from parity if one of the drives is missing or incorrect.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
This will effectively delete all of your crcs, but at least you'll
be able to mount the FS with nodatasum.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
fsck needs to be able to open a damaged FS, which means open_ctree needs
to be able to return a damaged FS.
This adds a new open_ctree_fs_info which can be used to open any and all
roots that are valid. btrfs-debug-tree is changed to use it.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
When building on ppc64 I hit a number of warnings in printf:
btrfs-map-logical.c:69: error: format ‘%Lu’ expects type ‘long long
unsigned int’, but argument 4 has type ‘u64’
Fix them.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Hugo Mills <hugo@carfax.org.uk>
gcc-4.6 (as shipped in Fedora) turns on -Wunused-but-set-variable by
default, which breaks the build when combined with -Wall, e.g.:
debug-tree.c: In function ‘print_extent_leaf’:
debug-tree.c:45:13: error: variable ‘last_len’ set but not used [-Werror=unused-but-set-variable]
debug-tree.c:44:13: error: variable ‘last’ set but not used [-Werror=unused-but-set-variable]
debug-tree.c:41:21: error: variable ‘item’ set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
This patch fixes the errors by removing the unused variables.
Signed-off-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Hugo Mills <hugo@carfax.org.uk>
Btrfs stores multiple copies of the superblock, and for common power-failure
crashes where barriers were not in use, one of the super copies is often
valid while the first copy is not.
This adds a btrfs-select-super -s N /dev/xxx command, which can
overwrite all the super blocks with a copy that you have already
determined is valid with btrfsck -s
Signed-off-by: Chris Mason <chris.mason@oracle.com>
When a device is missing, the btrfs tools need to be able to read alternate
copies from the remaining devices. This creates placeholder devices
that always return -EIO so the tools can limp along.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
After the roots are closed, root is freed. Yet close_ctree continues
to use it. It works generally because no new memory is allocated in
the interim, but with glibc malloc perturbing enabled, it crashes
every time. This is because root->fs_info points to garbage.
This patch uses the already-cached fs_info variable for the rest of
the accesses and fixes the crash.
This issue was reported at:
https://bugzilla.novell.com/show_bug.cgi?id=603620
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
This commit introduces a new kind of back reference for btrfs metadata.
Once a filesystem has been mounted with this commit, IT WILL NO LONGER
BE MOUNTABLE BY OLDER KERNELS.
The new back ref provides information about pointer's key, level and in which
tree the pointer lives. This information allow us to find the pointer by
searching the tree. The shortcoming of the new back ref is that it only works
for pointers in tree blocks referenced by their owner trees.
This is mostly a problem for snapshots, where resolving one of these fuzzy back
references would be O(number_of_snapshots) and quite slow. The solution used
here is to use the fuzzy back references in the common case where a given tree
block is only referenced by one root, and use the full back references when
multiple roots have a reference
This patch updates the ext3 to btrfs converter for the new
disk format. This mainly involves changing the convert's
data relocation and free space management code. This patch
also ports some functions from kernel module to btrfs-progs.
Thank you,
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>