recover_prepare() in chunk-recover.c alloc memory which only contains
sizeof(struct btrfs_super_block). This will cause glibc malloc error
after superblock csum is calculated.
Use BTRFS_SUPER_INFO_SIZE to fix the bug.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Btrfs-progs superblock checksum check is somewhat too restricted for
super-recover, since current btrfs-progs will only read the 1st
superblock and if you need super-recover the 1st superblock is
possibly already damaged.
The fix is introducing super_recover parameter for
btrfs_read_dev_super() and callers to allow scan backup superblocks if
needed.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The chunk-recover.c/BTRFS_NUM_MIRRORS in the userspace means
the same thing as ctree.h/BTRFS_MAX_MIRRORS in the kernelspace,
so to stay consistent with the kernelspace, just make this movement
in the userspace:
chunk-recover.c/BTRFS_NUM_MIRRORS
===>
ctree.h/BTRFS_MAX_MIRRORS
This provides convenience for future use.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
When run chunk-recover on a health btrfs(data profile raid0, with
plenty of data), the program has a chance to abort on the number
of mirrors of an extent.
According to the kernel code, the max mirror number of an extent
is 3 not 2:
ctree.h: BTRFS_MAX_MIRRORS 3
chunk-recover.c : BTRFS_NUM_MIRRORS 2
just change BTRFS_NUM_MIRRORS to 3, and everything goes well.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
When deal with the p & q stripes for data profile raid6, chunk-recover
forgets to insert them into the chunk record. Just insert them back
freely.
Also wrap the insert procedure into a new function, fill_chunk_up.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The 'num_unordered' will be recounted after 'goto out',
just remove it.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
If the csum of one stripe is not able to judge the order of two
device extents, the stripe may happen to belong to the device extent
that is already kicked out as ordered.
Take this condition into consideration, don't report failure and
give more tries with the stripes following.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
When count the number of unordered device extents in chunk-recover,
the counter should be reinitialized to be used.
Also, introduce a new function for the counting job.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Free memory if open call fails. Prevent pthread_cancel on threads
which have already finished successfully. If all calls to
pthread_create and pthread_join are successful, we mistakenly call
pthread_cancel because cancel_from and cancel_to are both zero.
Make POSIX.1-2001 happy by supplying a non-NULL second argument to
pthread_setcanceltype.
Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Fix corporate name for copyright.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
a clean up patch, the BTRFS_STRIPE_LEN is been duplicated across
btrfs-progs, the kernel defines it in volume.h so do the same
for progs.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Originally, multi devices are scanned one by one;
Now, one thread is used per device to scan.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Decide the raid0/5/6 data stripes' order using checksums.
For one chunk, fetch each 64k logical stripe
1. search its checksum in the csum tree
2. read the physical stripe data on each device
3. calc the data checksums
4. if one checksum matches the value from the csum tree,
then the logical stripe resides in that device,
the stripe order index can be calculated.
5. if more than one checksums match,
then use the successive csum in the tree to compare again.
6. if equal stripes are encountered, just fetch next stripe.
7. if some devices' order are still not decided, then they
can not be recovered.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
If no chunks need to be recovered, skip the recover works,
meanwhile the user won't be annoyed by the "ask_user".
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
When reading block groups we will searching it's corresponding chunk, however, at this
time, some chunks has not been built(data chunks raid0/raid10/raid56), don't bug_on here,
we will try to rebuild these chunks later.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
When allocating chunk root node, we should use nodesize rather than sectorsize,
this will casue regression when making other nodesize choice.(for example 16k size now)
Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Internally, btrfs_header_chunk_tree_uuid() calculates an unsigned
long, but casts it to a pointer, while all callers cast it to unsigned
long again.
From btrfs commit b308bc2f05a86e728bd035e21a4974acd05f4d1e
Signed-off-by: Ross Kirk <ross.kirk@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Internally, btrfs_header_fsid() calculates an unsigned long, but casts
it to a pointer, while all callers cast it to unsigned long again.
Committed to btrfs as fba6aa75654394fccf2530041e9451414c28084f
Fix line length issues and match changes to kernelspace
Signed-off-by: Ross Kirk <ross.kirk@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Remove unused parameter, 'eb'. Unused since introduction in
7777e63b42
Signed-off-by: Ross Kirk <ross.kirk@gmail.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
If some fatal superblocks are damaged, running ioctl will return failure,
in this case, we should avoid run ioctl.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
The command has been moved and we should rename the files accordingly,
so the entry point is now in cmds-rescue.c and the core functionality
in it's own file.
Return codes of btrfs_recover_chunk_tree have been simplified not to
require a define and another file for defintion.
CC: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>