Commit Graph

44 Commits

Author SHA1 Message Date
Qu Wenruo ffb1b847bd btrfs-progs: Fix wrong tree block alignment for unalianged block group
Commit 854437ca(btrfs-progs: extent-tree: avoid allocating tree block
that crosses stripe boundary) introduces check for logical bytenr not
crossing stripe boundary.

However that check is not completely correct.
It only checks if the logical bytenr and length agaist absolute logical
offset.
That's to say, it only check if a tree block lies in 64K logical stripe.

But in fact, it's possible a block group starts at bytenr unaligned with
64K, just like the following case.

Then btrfsck will give false alert.

0       32K       64K       96K        128K         160K ...
        |--------------- Block group A ---------------------
	|<-----TB 32K------>|
        |/Scrub stripe unit/|
|    WRONG UNIT   |

In that case, TB(tree block) at bytenr 32K in fact fits into the kernel
scrub stripe unit.
But doesn't fit into the pure logical 64K stripe.

Fix check_crossing_stripes() to compare bytenr to block group start, not
to absolute logical bytenr.

Reported-by: Jussi Kansanen <jussi.kansanen@gmail.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-10-25 14:31:06 +02:00
Qu Wenruo 2f242115d1 btrfs-progs: Do extra chunk check before processing chunk item
Current we only do chunk validation check at mount time.

It's good for most case, but for fuzzed or manually crafted images, we
can insert a CHUNK_ITEM key into root tree.

Since mount time check will only check chunk tree, it will not check
CHUNK_ITEM in root tree.

Even with previous key type check against leaf owner, it is still
possible to modify the leaf owner to by-pass it.

So we still need to check chunk validation before processing it.

Reported-by: Lukas Lueg <lukas.lueg@gmail.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-05 10:04:16 +02:00
David Sterba 059832da5f btrfs-progs: make superblock reading/scanning api more generic
We'll add more modes that affect scanning.

Signed-off-by: David Sterba <dsterba@suse.com>
2016-08-24 14:36:58 +02:00
Qu Wenruo f07c814971 btrfs-progs: Introduce function to create convert data chunks
Introduce new function, make_convert_data_chunks(), to build up data
chunks for convert.

It will call a modified version of btrfs_alloc_data_chunk() to force
data chunks to covert all known ext* data.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-06-07 18:15:19 +02:00
David Sterba 40db5cd7ff btrfs-progs: extend balance args to take min/max usage filter
Add the overlapping usage and [usage_min, usage_max] members to the
balance args. The min/max values are interpreted iff the corresponding
flag BTRFS_BALANCE_ARGS_USAGE_RANGE is set.

The minimum boundary is inclusive, maximum is exclusive:
* usage_min <= chunk_usage < usage_max

Signed-off-by: David Sterba <dsterba@suse.com>
2016-01-12 15:01:04 +01:00
Gabríel Arthúr Pétursson 0826a8ddb9 btrfs-progs: balance: add stripes filter
Add new balance filter 'stripes=<range>' to process only chunks that are
spread accross given number of chunks.

The range minimum and maximum are inclusive.

Signed-off-by: Gabríel Arthúr Pétursson <gabriel@system.is>
[ reworked a bit to use the range helpers, dropped the single value
  for stripes ]
Signed-off-by: David Sterba <dsterba@suse.com>
2016-01-12 15:01:04 +01:00
David Sterba 45e9bf8098 btrfs-progs: extend balance args to take min/max limit filter
Add the overlapping limit and [limit_min, limit_max] members to the
balance args. The min/max values are interpreted iff the corresponding
flag BTRFS_BALANCE_ARGS_LIMIT_RANGE is set.

The minimum and maximum are inclusive.

Note that the values are only 32bit, but this should be enough for the
foreseeable future.

Signed-off-by: David Sterba <dsterba@suse.com>
2016-01-12 15:01:03 +01:00
Qu Wenruo 2143084229 btrfs-progs: find-root: Add support to search chunk root
Add support to search chunk root, as we only need to search tree roots
in system chunk, which should be very easy to add, just iterate in
system chunks.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
[ renamed to btrfs_next_bg_* ]
Signed-off-by: David Sterba <dsterba@suse.com>
2015-11-16 14:23:45 +01:00
David Sterba 73a894e51e btrfs-progs: fix cross stripe boundary check
Commit 854437ca3c ("btrfs-progs:
extent-tree: avoid allocating tree block that crosses stripe boundary")
does not work for 64k nodesize. Due to an off-by-one error, all queries
to check_crossing_stripes will return that all extents cross a stripe
and this will lead to a false ENOSPC. This crashes later

$ ./mkfs.btrfs -n 64k image

./mkfs.btrfs(btrfs_reserve_extent+0xb77)[0x417f38]
./mkfs.btrfs(btrfs_alloc_free_block+0x57)[0x417fe0]
./mkfs.btrfs(__btrfs_cow_block+0x163)[0x408eb7]
./mkfs.btrfs(btrfs_cow_block+0xd0)[0x4097c4]
./mkfs.btrfs(btrfs_search_slot+0x16f)[0x40be4d]
./mkfs.btrfs(btrfs_insert_empty_items+0xc0)[0x40d5f9]
./mkfs.btrfs(btrfs_insert_item+0x99)[0x40da5f]
./mkfs.btrfs(btrfs_make_block_group+0x4d)[0x41705c]
./mkfs.btrfs(main+0xeef)[0x434b56]

Holger Hoffstätte reports that this also fixes false positives in case
the nodesize is less than 64k. This happens when the node blocks end at
the stripe boundary.

Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-09-11 16:46:14 +02:00
Zhao Lei b0f760c91a btrfs-progs: Introduce btrfs_close_all_devices helper
If there is more than one fs_devices in fs_uuids list (like mkfs.btrfs
does), we need close them all before exit. Add a helper for that.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-08-31 19:25:13 +02:00
Qu Wenruo 595c57d2f4 btrfs-progs: fsck: Check if a metadata tree block crossing stripe boundary
Kernel btrfs_map_block() function has a limitation that it can only
map BTRFS_STRIPE_LEN size.
That will cause scrub fails to scrub tree block which crosses strip
boundary, causing BUG_ON().

Normally, it's OK as metadata is always in metadata chunk and
BTRFS_STRIPE_LEN can always be divided by node/leaf size.
So without mixed block group, tree block won't cross stripe boundary.

But for mixed block group, especially for btrfs converted from ext4,
it's almost sure one or more tree blocks are not aligned with node size
and cross stripe boundary.
Causing bug with kernel scrub.

This patch will report the problem, although we don't have a good idea
how to fix it in user space until we add the ability to relocate tree
block in user space.

Also, kernel code should also be checked for such tree block alloc
problems.

Reported-by: Chris Murphy <lists@colorremedies.com>
Reported-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-08-31 19:25:10 +02:00
David Sterba 250a58f34d btrfs-progs: add missing includes to header files
Add includes that let the header files compile or add explicit include
of kerncompat if the uXX types are used.

Signed-off-by: David Sterba <dsterba@suse.cz>
2015-06-10 02:52:21 +02:00
David Sterba 07ce7005fc btrfs-progs: unify header file inclusion protections
There are missing ifdefs or defines with very generic names.

Signed-off-by: David Sterba <dsterba@suse.cz>
2015-01-21 17:49:26 +01:00
Gui Hecheng 2513077f2f btrfs-progs: fix device missing of btrfs fi show with seed devices
*Note*
this handles the problem under umounted state, the similar problem
under mounted state is already fixed by Anand.

Steps to reproduce:
        # mkfs.btrfs -f /dev/sda1
        # btrfstune -S 1 /dev/sda1
        # mount /dev/sda1 /mnt
        # btrfs dev add /dev/sda2 /mnt
        # umount /mnt                   <== (umounted)
        # btrfs fi show /dev/sda2
result:
        Label: none  uuid: XXXXXXXXXXXXXXXXXX
        Total devices 2 FS bytes used 368.00KiB
        devid    2 size 9.31GiB used 1.25GiB path /dev/sda2
        *** Some devices missing
        Btrfs v3.16-67-g69f54ea-dirty

It is because @btrfs_scan_lblkid() won't establish mappinig
between the seed and sprout devices. So seeding devices are missing.
We could use @open_ctree_* to detect all seed/sprout mappings
for each fs scanned after @btrfs_scan_lblkid().

sth worthes mention:
o If there are multi-level of seeds, all devices in them will be shown
  in the ascending order of @devid
o If device replace is execed on a sprout fs with a device in a seed fs,
  the replaced device still exist in the seed fs together with
  the replacing device in the sprout fs, so we only keep the latest device
  with the newest generation

Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-10-10 10:52:41 +02:00
Qu Wenruo 23d7f6d9dc btrfs-progs: Allow btrfs_read_dev_super() to read all 3 super for super_recover.
Btrfs-progs superblock checksum check is somewhat too restricted for
super-recover, since current btrfs-progs will only read the 1st
superblock and if you need super-recover the 1st superblock is
possibly already damaged.

The fix is introducing super_recover parameter for
btrfs_read_dev_super() and callers to allow scan backup superblocks if
needed.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-08-22 15:04:50 +02:00
David Sterba 266c81a910 btrfs-progs: balance filter: add limit of processed chunks
Add more control to the balance behaviour.

Usage filter may not be finegrained enough and can lead to moving too
many chunks at once. Another example use is in connection with
drange+devid or vrange filters that allow to work with a specific chunk
or even with a chunk on a given device.

The limit filter applies last, the value of 0 means no limiting.

CC: Ilya Dryomov <idryomov@gmail.com>
CC: Hugo Mills <hugo@carfax.org.uk>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-08-22 14:55:26 +02:00
Adam Buchbinder f6a290686e btrfs-progs: Fix a use-after-free in the volumes code.
When a struct btrfs_fs_devices was being torn down by
btrfs_close_devices(), there was an invalidated pointer in the global
list fs_uuids which still pointed to it; if a device was closed and
then reopened (which btrfs-convert does), freed memory would be
accessed.

This was found using ThreadSanitizer (pretty much doing what
AddressSanitizer would, but not exiting after the first failure).
To reproduce, build with -fsanitize=thread and run 'make test'.
Representative output is below.

This change makes the current tests TSan-clean.

WARNING: ThreadSanitizer: heap-use-after-free (pid=29161)
  Read of size 8 at 0x7d180000eee0 by main thread:
    #0 memcmp ??:0
    #1 find_fsid .../volumes.c:81
    #2 device_list_add .../volumes.c:95
    #3 btrfs_scan_one_device .../volumes.c:259
    #4 btrfs_scan_fs_devices .../disk-io.c:1002
    #5 __open_ctree_fd .../disk-io.c:1090
    #6 open_ctree_fd .../disk-io.c:1191
    #7 do_convert .../btrfs-convert.c:2317
    #8 main .../btrfs-convert.c:2745

  Previous write of size 8 at 0x7d180000eee0 by main thread:
    #0 free ??:0
    #1 btrfs_close_devices .../volumes.c:191
    #2 close_ctree .../disk-io.c:1401
    #3 do_convert .../btrfs-convert.c:2300
    #4 main .../btrfs-convert.c:2745

  Location is heap block of size 96 at 0x7d180000eee0 allocated by main thread:
    #0 calloc ??:0 (exe+0x00000002acc6)
    #1 device_list_add .../volumes.c:97
    #2 btrfs_scan_one_device .../volumes.c:259
    #3 btrfs_scan_fs_devices .../disk-io.c:1002
    #4 __open_ctree_fd .../disk-io.c:1090
    #5 open_ctree_fd .../disk-io.c:1191
    #6 do_convert .../btrfs-convert.c:2256
    #7 main .../btrfs-convert.c:2745

Signed-off-by: Adam Buchbinder <abuchbinder@google.com>
Reviewed-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-08-22 14:39:34 +02:00
Anand Jain 5e5fd1b9ed btrfs-progs: don't replicate the stripe_len defines
a clean up patch, the BTRFS_STRIPE_LEN is been duplicated across
btrfs-progs, the kernel defines it in volume.h so do the same
for progs.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-31 08:22:18 -08:00
Eric Sandeen 989ca65a11 btrfs-progs: mark static & remove unused from shared kernel code
In files copied from the kernel, mark many functions as static,
and remove any resulting dead code.

Some functions are left unmarked if they aren't static in the
kernel tree.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-09-03 19:40:53 +02:00
Chris Mason 0bae08fdab Merge branch 'liubo-image-restore'
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

Conflicts:
	disk-io.c
	volumes.h
2013-07-03 14:24:43 -04:00
Liu Bo 095e21af45 Btrfs-progs: enhance btrfs-image to restore image onto multiple disks
This adds a 'btrfs-image -m' option, which let us restore an image that
is built from a btrfs of multiple disks onto several disks altogether.

This aims to address the following case,
$ mkfs.btrfs -m raid0 sda sdb
$ btrfs-image sda image.file
$ btrfs-image -r image.file sdc
---------
so we can only restore metadata onto sdc, and another thing is we can
only mount sdc with degraded mode as we don't provide informations of
another disk.  And, it's built as RAID0 and we have only one disk,
so after mount sdc we'll get into readonly mode.

This is just annoying for people(like me) who're trying to restore image
but turn to find they cannot make it work.

So this'll make your life easier, just tap
$ btrfs-image -m image.file sdc sdd
---------
then you get everything about metadata done, the same offset with that of
the originals(of course, you need offer enough disk size, at least the disk
size of the original disks).

Besides, this also works with raid5 and raid6 metadata image.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-07-03 14:16:10 -04:00
Miao Xie 3b9e6dd437 Btrfs-progs: Add chunk rebuild function for RAID1/SINGLE/DUP
Add chunk rebuild for RAID1/SINGLE/DUP to chunk-recover command.

Before this patch chunk-recover can only scan and reuse the old chunk
data to recover. With this patch, chunk-recover can use the reference
between chunk/block group/dev extent to rebuild the whole chunk tree
even when old chunks are not available.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-07-03 14:06:55 -04:00
Miao Xie 30d5c8a49f Btrfs-progs: Add chunk recover function - using old chunk items
Add chunk-recover program to check or rebuild chunk tree when the system
chunk array or chunk tree is broken.

Due to the importance of the system chunk array and chunk tree, if one of
them is broken, the whole btrfs will be broken even other data are OK.

But we have some hint(fsid, checksum...) to salvage the old metadata.
So this function will first scan the whole file system and collect the
needed data(chunk/block group/dev extent), and check for the references
between them. If the references are OK, the chunk tree can be rebuilt and
luckily the file system will be mountable.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-07-03 14:06:55 -04:00
Chris Mason 7b1c567c84 Merge branch 'for-chris' of git://repo.or.cz/btrfs-progs-unstable/devel into raid56
Conflicts:
	ctree.h

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-02-06 12:42:24 -05:00
David Woodhouse 4d48b96b28 Add basic RAID[56] support
David Woodhouse originally contributed this code, and Chris Mason
changed it around to reflect the current design goals for raid56.

The original code expected all metadata and data writes to be full
stripes.  This meant metadata block size == stripe size, and had a few
other restrictions.

This version allows metadata blocks smaller than the stripe size.  It
implements both raid5 and raid6, although it does not have code to
rebuild from parity if one of the drives is missing or incorrect.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-02-01 14:22:07 -05:00
Stefan Behrens 7e08a9116d Btrfs-progs: add support for device replace procedure
This is the user mode part of the device replace patch series.

The command group "btrfs replace" is added with three commands:
- btrfs replace start srcdev|srcdevid targetdev [-Bfr] mount_point
- btrfs replace status mount_point [-1]
- btrfs replace cancel mount_point

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
2013-01-31 13:47:26 +01:00
Chris Mason e22827e9bb btrfsck: add early code to handle corrupted block groups
This is mostly disabled, but it is step one in handling
corrupted block groups in the extent allocation tree.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-02-22 10:59:55 -05:00
Ilya Dryomov 4f3a15d09a Btrfs-progs: add restriper headers
Add restriper headers and update print-tree.c

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-02-03 21:02:29 +02:00
Josef Bacik be826706b5 btrfs-progs: add a recovery utility to pull files from damanged filesystems
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-10-27 12:49:54 -04:00
Donggeun Kim 4e64e05c6b btrfs-progs: Add new feature to mkfs.btrfs to make file system image file from source directory
Changes from V1 to V2:
- support extended attributes
- move btrfs_alloc_data_chunk function to volumes.c
- fix an execution error when additional useless parameters are specified
- fix traverse_directory function so that the insertion functions for the common items are invoked in a single point

The extended attributes is implemented through llistxattr and getxattr function calls.

Thanks

Signed-off-by: Donggeun Kim <dg77.kim@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-10-25 09:18:31 -04:00
Yan Zheng 5ccd1715fa superblock duplication
This patch updates btrfs-progs for superblock duplication.
Note: I didn't make this patch as complete as the one for
kernel since updating the converter requires changing the
code again. Thank you,

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
2008-12-05 12:21:31 -05:00
Yan Zheng 4d1d3a59d6 update btrfs-progs for seed device support
This patch does the following:

1) Update device management code to match the kernel code.

2) Allocator fixes.

3) Add a program called btrfstune to set/clear the SEEDING
   super block flags.
2008-11-18 10:40:06 -05:00
Chris Mason 8bfbb6b6f8 Update the Ext3 converter
The main changes in this patch are adding chunk handing and data relocation
ability. In the last step of conversion, the converter relocates data in system
chunk and move chunk tree into system chunk. In the rollback process, the
converter remove chunk tree from system chunk and copy data back.

Regards
YZ
---
2008-04-22 14:06:56 -04:00
Chris Mason 358564890a Add a command to show all of the btrfs filesystems on the box (btrfs-show) 2008-04-22 14:06:31 -04:00
Chris Mason 951fd7371c Add chunk uuids and update multi-device back references
Block headers now store the chunk tree uuid

Chunk items records the device uuid for each stripes

Device extent items record better back refs to the chunk tree

Block groups record better back refs to the chunk tree

The chunk tree format has also changed.  The objectid of BTRFS_CHUNK_ITEM_KEY
used to be the logical offset of the chunk.  Now it is a chunk tree id,
with the logical offset being stored in the offset field of the key.

This allows a single chunk tree to record multiple logical address spaces,
upping the number of bytes indexed by a chunk tree from 2^64 to
2^128.
2008-04-15 15:42:08 -04:00
Chris Mason d1b04c2112 Write all super blocks during commit 2008-04-10 16:22:00 -04:00
Chris Mason fd2d0af0bf Retry metadata reads in the face of checksum failures 2008-04-09 16:28:12 -04:00
Chris Mason 1b74adf90b Change btrfs_map_block to return a structure with mappings for all stripes 2008-04-09 16:28:12 -04:00
Chris Mason a6de0bd778 Add mirroring support across multiple drives 2008-04-03 16:35:48 -04:00
Chris Mason 0dcfa3b827 Walk all block devices looking for btrfs 2008-03-24 15:05:44 -04:00
Chris Mason 26afd0f31d ioctls to scan for btrfs filesystems 2008-03-24 15:04:49 -04:00
Chris Mason 1f3ba6a3f9 Btrfsck updates for multi-device filesystems 2008-03-24 15:04:37 -04:00
Chris Mason d12d4c7203 Dynamic chunk allocation 2008-03-24 15:03:58 -04:00
Chris Mason 510be29677 Add support for multiple devices per filesystem 2008-03-24 15:03:18 -04:00