Commit Graph

90 Commits

Author SHA1 Message Date
Qu Wenruo
60651ad9da btrfs-progs: introduce OPEN_CTREE_ALLOW_TRANSID_MISMATCH flag
[BUG]
There is a report that, btrfstune can even work while the fs has transid
mismatch problems.

  $ btrfstune -f -u /dev/sdb1
  Current fsid: b2b5ae8d-4c49-45f0-b42e-46fe7dcfcb07
  New fsid: b2b5ae8d-4c49-45f0-b42e-46fe7dcfcb07
  Set superblock flag CHANGING_FSID
  Change fsid in extents
  parent transid verify failed on 792854528 wanted 20103 found 20091
  parent transid verify failed on 792854528 wanted 20103 found 20091
  parent transid verify failed on 792854528 wanted 20103 found 20091
  Ignoring transid failure
  parent transid verify failed on 792870912 wanted 20103 found 20091
  parent transid verify failed on 792870912 wanted 20103 found 20091
  parent transid verify failed on 792870912 wanted 20103 found 20091
  Ignoring transid failure
  parent transid verify failed on 792887296 wanted 20103 found 20091
  parent transid verify failed on 792887296 wanted 20103 found 20091
  parent transid verify failed on 792887296 wanted 20103 found 20091
  Ignoring transid failure
  ERROR: child eb corrupted: parent bytenr=38010880 item=69 parent level=1 child level=1
  ERROR: failed to change UUID of metadata: -5
  ERROR: btrfstune failed

This leaves a corrupted fs even more corrupted, and due to the extra
CHANGING_FSID flag, btrfs check will not even try to run on it:

  Opening filesystem to check...
  ERROR: Filesystem UUID change in progress
  ERROR: cannot open file system

[CAUSE]
Unlike kernel, btrfs-progs has a less strict check on transid mismatch.

In read_tree_block() we will fall back to use the tree block even its
transid mismatch if we can't find any better copy.

However not all commands in btrfs-progs needs this feature, only
btrfs-check (which may fix the problem) and btrfs-restore (it just tries
to ignore any problems) really utilize this feature.

[FIX]
Introduce a new open ctree flag, OPEN_CTREE_ALLOW_TRANSID_MISMATCH, to
be explicit about whether we really want to ignore transid error.

Currently only btrfs-check and btrfs-restore will utilize this new flag.

Also add btrfs-image to allow opening such fs with transid error.

Link: https://www.reddit.com/r/btrfs/comments/pivpqk/failure_during_btrfstune_u/
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-20 12:17:29 +02:00
Qu Wenruo
3875c14a5a btrfs-progs: image: fix restored image size misalignment
[BUG]
There is a small device size misalignment between the super block device
size and the device extent size:

  total_bytes             10737418240 	<<<
  bytes_used              15097856
  dev_item.total_bytes    10737418240
  dev_item.bytes_used     1094713344

        item 0 key (DEV_ITEMS DEV_ITEM 1) itemoff 16185 itemsize 98
                devid 1 total_bytes 1095761920 bytes_used 1094713344
				    ^^^^^^^^^^

[CAUSE]
In fixup_device_size(), we only reset superblock device item size, which
will be overwritten in write_dev_supers() using btrfs_device::total_bytes.

And it doesn't touch btrfs_superblock::total_bytes either.

[FIX]
So fix the small mismatch by also resetting btrfs_device::total_bytes,
btrfs_device::bytes_used and btrfs_superblock::total_bytes.

Thankfully since commit 73dd4e3c87 ("btrfs-progs: image: Don't modify
the chunk and device tree if the source dump is single device") single
device dump won't have such problem, but it's still worth for
multi-device dump.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:53 +02:00
Qu Wenruo
c4ff83cb5d btrfs-progs: image: reduce memory requirements for decompression
With recent change to enlarge max_pending_size to 256M for data dump,
the decompress code requires quite a lot of memory, up to 256M * 4.

The reason is we're using wrapped uncompress() function call, which
needs the buffer to be large enough to contain the decompressed data.

This patch will re-work the decompress work to use inflate() which can
resume it decompression so that we can use a much smaller buffer size.

Use 512K as buffer size.

Now the memory consumption for restore is reduced to

 cluster data size + 512K * nr_running_threads

Instead of the original one:

 cluster data size + 1G * nr_running_threads

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:53 +02:00
Qu Wenruo
0cc8a7bd4a btrfs-progs: image: introduce -d option to dump data
This new experimental data dump feature will dump the whole image, not
only the existing tree blocks but also all its data extents(*).

This feature will rely on the new dump format (_DUmP_v1), as it needs
extra large extent size limit, and older btrfs-image dump can't handle
such large item/cluster size.

Since we're dumping all extents including data extents, for the restored
image there is no need to use any extra super block flags to inform
kernel.
Kernel should just treat the restored image as any ordinary btrfs.

This new feature will be hidden behind the experimental features, that's
to say, if --enable-experimental is not enabled, although we still have
the option, it will not do anything but output an error message.

*: The data extents will be dumped as is, that's to say, even for
preallocated extent, its (meaningless) data will be read out and
dumpped.
This behavior will cause extra space usage for the image, but we can
skip all the complex partially shared preallocated extent check.

Issue: #394
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:53 +02:00
Qu Wenruo
6b03e0dc40 btrfs-progs: image: introduce framework for more dump versions
The original dump format only contains a magic member to verify the
format, this means if we want to introduce new on-disk format or change
certain size limit, we can only introduce new magic as version.

Introduce the framework to allow multiple magic numbers to co-exist for
further extensions.

Introduce the following members for each dump version.

- max_pending_size
  The threshold size of a cluster. It's not a hard limit but a soft
  one. One cluster can go larger than max_pending_size for one item, but
  next item would go to the next cluster.

- magic_cpu
  The magic number in CPU byte order.

- extra_sb_flags
  If the super block of this restore needs extra super block flags like
  BTRFS_SUPER_FLAG_METADUMP_V2.
  For incoming data dump feature, we don't need any extra super block
  flags.

This change also implies that all image dumps will use the same magic
for all clusters. No mixing is allowed, as we will use the first cluster
to determine the dump version.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:38:53 +02:00
Qu Wenruo
e916d57466 btrfs-progs: image: enlarge output file if no tree modification is needed for restore
[BUG]
If restoring dumped image to a new file, under most cases kernel will
reject it since version 5.11:

 # mkfs.btrfs -f /dev/test/test
 # btrfs-image /dev/test/test /tmp/dump
 # btrfs-image -r /tmp/dump ~/test.img
 # mount ~/test.img /mnt/btrfs
 mount: /mnt/btrfs: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.
 # dmesg -t | tail -n 7
 loop0: detected capacity change from 10592 to 0
 BTRFS info (device loop0): disk space caching is enabled
 BTRFS info (device loop0): has skinny extents
 BTRFS info (device loop0): flagging fs with big metadata feature
 BTRFS error (device loop0): device total_bytes should be at most 5423104 but found 10737418240
 BTRFS error (device loop0): failed to read chunk tree: -22
 BTRFS error (device loop0): open_ctree failed

[CAUSE]
When btrfs-image restores an image into a file, and the source image
contains only single device, then we don't need to modify the
chunk/device tree, as we can reuse the existing chunk/dev tree without
any problem.

This also means, for such restore, we also won't do any target file
enlarge. This behavior itself is fine, as at that time, kernel won't
check if the device is smaller than the device size recorded in device
tree.

But later kernel commit 3a160a933111 ("btrfs: drop never met disk total
bytes check in verify_one_dev_extent") introduces new check on device
size at mount time, rejecting any loop file which is smaller than the
original device size.

[FIX]
Do extra file enlarge for single device restore if the restored file is
smaller than the device size.

Reported-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <l@damenly.su>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:47 +02:00
Qu Wenruo
9900c7a6bc btrfs-progs: image: remove the dead stat() call in metadump
In restore_metadump(), we call stat() but never use the result.  This
call site is left by some code refactoring, as the stat() call is now
moved into fixup_device_size().  We can safely remove the call.

Reviewed-by: Su Yue <l@damenly.su>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:47 +02:00
David Sterba
7fa07e2abb btrfs-progs: split open/close helpers from utils.c
There's a group of functions that are related to opening filesystem in
various modes, this can be moved to a separate file.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:47 +02:00
David Sterba
6bb6d1215d btrfs-progs: factor open_ctree parameters to a structure
Extending open_ctree with more parameters would be difficult, we'll need
to add more so factor out the parameters to a structure for easier
extension.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-24 22:20:19 +01:00
Josef Bacik
0c6e59e0b2 btrfs-progs: image: fix invalid size check for extent items
While trying to run down a corruption problem I needed to use
btrfs-image to generate known good states in between tests.  At some
point this started failing with

  either extent tree is corrupted or deprecated extent ref format

This is because the fs had an extent item that was large enough that it
no longer had inline extent references, they were all keyed extent
references.  The check is bogus, we can have extent items that are >=
the extent item size, not just > than the extent item size.  Fix the
check so that we can generate metadata dumps properly.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-16 17:08:53 +01:00
David Sterba
0144bcb713 btrfs-progs: move volumes.c to kernel-shared/
Signed-off-by: David Sterba <dsterba@suse.com>
2020-08-31 17:01:06 +02:00
David Sterba
6069bc52a9 btrfs-progs: move transaction.c to kernel-shared/
Signed-off-by: David Sterba <dsterba@suse.com>
2020-08-31 17:01:06 +02:00
David Sterba
abb670f883 btrfs-progs: move ctree.c to kernel-shared/
Signed-off-by: David Sterba <dsterba@suse.com>
2020-08-31 17:01:05 +02:00
David Sterba
772f0da6df btrfs-progs: move disk-io.c to kernel-shared/
Signed-off-by: David Sterba <dsterba@suse.com>
2020-08-31 17:01:05 +02:00
David Sterba
4e49bd703d btrfs-progs: move extent_io.c to kernel-shared/
Signed-off-by: David Sterba <dsterba@suse.com>
2020-08-31 17:01:04 +02:00
David Sterba
a4122790ac btrfs-progs: move extent-cache.c to common/
Signed-off-by: David Sterba <dsterba@suse.com>
2020-08-31 17:01:04 +02:00
Qu Wenruo
b602b22884 btrfs-progs: image: pin down log tree blocks before fixup
Although btrfs-image will dump log tree, we will modify the restored
image if it's not a multi-device restore.

In that case, since log tree blocks are not recorded in the extent tree,
extent allocator will try to re-use the tree blocks belonging to log
trees, this would lead to transid mismatch if btrfs-restore chooses to
do device/chunk fixup.

This patch will fix such problem by pinning down all the log trees
blocks before fixing up the device and chunk trees.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-29 12:42:49 +02:00
Qu Wenruo
73dd4e3c87 btrfs-progs: image: Don't modify the chunk and device tree if the source dump is single device
Btrfs-image -r doesn't always create exactly the same fs of the original
fs, this is because btrfs-image can create dump from multi-device, and
restore it to single device.

Thus we need some special modification to chunk and device tree. This
behavior is mostly to handle the old behavior where we always restore
the metadata into SINGLE profile no matter what.

However this is not needed if the source fs only has one device, in that
case, we can use all the metadata as is, no need to modify the fs at
all, resulting an exact copy of metadata.

This patch will do extra check when doing restore, to avoid modify the
restored fs if the source is also single device.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-29 12:42:49 +02:00
Qu Wenruo
ccad599701 btrfs-progs: rename btrfs_block_group_cache to btrfs_block_group
To keep the same naming across kernel and btrfs-progs.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-11 20:50:00 +02:00
Adam Borowski
3d379b1341 btrfs-progs: lots of typo fixes (codespell)
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-31 18:37:38 +02:00
Su Yue
b1bd3cd93f btrfs-progs: reform block groups caches structure
This commit organises block groups cache in
btrfs_fs_info::block_group_cache_tree. And any dirty block groups are
linked in transaction_handle::dirty_bgs.

To keep coherence of bisect, it does almost replace in place:
1. Replace the old btrfs group lookup functions with new functions
introduced in former commits.
2. set_extent_bits(..., BLOCK_GROUP_DIRYT) things are replaced by linking
the block group cache into trans::dirty_bgs. Checking and clearing bits
are transformed too.
3. set_extent_bits(..., bit | EXTENT_LOCKED) things are replaced by
new the btrfs_add_block_group_cache() which inserts caches into
btrfs_fs_info::block_group_cache_tree directly. Other operations are
converted to tree operations.

Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-03 19:58:54 +01:00
Su Yue
162d891e4a btrfs-progs: pass @trans to functions working with dirty block groups
We are going to touch dirty_bgs in transaction directly, so every call
chain should pass @trans to the leaf functions.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-03 19:58:54 +01:00
Qu Wenruo
989a99b5f8 btrfs-progs: Replace btrfs_block_group_cache::item with dedicated members
We access btrfs_block_group_cache::item mostly for @used and @flags.

@flags is already a dedicated member in btrfs_block_group_cache, only
@used doesn't have a dedicated member.

This patch will remove btrfs_block_group_cache::item and add
btrfs_block_group_cache::used.

It's the btrfs-progs equivalent of the following kernel patches:
btrfs: move block_group_item::used to block group
btrfs: move block_group_item::flags to block group
btrfs: remove embedded block_group_cache::item

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 19:21:09 +01:00
Qu Wenruo
1796d5099f btrfs-progs: image: Rework how we search chunk tree blocks
Before this patch, we were using a very inefficient way to search
chunks:

We iterate through all clusters to find the chunk root tree block first,
then re-iterate all clusters again to find every child tree block.

Each time we need to iterate all clusters just to find a chunk tree
block.  This is obviously inefficient, especially when chunk tree gets
larger.  So the original author leaves a comment on it:

  /* If you have to ask you aren't worthy */
  static int search_for_chunk_blocks()

This patch will change the behavior so that we will only iterate all
clusters once.

The idea behind the optimization is, since we have the superblock
restored first, we could use the CHUNK_ITEMs in
super_block::sys_chunk_array to build a SYSTEM chunk mapping.

Then, when we start to iterate through all items, we can easily skip
unrelated items at different level:

- At cluster level
  If a cluster starts beyond last system chunk map, it must not contain
  any chunk tree blocks (as chunk tree blocks only lives inside system
  chunks)

- At item level
  If one item has no intersection with any system chunk map, then it
  must not contain any tree blocks.

By this, we can iterate through all clusters just once, and find out all
CHUNK_ITEMs.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 19:21:06 +01:00
Qu Wenruo
7abe4c3385 btrfs-progs: image: determine if a tree block is in the range of system chunks
Introduce a new helper function, is_in_sys_chunks(), to determine if an
item is in the range of system chunks.

Since btrfs-image will merge adjacent same type extents into one item,
this function is designed to return true for any bytes in system chunk
range.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 19:21:06 +01:00
Qu Wenruo
6cef330d2d btrfs-progs: image: Allow restore to record system chunk ranges for later usage
Currently we are doing a pretty slow search for system chunks before
restoring real data.
The current behavior is to search all clusters for chunk tree root
first, then search all clusters again and again for every chunk tree
block.

This causes recursive calls and pretty slow start up, the only good news
is since chunk tree are normally small, we don't need to iterate too
many times, thus overall it's acceptable.

To address such bad behavior, we could take usage of system chunk array
in the super block.
By recording all system chunks ranges, we could easily determine if an
extent belongs to chunk tree, thus do one loop simple linear search for
chunk tree leaves.

This patch only introduces the code base for later patches.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 19:21:06 +01:00
Qu Wenruo
52ad6dcbcd btrfs-progs: image: Don't waste memory when we're just extracting super block
There is no need to allocate 2 * max_pending_size (which can be 256M) if
we're just extracting super block.

We only need to prepare BTRFS_SUPER_INFO_SIZE as buffer size.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 19:21:06 +01:00
Qu Wenruo
d216266340 btrfs-progs: image: Fix error output to show correct return value
We can easily get confusing error message like:
  ERROR: restore failed: Success

This is caused by wrong "%m" usage, as we normally use ret to indicate
error, without populating errno.

This patch will fix it by output the return value directly as normally
we have extra error message to show more meaning message than the return
value.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 19:21:06 +01:00
Qu Wenruo
4dd66c7991 btrfs-progs: image: Output error message for chunk tree build error
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 19:21:06 +01:00
Johannes Thumshirn
c04bcdcacc btrfs-progs: move crc32c implementation to crypto/
With the introduction of xxhash64 to btrfs-progs we created a crypto/
directory for all the hashes used in btrfs (although no
cryptographically secure hash is there yet).

Move the crc32c implementation from kernel-lib/ to crypto/ as well so we
have all hashes consolidated.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 19:20:02 +01:00
Johannes Thumshirn
e4a8e1916d btrfs-progs: add table for checksum type and name
Adding this table will make extending btrfs-progs with new checksum types
easier.

Also add accessor functions to access the table fields.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-10-14 17:29:05 +02:00
Johannes Thumshirn
7b4f1035a6 btrfs-progs: pass checksum type to btrfs_csum_data()/btrfs_csum_final()
In preparation to supporting new checksum algorithm pass the checksum type
to btrfs_csum_data/btrfs_csum_final, this allows us to encapsulate any
differences in processing into the respective functions

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-10-14 17:28:28 +02:00
David Sterba
9b3a9daacd btrfs-progs: build: add stub makefile to image and mkfs
Add makefiles that work inside the image and mkfs directories, this
already exists for convert.

Signed-off-by: David Sterba <dsterba@suse.com>
2019-07-04 15:36:01 +02:00
David Sterba
bd4a386ec5 btrfs-progs: build most common tools into one binary (busybox style)
Build several standalone tools into one binary and switch the function
by name (symlink or hardlink).

* btrfs
* mkfs.btrfs
* btrfs-image
* btrfs-convert
* btrfstune

The static target is also supported. The name of resulting boxed
binaries is btrfs.box and btrfs.box.static . All the binaries can be
built at the same time without prior configuration.

   text    data     bss     dec     hex filename
 822454   27000   19724  869178   d433a btrfs
 927314   28816   20812  976942   ee82e btrfs.box
2067745   58004   44736 2170485  211e75 btrfs.static
2627198   61724   83800 2772722  2a4ef2 btrfs.box.static

File sizes:

  857496  btrfs
  968536  btrfs.box
 2141400  btrfs.static
 2704472  btrfs.box.static

Standalone utilities:

  512504  btrfs-convert
  495960  btrfs-image
  471224  btrfstune
  491864  mkfs.btrfs

 1747720  btrfs-convert.static
 1411416  btrfs-image.static
 1304256  btrfstune.static
 1361696  mkfs.btrfs.static

So the shared 900K binary saves ~2M, or ~5.7M for static build.

Signed-off-by: David Sterba <dsterba@suse.cz>
2019-07-04 15:30:40 +02:00
David Sterba
ccbea0977b btrfs-progs: utils: split device handling functions to own file
Helpers that read size, do zeoring, trim or prepare/finalize the device.

Signed-off-by: David Sterba <dsterba@suse.com>
2019-07-04 02:06:34 +02:00
David Sterba
94fced6353 btrfs-progs: build: drop kernel-lib from -I and update paths
Include the files by full path to avoid any confusion in case of
potentially duplicate names.

Signed-off-by: David Sterba <dsterba@suse.com>
2019-07-03 20:49:04 +02:00
David Sterba
c07960c8be btrfs-progs: move utils.[ch] to common/
Update include paths and remove some duplicates.

Signed-off-by: David Sterba <dsterba@suse.com>
2019-07-03 20:49:04 +02:00
David Sterba
f93b471143 btrfs-progs: move help.[ch] to common/
Signed-off-by: David Sterba <dsterba@suse.com>
2019-07-03 20:49:03 +02:00
David Sterba
d1efe50d0a btrfs-progs: move messages.[ch] to common/
Signed-off-by: David Sterba <dsterba@suse.com>
2019-07-03 20:49:03 +02:00
David Sterba
f63f29e9e9 btrfs-progs: move internal.h to common/
Create directory for all sources that can be used by anything that's not
rellated to a relevant kernel part, all common functions, helpers,
utilities that do not fit any other specific category.

The traditional location would be probably lib/ with all things that are
statically linked to the main binaries, but we have libbtrfs and
libbtrfsutil so this would be confusing.

Signed-off-by: David Sterba <dsterba@suse.com>
2019-07-03 20:49:03 +02:00
Qu Wenruo
ab5079c19a btrfs-progs: image: Verify the superblock before restore
This patch will export disk-io.c::check_super() as btrfs_check_super()
and use it in btrfs-image for extra verification.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-14 17:42:03 +02:00
Qu Wenruo
686e86d82d btrfs-progs: image: Fix a access-beyond-boundary bug when there are 32 online CPUs
[BUG]
When there are over 32 (in my example, 35) online CPUs, btrfs-image -c9
will just hang.

[CAUSE]
Btrfs-image has a hard coded limit (32) on how many threads we can use.
For the "-t" option we do the up limit check.

But when we don't specify "-t" option and speicified "-c" option, then
btrfs-image will try to auto detect the number of online CPUs, and use
it without checking if it's over the up limit.

And for num_threads larger than the up limit, we will over write the
adjust members of metadump_struct/mdrestore_struct, corrupting
pthread_mutex_t and pthread_cond_t, causing synchronising problem.

Nowadays, with SMT/HT and higher cpu core counts, it's not hard to go
beyond 32 threads, and hit the bug.

[FIX]
Just do extra num_threads check before using the number from sysconf().

Reviewed-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-14 17:41:35 +02:00
Qu Wenruo
d8c27db9ac btrfs-progs: image: Fix a indent misalign
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-14 17:39:30 +02:00
Qu Wenruo
0520abec4f btrfs-progs: image: Use SZ_* to replace intermediate size
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-14 17:39:22 +02:00
Qu Wenruo
085445e793 btrfs-progs: Cleanup BTRFS_COMPAT_EXTENT_TREE_V0
BTRFS_COMPAT_EXTENT_TREE_V0 is introduced for a short time in kernel,
and it's over 10 years ago.

Nowadays there should be no user for that feature, and kernel has remove
this support in Jun, 2018. There is no need for btrfs-progs to support
it.

This patch will remove EXTENT_TREE_V0 related code and replace those
BUG_ON() to a more graceful error message.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-05 18:00:07 +02:00
Qu Wenruo
37cf7de3a1 btrfs-progs: image: Only enlarge result image if it's a regular file
[BUG]
misc-test/021 will fail with the following error message:

  ====== RUN CHECK root_helper btrfs-progs/btrfs-image -r btrfs-progs/tests//test.img /dev/loop2
  ERROR: failed to enlarge result image: Invalid argument
  ERROR: failed to fix chunks and devices mapping, the fs may not be mountable: Invalid argument
  extent buffer leak: start 30638080 len 16384
  extent buffer leak: start 30408704 len 16384
  WARNING: dirty eb leak (aborted trans): start 30408704 len 16384
  ERROR: restore failed: Invalid argument
  failed: root_helper btrfs-progs/btrfs-image -r btrfs-progs/tests//test.img /dev/loop2
  test failed for case 021-image-multi-devices

[CAUSE]
For misc-test/021, we're using loopback device, which doesn't support
ftruncate.  That's why it returns EINVAL.

[FIX]
Only call ftruncate64() if we're operating on a regluar file.

Fixes: a7a44164c84e ("btrfs-progs: image: Use correct device size when restoring")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-01-15 18:42:12 +01:00
Qu Wenruo
9d5e88940a btrfs-progs: image: Use correct device size when restoring
When restoring btrfs image, the total_bytes of device item is not
updated correctly. In fact total_bytes can be left 0 for restored image.

It doesn't trigger any error because btrfs check never checks
total_bytes of dev item.  However this is going to change.

Fix it by populating total_bytes of device item with the end position of
last dev extent to make later btrfs check happy.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-01-15 18:42:12 +01:00
Nikolay Borisov
c4aadd9af2 btrfs-progs: Add support for metadata_uuid field
Add support for a new metadata_uuid field. This is just a preparatory
commit which switches all users of the fsid field for metdata comparison
purposes to utilize the new field. This more or less mirrors the
kernel patch, additionally:

 * Update 'btrfs inspect-internal dump-super' to account for the new
 field. This involes introducing the 'metadata_uuid' line to the
 output and updating the logic for comparing the fs uuid to the
 dev_item uuid.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-06 12:51:36 +01:00
Qu Wenruo
a1a98ee7a8 btrfs-progs: image: Remove all existing dev extents for later rebuild
For multi-disk btrfs image recovered to single disk, the dev tree would
look like:

        item 2 key (1 DEV_EXTENT 22020096)
                dev extent chunk_tree 3
                chunk_objectid 256 chunk_offset 22020096 length 8388608
        item 3 key (1 DEV_EXTENT 30408704)
                dev extent chunk_tree 3
                chunk_objectid 256 chunk_offset 30408704 length 1073741824
        item 4 key (1 DEV_EXTENT 1104150528)
                dev extent chunk_tree 3
                chunk_objectid 256 chunk_offset 1104150528 length 536870912
        item 5 key (2 DEV_EXTENT 1048576)
                dev extent chunk_tree 3
                chunk_objectid 256 chunk_offset 22020096 length 8388608
        item 6 key (2 DEV_EXTENT 9437184)
                dev extent chunk_tree 3
                chunk_objectid 256 chunk_offset 30408704 length 1073741824
        item 7 key (2 DEV_EXTENT 1083179008)
                dev extent chunk_tree 3
                chunk_objectid 256 chunk_offset 1104150528 length 536870912

However in chunk tree, we only use devid 2, thus devid 1 is completely
garbage:

        item 0 key (DEV_ITEMS DEV_ITEM 2)
        item 1 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 16105 itemsize 80
                length 8388608 owner 2 stripe_len 65536 type SYSTEM
                num_stripes 1 sub_stripes 0
                        stripe 0 devid 2 offset 1048576
        item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 16025 itemsize 80
                length 1073741824 owner 2 stripe_len 65536 type METADATA
                num_stripes 1 sub_stripes 0
                        stripe 0 devid 2 offset 9437184
        item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 1104150528) itemoff 15945 itemsize 80
                length 1073741824 owner 2 stripe_len 65536 type DATA
                num_stripes 1 sub_stripes 0
                        stripe 0 devid 2 offset 1083179008

To fix the problem, the most straight-forward way is to remove all
existing dev extents, and then re-fill correct dev extents from chunk.

So this patch just follows the straight-forward way to fix it, causing
the final dev extents layout to match chunk tree, and make btrfs check
happy.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-05 15:47:24 +01:00
Qu Wenruo
9a65b425bb btrfs-progs: image: Fix block group item flags when restoring multi-device image to single device
Since btrfs-image is just restoring tree blocks without really checking
if that tree block contents makes sense, for multi-device image, block
group items will keep the incorrect block group flags.

For example, for a metadata RAID1 data RAID0 btrfs recovered to a single
disk, its chunk tree will look like:

	item 1 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096)
		length 8388608 owner 2 stripe_len 65536 type SYSTEM
	item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704)
		length 1073741824 owner 2 stripe_len 65536 type METADATA
	item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 1104150528)
		length 1073741824 owner 2 stripe_len 65536 type DATA

All chunks have correct type (SINGLE).

While its block group items will look like:

	item 1 key (22020096 BLOCK_GROUP_ITEM 8388608)
		block group used 16384 chunk_objectid 256 flags SYSTEM|RAID1
	item 3 key (30408704 BLOCK_GROUP_ITEM 1073741824)
		block group used 114688 chunk_objectid 256 flags METADATA|RAID1
	item 11 key (1104150528 BLOCK_GROUP_ITEM 1073741824)
		block group used 1572864 chunk_objectid 256 flags DATA|RAID0

All block group items still have the wrong profiles.

And btrfs check (lowmem mode for better output) will report error for
such image:

  ERROR: chunk[22020096 30408704) related block group item flags mismatch, wanted: 2, have: 18
  ERROR: chunk[30408704 1104150528) related block group item flags mismatch, wanted: 4, have: 20
  ERROR: chunk[1104150528 2177892352) related block group item flags mismatch, wanted: 1, have: 9

This patch will do an extra repair for block group items to fix the
profile of them.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-05 15:47:23 +01:00