With the introduction of xxhash64 to btrfs-progs we created a crypto/
directory for all the hashes used in btrfs (although no
cryptographically secure hash is there yet).
Move the crc32c implementation from kernel-lib/ to crypto/ as well so we
have all hashes consolidated.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Adding this table will make extending btrfs-progs with new checksum types
easier.
Also add accessor functions to access the table fields.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
In preparation to supporting new checksum algorithm pass the checksum type
to btrfs_csum_data/btrfs_csum_final, this allows us to encapsulate any
differences in processing into the respective functions
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Build several standalone tools into one binary and switch the function
by name (symlink or hardlink).
* btrfs
* mkfs.btrfs
* btrfs-image
* btrfs-convert
* btrfstune
The static target is also supported. The name of resulting boxed
binaries is btrfs.box and btrfs.box.static . All the binaries can be
built at the same time without prior configuration.
text data bss dec hex filename
822454 27000 19724 869178 d433a btrfs
927314 28816 20812 976942 ee82e btrfs.box
2067745 58004 44736 2170485 211e75 btrfs.static
2627198 61724 83800 2772722 2a4ef2 btrfs.box.static
File sizes:
857496 btrfs
968536 btrfs.box
2141400 btrfs.static
2704472 btrfs.box.static
Standalone utilities:
512504 btrfs-convert
495960 btrfs-image
471224 btrfstune
491864 mkfs.btrfs
1747720 btrfs-convert.static
1411416 btrfs-image.static
1304256 btrfstune.static
1361696 mkfs.btrfs.static
So the shared 900K binary saves ~2M, or ~5.7M for static build.
Signed-off-by: David Sterba <dsterba@suse.cz>
This patch will export disk-io.c::check_super() as btrfs_check_super()
and use it in btrfs-image for extra verification.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When there are over 32 (in my example, 35) online CPUs, btrfs-image -c9
will just hang.
[CAUSE]
Btrfs-image has a hard coded limit (32) on how many threads we can use.
For the "-t" option we do the up limit check.
But when we don't specify "-t" option and speicified "-c" option, then
btrfs-image will try to auto detect the number of online CPUs, and use
it without checking if it's over the up limit.
And for num_threads larger than the up limit, we will over write the
adjust members of metadump_struct/mdrestore_struct, corrupting
pthread_mutex_t and pthread_cond_t, causing synchronising problem.
Nowadays, with SMT/HT and higher cpu core counts, it's not hard to go
beyond 32 threads, and hit the bug.
[FIX]
Just do extra num_threads check before using the number from sysconf().
Reviewed-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
BTRFS_COMPAT_EXTENT_TREE_V0 is introduced for a short time in kernel,
and it's over 10 years ago.
Nowadays there should be no user for that feature, and kernel has remove
this support in Jun, 2018. There is no need for btrfs-progs to support
it.
This patch will remove EXTENT_TREE_V0 related code and replace those
BUG_ON() to a more graceful error message.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
misc-test/021 will fail with the following error message:
====== RUN CHECK root_helper btrfs-progs/btrfs-image -r btrfs-progs/tests//test.img /dev/loop2
ERROR: failed to enlarge result image: Invalid argument
ERROR: failed to fix chunks and devices mapping, the fs may not be mountable: Invalid argument
extent buffer leak: start 30638080 len 16384
extent buffer leak: start 30408704 len 16384
WARNING: dirty eb leak (aborted trans): start 30408704 len 16384
ERROR: restore failed: Invalid argument
failed: root_helper btrfs-progs/btrfs-image -r btrfs-progs/tests//test.img /dev/loop2
test failed for case 021-image-multi-devices
[CAUSE]
For misc-test/021, we're using loopback device, which doesn't support
ftruncate. That's why it returns EINVAL.
[FIX]
Only call ftruncate64() if we're operating on a regluar file.
Fixes: a7a44164c84e ("btrfs-progs: image: Use correct device size when restoring")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When restoring btrfs image, the total_bytes of device item is not
updated correctly. In fact total_bytes can be left 0 for restored image.
It doesn't trigger any error because btrfs check never checks
total_bytes of dev item. However this is going to change.
Fix it by populating total_bytes of device item with the end position of
last dev extent to make later btrfs check happy.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add support for a new metadata_uuid field. This is just a preparatory
commit which switches all users of the fsid field for metdata comparison
purposes to utilize the new field. This more or less mirrors the
kernel patch, additionally:
* Update 'btrfs inspect-internal dump-super' to account for the new
field. This involes introducing the 'metadata_uuid' line to the
output and updating the logic for comparing the fs uuid to the
dev_item uuid.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since btrfs-image is just restoring tree blocks without really checking
if that tree block contents makes sense, for multi-device image, block
group items will keep the incorrect block group flags.
For example, for a metadata RAID1 data RAID0 btrfs recovered to a single
disk, its chunk tree will look like:
item 1 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096)
length 8388608 owner 2 stripe_len 65536 type SYSTEM
item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704)
length 1073741824 owner 2 stripe_len 65536 type METADATA
item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 1104150528)
length 1073741824 owner 2 stripe_len 65536 type DATA
All chunks have correct type (SINGLE).
While its block group items will look like:
item 1 key (22020096 BLOCK_GROUP_ITEM 8388608)
block group used 16384 chunk_objectid 256 flags SYSTEM|RAID1
item 3 key (30408704 BLOCK_GROUP_ITEM 1073741824)
block group used 114688 chunk_objectid 256 flags METADATA|RAID1
item 11 key (1104150528 BLOCK_GROUP_ITEM 1073741824)
block group used 1572864 chunk_objectid 256 flags DATA|RAID0
All block group items still have the wrong profiles.
And btrfs check (lowmem mode for better output) will report error for
such image:
ERROR: chunk[22020096 30408704) related block group item flags mismatch, wanted: 2, have: 18
ERROR: chunk[30408704 1104150528) related block group item flags mismatch, wanted: 4, have: 20
ERROR: chunk[1104150528 2177892352) related block group item flags mismatch, wanted: 1, have: 9
This patch will do an extra repair for block group items to fix the
profile of them.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Current fixup_devices() will only remove DEV_ITEMs and reset DEV_ITEM
size. Later we need to do more fixup works, so change the name to
fixup_chunks_and_devices() and refactor the original device size fixup
to fixup_device_size().
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Similar to the changes where strerror(errno) was converted, continue
with the remaining cases where the argument was stored in another
variable.
The savings in object size are about 4500 bytes:
$ size btrfs.old btrfs.new
text data bss dec hex filename
805055 24248 19748 849051 cf49b btrfs.old
804527 24248 19748 848523 cf28b btrfs.new
Signed-off-by: David Sterba <dsterba@suse.com>
At restore time, btrfs-image will try to fixup dev extents, this will
trigger a transaction.
It's normally OK to start a new transaction, but for an image with log
tree, a new transaction will increment generation, and log tree generation
will not match super block generation + 1.
There is no good way to solve it yet. Just output a warning for now.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update error message ]
Signed-off-by: David Sterba <dsterba@suse.com>
For read_data_extent() in convert/main.c it's using mirror number in a
incorrect way, which will not get correct copy for RAID1:
for (cur_mirror = 0; cur_mirror < num_copies; cur_mirror++) {
In such case, for RAID1 @cur_mirror will only be 0 and 1.
However for 0 and 1 case, btrfs_map_block() will only return the first
copy. To reach the 2nd copy, it correct @cur_mirror range should be 1
and 2.
So with this off-by-one error, btrfs-image will never be able to read
out data extent if the first stripe of the chunk is the missing one.
Fix it by starting @cur_mirror from 1 and to @num_copies (including).
Fixes: 2d46558b30 ("btrfs-progs: Use existing facility to replace read_data_extent function")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The kernel code no longer has BTRFS_CRC32_SIZE and only uses
btrfs_csum_sizes[]. So, update the progs code as well.
Suggested-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
As btrfs is specific to Linux, %m can be used instead of strerror(errno)
in format strings. This has some size reduction benefits for embedded
systems.
glibc, musl, and uclibc-ng all support %m as a modifier to printf.
A quick glance at the BIONIC libc source indicates that it has
support for %m as well. BSDs and Windows do not but I do believe
them to be beyond the scope of btrfs-progs.
Compiled sizes on Ubuntu 16.04:
Before:
3916512 btrfs
233688 libbtrfs.so.0.1
4899 bcp
2367672 btrfs-convert
2208488 btrfs-corrupt-block
13302 btrfs-debugfs
2152160 btrfs-debug-tree
2136024 btrfs-find-root
2287592 btrfs-image
2144600 btrfs-map-logical
2130760 btrfs-select-super
2152608 btrfstune
2131760 btrfs-zero-log
2277752 mkfs.btrfs
9166 show-blocks
After:
3908744 btrfs
233256 libbtrfs.so.0.1
4899 bcp
2366560 btrfs-convert
2207432 btrfs-corrupt-block
13302 btrfs-debugfs
2151104 btrfs-debug-tree
2134968 btrfs-find-root
2281864 btrfs-image
2143536 btrfs-map-logical
2129704 btrfs-select-super
2151552 btrfstune
2130696 btrfs-zero-log
2276272 mkfs.btrfs
9166 show-blocks
Total savings: 23928 (24 kilo)bytes
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The function uses the reverse CRC32C table to quickly calculate a
4-byte suffix, that when added to the original data will make it
match desired checksum.
Author: Piotr Pawlow <pp@siedziba.pl>
[ minor adjustments ]
Signed-off-by: David Sterba <dsterba@suse.com>
The table will be used to speed up calculations of CRC32C collisions.
Author: Piotr Pawlow <pp@siedziba.pl>
Signed-off-by: David Sterba <dsterba@suse.com>
Function find_collision sometimes generated file names with non-printable
DEL characters (code 127), for example file name "|5gp!" would be changed
to "U'2<DEL>y" when using "crc-collisions" sanitize mode.
Author: Piotr Pawlow <pp@siedziba.pl>
Signed-off-by: David Sterba <dsterba@suse.com>
Tree blocks are always nodesize. As readahead is only an optimization,
exact size is not required and is only advisory.
Signed-off-by: David Sterba <dsterba@suse.com>
Making the code data-race safe requires that reads *and* writes
happen under a mutex lock, if any of the access are writes. See
Dmitri Vyukov, "Benign data races: what could possibly go wrong?"
for more details.
The fix here was to put most of the main loop of restore_worker
under a mutex lock.
This race was detected using fsck-tests/012-leaf-corruption.
==================
WARNING: ThreadSanitizer: data race
Write of size 4 by main thread:
#0 add_cluster btrfs-progs/image/main.c:1931
#1 restore_metadump btrfs-progs/image/main.c:2566
#2 main btrfs-progs/image/main.c:2859
Previous read of size 4 by thread T6:
#0 restore_worker btrfs-progs/image/main.c:1720
Location is stack of main thread.
Thread T6 (running) created by main thread at:
#0 pthread_create <null>
#1 mdrestore_init btrfs-progs/image/main.c:1868
#2 restore_metadump btrfs-progs/image/main.c:2534
#3 main btrfs-progs/image/main.c:2859
SUMMARY: ThreadSanitizer: data race btrfs-progs/image/main.c:1931 in
add_cluster
Signed-off-by: Adam Buchbinder <abuchbinder@google.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The only reasom read_tree_block() needs a btrfs_root parameter is to get
its node/sector size.
And long ago, I have already introduced a compactible interface,
read_tree_block_fs_info() to pass btrfs_fs_info instead of btrfs_root.
Since we have cleaned up all root->sector/node/stripesize users, we
should be OK to refactor read_tree_block() function.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
We correctly build an image from a multiple devices filesystem but when
restoring the image into a single device we were missing updating the
number of devices in the superblock to the value 1 (we already took care
of setting the number of stripes to 1 for each chunk item and setting
the device id for each chunk item to match the device id from the super
block).
This missing update of the number of devices makes it impossible to mount
the restored filesystem on recent kernels, more specifically since the
linux kernel commit 99e3ecfcb9f4ca35192d20a5bea158b81f600062
("Btrfs: add more validation checks for superblock"), that produce the
following message in the dmesg/syslog:
[21097.542047] BTRFS error (device sdi): super_num_devices 2 mismatch with num_devices 1 found here
[21097.543972] BTRFS error (device sdi): failed to read chunk tree: -22
[21097.720360] BTRFS error (device sdi): open_ctree failed
So fix this by updating the number of devices to 1 in the superblock.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
While performing a memcpy, we are copying from uninitialized dst
as opposed to src->data. Though using eb->len is correct, I used
src->len to make it more readable.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>