Print warning when one of the following is requested by some command
line option:
- btrfstune -b: conversion to block-group-tree
- mkfs.btrfs --num-global-roots: extent-tree-v2
- btrfs-image -d: dump image with data
Issue: #523
Signed-off-by: David Sterba <dsterba@suse.com>
The option is listed among normal options but getopt does not recognize
it outside of experimental build, so make it consistent. Also mention
the correct version and need of the special build in documentation.
Signed-off-by: David Sterba <dsterba@suse.com>
The block group tree doesn't yet have full bi-directional conversion
support from btrfstune, and it seems we may want one or two release
cycles to rule out some extra bugs before really releasing the progs
support.
This patch will hide the block group tree feature behind experimental
flag for the following tools:
- btrfstune
"-b" option to convert to bg tree.
- mkfs.btrfs
hide "block-group-tree" feature from both -O (the new default position
for all features) and -R (the old, soon to be deprecated one).
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The preferred order:
- system headers
- standard headers
- libraries
- kernel library
- kernel shared
- common headers
- other tools
- own headers
Signed-off-by: David Sterba <dsterba@suse.com>
The new '-b' option will be responsible for converting to block group
tree compat ro feature.
The workflow looks like this for new convert:
- Setting CHANGING_BG_TREE flag
And initialize fs_info->last_converted_bg_bytenr value to (u64)-1.
Any bg with bytenr >= last_converted_bg_bytenr will have its bg item
update go to the new root (bg tree).
- Iterate each block group by their bytenr in descending order
This involves:
* Delete the old bg item from the old tree (extent tree)
* Update last_converted_bg_bytenr to the bytenr of the bg
* Add the new bg item into the new tree (bg tree)
* If we have converted a bunch of bgs, commit current transaction
- Clear CHANGING_BG_TREE flag
And set the new BLOCK_GROUP_TREE compat ro flag and commit.
And since we're doing the convert in multiple transactions, we also need
to resume from last interrupted convert.
In that case, we just grab the last unconverted bg, and start from it.
And to co-operate with the new kernel requirement for both no-holes and
free-space-tree features, the convert tool will check for
free-space-tree feature. If not enabled, will error out with an error
message to how to continue (by mounting with "-o space_cache=v2").
For missing no-holes feature, we just need to set the flag during
convert.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add constant for initial value to avoid unexpected clashes with user
defined getopt values and shift the common size getopt values.
Signed-off-by: David Sterba <dsterba@suse.com>
Qu noticed that the full checksums are still printed even if the
experimental build is not enabled. This is caused by wrong use of #ifdef
(as the macro is always defined), this must be "#if".
Fixes: 1bb6fb896d ("btrfs-progs: btrfstune: experimental, new option to switch csums")
Reported-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The function read_extent_from_disk() is only a wrapper to read tree
block.
And read_extent_data() is just a while loop to eliminate short read
caused by stripe boundary.
In fact, a lot of call sites of read_extent_data() are either reading
metadata (thus no possible short read) or doing extra loop by
themselves.
This patch will replace those two functions with read_data_from_disk(),
making it the only entrance for data/metadata read.
And update read_data_from_disk() to return the read bytes, so caller can
do a simple while loop.
For the few callers of read_extent_data(), open-code a small while loop
for them.
This will allow later RAID56 read repair using P/Q much easier.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
The following sequence operation can lead to a seed fs rejected by
kernel:
# Generate a fs with dirty log
mkfs.btrfs -f $file
mount $dev $mnt
xfs_io -f -c "pwrite 0 16k" -c fsync $mnt/file
cp $file $file.backup
umount $mnt
mv $file.backup $file
# now $file has dirty log, set seed flag on it
btrfstune -S1 $file
# mount will fail
mount $file $mnt
The mount failure with the following dmesg:
[ 980.363667] loop0: detected capacity change from 0 to 262144
[ 980.371177] BTRFS info (device loop0): flagging fs with big metadata feature
[ 980.372229] BTRFS info (device loop0): using free space tree
[ 980.372639] BTRFS info (device loop0): has skinny extents
[ 980.375075] BTRFS info (device loop0): start tree-log replay
[ 980.375513] BTRFS warning (device loop0): log replay required on RO media
[ 980.381652] BTRFS error (device loop0): open_ctree failed
[CAUSE]
Although btrfs will replay its dirty log even with RO mount, but kernel
will treat seed device as RO device, and dirty log can not be replayed
on RO device.
This rejection is already the better end, just imagine if we don't treat
seed device as RO, and replayed the dirty log.
The filesystem relying on the seed device will be completely screwed up.
[FIX]
Just add extra check on log tree in btrfstune to reject setting seed
flag on filesystems with dirty log.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is still work in progress but can survive some stress testing.
There are still some sanity checks missing, do not user this on valuable
data. To enables this, configure must be run with the experimental
features enabled.
$ mkfs.btrfs --csum crc32c /dev/sdx
$ <mount, fill with data, unmount>
$ btrfstune --csum sha256
Will change the checksum to sha256.
Implementation:
- set bit on superblock when the checksums are being changed (similar to
the uuid rewrite)
- metadata checksums are overwritten in place
- data checksums:
- the checksum tree is completely deleted and no checksums are
verified
- data blocks are enumerated and all checksums generated (same as
check --init-csum-tree)
To make it usable, it should be restartable and track the current
progress somehow. Also the previous data checksums should be verified
any time they're available.
Signed-off-by: David Sterba <dsterba@suse.com>
When we switch to multiple global trees we'll need to access the
appropriate extent root depending on the block group or possibly root.
To handle this, use a helper in most places and then the actual root in
places where it is required. We will whittle down the direct accessors
with future patches, but this does the bulk of the preparatory work.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There's a group of functions that are related to opening filesystem in
various modes, this can be moved to a separate file.
Signed-off-by: David Sterba <dsterba@suse.com>
Build several standalone tools into one binary and switch the function
by name (symlink or hardlink).
* btrfs
* mkfs.btrfs
* btrfs-image
* btrfs-convert
* btrfstune
The static target is also supported. The name of resulting boxed
binaries is btrfs.box and btrfs.box.static . All the binaries can be
built at the same time without prior configuration.
text data bss dec hex filename
822454 27000 19724 869178 d433a btrfs
927314 28816 20812 976942 ee82e btrfs.box
2067745 58004 44736 2170485 211e75 btrfs.static
2627198 61724 83800 2772722 2a4ef2 btrfs.box.static
File sizes:
857496 btrfs
968536 btrfs.box
2141400 btrfs.static
2704472 btrfs.box.static
Standalone utilities:
512504 btrfs-convert
495960 btrfs-image
471224 btrfstune
491864 mkfs.btrfs
1747720 btrfs-convert.static
1411416 btrfs-image.static
1304256 btrfstune.static
1361696 mkfs.btrfs.static
So the shared 900K binary saves ~2M, or ~5.7M for static build.
Signed-off-by: David Sterba <dsterba@suse.cz>
The error message about the unsatisfied argument count is scrolled away
by the full usage string dump. This is not considered a good usability
practice.
This commit switches all direct usage -> return patterns, where the
argument check has no other constraint, eg. dependency on an option.
Signed-off-by: David Sterba <dsterba@suse.com>
Reformat and update the help string so it's more aligned with other
utilities (mkfs.btrfs was taken as a source):
- reduce indentation
- group related options
- add references to mkfs features
- wording adjustments
Signed-off-by: David Sterba <dsterba@suse.com>
This function currently takes an fs_info only to reference the chunk
root. Just pass the root directly. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This member was used only by btrfstune when using the old method to
change fsid. It's only an in-memory value with a very specific
purpose so it makes no sense to pollute a generic structure such as
btrfs_fs_info with it. Just remove it and pass it as a function
argument where pertinent. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This allows us to change the use-visible UUID on filesytems from
userspace if desired, by copying the existing UUID to the new location
for metadata comparisons. If this is done, an incompat flag must be
set to prevent older filesystems from mounting the filesystem, but
the original UUID can be restored, and the incompat flag removed.
This introduces the new -m|-M UUID options similar to current
-u|-U UUID ones with the difference that we don't rewrite the fsid but
just copy the old uuid and set a new one. Additionally running with
[-M old-uuid] clears the incompat flag and retains only fsid on-disk.
Additionally it's not allowed to intermix -m/-u/-U/-M options in a
single invocation of btrfstune, nor is it allowed to change the uuid
while there is a uuid rewrite in-progress. Also changing the uuid of a
seed device is not currently allowed (can change in the future).
Example:
btrfstune -m /dev/loop1
btrfs inspect-internal dump-super /dev/loop1
superblock: bytenr=65536, device=/dev/loop1
---------------------------------------------------------
csum_type 0 (crc32c)
csum_size 4
csum 0x4b7ea749 [match]
<ommitted for brevity>
fsid 0efc41d3-4451-49f3-8108-7b8bdbcf5ae8
metadata_uuid 352715e7-62cf-4ae0-92ee-85a574adc318
<ommitted for brevity>
incompat_flags 0x541
( MIXED_BACKREF |
EXTENDED_IREF |
SKINNY_METADATA |
METADATA_UUID )
<omitted for brevity>
dev_item.uuid 0610deee-dfc3-498b-9449-a06533cdec98
dev_item.fsid 352715e7-62cf-4ae0-92ee-85a574adc318 [match]
<ommitted for brevity>
mount /dev/loop1 btrfs-mnt/
btrfs fi show btrfs-mnt/
Label: none uuid: 0efc41d3-4451-49f3-8108-7b8bdbcf5ae8
Total devices 1 FS bytes used 128.00KiB
devid 1 size 5.00GiB used 536.00MiB path /dev/loop1
In this case a new btrfs filesystem was created and the original uuid
was 352715e7-62cf-4ae0-92ee-85a574adc318, then btrfstune was run which
copied that value over to metadata_uuid field and set the current fsid
to 0efc41d3-4451-49f3-8108-7b8bdbcf5ae8. And as far as userspace is
concerned this is the fsid of the fs.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add support for a new metadata_uuid field. This is just a preparatory
commit which switches all users of the fsid field for metdata comparison
purposes to utilize the new field. This more or less mirrors the
kernel patch, additionally:
* Update 'btrfs inspect-internal dump-super' to account for the new
field. This involes introducing the 'metadata_uuid' line to the
output and updating the logic for comparing the fs uuid to the
dev_item uuid.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Rename the function so its name falls in line with the rest of btrfs
nomenclature. So when we change the uuid of the header it's in fact of
the buffer header. Also remove the root argument since it's used solely
to take a reference to fs_info which can be done via the mandatory eb
argument. No functional changes.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This function already takes an extent buffer that contains a reference
to the fs_info. Use that and reduce argument count. No functional
changes.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Similar to the changes where strerror(errno) was converted, continue
with the remaining cases where the argument was stored in another
variable.
The savings in object size are about 4500 bytes:
$ size btrfs.old btrfs.new
text data bss dec hex filename
805055 24248 19748 849051 cf49b btrfs.old
804527 24248 19748 848523 cf28b btrfs.new
Signed-off-by: David Sterba <dsterba@suse.com>
When the 'btrfsune -u' command is interrupted, the final filesystem fsid
is not written to the superblock and it cannot be mounted. Too bad that
'btrfstune' cannot continue to finish the UUID change as it should.
This patch fixes that and passes the relaxed flags for superblock and
only warns when it detects the fsid mismatch. As this is something that
should be noted in case it would be needed for further debugging, it's
not just silent.
Signed-off-by: David Sterba <dsterba@suse.com>
We'll want to pass non-default superblock flags so let's use the other
helper that allows that. We can reuse the filesystem handle so it needs
to open it read-write, unlike what the plain check_mounted does.
Signed-off-by: David Sterba <dsterba@suse.com>
Currently transaction bugs out insided btrfs_start_transaction in case
of error, we want to lift the error handling to the callers. This patch
adds the BUG_ON anywhere it's been missing so far. This is not the best
way of course. Transforming BUG_ON to a proper error handling highly
depends on the caller and should be dealt with case by case.
Signed-off-by: David Sterba <dsterba@suse.com>
The only reasom read_tree_block() needs a btrfs_root parameter is to get
its node/sector size.
And long ago, I have already introduced a compactible interface,
read_tree_block_fs_info() to pass btrfs_fs_info instead of btrfs_root.
Since we have cleaned up all root->sector/node/stripesize users, we
should be OK to refactor read_tree_block() function.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>