This would sync the code between kernel and btrfs-progs, and save at
least 1 byte for each btrfs_block_group_cache.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Update the summary of 'fi usage' where the multiple profiles will be
listed by type, like:
Multiple profiles: yes (data, metadata)
The string is returned from btrfs_test_for_multiple_profiles so the
callers don't have to assemble it together from the other profile
strings.
Signed-off-by: David Sterba <dsterba@suse.com>
The term 'mixed' is confusing as it's commonly used for mised block
group profiles created by 'mkfs.btrfs --mixed'. We're interested in
multiple profiles for each type, so use the term 'multiple'.
Signed-off-by: David Sterba <dsterba@suse.com>
Add the warning to 'device usage' and 'filesystem df'.
Signed-off-by: Goffredo Baroncelli <kreijack@inwid.it>
Signed-off-by: David Sterba <dsterba@suse.com>
A new line in the "Overall" section is added to inform that 'Multiple
profiles' are present.
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: David Sterba <dsterba@suse.com>
Add a check in some btrfs subcommands to detect if a filesystem
has mixed profiles for data/metadata/system. In this case
a warning is showed.
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: David Sterba <dsterba@suse.com>
Some scripts can still rely on this message, so make it available with
-vv, so -v stays sane.
Fixes: #127
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When we process a clone request, we look up the source subvolume by
UUID, even if the source is the subvolume that we're currently
receiving. Usually, this is fine. However, if for some reason we
previously received the same subvolume, then this will use paths
relative to the previously received subvolume instead of the current
one. This is incorrect, since the send stream may use temporary names
for the clone source. This can be reproduced as follows:
btrfs subvolume create subvol
dd if=/dev/urandom of=subvol/foo bs=1M count=1
cp --reflink subvol/foo subvol/bar
mkdir subvol/dir
mv subvol/foo subvol/dir/
btrfs property set subvol ro true
btrfs send -f send.data subvol
mkdir first second
btrfs receive -f send.data first
btrfs receive -f send.data second
The second receive results in this error:
ERROR: cannot open first/subvol/o259-7-0/foo: No such file or directory
Fix it by always cloning from the current subvolume if its UUID matches.
This has the nice side effect of avoiding unnecessary UUID tree lookups
in that case.
Fixes: f1c24cd80d ("Btrfs-progs: add btrfs send/receive commands")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The checks for a subvolume being modified after it was received have
been commented out since they were added back in commit f1c24cd80d
("Btrfs-progs: add btrfs send/receive commands"). Let's just get rid of
the noise.
If they were ever in place, it would have never been possible
to do an incremental send and running dedupe against the parent
snapshot.
That particular use case used to cause send, the kernel side, to fail
(initially with a BUG_ON() and later with -EIO returned to user
space), see commit b4f9a1a87a48 ("Btrfs: fix incremental send failure
after deduplication").
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
[ add Filipe's note ]
Signed-off-by: David Sterba <dsterba@suse.com>
The old code of copy_one_extent() is a mess:
- The main loop is implemented using goto
- @mirror_num is reset to 1 for each loop
- @mirror num check against @num_copies is wrong for decompression error
This patch will fix this mess by:
- Use read_extent_data()
read_extent_data() has all the good wrapping of btrfs_map_block()
and length check.
This removes a lot of complexity.
- Add extra file extent offset check
To prevent underflow for memory allocation
- Do proper mirror_num check for decompression error
Issue: #221
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Previously, no filenames/xattrs would be printed with --nofilename, but
to keep the format of dump, print a placeholder instead of all names.
This is:
* directory entries (files, directories, subvolumes)
* default subvolume
* extended attributes (name, value)
* hardlink names if stored inside another item
Note that lengths are not hidden because they can be calculated from the
item size anyway.
Signed-off-by: David Sterba <dsterba@suse.com>
In the mail list, it's pretty common that a developer is asking dump tree
output from the reporter, it's better to protect those kind reporters by
hiding the filename if the reporter wants.
This option will skip @name/@data output for the following items:
- DIR_INDEX
- DIR_ITEM
- INODE_REF
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
According to the documentation, btrfs qgroup remove takes the same
options as qgroup assign, i.e., --rescan and --no-rescan. However,
currently no options are accepted. Activate option handling also for
qgroup remove, so that automatic rescan can be disabled by the user.
Signed-off-by: Michael Lass <bevan@bi-co.net>
Signed-off-by: David Sterba <dsterba@suse.com>
LOGICAL_INO v1 ignored the reserved fields, so they could be filled
with random stack garbage and have no effect. LOGICAL_INO_V2 requires
all unused reserved bits to be set to zero, and returns EINVAL if they
are not, to guard against future kernel versions which may interpret
non-zero bit values.
Sometimes when 'btrfs ins log' runs, the stack garbage is zeros, so the
-o (ignore offsets) option for logical-resolve works. Sometimes the
stack garbage is something else, and 'btrfs ins log -o' fails with
invalid argument. This depends mostly on compiler version and build
environment details, so a binary typically either always works or never
works.
Fix by initializing logical-resolve's argument structure with a C99
compound literal zero.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Signed-off-by: David Sterba <dsterba@suse.com>
This ioctl will be responsible for deleting a subvolume using its id.
This can be used when a system has a file system mounted from a
subvolume, rather than the root file system, like below:
/
@subvol1/
@subvol2/
@subvol_default/
If only @subvol_default is mounted, we have no path to reach @subvol1
(id 256) and @subvol2 (id 257), thus no way to delete them. Current
subvolume delete ioctl takes a file handle point as argument, and if
@subvol_default is mounted, we can't reach @subvol1 and @subvol2 from
the same mount point.
$ mount -o subvol=subvol_default /mnt
$ btrfs subvolume delete -i 257 /mnt
This will delete @subvol2 although it's path is hidden.
Fixes: #152
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This commit organises block groups cache in
btrfs_fs_info::block_group_cache_tree. And any dirty block groups are
linked in transaction_handle::dirty_bgs.
To keep coherence of bisect, it does almost replace in place:
1. Replace the old btrfs group lookup functions with new functions
introduced in former commits.
2. set_extent_bits(..., BLOCK_GROUP_DIRYT) things are replaced by linking
the block group cache into trans::dirty_bgs. Checking and clearing bits
are transformed too.
3. set_extent_bits(..., bit | EXTENT_LOCKED) things are replaced by
new the btrfs_add_block_group_cache() which inserts caches into
btrfs_fs_info::block_group_cache_tree directly. Other operations are
converted to tree operations.
Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We are going to touch dirty_bgs in transaction directly, so every call
chain should pass @trans to the leaf functions.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Filesystems with nontrivial snapshots or dedupe will easily overflow
a 4K buffer. Bump the size up to the largest size supported by the
V1 ioctl.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Signed-off-by: David Sterba <dsterba@suse.com>
Increase the maximum buffer size to SZ_16M.
Add an option (-o) to set the ..._IGNORE_OFFSET flag.
If the buffer size is greater than 64K or the IGNORE_OFFSET option
is used, call ioctl V2; otherwise, use ioctl V1 to be compatible with
older kernels.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Signed-off-by: David Sterba <dsterba@suse.com>
When kernel returns ENOTCONN after the user tries to create a qgroup on
a subvolume without quota enabled, present a meaningful message to the
user. Kernels before 5.5 return EINVAL for that.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is based on idea from Stanislaw Gruszka to print the ratio,
originally suggested for the 'fi df', but we don't want to add new
things there and let people use 'fi us' instead. The new output fits
there and is printed by default without options:
Example output:
$ btrfs fi us /mnt
[...]
Data,single: Size:339.00GiB, Used:172.05GiB (50.75%)
Metadata,single: Size:7.00GiB, Used:3.41GiB (48.70%)
System,single: Size:32.00MiB, Used:64.00KiB (0.20%)
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Even "btrfs rescue zero-log" only reset btrfs_super_block::log_root and
btrfs_super_block::log_root_level, we still use trasction to write all
super blocks for all devices.
This means we can't handle things like corrupted extent tree:
checksum verify failed on 2172747776 found 000000B6 wanted 00000000
checksum verify failed on 2172747776 found 000000B6 wanted 00000000
bad tree block 2172747776, bytenr mismatch, want=2172747776, have=0
WARNING: could not setup extent tree, skipping it
Clearing log on /dev/nvme/btrfs, previous log_root 0, level 0
ERROR: Corrupted fs, no valid METADATA block group found
ERROR: attempt to start transaction over already running one
[CAUSE]
Because we have extra check in transaction code to ensure we have valid
METADATA block groups.
In fact we don't really need transaction at all.
[FIX]
Instead of commit transaction, we can just call write_all_supers()
manually, so we can still handle multi-device fs while avoid above
error.
Also, add OPEN_CTREE_NO_BLOCK_GROUPS open ctree flag to make it more
robust.
Link: https://lore.kernel.org/linux-btrfs/CAKbQEqG35D_=8raTFH75-yCYoqH2OvpPEmpj2dxgo+PTc=cfhA@mail.gmail.com/
Reported-by: Christian Pernegger <pernegger@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We access btrfs_block_group_cache::item mostly for @used and @flags.
@flags is already a dedicated member in btrfs_block_group_cache, only
@used doesn't have a dedicated member.
This patch will remove btrfs_block_group_cache::item and add
btrfs_block_group_cache::used.
It's the btrfs-progs equivalent of the following kernel patches:
btrfs: move block_group_item::used to block group
btrfs: move block_group_item::flags to block group
btrfs: remove embedded block_group_cache::item
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
btrfs balance status supports both short and long option -v|--verbose
but usage failed to show it in its --help. This patch fixes the --help.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
btrfs balance start supports both short and long option -v|--verbose
however usage failed to show the long option. This patch fixes the --help.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Even when -q option specified, the receive sub-command is not quiet as
shown below.
$ btrfs receive -q -f /tmp/t /btrfs1
At snapshot ss3
It must be quiet at least when it's been asked to be quiet.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add definition, crypto wrappers and support to mkfs for blake2 for
checksumming. There are 2 aliases either blake2 or blake2b.
Signed-off-by: David Sterba <dsterba@suse.com>
Add the definition to the checksum types and let mkfs accept it.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
With the introduction of xxhash64 to btrfs-progs we created a crypto/
directory for all the hashes used in btrfs (although no
cryptographically secure hash is there yet).
Move the crc32c implementation from kernel-lib/ to crypto/ as well so we
have all hashes consolidated.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Adding this table will make extending btrfs-progs with new checksum types
easier.
Also add accessor functions to access the table fields.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Add a helper to check if we have a valid csum type from the super block.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Update the checksumming API to be able to cope with more checksum types
than just CRC32C. The finalization call is merged into btrfs_csum_data.
There are some fixme's and asserts added that need to be resolved.
Co-developed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
In preparation to supporting new checksum algorithm pass the checksum type
to btrfs_csum_data/btrfs_csum_final, this allows us to encapsulate any
differences in processing into the respective functions
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Pass pointer to a generic buffer instead of fixed size that crc32c
currently uses.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Add the checksum type to csum_tree_block_size(), __csum_tree_block_size()
and verify_tree_block_csum_silent().
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Cache the super-block's checksum type field in 'struct recover_control'.
This will be needed for further refactoring the checksum handling.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
The only difference between parse_limit and parse_size is that
parse_limit accepts "none" as a valid input. That's easy enough to
handle as a special case and lets us drop the duplicate code.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We want just one header for the check API (similar to what mkfs does)
but as btrfsck.h is exported header (libbtrfs), it needs some
deprecation beriod before it's moved through there are probably no users
of that header file in particular.
Copy the header to check, all modifications and cleanups won't affect
the public header.
Signed-off-by: David Sterba <dsterba@suse.com>
The default traversal has been switched to BFS due, update the
documentation accordingly. Also fix the help text of the command that
ommitted to mention the options.
Signed-off-by: David Sterba <dsterba@suse.com>
Move the full-balance warning to before the fork, so that the user can
see and react to it.
Notes on test:
- Don't use grep -q, as it causes a SIGPIPE during the countdown, and
the balance thus doesn't start.
- The "balance cancel" is superfluous as the last command, but it
provides some idempotence and allows adding more tests below it.
Issue: #168
Signed-off-by: Vladimir Panteleev <git@vladimir.panteleev.md>
Signed-off-by: David Sterba <dsterba@suse.com>
User reports:
"When I execute btrfs restore -S to restore a symlink, it prints:
SYMLINK: 'dest/path/of/symlink' => 'symlink/target'
Failed to change owner: Bad file descriptor
And at cmds-restore.c#L937:
ret = fchownat(-1, file, btrfs_inode_uid(path.nodes[0], inode_item),
btrfs_inode_gid(path.nodes[0], inode_item),
AT_SYMLINK_NOFOLLOW);
"
The -1 is indeed a bad descriptor, and should be probably AT_FDCWD as
this is documented. The path passed as 'file' is always absolute, so the
semantics are unaffected.
Issue: #183
Signed-off-by: David Sterba <dsterba@suse.com>
ETA is calculated in a wrong way. It should be just current time in
seconds + sec_left, independently if the job was resumed or not.
Pull-request: #190
Signed-off-by: Grzegorz Kowal <grzegorz@amuncode.org>
Signed-off-by: David Sterba <dsterba@suse.com>
In process_clone(), we're not checking the return value of strdup().
But, there's no reason to strdup() in the first place: we just pass the
path into path_cat_out(). Get rid of the strdup().
Fixes: f1c24cd80d ("Btrfs-progs: add btrfs send/receive commands")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
- add first line of the long description
- list the type values in all commands (set, get, list)
- enhance
- split option description
Signed-off-by: David Sterba <dsterba@suse.com>
The comman 'btrfs inspect dump-tree <dev>' will scan all the devices
from the filesystem by defaul.
So as of now you can not inspect each mirrored device independently.
This patch adds option --noscan, which when used won't scan the system
for the partner devices, instead it just uses the devices provided in
the argument.
For example:
btrfs inspect dump-tree --noscan <dev> [<dev>..]
This helps to debug degraded raid1 and raid10.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
fgets consumes n-1 bytes from input buffer.
When a user types y\n, the newline is left in the buffer. As a result,
the next fgets uses that \n as answer without waiting for the user to
type.
This patch also fix a bug that dereference the ret without checking if
it's NULL.
* Consumes the `\n` from stdin buffer
* Avoid NULL pointer dereference: treat EOF as default value
Pull-request: #182
Author: pjw91 <mail6543210@yahoo.com.tw>
Signed-off-by: David Sterba <dsterba@suse.com>
When a scrub completes or is cancelled, statistics are updated for
reporting in a later btrfs scrub status command and for resuming the
scrub. Most statistics (such as bytes scrubbed) are additive so scrub
adds the statistics from the current run to the saved statistics.
However, the last_physical statistic is not additive. The value from the
current run should replace the saved value. The current code incorrectly
adds the last_physical from the current run to the previous saved value.
This bug causes the resume point to be incorrectly recorded, so large
areas of the disk are skipped when the scrub resumes. As an example,
assume a disk had 1000000 bytes and scrub was cancelled and resumed each
time 10% (100000 bytes) had been scrubbed.
Run | Start byte | bytes scrubbed | kernel last_physical | saved last_physical
1 | 0 | 100000 | 100000 | 100000
2 | 100000 | 100000 | 200000 | 300000
3 | 300000 | 100000 | 400000 | 700000
4 | 700000 | 100000 | 800000 | 1500000
5 | 1500000 | 0 | immediately completes| completed
In this example, only 40% of the disk is actually scrubbed.
This patch changes the saved/displayed last_physical to track the last
reported value from the kernel.
Signed-off-by: Graham R. Cobb <g.btrfs@cobb.uk.net>
Signed-off-by: David Sterba <dsterba@suse.com>
This adds a global --format option to request extended output formats
from each command.
We currently only support text mode. Command help reports what
output formats are available for each command. Global help reports
what valid formats are.
If an invalid format is requested, an error is reported and lists the
valid formats.
Each command sets a bitmask that describes which formats it is capable
of outputting. If a globally valid format is requested of a command
that doesn't support it, an error is reported and command usage dumped.
Commands don't need to specify that they support text output. All
commands are required to output text.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
[ use global config instead of passing cmd_context ]
Signed-off-by: David Sterba <dsterba@suse.com>
For options that do not have the long description, the empty string is
required to mark where the options start. Some commands were missing
that.
Signed-off-by: David Sterba <dsterba@suse.com>
This is first instance of commands files moving to a separate directory,
that will be cmds/, thus the files can drop the prefix. We can further
split files into specific parts of a given command. The quota file was
selected as the smallest.
Signed-off-by: David Sterba <dsterba@suse.com>