The search for default subvolume could fail for two reasons, the lack of
CAP_SYS_ADMIN for TREE_SEARCH ioctl is one but the default subvolume
could be unset as well, thus no restrictions for deletion.
Signed-off-by: David Sterba <dsterba@suse.com>
Checking the default subvolume uses TREE_SEARCH which is a CAP_SYS_ADMIN
only operation, and thus will fail when unprivileged, even if we have
permissions to actually delete the subvolume.
This produces a warning even if all is ok. Let's hide it if we're not
root (root but !CAP is odd enough to warn).
Fixes 87804a3f06 ("btrfs-progs: subvolume: check deleting default subvolume")
Link: https://bugs.debian.org/998840
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: David Sterba <dsterba@suse.com>
Pointer returned from get_parent needs additional handling otherwise
we could return an error and then try to free it. Reset the pointer when
the error occurs so the cleanup is always done on a valid pointer.
Issue: #423
Signed-off-by: David Sterba <dsterba@suse.com>
The function autodetect_object_types() tries to detect the type of
btrfs object passed. If it is an "inode" type (e.g. file) this function
returns the type as "inode". If it is a block device, it return it as
"block device".
However it doesn't handle the case where the object passed is a link
to a block device (which could be a valid btrfs device). For example
LVM/DM creates link to block devices. In this case it should return
the type as "block device".
This patch replace the lstat() call with a stat().
Reported-by: Boris Burkov <boris@bur.io>
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: David Sterba <dsterba@suse.com>
When some error happens when trying to search for parent subvolume
then parent_subvol will contain errno so don't try to free that
Crash backtrace would look like:
0 process_snapshot at cmds/receive.c:358
358 free(parent_subvol->path);
1 0x00005646898aaa67 in read_and_process_cmd at common/send-stream.c:348
2 btrfs_read_and_process_send_stream at common/send-stream.c:525
3 0x00005646898c9b8b in do_receive at cmds/receive.c:1113
4 cmd_receive at cmds/receive.c:1316
5 0x00005646898750b1 in cmd_execute at cmds/commands.h:125
6 main at btrfs.c:405
(gdb) p parent_subvol
$1 = (struct subvol_info *) 0xfffffffffffffffe
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Dāvis Mosāns <davispuh@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add the on disk definitions for the block group tree. This will be part
of the super block so we need to add the appropriate helpers to the
super block, as well as adding it to the backup roots.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When we switch to multiple global trees we'll need to access the
appropriate extent root depending on the block group or possibly root.
To handle this, use a helper in most places and then the actual root in
places where it is required. We will whittle down the direct accessors
with future patches, but this does the bulk of the preparatory work.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Filesystem du command fails and exits when it access file that has
permission denied. But it can continue the command except the files.
This patch prints error message just like /bin/du does and it continues
if it can.
Issue: #421
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With extent tree v2 we will have per-block group checksums, so add a
helper to access the csum root and rename the fs_info csum_root to
_csum_root to catch all the places that are accessing it directly.
Convert everybody to use the helper except for internal things.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Running with ASAN we won't pass the self tests because we leak the whole
fs_info with btrfs filesystem show. Fix this by making sure we close
out the fs_info and clean up all of the memory and such.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
There is a bug report that a corrupted key type (expected
UUID_KEY_SUBVOL, has EXTENT_ITEM) causing newer kernel to reject a
mount.
Although the root cause is not determined yet, with roll out of v5.11
kernel to various distros, such problem should be prevented by
tree-checker, no matter if it's hardware problem or not.
And older kernel with "-o uuid_rescan" mount option won't help, as
uuid_rescan will only delete items with
UUID_KEY_SUBVOL/UUID_KEY_RECEIVED_SUBVOL key types, not deleting such
corrupted key.
[FIX]
To fix such problem we have to rely on offline tool, thus there we
introduce a new rescue tool, clear-uuid-tree, to empty and then remove
uuid tree.
Kernel will re-generate the correct uuid tree at next mount.
Reported-by: S. <sb56637@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Current formula calculates the stripe size, however that's not what we
want in the case of RAID1/DUP profiles. In those cases since chunk are
mirrored across devices we want the full size of the chunk. Without this
patch the 'btrfs fi usage' output from an fs which is using RAID1 is:
Data,RAID1: Size:2.00GiB, Used:1.00GiB (50.03%)
/dev/vdc 1.00GiB
/dev/vdf 1.00GiB
Metadata,RAID1: Size:256.00MiB, Used:1.34MiB (0.52%)
/dev/vdc 128.00MiB
/dev/vdf 128.00MiB
System,RAID1: Size:8.00MiB, Used:16.00KiB (0.20%)
/dev/vdc 4.00MiB
/dev/vdf 4.00MiB
Unallocated:
/dev/vdc 8.87GiB
/dev/vdf 8.87GiB
So a 2 gigabyte RAID1 chunk actually will take up 4 gigabytes on the
actual disks 2 each. In this case this is being miscalculated as taking
up 1GiB on each device.
This also leads to erroneously calculated unallocated space. The correct
output in this case is:
Data,RAID1: Size:2.00GiB, Used:1.00GiB (50.03%)
/dev/vdc 2.00GiB
/dev/vdf 2.00GiB
Metadata,RAID1: Size:256.00MiB, Used:1.34MiB (0.52%)
/dev/vdc 256.00MiB
/dev/vdf 256.00MiB
System,RAID1: Size:8.00MiB, Used:16.00KiB (0.20%)
/dev/vdc 8.00MiB
/dev/vdf 8.00MiB
Unallocated:
/dev/vdc 7.74GiB
/dev/vdf 7.74GiB
Fix it by only utilising the chunk formula for profiles which are not
RAID1/DUP.
Issue: #422
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 80714610f3 ("btrfs-progs: use raid table for ncopies")
slightly broke how raid ratio are being calculated since the resulting
code would always reset ratio to be 1 in case we didn't have RAID56
profile. The correct behavior is to simply set it to 0 if we have RAID56
as the calculation is different in this case and leave it intact
otherwise.
This bug manifests by doing all size-related calculation for 'btrfs
filesystem usage' command as if all block groups are of type SINGLE. Fix
this by only resetting ratio 0 in case of RAID56.
Issue: #422
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Just like kernel commit 22b6331d9617 ("btrfs: store precalculated
csum_size in fs_info"), we can cache csum_size and csum_type in
btrfs_fs_info.
Furthermore, there is already a 32 bits hole in btrfs_fs_info, and we
can fit csum_type and csum_size into the hole without increase the size
of btrfs_fs_info.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There are a lot of call sites where we use the following code snippet:
u8 super_block_data[BTRFS_SUPER_INFO_SIZE];
struct btrfs_super_block *sb;
u64 ret;
sb = (struct btrfs_super_block *)super_block_data;
The reason for this is, structure btrfs_super_block was smaller than
BTRFS_SUPER_INFO_SIZE.
Thus for anything with csum involved, we have to use a proper 4K buffer.
Since the recent unification of sizeof(struct btrfs_super_block), we no
longer need such workaround, and can use struct btrfs_super_block
directly to do any operation.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There's a report that a read-only subvolume with a received_uuid set
emits the warning in command 'btrfs subvolume show', which is obviously
wrong.
The reason is that there are different types of root item flags,
depending on how we read them. The check in cmd_subvol_show uses the
ioctl GET_SUBVOL_INFO and the appropriate flag is raw
BTRFS_ROOT_SUBVOL_RDONLY (0x1), while there's another SUBVOL_GETFLAGS that
maps the flags and the raw value is different (BTRFS_SUBVOL_RDONLY, 0x2).
Due to this the warning was issued. Fix that by using the right flag
constant. The test has been extended to check for all combinations of
read-write and received_uuid.
Issue: #419
Signed-off-by: David Sterba <dsterba@suse.com>
The profile descriptions allow us to use a single formula to calculate
chunk size. Right now there are no profiles with parity (raid5-like) and
sub_stripes (raid10-like), which makes it easier.
- parity stripes are subtracted from the total count
- then divided by number of sub stripes
Practically speaking, 1:1 copy profiles do not have any adjustments.
Signed-off-by: David Sterba <dsterba@suse.com>
The striped profiles covering arbitrary number of devices are often
hardcoded so use the new helper btrfs_bg_type_is_stripey for that.
Signed-off-by: David Sterba <dsterba@suse.com>
There's opencoded value of raid table ncopies in
print_filesystem_usage_overall, add a helper and use it.
Signed-off-by: David Sterba <dsterba@suse.com>
After removing uuid search fallback code the structure has become
trivial and copies the fd that all callers have in their context.
Signed-off-by: David Sterba <dsterba@suse.com>
After the uuid search fallback code has been removed, the finit helper
has become empty and can be removed.
Signed-off-by: David Sterba <dsterba@suse.com>
All the comparators switch the result based on is_descending, but that
can be factored to the caller to simplify the comparators.
Signed-off-by: David Sterba <dsterba@suse.com>
The remaining functions are too entangled to be moved separately without
too much churn making them exported and not, so move all the code at
once. No refactoring or coding style fixups.
Signed-off-by: David Sterba <dsterba@suse.com>
There's only one caller of btrfs_list_alloc_filter_set so move it there.
Also move the definitions of BTRFS_LIST_* to the header so they can be
used by both btrfs-list and subvolume.c.
Signed-off-by: David Sterba <dsterba@suse.com>
There's only one caller of btrfs_list_alloc_comparer_set so move it
there. Also move the definitions of BTRFS_LIST_* to the header so they
can be used by both btrfs-list and subvolume.c.
Signed-off-by: David Sterba <dsterba@suse.com>
The actual implementation of find-new functionality is outside of
subvolume.c, copy it where it's supposed to be. No reformatting or style
changes.
Signed-off-by: David Sterba <dsterba@suse.com>
The main functionality of subvolume listing is now in btrfs-list.c but
there are no other commands using the API so this will be merged. It's a
lot of code so split it to another file.
Signed-off-by: David Sterba <dsterba@suse.com>
The btrfs_list_* functions come with some overhead and for simple path
resolution we can use btrfs_subvolid_resolve.
Signed-off-by: David Sterba <dsterba@suse.com>
We don't need to include this besides btrfs-list.c itself and
subvolume.c that does use the btrfs_list_* API.
Signed-off-by: David Sterba <dsterba@suse.com>
Add a slightly more convenient way to identify the subvolumes with bad
combination of flags and received uuid.
Signed-off-by: David Sterba <dsterba@suse.com>
Implement safety check when a read-only subvolume is getting switched
to read-write and there's received_uuid set.
This prevents accidental breakage of incremental send use case but
allows user to do the rw change anyway but resets the received_uuid in
that case.
As this is implemented entirely in userspace, it's racy and using the
raw ioctl won't prevent it nor reset the received_uuid. A change in the
ioctl implementation might do that in the future.
Signed-off-by: David Sterba <dsterba@suse.com>
Add option support to force the value change. This allows to do safety
checks by default and warn user that something might break. Using the
force will override that and changing the property should do change
itself and additionally any other changes that could break some
use cases.
Signed-off-by: David Sterba <dsterba@suse.com>
There are some send/receive related data not printed in subvol show,
while they're exported by the ioctls. Print them for convenience:
$ btrfs subvol show test
test
Name: test
UUID: dc16dd1b-825f-3245-94a8-557672d6cf85
Parent UUID: -
Received UUID: -
Creation time: 2021-05-17 16:17:14 +0200
Subvolume ID: 19112
Generation: 7730702
Gen at creation: 7730701
Parent ID: 5
Top level ID: 5
Flags: -
Send transid: 0
Send time: 2021-05-17 16:17:14 +0200
Receive transid: 0
Receive time: -
Snapshot(s):
test-snap
Signed-off-by: David Sterba <dsterba@suse.com>
I had to go back to find what BTRFS_ARG_REG is, add a comment for that.
And, search_umounted_fs_uuids() is also to find the seed device, so bring
the related comment above it.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The commands initializing a new device (mkfs, device add) do discard by
default, while this is missing from replace start. For parity add the
options with same name and semantics.
Issue: #390
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
There is a report that, btrfstune can even work while the fs has transid
mismatch problems.
$ btrfstune -f -u /dev/sdb1
Current fsid: b2b5ae8d-4c49-45f0-b42e-46fe7dcfcb07
New fsid: b2b5ae8d-4c49-45f0-b42e-46fe7dcfcb07
Set superblock flag CHANGING_FSID
Change fsid in extents
parent transid verify failed on 792854528 wanted 20103 found 20091
parent transid verify failed on 792854528 wanted 20103 found 20091
parent transid verify failed on 792854528 wanted 20103 found 20091
Ignoring transid failure
parent transid verify failed on 792870912 wanted 20103 found 20091
parent transid verify failed on 792870912 wanted 20103 found 20091
parent transid verify failed on 792870912 wanted 20103 found 20091
Ignoring transid failure
parent transid verify failed on 792887296 wanted 20103 found 20091
parent transid verify failed on 792887296 wanted 20103 found 20091
parent transid verify failed on 792887296 wanted 20103 found 20091
Ignoring transid failure
ERROR: child eb corrupted: parent bytenr=38010880 item=69 parent level=1 child level=1
ERROR: failed to change UUID of metadata: -5
ERROR: btrfstune failed
This leaves a corrupted fs even more corrupted, and due to the extra
CHANGING_FSID flag, btrfs check will not even try to run on it:
Opening filesystem to check...
ERROR: Filesystem UUID change in progress
ERROR: cannot open file system
[CAUSE]
Unlike kernel, btrfs-progs has a less strict check on transid mismatch.
In read_tree_block() we will fall back to use the tree block even its
transid mismatch if we can't find any better copy.
However not all commands in btrfs-progs needs this feature, only
btrfs-check (which may fix the problem) and btrfs-restore (it just tries
to ignore any problems) really utilize this feature.
[FIX]
Introduce a new open ctree flag, OPEN_CTREE_ALLOW_TRANSID_MISMATCH, to
be explicit about whether we really want to ignore transid error.
Currently only btrfs-check and btrfs-restore will utilize this new flag.
Also add btrfs-image to allow opening such fs with transid error.
Link: https://www.reddit.com/r/btrfs/comments/pivpqk/failure_during_btrfstune_u/
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The refactoring f3a132fa1b ("btrfs-progs: factor out compression type
name parsing to common utils") caused a bug with parsing option -c with
defrag:
# btrfs fi defrag -v -czstd file
ERROR: unknown compression type: zstd
# btrfs fi defrag -v -clzo file
ERROR: unknown compression type: lzo
# btrfs fi defrag -v -czlib file
ERROR: unknown compression type: zlib
Fix it by properly checking the value representing unknown compression
algorithm.
Issue: #403
Signed-off-by: David Sterba <dsterba@suse.com>
The function btrfs_list_get_path_rootid is exported to libbtrfs so it
needs to stay, but we can inline the implementation.
Signed-off-by: David Sterba <dsterba@suse.com>