In case of a raid5/6 filesystem 'btrfs fi us' returns wrong values
without the root capabilities:
$ sudo btrfs fi us /tmp/raid5fs # as root
Overall:
Device size: 3.00GiB
Device allocated: 1.51GiB <--- OK
Device unallocated: 1.49GiB <--- OK
Device missing: 0.00B
Device slack: 0.00B
Used: 769.03MiB <--- OK
Free (estimated): 1.32GiB (min: 1.32GiB) <-OK
Free (statfs, df): 1.32GiB
Data ratio: 1.50 <--- OK
Metadata ratio: 1.50 <--- OK
Global reserve: 5.50MiB (used: 0.00B)
Multiple profiles: no
[...]
$ btrfs fi us /tmp/raid5fs # as user
WARNING: cannot read detailed chunk info, per-device usage will not be shown, run as root
Overall:
Device size: 3.00GiB
Device allocated: 0.00B <--- WRONG
Device unallocated: 3.00GiB <--- WRONG
Device missing: 0.00B
Device slack: 0.00B
Used: 0.00B <--- WRONG
Free (estimated): 0.00B (min: 8.00EiB) <- WRONG
Free (statfs, df): 1.32GiB
Data ratio: 0.00 <--- WRONG
Metadata ratio: 0.00 <--- WRONG
Global reserve: 5.50MiB (used: 0.00B)
Multiple profiles: no
[...]
The reason is that the BTRFS_IOC_SPACE_INFO ioctl doesn't return enough
information. To bypass it a scan of the chunks is required when a
raid5/6 profile is present.
To avoid providing wrong information, in case of a raid5/6 filesystem
without the root capabilities the "btrfs fi us" is not executed at all
and a warning with a suggestion to run it as root is printed.
$ ./btrfs fi us /tmp/t/
WARNING: cannot read detailed chunk info, per-device usage will not be shown, run as root
WARNING: due to the presence of a raid5/raid6 profile, we cannots compute some values;
WARNING: run as root instead.
Signed-off-by: Goffredo Baroncelli <kreijack@libero.it>
Signed-off-by: David Sterba <dsterba@suse.com>
If "btrfs dev us" is invoked by a not root user, it is impossible to
collect the chunk info data (not enough privileges). This causes
"btrfs dev us" to print as "Unallocated" value the size of the disk.
This patch handles the case where print_device_chunks() is invoked
without the chunk info data, printing "Unallocated N/A":
Before the patch:
$ btrfs dev us t/
WARNING: cannot read detailed chunk info, per-device usage will not be shown, run as root
/dev/loop0, ID: 1
Device size: 5.00GiB
Device slack: 0.00B
Unallocated: 5.00GiB <-- Wrong
$ sudo btrfs dev us t/
[sudo] password for ghigo:
/dev/loop0, ID: 1
Device size: 5.00GiB
Device slack: 0.00B
Data,single: 8.00MiB
Metadata,DUP: 512.00MiB
System,DUP: 16.00MiB
Unallocated: 4.48GiB <-- Correct
After the patch:
$ ./btrfs dev us /tmp/t/
WARNING: cannot read detailed chunk info, per-device usage will not be shown, run as root
/dev/loop0, ID: 1
Device size: 5.00GiB
Device slack: 0.00B
Unallocated: N/A
$ sudo ./btrfs dev us /tmp/t/
[sudo] password for ghigo:
/dev/loop0, ID: 1
Device size: 5.00GiB
Device slack: 0.00B
Data,single: 8.00MiB
Metadata,DUP: 512.00MiB
System,DUP: 16.00MiB
Unallocated: 4.48GiB
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Goffredo Baroncelli <kreijack@libero.it>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch introduces a new parser helper, parse_u64_with_suffix(),
which has a better error handling, following all the parse_*()
helpers to return non-zero value for errors.
This new helper is going to replace parse_size_from_string(), which
would directly call exit(1) to stop the whole program.
Furthermore most callers of parse_size_from_string() are expecting
exit(1) for error, so that they can skip the error handling.
For those call sites, introduce a wrapper, arg_strtou64_with_suffix(),
to do that. The only disadvantage is a little less detailed error
report for why the parse failed, but for most cases the generic error
string should be enough.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
On multi-device filesystems, a scrub limit may be applied to any of the
devices. Ensure that any limit found is not disregarded.
Since it's more intuitive, keep the lowest non-zero limit found, even
though at the present we don't actually use the exact value.
Pull-request: #733
Issue: #727
Fixes: 7e4a235df1 ("btrfs-progs: scrub status: print device speed limit in status if set")
Signed-off-by: Jonas Malaco <jonas@protocubo.io>
Signed-off-by: David Sterba <dsterba@suse.com>
On multi-device filesystems, a scrub limit may be applied to any of the
devices. Ensure that any limit found is not disregarded.
Since it's more intuitive, keep the lowest non-zero limit found, even
though at the present we don't actually use the exact value.
Pull-request: #733
Issue: #727
Fixes: 7e4a235df1 ("btrfs-progs: scrub status: print device speed limit in status if set")
Signed-off-by: Jonas Malaco <jonas@protocubo.io>
Signed-off-by: David Sterba <dsterba@suse.com>
On multi-device filesystems, scrub status should report "some limits
set" if at least one device has a scrub limit set.
However, with btrfs-progs 6.6.3, this was being reported regardless of
whether any limit actually being set:
# sudo btrfs scrub limit /more/butter
UUID: 989129d9-c96f-4d52-9d68-cbb6d9b2c499
Id Limit Path
-- ----- ---------
1 - /dev/sdc1
2 - /dev/sdd1
# sudo btrfs scrub status /more/butter/
UUID: 989129d9-c96f-4d52-9d68-cbb6d9b2c499
Scrub started: Mon Jan 15 02:00:30 2024
Status: running
Duration: 6:23:19
Time left: 0:49:08
ETA: Mon Jan 15 09:12:57 2024
Total to scrub: 9.83TiB
Bytes scrubbed: 8.72TiB (88.64%)
Rate: 397.47MiB/s (some device limits set)
Error summary: no errors found
Fix it by only setting `limit` to the special marker value 1 if at least
one actual limit is found.
Pull-request: #733
Issue: #727
Fixes: 7e4a235df1 ("btrfs-progs: scrub status: print device speed limit in status if set")
Signed-off-by: Jonas Malaco <jonas@protocubo.io>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When try to create a subvolume where the target path already exists, the
"btrfs" command doesn't return error code correctly.
# mkfs.btrfs -f $dev
# mount $dev $mnt
# touch $mnt/subv1
# btrfs subvolume create $mnt/subv1
ERROR: target path already exists: $mnt/subv1
# echo $?
0
[CAUSE]
The check on whether target exists is done by path_is_dir(), if it
returns 0 or 1, it means there is something in that path already.
But unfortunately commit 5aa959fb34 ("btrfs-progs: subvolume create:
accept multiple arguments") only changed the out label, which would
directly return @ret, not updating the return value correctly.
[FIX]
Make sure all error out branch has their @ret manually updated.
Fixes: 5aa959fb34 ("btrfs-progs: subvolume create: accept multiple arguments")
Issue: #730
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Scrubs which complete in under one second may carry a duration rounded
down to zero. This subsequently results in a bytes_per_sec value of
zero, which corresponds to the Rate metric output, causing intermittent
tests/btrfs/282 failures.
This change ensures that Rate reflects any sub-second bytes processed.
Time left and ETA metrics are also affected by this change, in that they
increase to account for (sub-second) bytes_per_sec.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Print one message per scrubbed device and also print the limit if set:
$ btrfs scrub start /mnt
scrub started on /mnt, fsid 9ee93131-f680-4d6c-8ca4-a194506e3081 (pid=27257)
Starting scrub on devid 1 (limit 100.00MiB/s)
Signed-off-by: David Sterba <dsterba@suse.com>
The statfs(2) syscall is deprecated by LSB in favor of statvfs(2),
however we can't replace all uses because we still need the
statfs::f_type to determine the filesystem by magic numer.
Signed-off-by: David Sterba <dsterba@suse.com>
The subvolume cleaning is done by polling but it's possible that the
filesystem turns to read-only (as reported), either due to an error
intentionally. In that case the waiting would be indefinite without an
obvious reason.
To fix that check if the filesystem is still writable in each iteration.
Issue: #535
Link: https://github.com/btrfs/fstests/issues/40
Signed-off-by: David Sterba <dsterba@suse.com>
When there's a speed limit set for a device via
/sysfs/fs/btrfs/FSID/devinfo/scrub_speed_max, show it in the scrub status
output like below:
$ btrfs scrub status -d /mnt
...
Rate: 47.98MiB/s (limit 60MiB/s)
...
If the limit is 0 this means unlimited and is not printed.
For a single device filesystem the limit is printed even without '-d' as
it's clear which device limit applies. For multi-device filesysetms,
without any limits nothing is printed, if there at least one device
limit set then the following is printed:
Rate: 36.37MiB/s (some device limits set)
More details with the -d option.
Issue: #531
Signed-off-by: David Sterba <dsterba@suse.com>
If zstd is not compiled in then a stream fails with a generic error
message:
ERROR: unknown compression: 2
Where BTRFS_ENCODED_IO_COMPRESSION_ZSTD is 2 and there's a case for that
but behind the '#if COMPRESSION_ZSTD'.
Signed-off-by: David Sterba <dsterba@suse.com>
Currently the path of deleted subvolume is printed, we should also print
the numeric id as it's another identifier commonly found and can be used
for a cross reference. In connection with the qgroup deletion it's
making the output clear:
...
Delete subvolume 258 (no-commit): '/mnt/subv1'
Delete qgroup 0/258
...
Signed-off-by: David Sterba <dsterba@suse.com>
The 0/subvolid qgroups are not automatically deleted when the subvolume
is deleted, for historical reasons. There's a command to clean up all
such stale qgroups (btrfs qgroup clean-stale) but this should be also
possible with the subvolume deletion.
With the options we can switch the default to delete the qgroup by
default eventually, if somebody depends on the not deleting behaviour
the negation option can be used.
Issue: #366
Signed-off-by: David Sterba <dsterba@suse.com>
Reported on IRC, that it's unexpected that passing several devices on
command line for 'btrfs device delete' still uses some of the devices
during deletion. The expectation was that they'd be removed at once (and
thus not used for the intermediate chunk relocation).
As it works now, the ioctl removes only one device. As a workaround, add
a timeout (like we have for the full balance and others) when there are
more devices passed on the command line. This can be skipped by the
--force parameter.
Issue: #708
Signed-off-by: David Sterba <dsterba@suse.com>
This patch would make "btrfs subvolume create" to accept multiple
arguments, just like "mkdir".
The existing options like "-i <qgroupid>" and "-p" would all be applied
to all subvolume(s).
If one destination failed, the command would return 1, while still retry
the remaining destinations.
Issue: #695
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The function strdup() can return NULL if the system is out of memory,
thus we need to hanle the rare but possible -ENOMEM case.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Previous fix for char devices and properties opens the path in non
blocking mode but this still triggers the watchdog, as reported. Add a
workaround to properties to completely skip opening the path and just
stat() it.
Issue: #699
Signed-off-by: David Sterba <dsterba@suse.com>
GCC 14 introduces a new -Walloc-size included in -Wextra which gives:
```
common/utils.c:983:15: warning: allocation of insufficient size ‘1’ for type ‘struct config_param’ with size ‘32’ [-Walloc-size]
cmds/qgroup.c:1644:13: warning: allocation of insufficient size ‘1’ for type ‘struct btrfs_qgroup_inherit’ with size ‘72’ [-Walloc-size]
```
The calloc prototype is:
```
void *calloc(size_t nmemb, size_t size);
```
So, just swap the number of members and size arguments to match the prototype, as
we're initialising 1 struct of size `sizeof(struct ...)`. GCC then sees we're not
doing anything wrong.
Pull-request: #707
Signed-off-by: Sam James <sam@gentoo.org>
Signed-off-by: David Sterba <dsterba@suse.com>
The function almost always returns 0 even for errors as the ret value is
not used in the final return. This was attempted to be fixed in
55438f3930 ("btrfs-progs: resize: return error value from
check_resize_args()") but this broke 'resize cancel' when devid 1 was
missing and was later reverted as 4286eb552e ("Revert "btrfs-progs:
resize: return error value from check_resize_args()"").
The devid fallback has been fixed so the proper return value can be
returned now.
Issue: #539
Signed-off-by: David Sterba <dsterba@suse.com>
The implicit devid is 1 but when it does not exist then the command
'btrfs fi resize max /path' fails and requires the user to specify the
number (and finding it elsewhere, e.g. in 'btrfs fi us -T' output).
This is a usability bug, we can verify if devid 1 exists and use the
lowest devid as a fallback. This does what user would expected, though
there's still a warning. Kernel has the hardcoded devid 1 when none is
specified, with this fix in user space the kernel does not need to be
changed (or could behave the same eventually).
Example use:
$ btrfs fi us -T .
Data Metadata System
Id Path single single single Unallocated Total Slack
-- ---------- --------- --------- -------- ----------- ------- -----
4 /dev/loop3 - - - 4.00GiB 4.00GiB -
-- ---------- --------- --------- -------- ----------- ------- -----
Total 416.00MiB 256.00MiB 64.00MiB 4.00GiB 4.00GiB 0.00B
Used 0.00B 128.00KiB 16.00KiB
$ btrfs fi resize max .
WARNING: no devid specified means devid 1 which does not exist, using
lowest devid 4 as a fallback
Resize device id 4 (/dev/loop3) from 4.00GiB to max
Issue: #470
Signed-off-by: David Sterba <dsterba@suse.com>
Add a convenience option to processing the range in smaller steps than
the whole file, where a flush is done after each steps. This could be
potentially used to measure progress with 'btrfs -vv fi defrag'.
Issue: #616
Signed-off-by: David Sterba <dsterba@suse.com>
The option "--clear-space-cache" is not really that suitable for "btrfs
check" group, as there are some concerns:
- Allowing transid mismatch
- No leaf item checks
Thoe behaviour are inherited from the default open ctree flags for
"btrfs check", which can be unsafe if the end user just wants to clear
the cache.
- Unclear if the cache clearing would happen along with repair
Thankfully the clearing of space cache is done without any repair
Thus there is a proposal to move space cache removal to rescue group,
and this patch would do that exactly.
However this would lead to some behavior changes:
- Transid mismatch would be treated as error
- Leaf items size/offset would still be checked
If we hit any above error, we should just abort without doing any
write.
These change would increase the safety of the space cache removal, thus
I believe it's worthy to introduce such behavior change.
Since we're here, also add a small explanation on why we need this
dedicated tool to clear space cache (especially for v1 cache).
Issue: #698
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There was a support for a short syntax of 'btrfs balance' that accepted
a path where normally would be the mandatory subcommand. This was a
heuristic and nowadays everybody should be using the
'btrfs balance action' syntax. The warning was in place for a year, it's
time to remove the short syntax completely.
Issue: #517
Signed-off-by: David Sterba <dsterba@suse.com>
Allow user to do dry-run deletion, doing any pre-checks and printing
which subvolumes would be deleted. Lack of access rights can still lead
to errors.
Issue: #629
Signed-off-by: David Sterba <dsterba@suse.com>
Some commands could be run in a dry-run mode, i.e. not doing any
write/change actions, only printing the steps and ignoring errors.
There are two possibilities where to put the option:
- as a global one: btrfs --dry-run subvolume delete /path
- local option: btrfs subvolume delete --dry-run /path
As we have several global options already, let's put it there, dry-run
should not be very common so the slight inconvenience of writing the
option out of order of command arguments should be acceptable.
Issue: #629
Signed-off-by: David Sterba <dsterba@suse.com>
Due to refactoring in 88c25674c7 ("btrfs-progs: convert device info
to struct array") the variable tracking number of devices was not
updated and led to an error.
$ btrfs device usage /path
ERROR: unexpected number of devices: 0 != 1
...
Issue: #697
Signed-off-by: David Sterba <dsterba@suse.com>
Add new option -p to 'subvolume create' so it behaves like 'mkdir -p'
and create all missing path components before the subvolume.
Issue: #429
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
For running scrubs, with v6.3 and newer btrfs-progs, it can report
incorrect "Total to scrub":
Scrub resumed: Mon Oct 9 11:28:33 2023
Status: running
Duration: 0:44:36
Time left: 0:00:00
ETA: Mon Oct 9 11:51:38 2023
Total to scrub: 625.49GiB
Bytes scrubbed: 625.49GiB (100.00%)
Rate: 239.35MiB/s
Error summary: no errors found
[CAUSE]
Commit c88ac0170b ("btrfs-progs: scrub: unify the output numbers for
"Total to scrub"") changed the output method for "Total to scrub", but
that value is only suitable for finished scrubs.
For running scrubs, if we use the currently scrubbed values, it would
lead to the above problem.
The real scrubbed bytes is only reliable for finished scrubs, not for
running/canceled/interrupted ones.
[FIX]
Change print_scrub_dev() to do extra checks, and only for finished
scrubs to use the scrubbed bytes.
Otherwise fall back to the device's bytes_used.
Issue: #690
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch fixes a bug that could occur when comparing paths in showing
qgroups list. Old code doesn't check it and the bug occurs when there is
stale qgroup and its path is null. Check whether it is null and return
without comparing paths.
Issue: #687
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The option "--clear-ino-cache" is not really that suitable for "btrfs
check" group.
Let's move it to "btrfs rescue" group to fix those small hiccups, just
like the existing "btrfs rescue fix-device-size" command.
For now, "btrfs check --clear-ino-cache" would still work, with one
extra warning referring to "btrfs rescue clear-ino-cache".
This is mostly to reduce the surprise, and keep script users (I doubt if
there is any though) happy for now.
In the next or two releases, we would fully remove the support in "btrfs
check" group.
Another small change is, in the documents, we refer to the feature as
"inode map", which doesn't match with the mount option documents.
Since we're here, unify them to "inode cache" feature.
Issue: #669
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There are quite some variable shadowing in btrfs-progs, most of them are
just reusing some common names like tmp.
And those are quite safe and the shadowed one are even different type.
But there are some exceptions:
- @end in traverse_tree_blocks()
There is already an @end with the same type, but a different meaning
(the end of the current extent buffer passed in).
Just rename it to @child_end.
- @start in generate_new_data_csums_range()
Just rename it to @csum_start.
- @size of fixup_chunk_tree_block()
This one is particularly bad, we declare a local @size and initialize
it to -1, then before we really utilize the variable @size, we
immediately reset it to 0, then pass it to logical_to_physical().
Then there is a location to check if @size is -1, which will always be
true.
According to the code in logical_to_physical(), @size would be clamped
down by its original value, thus our local @size will always be 0.
This patch would rename the local @size to @found_size, and only set
it to -1.
The call site is only to pass something as logical_to_physical()
requires a non-NULL pointer.
We don't really need to bother the returned value.
- duplicated @ref declaration in run_delayed_tree_ref()
- duplicated @super_flags in change_meta_csums()
Just delete the duplicated one.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently for multi-word tree names, we only allow '_' to connect the
two words, like "block_group".
Meanwhile for mkfs features, we go '-' to connect two words, like
"block-group-tree".
This makes users to use different separators for different commands.
This patch would allow using both '-' and '_' for tree ids.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In the kernel we've added a control struct to handle the different
checks we want to do on extent buffers when we read them. Update our
copy of read_tree_block to take this as an argument, then update all of
the callers to use the new structure.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In the kernel we pass in the parent to btrfs_alloc_tree_block instead of
the blocksize and simply derive the blocksize from the fs_info. Update
the function to match the kernel's convention and update all of the
callers so we can sync ctree.c easily.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This simply zero's out the path, and this is used everywhere we use a
stack path. Drop this usage and simply init the path's to empty instead
of using a function to do the memset.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In the kernel we have btrfs_print_leaf(eb) instead of
btrfs_print_leaf(eb, mode). In fact in all of the kernel-shared sources
we're just using the default mode. Fix this to have a
__btrfs_print_leaf() which handles the mode for the user space utilities
that want the different behavior, and then change btrfs_print_leaf() to
just be the normal default style.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In the kernel this is called btrfs_read_node_slot, and it doesn't take a
btrfs_fs_info. Update the btrfs-progs version to match the kernel and
update all of the callers.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is the calling convention in the kernel because we track dirty
blocks per transaction instead of globally in the fs_info. Simply
mirror what we do in the kernel to make it easier to sync ctree.c
locally.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>