In commit 57cfe29e69 ("btrfs-progs: utils: introduce
find_mount_fsroot") the entries in /proc/self/mountinfo are parsed by a
convenience library libmount, because getmntent does not provide the
information we need to distinguish bind mounts.
Using libmount turned out to be problematic in several ways:
- static build got broken due to clashing symbols, eg. for parsing size
or path canonicalization (#333)
- long-term distros do not have libmount new enough (2.24+) to provide
some functions (mnt_table_is_empty, #334)
- libmount internally uses getgrnam_r/mnt_get_uid/... that are not
static-build friendly, a warning is printed during link time for each
binary; we don't use any of the functions
- libmount has further library dependencies that we don't need:
$ ldd /usr/lib64/libmount.so.1
linux-vdso.so.1 (0x00007fff4f175000)
libc.so.6 => /lib64/libc.so.6 (0x00007f44a1763000)
libblkid.so.1 => /usr/lib64/libblkid.so.1 (0x00007f44a1730000)
libselinux.so.1 => /usr/lib64/libselinux.so.1 (0x00007f44a1704000)
/lib64/ld-linux-x86-64.so.2 (0x00007f44a1998000)
libpcre.so.1 => /usr/lib64/libpcre.so.1 (0x00007f44a166c000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f44a1666000)
namely selinux, pcre and dl.
Summing it up, libmount causes more trouble than it's worth using a
convenience library, we want to keep the dependencies minimal so the
custom mountinfo parser was inevitable.
Issue: #333
Issue: #334
Issue: #336
Signed-off-by: David Sterba <dsterba@suse.com>
The libmount dependency has been added in commit 61ecaff036
("btrfs-progs: build: add libmount dependency"), and static build got
broken. There are functions that do basically the same thing and also
share the name, which in turn fails at link time.
ld: /../lib64/libmount.a(libcommon_la-canonicalize.o): in function `canonicalize_dm_name':
util-linux-2.34/lib/canonicalize.c:58: multiple definition of `canonicalize_dm_name';
common/path-utils.static.o:btrfs-progs/common/path-utils.c:286: first defined here
In case the collision can be resolved by renaming, it's done
(canonicalize_path and parse_size). There are 2 symbols from selinux
that are substituted by a weak aliases during the static build.
There's one new warning due to use of getgrnam_r in libmount that
depends on dynamic linking and may not work properly with static build.
We're not using the related functions directly or indirectly, so it
should be safe to ignore the warnings.
ld: ../lib64/libmount.a(la-utils.o): in function `mnt_get_gid':
util-linux-2.34/libmount/src/utils.c:625: warning: Using 'getgrnam_r' in statically linked applications
+requires at runtime the shared libraries from the glibc version used for linking
Issue: #333
Signed-off-by: David Sterba <dsterba@suse.com>
To use optimized CRC implementation, the input buffer must be
unsigned long aligned. btrfs receive calculates checksum based on
read_buf, including btrfs_cmd_header (with zeroed CRC field)
and command content.
Reorder the buffer to the beginning of the structure and force the
alignment to 64, this should be cacheline friendly and could speed up
the data transfers.
Interesting parts from the report:
Sending host:
Fedora 33
AMD ThreadRipper 1920X - 128GB RAM
2x10GBit Ethernet, bonded
MegaRaid 9270
6x16TB Seagate Exos in RAID5
Receiving host:
Fedora 33
Intel i3-7300 - HT enabled - 32GB RAM
10GBit Ethernet, single connection
MegaRaid 9260
12x8TB WD NAS drives in RAID5
The 2 hosts are connected to the same 10G switch. The sender could definitely
saturate a 10GBit link. The practically achievable writes on the backup host
would be lower, but still at least 400MB/s. The file system contains mostly
large files of 1GB+, so there is little meta-data.
With btrfs send/receive I'm getting a steady transfer rate of 60MB/s. The copy
has been running for a little over 5 days now, having only transferred some
25TB. This is way too slow for this setup.
Analyzing resource usage, the sender side is fine, both the btrfs send and the
corresponding ssh process only use about 10-10% CPU, which on a 24 threaded
machine is virtually nothing. However, the receiver is running with a load of
~2.6, with the sshd using 30-50% CPU and the btrfs receive a further 60-70%.
The rest of the load comes from IO wait. So the bottleneck is the btrfs receive
clearly.
Issue: #324
Signed-off-by: Sheng Mao <shngmao@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This new function checks for filesystem path name that was mounted, thus
being different from find_mount_root. By using libmount we can easily
parse /proc/self/mountinfo file and check for the pathname field.
The function is useful to filter bind mounts with content different from
the original mount, thus making it safe to assume that the reported path
can be accessed by the user, with the right content.
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In cases where the compiler does not initialize the formatter context to
all zeros, there could be garbage values left on the depth 0 that is not
explicitly initialized. This could lead to mistakenly printing a ","
separator before the last closing "}", like
{
"__header": {
"version": "1"
},
}
Signed-off-by: David Sterba <dsterba@suse.com>
Extends fmt_print_start_group() so it can handle when name argument is
NULL. It is useful for printing unnamed array or map.
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The exclusive ops will not start if there's one already running. Now
that we have the sysfs export (since kernel 5.10) to check if there's
one already running, use it to allow enqueueing of the operations as a
convenience.
Supported enqueuing:
btrfs balance start --enqueue
btrfs filesystem resize --enqueue
btrfs device add --enqueue
btrfs device delete --enqueue
btrfs replace start --enqueue
This patch implements the functionality based on Goldwyn's patch
https://lore.kernel.org/linux-btrfs/?q=20200825150338.32610-4-rgoldwyn%40suse.de
but on top of previous preparatory patches.
Note that 'filesystem resize' options could confuse getopt as the
negative size change looks like a series of short options and there's no
way to make getopt ignore the short options, so there's a custom option
parser.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
There are several problems for current sectorsize check:
- No check at all for sectorsize
This means you can even specify "-s 62k".
- No way to specify sectorsize smaller than page size
Fix all these problems by:
- Introduce btrfs_check_sectorsize()
To do:
* power of 2 check for sectorsize
* lower and upper boundary check for sectorsize
* warn about sectorsize mismatch with page size
- Remove the max() between page size and sectorsize
This allows us to override the sectorsize for 64K page systems.
- Make nodesize calculation based on sectorsize
No need to use page size any more.
Users who specify sectorsize manually really know what they are doing,
and we have warned them already.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add helper that will either check a running operation or wait until it's
done, so that commands can be started and enqueued. If there are more
enqueued, an attempt to avoid racing is done based on the remaining
waiting time of each command.
Signed-off-by: David Sterba <dsterba@suse.com>
Since kernel 5.10, the file /sys/fs/btrfs/FSID/exclusive_operation
exports textual id of the running exclusive operation (balance, device
add/remove, ...). Add definitions and parsing functions so they can be
used to check before another operation is started and potentially
blocks.
Signed-off-by: David Sterba <dsterba@suse.com>
Add helpers to open and read sysfs files from the per-fs directory.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The path-util.[ch] is the right place, keep the send-utils.h prototypes
as it's part of libbtrfs headers.
Signed-off-by: David Sterba <dsterba@suse.com>
Add a function get_fsid_fd() to use an open file fd to get the
fsid of the mounted filesystem.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add a runtime feature (-R) flag for the free space tree. A filesystem
that is mkfs'd with -R free-space-tree then mounted with no options has
the same contents as one mkfs'd without the option, then mounted with
'-o space_cache=v2'.
The only tricky thing is in exactly how to call the tree creation code.
Using btrfs_create_free_space_tree as is did not quite work, because an
extra reference to the eb (root->commit_root) is leaked, which mkfs
complains about with a warning. I opted to follow how the uuid tree is
created by adding it to the dirty roots list for cleanup by
commit_tree_roots in commit_transaction. As a result,
btrfs_create_free_space_tree no longer exactly matches the version in
the kernel sources.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
[WARNING]
When compiling btrfs-progs, the following warning pops up:
In file included from /usr/include/stdio.h:867,
from ./kerncompat.h:22,
from common/fsfeatures.c:17:
In function 'printf',
inlined from 'process_features' at common/fsfeatures.c:192:4,
inlined from 'btrfs_process_runtime_features' at common/fsfeatures.c:205:2:
/usr/include/bits/stdio2.h:107:10: warning: '%s' directive argument is null [-Wformat-overflow=]
107 | return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This only occur with default make parameters. If compiling with D=1, the
warning just disappears.
The involved tool chain is:
- GCC 10.1.0
[CAUSE]
The offending code is:
static void process_features(u64 flags, enum feature_source source)
{
...
if (flags & feat->flag) {
printf("Turning ON incompat feature '%s': %s\n",
feat->name, feat->desc);
}
...
}
Currently, there is no runtime/fs feature without a name nor
description. So we shouldn't hit a feature with NULL as name nor
description.
This looks like a bug in GCC though.
[WORKAROUND]
However can workaround it by doing an explicit check on feat->name and
feat->desc to teach GCC not to do a wrong warning.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For backward compatibility with tools that may rely on the messages we
need a special level to print the message unless the verbosity settings
haven't been set on command line.
Signed-off-by: David Sterba <dsterba@suse.com>
Function btrfs_scan_devices() is being used by commands such as
'btrfs filesystem' and 'btrfs device', by having the verbose argument in
the btrfs_scan_devices() we can control which threads to print the
messages when verbose is enabled by the global option.
Add an option %verbose to btrfs_scan_devices().
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add --verbose and --quiet command options to show verbose or no output
from the subcommands. By introducing global a bconf::verbose memeber to
propagate the same down to the subcommand.
Further the added helper function pr_verbose() helps to logs the verbose
messages, based on the state of the %bconf::verbose. And further HELPINFO_
defines are provided for the usage.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
As of now the define HELPINFO_INSERT_GLOBALS if used as in the example
as below (as of now its not been used anywhere) will print the help
texts as shown below
$ ./btrfs fi show --help
<snip>
Global options:
--format TYPE where TYPE is: text
So in preparation to add --verbose and --quiet global options, and
apparently --format is not being used yet, this patch splits the global
options into two defines.
"Global options:"
So that the currently added global options --verbose and --quiet can use
the define HELPINFO_INSERT_GLOBALS header as shown below.
$ ./btrfs fi show --help
<snip>
Global options:
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add support for enabling quotas at mkfs time. The qgroup accounting will
be consistent, ie. works with --rootdir.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Make the features structures more generic to allow mkfs-time and
mount-time sets to be defined.
This provides base for later mkfs support of mount-time features like
quotas.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Update the summary of 'fi usage' where the multiple profiles will be
listed by type, like:
Multiple profiles: yes (data, metadata)
The string is returned from btrfs_test_for_multiple_profiles so the
callers don't have to assemble it together from the other profile
strings.
Signed-off-by: David Sterba <dsterba@suse.com>
The warning header was printed always, even if there weren't multiple
block group profiles. This also fixes the mixed block group profile
detection.
Signed-off-by: David Sterba <dsterba@suse.com>
Use simpler output format for easier parsing and place each block group
type on a separate line.
Example output:
WARNING: Multiple block group profiles detected, see 'man btrfs(5)'.
WARNING: Data: single, raid1
Signed-off-by: David Sterba <dsterba@suse.com>
Move 'single' as the first in the list of the multiple block groups, as
it's the default block group and the simplest.
Example output:
WARNING: data -> [single, raid1], metadata -> [single], system -> [single]
WARNING: data+metadata -> [single], system -> [raid1]
Signed-off-by: David Sterba <dsterba@suse.com>
Simpify sprint_profiles so it does not take the output parameters
optionally and add stubs to btrfs_test_for_multiple_profiles_by_fd.
This allows to remove all conditionals and reduce parameters of
sprint_profiles so that the output is returned directly.
Signed-off-by: David Sterba <dsterba@suse.com>
The term 'mixed' is confusing as it's commonly used for mised block
group profiles created by 'mkfs.btrfs --mixed'. We're interested in
multiple profiles for each type, so use the term 'multiple'.
Signed-off-by: David Sterba <dsterba@suse.com>
Add code to show a warning if a mixed profiles filesystem is detected.
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: David Sterba <dsterba@suse.com>
When compiling with clang, this warning is shown:
common/utils.c:404:3: warning: declaration does not declare anything [-Wmissing-declarations]
__attribute__ ((fallthrough));
This attribute seems to silence the same warning in GCC. Changing this
attribute with /* fallthrough */ fixes the warning for both gcc and
clang.
Full support for the attribute will be in clang 10, gcc supports that
now. Let's use what works for both and switch to the attribute in the
future.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With the introduction of xxhash64 to btrfs-progs we created a crypto/
directory for all the hashes used in btrfs (although no
cryptographically secure hash is there yet).
Move the crc32c implementation from kernel-lib/ to crypto/ as well so we
have all hashes consolidated.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Discovered with cppcheck. Fix signed/unsigned int mismatches, sizeof and
long formats.
Pull-request: #197
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The argument isn't changed inside the function.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
[ split from the original patch ]
Signed-off-by: David Sterba <dsterba@suse.com>
It's theoretically possible to add multiple devices with sizes that add up
to or exceed 16EiB. A file system will be created successfully but will
have a superblock with incorrect values for total_bytes and other fields.
Kernels up to v5.0 will crash when they encounter this scenario.
We need to check for overflow and reject the device if it would overflow.
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1099147
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
At least in Debian, default build flags include -Werror=format-security,
for good reasons in most cases. Here, the string comes from strftime --
and though I don't suspect any locale would be crazy enough to have %X
include a '%' char, the compiler has no way to know that.
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: David Sterba <dsterba@suse.com>
Build several standalone tools into one binary and switch the function
by name (symlink or hardlink).
* btrfs
* mkfs.btrfs
* btrfs-image
* btrfs-convert
* btrfstune
The static target is also supported. The name of resulting boxed
binaries is btrfs.box and btrfs.box.static . All the binaries can be
built at the same time without prior configuration.
text data bss dec hex filename
822454 27000 19724 869178 d433a btrfs
927314 28816 20812 976942 ee82e btrfs.box
2067745 58004 44736 2170485 211e75 btrfs.static
2627198 61724 83800 2772722 2a4ef2 btrfs.box.static
File sizes:
857496 btrfs
968536 btrfs.box
2141400 btrfs.static
2704472 btrfs.box.static
Standalone utilities:
512504 btrfs-convert
495960 btrfs-image
471224 btrfstune
491864 mkfs.btrfs
1747720 btrfs-convert.static
1411416 btrfs-image.static
1304256 btrfstune.static
1361696 mkfs.btrfs.static
So the shared 900K binary saves ~2M, or ~5.7M for static build.
Signed-off-by: David Sterba <dsterba@suse.cz>
Add structures and API for unified output definition and multiple
formatting backends. Currently there's plain text and json.
The format of each row is defined in struct rowspec, selected using a
key and formatted according to the type. There are extended types for
eg. UUID or pretty size, while direct printf format specifiers work too.
Due to different nature of the outputs, the context structure members
are not always used.
* text output mostly uses indentation and formats the name to a given
width
* json output tracks nesting depth and keeps stack of previous groups
(list or array) and how many member have been printed, as the
separators are allowed only between values and must not preced the
group closing bracket
the nesting depth is hardcoded to 16, counting the global group
The API provides functions to print simple values and some helpers to
format more complex structures.
Signed-off-by: David Sterba <dsterba@suse.com>
Global options should be printed right after the command options, but
there could be text following the options. Add a marker that will allow
to order the options before that text.
Signed-off-by: David Sterba <dsterba@suse.com>
This adds a global --format option to request extended output formats
from each command.
We currently only support text mode. Command help reports what
output formats are available for each command. Global help reports
what valid formats are.
If an invalid format is requested, an error is reported and lists the
valid formats.
Each command sets a bitmask that describes which formats it is capable
of outputting. If a globally valid format is requested of a command
that doesn't support it, an error is reported and command usage dumped.
Commands don't need to specify that they support text output. All
commands are required to output text.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
[ use global config instead of passing cmd_context ]
Signed-off-by: David Sterba <dsterba@suse.com>
Create directory for all sources that can be used by anything that's not
rellated to a relevant kernel part, all common functions, helpers,
utilities that do not fit any other specific category.
The traditional location would be probably lib/ with all things that are
statically linked to the main binaries, but we have libbtrfs and
libbtrfsutil so this would be confusing.
Signed-off-by: David Sterba <dsterba@suse.com>