Commit Graph

120 Commits

Author SHA1 Message Date
David Sterba
76ab1fa364 btrfs-progs: rename and move group_profile_max_safe_loss
The helper belongs to the others that translate bg flags to the raid
attr table member.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 16:38:56 +02:00
David Sterba
a5becbde35 btrfs-progs: use raid attr table in group_profile_max_safe_loss
The helper open codes what we already have in the raid attr table, so
use it. We assume a valid flags so there's no error value.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 16:32:55 +02:00
David Sterba
45d034d774 btrfs-progs: move btrfs_tree_search2_ioctl_supported to fsfeatures.c
The helper detects a feature support, so put it to the right file.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 16:20:17 +02:00
David Sterba
47b04b747d btrfs-progs: move parse_qgroupid to parse utils
Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 14:20:42 +02:00
David Sterba
cb5a542871 btrfs-progs: factor out plain qgroupid parsing
We'll use plain qgroupid parsing function elsewhere so split that part
from parse_qgroupid_or_path. The parsing is slightly reworked and goes
from start to end, while previously it looked up the slash and worked
from there. In case a valid qgroupid is also a valid path, the path must
be specified as absolute.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 14:20:42 +02:00
David Sterba
ebf500b43d btrfs-progs: rename parse_qgroupid
This helper can parse a qgroupid or a path, so rename it accordingly, so
a plain qgroupid parsing can be factored out as a standalone helper.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 14:20:42 +02:00
David Sterba
c3ee6a8a09 btrfs-progs: unify GPL header comments
Add the GPL v2 header to files where it was missing and is not from an
external source, update to the most recent version with the address.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 13:58:44 +02:00
David Sterba
f3a132fa1b btrfs-progs: factor out compression type name parsing to common utils
Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 13:58:44 +02:00
David Sterba
31df7dc295 btrfs-progs: factor out profile parsing to common utils
There are some duplicate parsers of the profile names, factor out the
one from balance to the common code.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 13:58:44 +02:00
David Sterba
6b93a7336c btrfs-progs: move number and range parsing helpers to parse-utils.c
Some of the parsers in cmds/balance.c are generic enough for the common
part.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 13:58:43 +02:00
David Sterba
af56460de8 btrfs-progs: split parsing helpers from utils.c
There are various parsing helpers scattered everywhere, unify them to
one file and start with helpers already in utils.c.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-06 17:15:51 +02:00
David Sterba
a177ef7dd4 btrfs-progs: mkfs: allow degenerate raid0/raid10
Kernel patch b2f78e88052bc0bee ("btrfs: allow degenerate raid0/raid10")
in
5.15 will allow mounting and converting to single device raid0 or two
device raid10.  Let mkfs create such filesystem.

"The motivation is to allow to preserve the profile type as long as it
 possible for some intermediate state (device removal, conversion), or
 when there are disks of different size, with raid0 the otherwise
 unusable space of the last device will be used too.  Similarly for
 raid10, though the two largest devices would need to be the same."

Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-27 15:40:53 +02:00
Qu Wenruo
773afad3e6 btrfs-progs: slightly enhance btrfs_format_csum()
- Change it void
  The old one always return csum_size.

- Use snprintf()

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-26 14:27:01 +02:00
Qu Wenruo
991a598f53 btrfs-progs: move btrfs_format_csum() to common/utils.[ch]
Function btrfs_format_csum() is a special helper only used in
btrfs-progs.

Move it to common/utils.[ch] other than leaving it in
kernel-shared/disk-io.c.

Since we're moving the code, also introduce a macro,
BTRFS_CSUM_STRING_LEN, to replace open-coded string length calculation.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-26 14:26:13 +02:00
Goldwyn Rodrigues
a7ed5b0ced btrfs-progs: correct check_running_fs_exclop() return value
check_running_fs_exclop() can return 1 when exclop is changed to "none"
The ret is set by the return value of the select() operation. Checking
the exclusive op changes just the exclop variable while ret is still
set to 1.

Set ret = 0 if exclop is set to BTRFS_EXCL_NONE or BTRFS_EXCL_UNKNOWN.
Remove unnecessary continue statement at the end of the block.

The command appears to have executed, but does not. This was found when
balance which typically reports chunks relocated did not print anything
on screen.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-02 17:27:53 +02:00
David Sterba
3d52313591 btrfs-progs: add helper to read zone size from sysfs
Sysfs hides the zone size of a block device in the queue/chunk_sectors
file, so add a helper that will read it for us when given the short
device name (that can be found in FSID/devices).

Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-02 17:27:53 +02:00
David Sterba
e25c8f00a3 btrfs-progs: add helper for opening sysfs fsid directory
There are several directories in /sys/fs/btrfs/FSID that contain more
than one file/directory. Add a helper to open the directory so that the
file descriptor can be used for fdopendir.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-07-02 17:27:53 +02:00
Su Yue
80a86f1b47 btrfs-progs: do not BUG_ON if btrfs_add_to_fsid succeeded to write superblock
Commit 8ef9313cf2 ("btrfs-progs: zoned: implement log-structured
superblock") changed to write BTRFS_SUPER_INFO_SIZE bytes to device.
The before num of bytes to be written is sectorsize.
It causes mkfs.btrfs failed on my 16k pagesize kvm:

  $ /usr/bin/mkfs.btrfs -s 16k -f -mraid0 /dev/vdb2 /dev/vdb3
  btrfs-progs v5.12
  See http://btrfs.wiki.kernel.org for more information.

  ERROR: superblock magic doesn't match
  ERROR: superblock magic doesn't match
  common/device-scan.c:195: btrfs_add_to_fsid: BUG_ON `ret != sectorsize`
  triggered, value 1
  /usr/bin/mkfs.btrfs(btrfs_add_to_fsid+0x274)[0xaaab4fe8a5fc]
  /usr/bin/mkfs.btrfs(main+0x1188)[0xaaab4fe4dc8c]
  /usr/lib/libc.so.6(__libc_start_main+0xe8)[0xffff7223c538]
  /usr/bin/mkfs.btrfs(+0xc558)[0xaaab4fe4c558]

  [1]    225842 abort (core dumped)  /usr/bin/mkfs.btrfs -s 16k -f -mraid0
  /dev/vdb2 /dev/vdb3

btrfs_add_to_fsid() now always calls sbwrite() to write
BTRFS_SUPER_INFO_SIZE bytes to device, so change condition of
the BUG_ON().
Also add comments for sbread() and sbwrite().

Signed-off-by: Su Yue <l@damenly.su>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-12 16:00:14 +02:00
David Sterba
3dad6bd6e9 btrfs-progs: use proper array designator for exclop initialization
Compiling with clang produces warnings like:

  common/utils.c:1670:31: warning: use of GNU 'missing =' extension in designator [-Wgnu-designator]
	  [BTRFS_EXCLOP_SWAP_ACTIVATE]    "swap activate",
					  ^
					  =

Use the form with '=' to fix the warning.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-08 00:58:51 +02:00
David Sterba
a2495d1e6e btrfs-progs: export get_zone_unusable and move to utils.c
Getting the per bg type zone unusable space will be used in other size
reports like 'fi us', so export it to the device utils.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-08 00:58:50 +02:00
David Sterba
c19ac510a7 btrfs-progs: move repair.[ch] to common/
Move the file to common as it's used by several parts, while still
keeping the name 'repair' although the only thing it does is adding a
corrupted extent.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:47 +02:00
David Sterba
cfbcfaa4e4 btrfs-progs: mkfs: move btrfs_make_root_dir from utils.c
The helper is used in several tools but logically belongs to mkfs, so
put it to the common section.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:47 +02:00
David Sterba
d591cd7c08 btrfs-progs: split unit related helpers from utils.c
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:47 +02:00
David Sterba
7fa07e2abb btrfs-progs: split open/close helpers from utils.c
There's a group of functions that are related to opening filesystem in
various modes, this can be moved to a separate file.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:47 +02:00
David Sterba
b19a603d62 btrfs-progs: remove unnecessary linux/*.h includes
Decrease dependency on system headers, remove where they're not needed
or became stale after code moved. The path-utils.h encapsulate path
operations so include linux/limits.h here, that's where PATH_MAX is
defined.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:47 +02:00
David Sterba
600d2dba6f btrfs-progs: add fd version of device_get_partition_size
The helper wraps a raw ioctl but some users may already have the fd and
not necessarily the path. Add a suitable helper for convenience.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:46 +02:00
David Sterba
2cfd248ddf btrfs-progs: remove unused disk_size
This helper hasn't been used since 63bbf2931d ("btrfs-progs: rework
calculations of fi usage") a few years ago and we don't need the statfs
based calculations anywhere.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:46 +02:00
David Sterba
a2dbbcfe88 btrfs-progs: update comments for device helpers
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:46 +02:00
David Sterba
51c0ece9f6 btrfs-progs: add prefix to get_partition_size
This is a public helper for devices, add the prefix to make it clear.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:46 +02:00
David Sterba
c7b5f884e0 btrfs-progs: add prefix to zero_blocks
This is a public helper for devices, add the prefix to make it clear.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:46 +02:00
David Sterba
2b5d4f2e6f btrfs-progs: add prefix to discard_blocks
This is a helper for devices, make it clear in the function name.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:46 +02:00
David Sterba
bc6864967b btrfs-progs: add prefix to exported queue_param
As this is a public helper, add a prefix that makes it clear what is the
queue related to.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:46 +02:00
Naohiro Aota
b42b7fbc32 btrfs-progs: zoned: support wiping superblock on sequential write zone
We cannot overwrite superblock magic in a sequential required zone.
Instead, we can reset the zone to wipe it.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:46 +02:00
Naohiro Aota
8bbb0c5744 btrfs-progs: zoned: support zero out on zoned block device
If we zero out a region in a sequential write required zone, we cannot
write to the region until we reset the zone. Thus, we must prohibit zeroing
out to a sequential write required zone.

zero_dev_clamped() is modified to take the zone information and it calls
zero_zone_blocks() if the device is host managed to avoid writing to
sequential write required zones.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:46 +02:00
Naohiro Aota
58ec593892 btrfs-progs: zoned: support resetting zoned device
All zones of zoned block devices should be reset before writing. Support
this by introducing PREP_DEVICE_ZONED.

btrfs_reset_all_zones() walk all the zones on a device, and reset a zone if
it is sequential required zone, or discard the zone range otherwise.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:46 +02:00
Naohiro Aota
8ef9313cf2 btrfs-progs: zoned: implement log-structured superblock
Superblock (and its copies) is the only data structure in btrfs which has a
fixed location on a device. Since we cannot overwrite in a sequential write
required zone, we cannot place superblock in the zone.  One easy solution
is limiting superblock and copies to be placed only in conventional zones.

However, this method has two downsides: one is reduced number of superblock
copies. The location of the second copy of superblock is 256GB, which is in
a sequential write required zone on typical devices in the market today.
So, the number of superblock and copies is limited to be two.  Second
downside is that we cannot support devices which have no conventional zones
at all.

To solve these two problems, we employ superblock log writing. It uses two
adjacent zones as a circular buffer to write updated superblocks.  Once the
first zone is filled up, start writing into the second one.  Then, when
both zones are filled up and before starting to write to the first zone
again, reset the first zone.

We can determine the position of the latest superblock by reading write
pointer information from a device. One corner case is when both zones are
full. For this situation, we read out the last superblock of each zone, and
compare them to determine which zone is older.

The following zones are reserved as the circular buffer on ZONED btrfs.

- primary superblock: offset   0B (and the following zone)
- first copy:         offset 512G (and the following zone)
- Second copy:        offset   4T (4096G, and the following zone)

If these reserved zones are conventional, superblock is written fixed at
the start of the zone without logging.

Currently, superblock reading/writing is done by pread/pwrite. This
commit replace the call sites with sbread/sbwrite to wrap the functions.
For zoned btrfs, btrfs_sb_io which is called from sbread/sbwrite
reverses the IO position back to a mirror number, maps the mirror number
into the superblock logging position, and do the IO.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:45 +02:00
Naohiro Aota
384840b9c0 btrfs-progs: zoned: get zone information of zoned block devices
Get the zone information (number of zones and zone size) from all the
devices, if the volume contains a zoned block device. To avoid costly
run-time zone report commands to test the device zones type during block
allocation, it also records all the zone status (zone type, write
pointer position, etc.).

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:45 +02:00
Naohiro Aota
242c8328bc btrfs-progs: zoned: add new ZONED feature flag
With the zoned feature enabled, a zoned block device-aware btrfs
allocates block groups aligned to the device zones and always written in
sequential zones at the zone write pointer position.

It also supports "emulated" zoned mode on a non-zoned device. In the
emulated mode, btrfs emulates conventional zones by slicing the device
into fixed-size zones.

We don't support conversion from the ext4 volume with the zoned feature
because we can't be sure all the converted block groups are aligned to
zone boundaries.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:45 +02:00
Naohiro Aota
acdd22ab68 btrfs-progs: provide fs_info from btrfs_device
Likewise in the kernel code, provide fs_info access from struct
btrfs_device. This will help to unify the code between the kernel and
the userland.

Since fs_info can be NULL at the time of btrfs_add_to_fsid(), let's use
btrfs_open_devices() to set fs_info to the devices.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:45 +02:00
Naohiro Aota
c4d2704c9a btrfs-progs: utils: introduce queue_param helper function
Introduce the queue_param helper function to get a device request queue
parameter. This helper will be used later to query information of a zoned
device.

Furthermore, rewrite is_ssd() using the helper function.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
[Naohiro] fixed error return value
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:41:45 +02:00
Qu Wenruo
82611f51a5 btrfs-progs: mkfs: only output the warning if the sectorsize is not supported
Currently mkfs.btrfs will output a warning message if the sectorsize is
not the same as page size:

  WARNING: the filesystem may not be mountable, sectorsize 4096 doesn't match page size 65536

But since btrfs subpage support for 64K page size is coming, this output
is populating the golden output of fstests, causing tons of false
alerts.

This patch will teach mkfs.btrfs to check
/sys/fs/btrfs/features/supported_sectorsizes and check if the sector
size is supported.

Then only output above warning message if the sector size is not
supported or the file is not found (ie. kernel does not export the file
yet).

Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-06 16:40:51 +02:00
David Sterba
ce2de8282b btrfs-progs: change formatting for plain text lines
Line continuations and not simple "\n" for the json output, this got
inherited to the plain text output, but this is not necessary.

This also caused problems in fstests btrfs/006 where the extra newline
does not match the golden output and the test fails, when printing
device stats that now use the output formatter.

Change the plain text formatting to always expect that a fmt_print or a
manual line print (like is for the device stats) will append the newline
and remove it from the end of formatting.

Link: https://lore.kernel.org/linux-btrfs/CAL3q7H4b7QhL02aSOpN0-k_9P2EAbj1t+NkA6VwidKEg4S996w@mail.gmail.com
Reported-by: Filipe Manana <fdmanana@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-02-22 16:22:21 +01:00
Boris Burkov
feb8f56ba2 btrfs-progs: receive: fix btrfs_mount_root substring bug
The current mount detection code in btrfs receive is not quite perfect.
For example, suppose /tmp is mounted as a tmpfs. In that case,
btrfs receive /tmp2 will find /tmp as the longest mount that matches a
prefix of /tmp2 and blow up because it is not a btrfs filesystem, even
if /tmp2 is just a directory in / mounted as btrfs.

Fix this by replacing the substring check with a dirname recursion to
only check the directories in the path of the dir, rather than every
substring.

Add a new test for this case.

Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-02-19 16:29:40 +01:00
Nikolay Borisov
3e653e801c btrfs-progs: remove duplicate checks from cmd_filesystem_resize
btrfs_open_dir already has a check whether the passed path is a
directory and if so it returns a specific error code (-3) when such an
error occurs. Use this instead of open-coding the directory check. To
avoid regression in cli/003 test also move directory checks before fs
type in btrfs_open.

Output before this check:

  ERROR: resize works on mounted filesystems and accepts only
  directories as argument. Passing file containing a btrfs image
  would resize the underlying filesystem instead of the image.

After:

  ERROR: not a directory: /root/btrfs-progs/tests/test.img
  ERROR: resize works on mounted filesystems and accepts only
  directories as argument. Passing file containing a btrfs image
  would resize the underlying filesystem instead of the image.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-02-19 15:24:42 +01:00
David Sterba
b604e0eda2 btrfs-progs: remove workarounds for libmount and static build
Partial revert of 922eaa7b54 ("btrfs-progs: build: fix linking with
static libmount"), remove the necessary workarounds like the weak
symbols and link time warnings. Symbols renamed not to clash with
libmount (parse_size, canonicalize_path) haven't been reverted because
the new names are acceptable.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-02-19 15:24:42 +01:00
David Sterba
e31679c1ec btrfs-progs: reimplement find_mount_fsroot without libmount
In commit 57cfe29e69 ("btrfs-progs: utils: introduce
find_mount_fsroot") the entries in /proc/self/mountinfo are parsed by a
convenience library libmount, because getmntent does not provide the
information we need to distinguish bind mounts.

Using libmount turned out to be problematic in several ways:

- static build got broken due to clashing symbols, eg. for parsing size
  or path canonicalization (#333)

- long-term distros do not have libmount new enough (2.24+) to provide
  some functions (mnt_table_is_empty, #334)

- libmount internally uses getgrnam_r/mnt_get_uid/... that are not
  static-build friendly, a warning is printed during link time for each
  binary; we don't use any of the functions

- libmount has further library dependencies that we don't need:

  $ ldd /usr/lib64/libmount.so.1
	  linux-vdso.so.1 (0x00007fff4f175000)
	  libc.so.6 => /lib64/libc.so.6 (0x00007f44a1763000)
	  libblkid.so.1 => /usr/lib64/libblkid.so.1 (0x00007f44a1730000)
	  libselinux.so.1 => /usr/lib64/libselinux.so.1 (0x00007f44a1704000)
	  /lib64/ld-linux-x86-64.so.2 (0x00007f44a1998000)
	  libpcre.so.1 => /usr/lib64/libpcre.so.1 (0x00007f44a166c000)
	  libdl.so.2 => /lib64/libdl.so.2 (0x00007f44a1666000)

  namely selinux, pcre and dl.

Summing it up, libmount causes more trouble than it's worth using a
convenience library, we want to keep the dependencies minimal so the
custom mountinfo parser was inevitable.

Issue: #333
Issue: #334
Issue: #336
Signed-off-by: David Sterba <dsterba@suse.com>
2021-02-19 15:24:40 +01:00
David Sterba
2347b34af4 btrfs-progs: fix device mapper path canonicalization
Commit 922eaa7b54 ("btrfs-progs: build: fix linking with static
libmount") broke path canonicalization, that prevented eg 'device add
/dev/dm-0' to properly recognize the device mapper names.

Issue: #339
Signed-off-by: David Sterba <dsterba@suse.com>
2021-02-19 15:18:39 +01:00
David Sterba
922eaa7b54 btrfs-progs: build: fix linking with static libmount
The libmount dependency has been added in commit 61ecaff036
("btrfs-progs: build: add libmount dependency"), and static build got
broken. There are functions that do basically the same thing and also
share the name, which in turn fails at link time.

  ld: /../lib64/libmount.a(libcommon_la-canonicalize.o): in function `canonicalize_dm_name':
  util-linux-2.34/lib/canonicalize.c:58: multiple definition of `canonicalize_dm_name';
	  common/path-utils.static.o:btrfs-progs/common/path-utils.c:286: first defined here

In case the collision can be resolved by renaming, it's done
(canonicalize_path and parse_size). There are 2 symbols from selinux
that are substituted by a weak aliases during the static build.

There's one new warning due to use of getgrnam_r in libmount that
depends on dynamic linking and may not work properly with static build.
We're not using the related functions directly or indirectly, so it
should be safe to ignore the warnings.

  ld: ../lib64/libmount.a(la-utils.o): in function `mnt_get_gid':
  util-linux-2.34/libmount/src/utils.c:625: warning: Using 'getgrnam_r' in statically linked applications
  +requires at runtime the shared libraries from the glibc version used for linking

Issue: #333
Signed-off-by: David Sterba <dsterba@suse.com>
2021-01-25 23:31:56 +01:00
Sheng Mao
9870d504d8 btrfs-progs: align receive buffer to enable fast CRC
To use optimized CRC implementation, the input buffer must be
unsigned long aligned. btrfs receive calculates checksum based on
read_buf, including btrfs_cmd_header (with zeroed CRC field)
and command content.

Reorder the buffer to the beginning of the structure and force the
alignment to 64, this should be cacheline friendly and could speed up
the data transfers.

Interesting parts from the report:

Sending host:
Fedora 33
AMD ThreadRipper 1920X - 128GB RAM
2x10GBit Ethernet, bonded
MegaRaid 9270
6x16TB Seagate Exos in RAID5

Receiving host:
Fedora 33
Intel i3-7300 - HT enabled - 32GB RAM
10GBit Ethernet, single connection
MegaRaid 9260
12x8TB WD NAS drives in RAID5

The 2 hosts are connected to the same 10G switch. The sender could definitely
saturate a 10GBit link. The practically achievable writes on the backup host
would be lower, but still at least 400MB/s. The file system contains mostly
large files of 1GB+, so there is little meta-data.

With btrfs send/receive I'm getting a steady transfer rate of 60MB/s. The copy
has been running for a little over 5 days now, having only transferred some
25TB. This is way too slow for this setup.

Analyzing resource usage, the sender side is fine, both the btrfs send and the
corresponding ssh process only use about 10-10% CPU, which on a 24 threaded
machine is virtually nothing. However, the receiver is running with a load of
~2.6, with the sshd using 30-50% CPU and the btrfs receive a further 60-70%.
The rest of the load comes from IO wait. So the bottleneck is the btrfs receive
clearly.

Issue: #324
Signed-off-by: Sheng Mao <shngmao@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-01-18 17:49:22 +01:00
Marcos Paulo de Souza
57cfe29e69 btrfs-progs: utils: introduce find_mount_fsroot
This new function checks for filesystem path name that was mounted, thus
being different from find_mount_root. By using libmount we can easily
parse /proc/self/mountinfo file and check for the pathname field.

The function is useful to filter bind mounts with content different from
the original mount, thus making it safe to assume that the reported path
can be accessed by the user, with the right content.

Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-01-18 17:49:22 +01:00