Get the zone information (number of zones and zone size) from all the
devices, if the volume contains a zoned block device. To avoid costly
run-time zone report commands to test the device zones type during block
allocation, it also records all the zone status (zone type, write
pointer position, etc.).
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With the zoned feature enabled, a zoned block device-aware btrfs
allocates block groups aligned to the device zones and always written in
sequential zones at the zone write pointer position.
It also supports "emulated" zoned mode on a non-zoned device. In the
emulated mode, btrfs emulates conventional zones by slicing the device
into fixed-size zones.
We don't support conversion from the ext4 volume with the zoned feature
because we can't be sure all the converted block groups are aligned to
zone boundaries.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If the kernel supports zoned block devices, the file
/usr/include/linux/blkzoned.h will be present. Check this and define
BTRFS_ZONED if the file is present.
If it present, enables ZONED feature, if not disable it.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Likewise in the kernel code, provide fs_info access from struct
btrfs_device. This will help to unify the code between the kernel and
the userland.
Since fs_info can be NULL at the time of btrfs_add_to_fsid(), let's use
btrfs_open_devices() to set fs_info to the devices.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce the queue_param helper function to get a device request queue
parameter. This helper will be used later to query information of a zoned
device.
Furthermore, rewrite is_ssd() using the helper function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
[Naohiro] fixed error return value
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
alloc_chunk_ctl::calc_size is actually the stripe_size in the kernel
side code. Let's rename it to clarify what the "calc" is.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Chunk_bytes_by_type() takes type, calc_size, and ctl as arguments. But
the first two can be obtained from the ctl. Let's drop these arguments
for simplicity.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since commit b9444efb66 ("btrfs-progs: don't pretend RAID56 has a
different stripe length"), alloc_chunk_ctl::stripe_len is always fixed
to BTRFS_STRIPE_LEN. Let's replace alloc_chunk_ctl::stripe_len with
BTRFS_STRIPE_LEN, like in the kernel code.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Several calculations in the chunk allocation process use this pattern.
x /= y;
x *= y;
Replace this pattern with round_down().
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In the DUP profile, we can use only half of the space available in a
device extent. Fix the calculation of calc_size for it.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
btrfs_alloc_data_chunk() and create_chunk() have the most part in common.
Let's rewrite btrfs_alloc_data_chunk() using create_chunk().
There are two differences between btrfs_alloc_data_chunk() and
create_chunk(). create_chunk() uses find_next_chunk() to decide the
logical address of the chunk, and it uses btrfs_alloc_dev_extent() to
decide the physical address of a device extent. On the other hand,
btrfs_alloc_data_chunk() uses *start for both logical and physical
addresses.
To support the btrfs_alloc_data_chunk()'s use case, we use ctl->start
and ctl->dev_offset. If these values are set (non-zero), use the
specified values as the address. It is safe to use 0 to indicate the
value is not set here. Because both lower addresses of logical
(0..BTRFS_FIRST_CHUNK_TREE_OBJECT_ID) and physical
(0..BTRFS_BLOCK_RESERVED_1M_FOR_SUPER) are reserved.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Factor out create_chunk() from btrfs_alloc_chunk(). This new function
creates a chunk.
There is no functional changes.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Factor out decide_stripe_size() from btrfs_alloc_chunk(). This new
function calculates the actual stripe size to allocate and decides the
size of a stripe (ctl->calc_size).
This commit has no functional changes.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Move parameter initialization code for regular allocator to
init_alloc_chunk_ctl_policy_regular(). This will help adding another
allocator in the future.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Convert alloc_chunk_ctl::type to take the original type in
btrfs_alloc_chunk(). This will help refactoring in the following commits.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Factor out the function dev_extent_search_start() from
find_free_dev_extent_start() to decide the starting position of a device
extent search.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce chunk allocation policy for btrfs. This policy controls how
chunks and device extents are allocated from devices.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This was asked on reddit, how to automatically mount a swapfile from
fstab. As this is not completely obvious, document it with an example.
Signed-off-by: David Sterba <dsterba@suse.com>
There were plans to add X as flag to set/unset the btrfs NOCOMPRESS
attribute but that never materialized. In e2fsprogs the letter 'm' has
been assigned to the same functionality and released in version 1.46.2.
Update the docs and mention that the compression options are
conflicting.
Signed-off-by: David Sterba <dsterba@suse.com>
Resize to nums without sign prefix makes false output:
$ btrfs fi resize 1:150g /srv/extra
Resize device id 1 (/dev/sdb1) from 298.09GiB to 0.00B
The resize operation would take effect though.
Fix it by handling the case if mod is 0 in check_resize_args().
Issue: #307
Reported-by: Chris Murphy <lists@colorremedies.com>
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Su Yue <l@damenly.su>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently mkfs.btrfs will output a warning message if the sectorsize is
not the same as page size:
WARNING: the filesystem may not be mountable, sectorsize 4096 doesn't match page size 65536
But since btrfs subpage support for 64K page size is coming, this output
is populating the golden output of fstests, causing tons of false
alerts.
This patch will teach mkfs.btrfs to check
/sys/fs/btrfs/features/supported_sectorsizes and check if the sector
size is supported.
Then only output above warning message if the sector size is not
supported or the file is not found (ie. kernel does not export the file
yet).
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This relicenses the libbtrfsutil library to LGPLv2.1+ from LGPLv3.
People that have contributed non-trivial changes acknowledged the change
and are listed below.
There's a potential licensing conflict with the 'btrfs' utility that is
GPLv2 and statically links libbtrfsutil, this is not a valid combination
per the compatibility matrix as found in
https://www.gnu.org/licenses/gpl-faq.html#AllCompatibility or
http://gplv3.fsf.org/dd3-faq .
We also have an explicit request to change the license [1] (issue #323)
from LGPLv3 to allow use in environments that don't like GPLv3. Though
the library license is not GPLv3, the full text of the license is in the
repository and the 'lesser' part is an addendum. This was perhaps a bit
confusing, nevertheless this gets clarified as well.
[1] https://lore.kernel.org/linux-btrfs/b927ca28-e280-4d79-184f-b72867dbdaa8@denx.de/
Acked-by: Omar Sandoval <osandov@fb.com>
Acked-by: Misono Tomhiro <misono.tomohiro@jp.fujitsu.com>
Acked-by: Qu Wenruo <wqu@suse.com>
Acked-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Acked-by: Anand Jain <anand.jain@oracle.com>
Acked-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Link: https://bugs.debian.org/985400
Issue: #323
Signed-off-by: Neal Gompa <ngompa@fedoraproject.org>
Signed-off-by: David Sterba <dsterba@suse.com>
In case the right buffer is emptied it's first set to NULL and
subsequently it's dereferenced to get its size to pass to root_sub_used.
This naturally leads to a NULL pointer dereference. The correct thing to
do is to pass the stashed right->len in "blocksize".
Issue: #296
Pull-request: #360
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Marking BUG() unreachable helps us silence unnecessary warnings e.g.
"warning: control reaches end of non-void function [-Wreturn-type]" like
the code below.
int foo()
{
...
if (XXX)
return 0;
else if (YYY)
return 1;
else
BUG();
}
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add general paragraphs and establish per-ioctl argument description
formatting, using [horizontal] where the struct member and the
description are on the same line, unlike ordinary ::.
Signed-off-by: David Sterba <dsterba@suse.com>
Notes for making it work:
- "CFLAGS=-m32 LDFLAGS=-m32 ./configure ..." passes the right flags
everywhere, otherwise just passing it by EXTRA_CFLAGS/EXTRA_LDFLAGS
might miss some binaries or libraries
- note that passing CFLAGS/LDFLAGS to configure will override the
default flags
- libbtrfsutil and python bindings require to pass the -m32 flags via
EXTRA_PYTHON_CFLAGS/EXTRA_PYTHON_LDFLAGS
- all the e2fsprogs, zlib, lzo, util-linux etc need packaging changes to
provide static libraries - usually frowned upon by distribution unless
justified, even less so for static + 32bit versions, very niche
use case
Signed-off-by: David Sterba <dsterba@suse.com>
The correct checksum type value is set a few lines below, there seems
to be stale crc32 initialization. Also remove the crc32c.h include as
it's not used directly anymore.
Signed-off-by: David Sterba <dsterba@suse.com>
For passing authentication keys to the checksumming functions we need a
container for the key.
Pass in a btrfs_fs_info to btrfs_csum_data() so we can use the fs_info
as a container for the authentication key.
Note this is not always possible for all callers of btrfs_csum_data() so
we're just passing in NULL for now
Functions calling btrfs_csum_data() with a NULL fs_info argument are
currently not supported in the context of an authenticated file system.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Extending open_ctree with more parameters would be difficult, we'll need
to add more so factor out the parameters to a structure for easier
extension.
Signed-off-by: David Sterba <dsterba@suse.com>
The paths are not adapted to the TEST_TOP added long time ago in
e44f595dd7 ("btrfs-progs: tests: unify test drivers, make ready for
extenral testsuite").
Signed-off-by: David Sterba <dsterba@suse.com>
Add test vectors, a subset without keys as found in linux kernel sources
in crypto/test-mgr.h for all supported hash algorithms.
Signed-off-by: David Sterba <dsterba@suse.com>
Autoheader uses the AC_DEFINE macros (and a few others) to populate
the config.h.in file. The autotools documentation does not tell
what happens if AC_DEFINE is used twice for the same identifier.
This patch prevents using AC_DEFINE twice for
HAVE_OWN_FIEMAP_EXTENT_DEFINE, preserving the logic (using the
fact that an undefined identifier in a preprocessor directive is
taken as zero).
Signed-off-by: Pierre Labastie <pierre.labastie@neuf.fr>
Signed-off-by: David Sterba <dsterba@suse.com>
Make output of 'btrfs filesystem resize' command more readable and
describe the changes in more detail.
Before:
Resize '/mnt' of '1:-1G'
After:
Resize device id 1 (/dev/vdb) from 4.00GiB to 3.00GiB
Issue: #307
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
To properly check the 64bit timestamp conversion, the filesystem must
support it. Check that a freshly created filesystem contains the
feature.
Signed-off-by: David Sterba <dsterba@suse.com>
Commit b3df561fbf ("btrfs-progs: convert: copy extra timespec on
ext4") has introduced the ability to convert extended inode time
precision on ext4, but this breaks builds on older distros, where ext4
does not have the nsec time precision.
Commit c615287cc0 ("btrfs-progs: a bunch of typo fixes") tried to fix
that by testing the availability of the EXT4_EPOCH_MASK macro, but the
test is not complete.
This patch aims at fixing the macro test, and changes the
name of the associated HAVE_ macro, since the logic is reverted.
This fixes#353 when ext4 has nsec time precision. Note that the test
convert/019-ext4-copy-timestamps fails when ext4 does not have the nsec
time precision and needs to check for the support.
Issue: #353
Signed-off-by: Pierre Labastie <pierre.labastie@neuf.fr>
Signed-off-by: David Sterba <dsterba@suse.com>
The warning is printed for profiles where it's not intended (like raid0
or raid1c4). Check the correct variable for the target profiles.
Issue: #355
Fixes: 1ed5db8db4 ("btrfs-progs: balance convert: add a warning and countdown for RAID56 conversion")
Signed-off-by: David Sterba <dsterba@suse.com>
Image of ext4 with needs_recovery incompat bit set. This bit cannot be
set by regular tune2fs so was created on an empty 4M image by patched
tune2fs that set the bit unconditionally (the image still passed e2fsck,
with journal recovery).
Issue: #348
Signed-off-by: David Sterba <dsterba@suse.com>
As Chris reports: This ext4 file system has 'needs_recovery' feature set, and
if mounted rw, log replay happens. But btrfs-convert doesn't check for it and
converts anyway. It probably shouldn't.
# debugfs -R stats /dev/loop0
debugfs 1.45.6 (20-Mar-2020)
Filesystem volume name: <none>
Last mounted on: /mnt/0
Filesystem UUID: d3e3862e-f892-4ab7-ae91-84eb4be4a3ef
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index
filetype needs_recovery extent 64bit flex_bg
sparse_super large_file huge_file dir_nlink
extra_isize metadata_csum
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
...
Then 'btrfs-convert' proceeds, while 'e2fsck -fvn /dev/loop1' finds some
problems and wants to fix them.
Add a check for the 'needs_recovery' incompat bit set and don't convert
the filesystem.
Issue: #348
Signed-off-by: David Sterba <dsterba@suse.com>
Hard-coding the pkg-config executable might result in build errors
on system and cross environments that have prefixed toolchains. The
PKG_CONFIG variable already holds the proper one and is already used
in a few other places.
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
Signed-off-by: Heiko Becker <heirecka@exherbo.org>
Signed-off-by: David Sterba <dsterba@suse.com>
When running the tests as root SUDO_HELPER is empty, so that
`run_check "" dd ...` is run, and it fails because the empty command is
not found.
Pull-request: #351
Issue: #352
Author: Pierre Labastie <pierre.labastie@neuf.fr>
Signed-off-by: David Sterba <dsterba@suse.com>
Patch "btrfs-progs: fix false alert on tree block crossing 64K page
boundary" fixes a false alert (warning) when 'btrfs check' is called
right after mkfs.
As we have an extensive mkfs coverage in test mkfs/001, add the check for
the warning there instead of a separate test.
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When btrfs-check is executed on even newly created fs, it can report
tree blocks crossing 64K page boundary like this:
Opening filesystem to check...
Checking filesystem on /dev/test/test
UUID: 80d734c8-dcbc-411b-9623-a10bd9e7767f
[1/7] checking root items
[2/7] checking extents
WARNING: tree block [30523392, 30539776) crosses 64K page boudnary, may cause problem for 64K page system
[3/7] checking free space cache
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 131072 bytes used, no error found
total csum bytes: 0
total tree bytes: 131072
total fs tree bytes: 32768
total extent tree bytes: 16384
btree space waste bytes: 125199
file data blocks allocated: 0
referenced 0
[CAUSE]
Tree block [30523392, 30539776) is at the last 16K slot of page.
As 30523392 % 65536 = 49152, and 30539776 % 65536 = 0.
The cross boundary check is using exclusive end, which causes false
alerts.
[FIX]
Use inclusive end to do the cross 64K boundary check.
Reported-by: Wang Yugui <wangyugui@e16-tech.com>
Issue: #352
Issue: #354
Fixes: fc38ae7f48 ("btrfs-progs: check: detect and warn about tree blocks crossing 64K page boundary")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently only single checksum byte is printed. Fix it so that
the whole checksum is printed, in the order as the bytes are stored in
the buffer. This matches what kernel does, though it might not
correspond to the cases of CRC32C and XXHASH as if they were stored in
integer variable and printed in the native format. For consistency we
need to print the same format.
Signed-off-by: Dāvis Mosāns <davispuh@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>