Even with "mkfs.btrfs -b", mkfs.btrfs resets all the zones on the device.
Limit the reset target within the specified length.
Also, we need to check that there is no active zone outside of the FS
range. Having an active zone outside FS reduces the number of zones btrfs
can write simultaneously. Technically, we can still scan all the device
zones and keep active zones outside FS intact and try to live with the
limited active zones. But, that will make btrfs operations harder.
It is generally bad idea to use "-b" on a non-test usage on a device with
active zone limit in the first place. You really need to take care that FS
and outside the FS goes over the limit. That means you'll never be able to
use zones outside the FS anyway.
So, until there is a strong request for that, I don't think it's worthwhile
to do so.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
block_count and dev_block_count are counting the size in bytes. And,
comparing them with e.g, "min_dev_size" is confusing. Rename them to
represent the unit better.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
What basename(3) does with the argument depends on _GNU_SOURCE and
inclusion of libgen.h. This is problematic on Musl (1.2.5) as reported.
We want the GNU semantics that does not modify the argument. Common way
to make it portable is to add own helper. This is now implemented in
path_basename() that does not use the libc provided basename but preserves
the semantics. The path_dirname() is just for parity, otherwise same as
dirname().
Sources:
- https://bugs.gentoo.org/926288
- https://git.musl-libc.org/cgit/musl/commit/?id=725e17ed6dff4d0cd22487bb64470881e86a92e7
Issue: #778
Signed-off-by: David Sterba <dsterba@suse.com>
To be consistent with the rest of the code the sysfs helper should
return the -errno instead of passing -1 from various syscalls. Update
callers that relied on -1 as the invalid file descriptor.
Signed-off-by: David Sterba <dsterba@suse.com>
The sysfs could use more convenience helpers so move the current code to
own file before adding more helpers.
Signed-off-by: David Sterba <dsterba@suse.com>
It's not a common practice to use the same io function for both read and
write (we have pread() and pwrite(), not pio()).
Furthermore the original function has the following problems:
- Not returning proper error number
If we had ioctl/stat errors we just return 0 with errno set.
Thus caller would treat it as a short read, not a proper error.
- Unnecessary @ret_rw
This is not that obvious if we have different handling for read and
write, but if we split them it's super obvious we can reuse @ret.
- No proper copy back for short read
- Unable to constify the @buf pointer for write operation
All those problems would be addressed in this patch.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The fixes involve the following changes:
- Unexport functions which are not utilized out of the file
* print_path_column()
* parse_reflink_range()
* btrfs_list_setup_print_column()
* device_get_partition_size_sysfs()
* max_zone_append_size()
- Include related headers before implementing the function
* change-uuid.c
* convert-bgt.c
* seed.h
- Add missing headers caused by the above header changes
* include <uuid/uuid.h> for tune/tune.h.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
While syncing messages.[ch] I had to back out the ASSERT() code in
kerncompat.h, which means we now rely on the kernel code for ASSERT().
In order to maintain some semblance of separation introduce UASSERT()
and use that in all the purely userspace code.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There's a group of helpers to read device size, the btrfs_device_size
should be one of them. Rename it and so minor cleanup.
Signed-off-by: David Sterba <dsterba@suse.com>
The tool IWYU (include what you use) suggests to remove and add some
includes. This is only partial to avoid accidental build breakage, the
includes are entangled and will have to be cleaned in the future again.
Signed-off-by: David Sterba <dsterba@suse.com>
The preferred order:
- system headers
- standard headers
- libraries
- kernel library
- kernel shared
- common headers
- other tools
- own headers
Signed-off-by: David Sterba <dsterba@suse.com>
This file includes linux/fs.h which includes linux/mount.h and with
glibc 2.36 linux/mount.h and glibc mount.h are not compatible [1]
therefore try to avoid including both headers
[1] https://sourceware.org/glibc/wiki/Release/2.36
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
mkfs.btrfs v5.15 outputs a message even if the disk is a HDD without
TRIM/DISCARD support:
Performing full device TRIM /dev/sdc2 (326.03GiB) ...
[CAUSE]
mkfs.btrfs check TRIM/DISCARD support through the content of
queue/discard_granularity, but compare it against a wrong value.
When HDD without TRIM/DISCARD support, the content of
queue/discard_granularity is '0' '\n' '\0', rather than '0' '\0'.
[FIX]
- compare the value based on atoi() to provide more robustness
- delete unnecessary '\n' in pr_verbose()
Fixes: c50c448518 ("btrfs-progs: do sysfs detection of device discard capability")
Signed-off-by: Wang Yugui <wangyugui@e16-tech.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Wrap pwrite with btrfs_pwrite(). It simply calls pwrite() on non-zoned
btrfs (opened without O_DIRECT). On zoned mode (opened with O_DIRECT),
it allocates an aligned bounce buffer, copies the contents and uses it
for direct-IO writing.
Writes in device_zero_blocks() and btrfs_wipe_existing_sb() are a little
tricky. We don't have fs_info on our hands, so use zinfo to determine it
is a zoned device or not.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The detection of the discard status of a device is done by issuing a
real discard request but on an empty range. This works in most cases.
However there's a case of a VirtualBox driver that returns 'Operation
not supported' in that case, and then discard is skipped during mkfs.
The other tools like fstrim check the sysfs queue file
discard_granularity which is the recommended way. Do that as well.
Issue: #390
Signed-off-by: David Sterba <dsterba@suse.com>
We cannot zone reset a regular file with emulated zones. So, mkfs.btrfs
on such a file causes the following error.
ERROR: zoned: failed to reset device '/home/naota/tmp/btrfs.img' zones: Inappropriate ioctl for device
Introduce btrfs_zoned_device_info->emulated to distinguish the zones are
emulated or not. And, use it to decide it needs zone reset or not.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Reading partition size using an ioctl requires the device open, but that
does not work for unprivileged users. This leads to 0 size in device
info structures filled by device_get_partition_size.
As a consequence, this also misreports such devices as missing in 'fi
us' overview:
$ btrfs fi us /
WARNING: cannot read detailed chunk info, per-device usage will not be shown, run as root
Overall:
Device size: 411.35GiB
Device allocated: 53.01GiB
Device unallocated: 358.34GiB
Device missing: 411.35GiB
Used: 31.99GiB
Free (estimated): 379.16GiB (min: 379.16GiB)
Free (statfs, df): 379.35GiB
Data ratio: 1.00
Metadata ratio: 1.00
Global reserve: 194.77MiB (used: 0.00B)
Multiple profiles: no
There should be 0 for 'Device missing'.
Add a fallback to read the device size from sysfs in case the ioctl is
not available.
Issue: #395
Signed-off-by: David Sterba <dsterba@suse.com>
Sysfs hides the zone size of a block device in the queue/chunk_sectors
file, so add a helper that will read it for us when given the short
device name (that can be found in FSID/devices).
Signed-off-by: David Sterba <dsterba@suse.com>
Getting the per bg type zone unusable space will be used in other size
reports like 'fi us', so export it to the device utils.
Signed-off-by: David Sterba <dsterba@suse.com>
The helper wraps a raw ioctl but some users may already have the fd and
not necessarily the path. Add a suitable helper for convenience.
Signed-off-by: David Sterba <dsterba@suse.com>
This helper hasn't been used since 63bbf2931d ("btrfs-progs: rework
calculations of fi usage") a few years ago and we don't need the statfs
based calculations anywhere.
Signed-off-by: David Sterba <dsterba@suse.com>
We cannot overwrite superblock magic in a sequential required zone.
Instead, we can reset the zone to wipe it.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If we zero out a region in a sequential write required zone, we cannot
write to the region until we reset the zone. Thus, we must prohibit zeroing
out to a sequential write required zone.
zero_dev_clamped() is modified to take the zone information and it calls
zero_zone_blocks() if the device is host managed to avoid writing to
sequential write required zones.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
All zones of zoned block devices should be reset before writing. Support
this by introducing PREP_DEVICE_ZONED.
btrfs_reset_all_zones() walk all the zones on a device, and reset a zone if
it is sequential required zone, or discard the zone range otherwise.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce the queue_param helper function to get a device request queue
parameter. This helper will be used later to query information of a zoned
device.
Furthermore, rewrite is_ssd() using the helper function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
[Naohiro] fixed error return value
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>