For historical reasons the helpers [btrfs_]open_dir... return also
the 'DIR *dirstream' value when a directory is opened.
However this is never used. So avoid calling diropen() and return
only the fd.
Replace the last btrfs_open_dir() call with btrfs_open_dir_fd()
removing any reference to the unused/useless dirstream variables.
Also update the add_seen_fsid() function removing any reference to dir
stream (again this is never used).
Signed-off-by: Goffredo Baroncelli <kreijack@libero.it>
Signed-off-by: David Sterba <dsterba@suse.com>
For historical reasons the helpers [btrfs_]open_dir... return also
the 'DIR *dirstream' value when a directory is opened.
Replace btrfs_open_dir() with btrfs_open_dir_fd() removing
any reference to the unused/useless dirstream variables.
Calling btrfs_open_dir_fd() with only the path is equivalent to
btrfs_open_dir(_, _, 1).
Signed-off-by: Goffredo Baroncelli <kreijack@libero.it>
Signed-off-by: David Sterba <dsterba@suse.com>
For historical reasons the helpers [btrfs_]open_dir... return also
the 'DIR *dirstream' value when a directory is opened.
However this is never used. So avoid calling diropen() and return only
the fd. This is a preparatory patch.
Signed-off-by: Goffredo Baroncelli <kreijack@libero.it>
Signed-off-by: David Sterba <dsterba@suse.com>
There's a report that 'btrfs balance start --enqueue' does not properly
wait when there are multiple instances started. The command does a busy
wait instead of timeouts.
Strace output:
0.000006 pselect6(5, NULL, NULL, [4], {tv_sec=60, tv_nsec=0}, NULL) = 1 (except [4], left {tv_sec=59, tv_nsec=999999716})
0.000008 pselect6(5, NULL, NULL, [4], {tv_sec=29, tv_nsec=999999000}, NULL) = 1 (except [4], left {tv_sec=29, tv_nsec=999998786})
After the first select there's almost the entire time left, the second
one starts right after it.
Polling/selecting sysfs files is possible under some conditions:
- the file descriptor must be reopened before each poll/select
- the whole buffer must be read too
With that in place it now works as expected. The remaining timeout logic
is slightly adjusted to wait at most 10 seconds so the pending jobs do
not wait too long if there's still a lot of time left from the first
select.
Issue: #746
Signed-off-by: David Sterba <dsterba@suse.com>
Use the new docref and duplabel syntax to fix build warnings and fix the
link targets in the pages.
[ci skip]
Signed-off-by: David Sterba <dsterba@suse.com>
Sphinx/RST does not(?) have native support for cross references to
labels in specific documents (doc, ref but not both at the same time),
also allowing duplicate labels. We have web and manual pages and to
share the same text there are common files included from each, defining
labels. This leads to warnings and clicking the links ends up in the
included document (like ch-volume-management-intro.rst) instead of the
parent.
As a last resort, implement own role that does the doc and ref in one
go. A special directive is used to define a label that is expected
to be duplicate (but otherwise it's an ordinary label) and this gets rid
of the warning too.
Signed-off-by: David Sterba <dsterba@suse.com>
Subpage promoted to 'OK', no serious problems since 6.0. Otherwise
references, adjustments, wording updates.
[ci skip]
Signed-off-by: David Sterba <dsterba@suse.com>
We've deprecated the -R option and runtime feature distinction, there's
only one option -O recommended so let it also print on the same line.
Incompat/compat/runtime status of a feature shall be consulted with the
documentation.
Signed-off-by: David Sterba <dsterba@suse.com>
Be verbose about the potential compatibility problems with the
sectorsize and page size. Also print the page size on the overview.
Signed-off-by: David Sterba <dsterba@suse.com>
There are new backends and builtin accelerated implementations.
Recalculate the table results, a different host than before, with SHANI
extension.
[ci skip]
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Although commit b2a1be83b8 ("btrfs-progs: mkfs: keep file descriptors
open during whole time") is making sure we're only closing the writeable
fds after the fs is properly created, there is still a missing fd not
following the requirement.
And this explains the issue why sometimes after mkfs.btrfs, lsblk still
doesn't give a valid uuid.
Shown by the strace output (the command is "mkfs.btrfs -f
/dev/test/scratch1"):
openat(AT_FDCWD, "/dev/test/scratch1", O_RDWR) = 5 <<< Writeable open
fadvise64(5, 0, 0, POSIX_FADV_DONTNEED) = 0
sysinfo({uptime=2529, loads=[8704, 6272, 2496], totalram=4104548352, freeram=3376611328, sharedram=9211904, bufferram=43016192, totalswap=3221221376, freeswap=3221221376, procs=190, totalhigh=0, freehigh=0, mem_unit=1}) = 0
lseek(5, 0, SEEK_END) = 10737418240
lseek(5, 0, SEEK_SET) = 0
......
close(5) = 0 <<< Closed now
pwrite64(6, "O\250\22\261\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 1163264) = 16384
pwrite64(6, "\201\316\272\342\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 1179648) = 16384
pwrite64(6, "K}S\t\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 1196032) = 16384
pwrite64(6, "\207j$\265\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 1212416) = 16384
pwrite64(6, "q\267;\336\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 5242880) = 16384
fsync(6) <<< But we're still writing into the disk.
[CAUSE]
After more digging, it turns out we have a very obvious escape in
open_ctree_fs_info():
open_ctree_fs_info()
|- fp = open(oca->filename, flags);
|- info = __open_ctree_fd();
|- close(fp);
As later we only do IO using the device fd, this close() seems fine.
But the truth is, for mkfs usage, this fs_info is a temporary one, with
a special magic number for the disk. And since mkfs is doing writeable
operations, this close() would immediately trigger udev scan.
And since at this stage, the fs is not yet fully created, udev can race
with mkfs, and may get the invalid temporary superblock.
[FIX]
Introduce a new btrfs_fs_info member, initial_fd, for
open_ctree_fs_info() to record the fd.
And on close_ctree(), if we find fs_info::initial_fd is a valid fd, then
close it.
By this, we make sure all writeable fds are only closed after we have
written valid super blocks into the disk.
Issue: #734
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
In "btrfs tune" subcommand, we're using open_ctree_fd(), which
requires passing a valid fd, and the caller is responsible to properly
handling the lifespan of the fd.
But we just call close_ctree() and btrfs_close_all_devices(), without
closing the fd.
[FIX]
Just manually close the fd.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
In cmd_rescue_clear_ino_cache(), we opened the fs, but without
closing it using close_ctree().
[CAUSE]
This was introduced in 42404a4e44 ("btrfs-progs: move inode cache
removal to rescue group"), the original code inside btrfs check
had a "goto out_close;" to properly close the fs.
[FIX]
Manually call close_ctree() on the fs_info->tree_root.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The symbol BTRFS_UPDATE_KERNEL seems to be unused since 2f55fd7019
("btrfs-progs: optimize btrfs_scan_lblkid() for multiple calls"), remove
it.
Signed-off-by: David Sterba <dsterba@suse.com>
Wherever it makes sense, save the logs as artifacts when something
fails (most likely the main step with real tests).
Signed-off-by: David Sterba <dsterba@suse.com>
There's a report that passing raw device mapper path and -d don't work
together:
yyy@xxx ~ $ sudo btrfs filesystem show /dev/dm-0
Label: none uuid: a7fbb8d6-ec5d-4e88-bd8b-c686553e0dc7
Total devices 1 FS bytes used 144.00KiB
devid 1 size 256.00MiB used 88.00MiB path /dev/mapper/da0972636816-LogVol00
With --all-devices
yyy@xxx ~ $ sudo btrfs filesystem show --all-devices /dev/dm-0
ERROR: not a valid btrfs filesystem: /dev/dm-0
Where dm-0 corresponds to the LogVol00 device from above.
Passing the option -d skips some steps but still uses the real path of
the device that is required for scanning and identification, while
blkid uses the canonicalized path.
The combination of raw device name and -d was not handled as the raw
path is not in cache and thus not recognized. Canonicalization fixes
that although this changes the device name in the output.
Issue: #732
Signed-off-by: David Sterba <dsterba@suse.com>
There are now two tests with 060, we'd like to be that a unique number,
rename the one that was added more recently.
Signed-off-by: David Sterba <dsterba@suse.com>
The raid-stripe-tree can be enabled for convert, though it's still
considered incomplete and slightly experimental. Due to that the tests
need to be adjusted to check for support and skip mount eventually.
Possible remaining options to add: quota, squota
Issue: #694
Signed-off-by: David Sterba <dsterba@suse.com>
Now docker hub images can be pulled for build tests (sources are
downloaded) and this is faster than rebuilding them each time so this
can be enabled for all ci/* and devel branches.
Signed-off-by: David Sterba <dsterba@suse.com>
The runner scripts ci/ci-build... pass build options that work for each
distro, however this was not matching some Dockerfiles. Pulling such
image would then fail due to missing dependencies (namely libudev).
Signed-off-by: David Sterba <dsterba@suse.com>
There was a bug when a branch contained a slash then the file with
downloaded sources was not found. Update all, all images have to be
rebuilt and pushed to docker hub so the changes are applied inside
github actions.
Signed-off-by: David Sterba <dsterba@suse.com>
Add a possibility to test branches independent of devel, the pattern is
prefix "ci/" or "CI/".
[ci skip]
Signed-off-by: David Sterba <dsterba@suse.com>
We have had working subpage support in Btrfs for many cycles now.
Generally, we do not want people creating filesystems by default
with non-4k sectorsizes since it creates portability problems.
As the subpage has stabilized it seems to be safe to do the switch.
This may still affect users that relying on the previous behaviour.
Issue: #604
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Eric Curtin <ecurtin@redhat.com>
Signed-off-by: Neal Gompa <neal@gompa.dev>
Signed-off-by: David Sterba <dsterba@suse.com>
In case of a raid5/6 filesystem 'btrfs fi us' returns wrong values
without the root capabilities:
$ sudo btrfs fi us /tmp/raid5fs # as root
Overall:
Device size: 3.00GiB
Device allocated: 1.51GiB <--- OK
Device unallocated: 1.49GiB <--- OK
Device missing: 0.00B
Device slack: 0.00B
Used: 769.03MiB <--- OK
Free (estimated): 1.32GiB (min: 1.32GiB) <-OK
Free (statfs, df): 1.32GiB
Data ratio: 1.50 <--- OK
Metadata ratio: 1.50 <--- OK
Global reserve: 5.50MiB (used: 0.00B)
Multiple profiles: no
[...]
$ btrfs fi us /tmp/raid5fs # as user
WARNING: cannot read detailed chunk info, per-device usage will not be shown, run as root
Overall:
Device size: 3.00GiB
Device allocated: 0.00B <--- WRONG
Device unallocated: 3.00GiB <--- WRONG
Device missing: 0.00B
Device slack: 0.00B
Used: 0.00B <--- WRONG
Free (estimated): 0.00B (min: 8.00EiB) <- WRONG
Free (statfs, df): 1.32GiB
Data ratio: 0.00 <--- WRONG
Metadata ratio: 0.00 <--- WRONG
Global reserve: 5.50MiB (used: 0.00B)
Multiple profiles: no
[...]
The reason is that the BTRFS_IOC_SPACE_INFO ioctl doesn't return enough
information. To bypass it a scan of the chunks is required when a
raid5/6 profile is present.
To avoid providing wrong information, in case of a raid5/6 filesystem
without the root capabilities the "btrfs fi us" is not executed at all
and a warning with a suggestion to run it as root is printed.
$ ./btrfs fi us /tmp/t/
WARNING: cannot read detailed chunk info, per-device usage will not be shown, run as root
WARNING: due to the presence of a raid5/raid6 profile, we cannots compute some values;
WARNING: run as root instead.
Signed-off-by: Goffredo Baroncelli <kreijack@libero.it>
Signed-off-by: David Sterba <dsterba@suse.com>
If "btrfs dev us" is invoked by a not root user, it is impossible to
collect the chunk info data (not enough privileges). This causes
"btrfs dev us" to print as "Unallocated" value the size of the disk.
This patch handles the case where print_device_chunks() is invoked
without the chunk info data, printing "Unallocated N/A":
Before the patch:
$ btrfs dev us t/
WARNING: cannot read detailed chunk info, per-device usage will not be shown, run as root
/dev/loop0, ID: 1
Device size: 5.00GiB
Device slack: 0.00B
Unallocated: 5.00GiB <-- Wrong
$ sudo btrfs dev us t/
[sudo] password for ghigo:
/dev/loop0, ID: 1
Device size: 5.00GiB
Device slack: 0.00B
Data,single: 8.00MiB
Metadata,DUP: 512.00MiB
System,DUP: 16.00MiB
Unallocated: 4.48GiB <-- Correct
After the patch:
$ ./btrfs dev us /tmp/t/
WARNING: cannot read detailed chunk info, per-device usage will not be shown, run as root
/dev/loop0, ID: 1
Device size: 5.00GiB
Device slack: 0.00B
Unallocated: N/A
$ sudo ./btrfs dev us /tmp/t/
[sudo] password for ghigo:
/dev/loop0, ID: 1
Device size: 5.00GiB
Device slack: 0.00B
Data,single: 8.00MiB
Metadata,DUP: 512.00MiB
System,DUP: 16.00MiB
Unallocated: 4.48GiB
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Goffredo Baroncelli <kreijack@libero.it>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch introduces a new parser helper, parse_u64_with_suffix(),
which has a better error handling, following all the parse_*()
helpers to return non-zero value for errors.
This new helper is going to replace parse_size_from_string(), which
would directly call exit(1) to stop the whole program.
Furthermore most callers of parse_size_from_string() are expecting
exit(1) for error, so that they can skip the error handling.
For those call sites, introduce a wrapper, arg_strtou64_with_suffix(),
to do that. The only disadvantage is a little less detailed error
report for why the parse failed, but for most cases the generic error
string should be enough.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Both functions are just doing the same thing, the only difference is
only in error handling, as parse_u64() requires callers to handle it,
meanwhile arg_strtou64() would call exit(1).
This patch would convert arg_strtou64() to utilize parse_u64(), and use
the return value to output different error messages.
This also means the return value of parse_u64() would be more than just
0 or 1, but -EINVAL for invalid string (including no numeric string at
all, has any tailing characters, or minus value), and -ERANGE for
overflow.
The existing callers are only checking if the return value is 0, thus
not really affected.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The CI test uses openssl 3.2 but it still hasn't made it to Tumbleweed,
it's still waiting in the queue. It'll be enabled again.
Signed-off-by: David Sterba <dsterba@suse.com>
Enhance the conversion macro to recognize the filesystem type in case
the ntfs-specific converter utility has to be used. Its current feature
set does not match btrfs-convert.
Signed-off-by: David Sterba <dsterba@suse.com>