Commit Graph

300 Commits

Author SHA1 Message Date
Josef Bacik
8069b8b8cd btrfs-progs: drop btrfs_init_path
This simply zero's out the path, and this is used everywhere we use a
stack path.  Drop this usage and simply init the path's to empty instead
of using a function to do the memset.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:56 +02:00
Boris Burkov
14ac1a6051 btrfs-progs: mkfs: add support for squota
Add the ability to enable simple quotas from mkfs with '-O squota'

There is some complication around handling enable_gen while still
counting the root node of an fs. To handle this, employ a hack of doing
a no-op write on the root node to bump its generation up above that of
the qgroup enable generation, which results in counting it properly.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:55 +02:00
Anand Jain
ff4c4a3a00 btrfs-progs: allow duplicate fsid for single device filesystems
For single device btrfs filesystem, allow duplicate fsid to be created.
This should be used with caution as more devices with the same uuid
could be confused with each other.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-02 18:41:08 +02:00
Johannes Thumshirn
fff57d3774 btrfs-progs: load zone info for all zoned devices
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-02 18:41:08 +02:00
Johannes Thumshirn
b4ab282686 btrfs-progs: allow zoned RAID
Allow for RAID levels 0, 1 and 10 on zoned devices if the RAID stripe tree
is used.

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-02 18:41:08 +02:00
David Sterba
76c0446bec btrfs-progs: mkfs: convert int to bool in a few helpers
Signed-off-by: David Sterba <dsterba@suse.com>
2023-07-27 14:45:29 +02:00
Anand Jain
d46a0ef6a0 btrfs-progs: rename struct open_ctree_flags to open_ctree_args
The struct open_ctree_flags currently holds arguments for
open_ctree_fs_info(), it can be confusing when mixed with a local variable
named open_ctree_flags as below in the function cmd_inspect_dump_tree().

  cmd_inspect_dump_tree()
  ::
  struct open_ctree_flags ocf = { 0 };
  ::
  unsigned open_ctree_flags;

So rename struct open_ctree_flags to struct open_ctree_args.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-07-26 15:00:47 +02:00
Dominique Martinet
9362803539 btrfs-progs: mkfs: make --quiet silence the 5.15 default change NOTE
mkfs.btrfs help message for --quiet is 'no message except errors' so
we probably ought to silence this as well in the quiet case.

Author: Dominique Martinet <dominique.martinet@atmark-techno.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-09 11:57:39 +02:00
David Sterba
ae73e89f28 btrfs-progs: mkfs: more verbose output for --rootdir
Print the source directory for --rootdir and if --shrink is used. With
-vv then print the individual files as added:

  $ mkfs.btrfs --rootdir dir --shrink -vv img
  ...
  Rootdir from:       Documentation
  ADD: /btrfs-progs/Documentation/btrfs-check.rst
  ...
  ADD: /btrfs-progs/Documentation/btrfs-send.rst
    Shrink:           yes
  Label:              (null)
  UUID:               40d3a16f-02d8-40d7-824b-239cee528093
  ...

The 'Rootdir from' is printed before the files are added so there's now
message before the files are added which could take some time.

Issue: #627
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 22:17:33 +02:00
David Sterba
95c1fa1871 btrfs-progs: mkfs: remove redundant variable for source dir
Validity of source dir can be determined by the variable itself, no need
to track it separately.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 22:17:33 +02:00
Qu Wenruo
08a3bd7694 btrfs-progs: tune: add the ability to generate new data checksums
This patch would modify btrfs_csum_file_block() to handle csum type
other than the one used in the current fs.

The new data checksum would use a different objectid (-13) to
distinguish with the existing one (-10).
This needs to change tree-checker to skip the item size checks,
since new csum can be larger than the original csum.

After this stage, the resulted csum tree would look like this:

	item 0 key (CSUM_CHANGE EXTENT_CSUM 13631488) itemoff 8091 itemsize 8192
		range start 13631488 end 22020096 length 8388608
	item 1 key (EXTENT_CSUM EXTENT_CSUM 13631488) itemoff 7067 itemsize 1024
		range start 13631488 end 14680064 length 1048576

Note the itemsize is 8 times the original one, as the original csum is
CRC32, while target csum is SHA256, which is 8 times the size.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:32 +02:00
Qu Wenruo
46364d3766 btrfs-progs: replace write_and_map_eb() by write_data_to_disk()
The function write_and_map_eb() is quite abused as a way to write any
generic buffer back to disk.

But we have a more suitable function already, write_data_to_disk().

This patch would remove the abused write_data_to_disk() calls, and
convert the only three valid call sites to write_data_to_disk() instead.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:31 +02:00
Josef Bacik
f8efe9f724 btrfs-progs: sync file-item.h into progs
This patch syncs file-item.h into btrfs-progs.  This carries with it an
API change for btrfs_del_csums, which takes a root argument in the
kernel, so all callsites have been updated accordingly.

I didn't sync file-item.c because it carries with it a bunch of bio
related helpers which are difficult to adapt to the kernel.
Additionally there's a few helpers in the local copy of file-item.c that
aren't in the kernel that are required for different tools.

This requires more cleanups in both the kernel and progs in order to
sync file-item.c, so for now just do file-item.h in order to pull things
out of ctree.h.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:29 +02:00
Josef Bacik
c979ffd787 btrfs-progs: sync accessors.[ch] from the kernel
This syncs accessors.[ch] from the kernel.  For the most part
accessors.h will remain the same, there's just some helpers that need to
be adjusted for eb->data instead of eb->pages.  Additionally accessors.c
needed to be completely updated to deal with this as well.

This is a set of files where we will likely only sync the header going
forward, and leave the C file in place as it needs to be specific to
btrfs-progs.

This forced a few "unrelated" changes

- Using btrfs_dir_item_ftype() instead of btrfs_dir_item_type().  This
  is due to the encryption changes, and was simpler to just do in this
  patch.
- Adjusting some of the print tree code to use the actual helpers and
  not the btrfs-progs ones.

A local definition of static_assert is used to avoid compilation
failures on older gcc (< 9) where the 2nd parameter is mandatory.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:28 +02:00
Josef Bacik
a754fe29d9 btrfs-progs: sync uapi/btrfs.h into btrfs-progs
We want to keep this file locally as we want to be uptodate with
upstream, so we can build btrfs-progs regardless of which kernel is
currently installed.  Sync this with the upstream version and put it in
kernel-shared/uapi to maintain some semblance of where this file comes
from.

There are some changes that need to be synced back to kernel. A local
definition of static_assert is used to avoid compilation problems on gcc
(< 9) due to mandatory 2nd parameter.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:28 +02:00
Josef Bacik
bf0f3db765 btrfs-progs: introduce UASSERT() for purely userspace code
While syncing messages.[ch] I had to back out the ASSERT() code in
kerncompat.h, which means we now rely on the kernel code for ASSERT().
In order to maintain some semblance of separation introduce UASSERT()
and use that in all the purely userspace code.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:28 +02:00
psykose
c9abbf6264 btrfs-progs: stop using legacy *64 interfaces
The *64 interfaces, such as fstat64, off64_t, etc, are legacy interfaces
created at a time when 64-bit file support was still new. They are
generally exposed when defining a macro named _LARGEFILE64_SOURCE, as
e.g. the glibc docs[0] say.

The modern way to utilise largefile support, is to continue to use the
regular interfaces (off_t, fstat, ..), and define _FILE_OFFSET_BITS=64.

We already use the autoconf macro AC_SYS_LARGEFILE[1] which arranges this
and sets this macro for us. Therefore, we can utilise the non-64 names
without fear of breaking on 32-bit systems.

This fixes the build against musl libc, ever since musl dropped the
*64 compat from interfaces by default[2] just for _GNU_SOURCE, unless
_LARGEFILE64_SOURCE is defined. However, there are plans for a future
removal of the whole *64 header API, and that workaround (adding another
define) might cease to exist.

So, rename all *64 API use to the regular non-suffixed names. For
consistency, rename the internal functions that were *64 named
(lstat64_path, ..) too.

This should have no regressions on any platform.

[0]: https://www.gnu.org/software/libc/manual/html_node/Feature-Test-Macros.html#index-_005fLARGEFILE64_005fSOURCE
[1]: https://www.gnu.org/software/autoconf/manual/autoconf-2.67/html_node/System-Services.html
[2]: 25e6fee27f

Pull-request: #615
Signed-off-by: psykose <alice@ayaya.dev>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-25 16:59:42 +02:00
Qu Wenruo
b2a1be83b8 btrfs-progs: mkfs: keep file descriptors open during whole time
[BUG]
There is an internal bug report that, after mkfs.btrfs there is a chance
that no /dev/disk/by-uuid/<uuid> symlink is not created at all.

[CAUSE]
That uuid symlink is created by udev, which listens to inotify
IN_CLOSE_WRITE events from all block devices.

After such IN_CLOSE_WRITE event is triggered, udev would *disable*
inotify for that block device, and do a blkid scan on it.
After the blkid scan is done, re-enables the inotify listening.

This means normally mkfs tools should open the fd, do all the writes,
and close the fd after everything is done.

But unfortunately for mkfs.btrfs, it's not the case, we have a lot of
phases separated by different close() calls:

  open_ctree() would open fds of each involved device
  and close them at close_ctree()
  Only after close_ctree() we have a valid superblock -\
                                                       |
|<------- A -------->|<--------- B --------->|<------- C ------->|
          |                      |
          |                      `- open a new fd for make_btrfs()
          |                         and close it before open_ctree()
          |                         The device contains invalid sb.
          |
          `- open a new fd for each device, then call
             btrfs_prepare_device(), then close the fd.
             The device would contain no valid superblock.

If at the close() of phase A udev event is triggered, while doing udev
scan we go into phase C (but before the new valid super blocks written),
udev would only see no superblock or invalid superblock.

Then phase C finished, udev resumes its inotify listening, but at this
time mkfs is finished, while udev only sees the premature data from
phase A, and misses the IN_CLOSE_WRITE events from phase C.

[FIX]
Instead of opening and closing a new fd for each device, re-use the fd
opened during prepare_one_device(), and close all the fds until
close_ctree() is called.

By this, although we may still have race between close_ctree() and
explicit close() calls, at least udev can always see the properly
written super blocks.

To compensate the change, some extra cleanups are made:

- Do not touch @device_count
  Which makes later prepare_ctx iteration much easier.

- Remove top-level @fd variable
  Instead go with prepare_ctx[i].fd.

- Do not open with O_RDWR in test_dev_for_mkfs()
  as test_dev_for_mkfs() would close the fd, if we go O_RDWR, it can
  cause the udev race.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-25 16:59:41 +02:00
Qu Wenruo
4dbe66ca2f btrfs-progs: mkfs: make -R|--runtime-features option deprecated
The option -R|--runtime-features was introduced to support features that
don't result in a full incompat flag change, thus things like
free-space-tree and quota features are put here.

But to end users, such separation of features is not helpful and can be
sometimes confusing.

Thus we're already migrating those runtime features into -O|--features
option under experimental builds.

I believe this is the proper time to move those runtime features into
-O|--features option, and mark the -R|--runtime-features option
deprecated.

For now we still keep the old option as for compatibility purposes.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 19:27:53 +02:00
David Sterba
a7fa81f296 btrfs-progs: open code print_usage where applicable
After previous change to usage() that now has the return code, there's
no purpose of the print_usage() wrapper so it can be removed.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-28 20:11:23 +01:00
Qu Wenruo
f61b90aff9 btrfs-progs: make usage call properly return an exit value
[BUG]
Currently cli/009 test case failed with different exit number:

  ====== RUN CHECK /home/adam/btrfs-progs/btrfstune --help
  usage: btrfstune [options] device
  [...]
  failed: /home/adam/btrfs-progs/btrfstune --help
  test failed for case 009-btrfstune

[CAUSE]
In tune/main.c, we have the following call on usage():

  static void print_usage(int ret)
  {
	usage(&tune_cmd);
	exit(ret);
  }

However usage() itself would always call exit(1):

  void usage(const struct cmd_struct *cmd)
  {
	usage_command_usagestr(cmd->usagestr, NULL, 0, true, true);
	exit(1);
  }

This makes prevents any caller of usage() to modify its exit number.

[FIX]
Add a new argument @error for print_usage(), so we can properly return 0
for -h/--help usage.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-28 20:11:23 +01:00
David Sterba
24ec095295 btrfs-progs: crypto: add common function for accelerated initialization
Prepare a single location that will detect or set accelerated versions
of hash algorithms. Right now it's the crc32c, blake2 and sha256 do
an if-else switch while crc32c sets a function pointer.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-28 19:49:31 +01:00
Qu Wenruo
f914949b1a btrfs-progs: fix set but not used variables
[WARNING]
Clang 15.0.7 warns about several unused variables:

  kernel-shared/zoned.c:829:6: warning: variable 'num_sequential' set but not used [-Wunused-but-set-variable]
          u32 num_sequential = 0, num_conventional = 0;
              ^
  cmds/scrub.c:1174:6: warning: variable 'n_skip' set but not used [-Wunused-but-set-variable]
          int n_skip = 0;
              ^
  mkfs/main.c:493:6: warning: variable 'total_block_count' set but not used [-Wunused-but-set-variable]
          u64 total_block_count = 0;
              ^
  image/main.c:2246:6: warning: variable 'bytenr' set but not used [-Wunused-but-set-variable]
          u64 bytenr = 0;
              ^

[CAUSE]
Most of them are just straightforward set but not used variables.

The only exception is total_block_count, which has commented out code
relying on it.

[FIX]
Just remove those variables, and for @total_block_count, also remove the
comments.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-18 17:44:03 +01:00
David Sterba
347c8209e8 btrfs-progs: mkfs: convert help text to option formatter
Signed-off-by: David Sterba <dsterba@suse.com>
2023-01-25 19:55:47 +01:00
David Sterba
dfd58c294b btrfs-progs: mkfs: use help and cmd_struct for printing help text
Unify the mkfs help text so it uses the help framework. The cmd struct
is set up only partially.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-01-25 19:55:47 +01:00
Naohiro Aota
d8c6021727 btrfs-progs: mkfs: check blkid version on zoned filesystems
Prior to version 2.38, libblkid fails to detect zoned mode's superblock
location resulting in blkid failing to detect btrfs on zoned block
devices. This patch suggest to the user to upgrade libblkid if it
detects a version lower then 2.38.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-01-25 16:19:55 +01:00
David Sterba
f5e07cc60a btrfs-progs: warn when an experimental functionality is used
Print warning when one of the following is requested by some command
line option:

- btrfstune -b: conversion to block-group-tree
- mkfs.btrfs --num-global-roots: extent-tree-v2
- btrfs-image -d: dump image with data

Issue: #523
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-20 16:39:11 +02:00
Qu Wenruo
d8f1bd519f btrfs-progs: mkfs: fix a stack over-flow when features string are too long
[BUG]
Even with chunk_objectid bug fixed, mkfs.btrfs can still caused stack
overflow when enabling extent-tree-v2 feature (need experimental
features enabled):

  # ./mkfs.btrfs  -f -O extent-tree-v2 ~/test.img
  btrfs-progs v5.19.1
  See http://btrfs.wiki.kernel.org for more information.

  ERROR: superblock magic doesn't match
  NOTE: several default settings have changed in version 5.15, please make sure
        this does not affect your deployments:
        - DUP for metadata (-m dup)
        - enabled no-holes (-O no-holes)
        - enabled free-space-tree (-R free-space-tree)

  Label:              (null)
  UUID:               205c61e7-f58e-4e8f-9dc2-38724f5c554b
  Node size:          16384
  Sector size:        4096
  Filesystem size:    512.00MiB
  Block group profiles:
    Data:             single            8.00MiB
    Metadata:         DUP              32.00MiB
    System:           DUP               8.00MiB
  SSD detected:       no
  Zoned device:       no
  =================================================================
  [... Skip full ASAN output ...]
  ==65655==ABORTING

[CAUSE]
For experimental build, we have unified feature output, but the old
buffer size is only 64 bytes, which is too small to cover the new full
feature string:

  extref, skinny-metadata, no-holes, free-space-tree, block-group-tree, extent-tree-v2

Above feature string is already 84 bytes, over the 64 on-stack memory
size.

This can also be proved by the ASAN output:

  ==65655==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffc4e03b1d0 at pc 0x7ff0fc05fafe bp 0x7ffc4e03ac60 sp 0x7ffc4e03a408
  WRITE of size 17 at 0x7ffc4e03b1d0 thread T0
      #0 0x7ff0fc05fafd in __interceptor_strcat /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:377
      #1 0x55cdb7b06ca5 in parse_features_to_string common/fsfeatures.c:316
      #2 0x55cdb7b06ce1 in btrfs_parse_fs_features_to_string common/fsfeatures.c:324
      #3 0x55cdb7a37226 in main mkfs/main.c:1783
      #4 0x7ff0fbe3c28f  (/usr/lib/libc.so.6+0x2328f)
      #5 0x7ff0fbe3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
      #6 0x55cdb7a2cb34 in _start ../sysdeps/x86_64/start.S:115

[FIX]
Introduce a new macro, BTRFS_FEATURE_STRING_BUF_SIZE, along with a new
sanity check helper, btrfs_assert_feature_buf_size().

The problem is I can not find a build time method to verify
BTRFS_FEATURE_STRING_BUF_SIZE is large enough to contain all feature
names, thus have to go the runtime function to do the BUG_ON() to verify
the macro size.

Now the minimal buffer size for experimental build is 138 bytes, just
bump it to 160 for future expansion.

And if further features go beyond that number, mkfs.btrfs/btrfs-convert
will immediately crash at that BUG_ON(), so we can definitely detect it.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Tested-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:12 +02:00
Qu Wenruo
56e75c9f75 btrfs-progs: mkfs: fix a crash when enabling extent-tree-v2
[BUG]
When enabling extent-tree-v2 feature at mkfs time (need to enable
experimental features), mkfs.btrfs will crash:

  # ./mkfs.btrfs  -f -O extent-tree-v2 ~/test.img
  btrfs-progs v5.19.1
  See http://btrfs.wiki.kernel.org for more information.

  ERROR: superblock magic doesn't match
  NOTE: several default settings have changed in version 5.15, please make sure
        this does not affect your deployments:
        - DUP for metadata (-m dup)
        - enabled no-holes (-O no-holes)
        - enabled free-space-tree (-R free-space-tree)

  Segmentation fault (core dumped)

[CAUSE]
The block group tree looks like this after make_btrfs() call:

  (gdb) call btrfs_print_tree(root->fs_info->block_group_root->node, 0)
  leaf 1163264 items 1 free space 16234 generation 1 owner BLOCK_GROUP_TREE
  leaf 1163264 flags 0x0() backref revision 1
  checksum stored f137c1ac
  checksum calced f137c1ac
  fs uuid 450d4b15-4954-4574-9801-8c6d248aaec6
  chunk uuid 4c4cc54d-f240-4aa4-b88b-bd487db43444
	item 0 key (1048576 BLOCK_GROUP_ITEM 4194304) itemoff 16259 itemsize 24
		block group used 131072 chunk_objectid 256 flags SYSTEM|single
						       ^^^

This looks completely sane, but notice that chunk_objectid 256.
That 256 value is the expected one for regular non-extent-tree-v2 btrfs,
but for extent-tree-v2, chunk_objectid is reused as the global id of
extent tree where the block group belongs to.

With the old 256 value as chunk_objectid, btrfs will not find an extent
tree root for the block group, and return NULL for btrfs_extent_root()
call, and trigger segfault.

This is a regression caused by commit 1430b41427 ("btrfs-progs:
separate block group tree from extent tree v2"), which doesn't take
extent-tree-v2 on-disk format into consideration.

[FIX]
For the initial btrfs created by make_btrfs(), all block group items
will be in extent-tree global id 0, thus we can reset chunk_objectid to
0, if and only if extent-tree-v2 is enabled.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Tested-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:12 +02:00
Qu Wenruo
bed70b939f btrfs-progs: fsfeatures: properly merge -O and -R options
[BUG]
Commit "btrfs-progs: prepare merging compat feature lists" tries to
merged "-O" and "-R" options, as they don't correctly represents
btrfs features.

But that commit caused the following bug during mkfs for experimental
build:

  $ mkfs.btrfs -f -O block-group-tree  /dev/nvme0n1
  btrfs-progs v5.19.1
  See http://btrfs.wiki.kernel.org for more information.

  ERROR: superblock magic doesn't match
  ERROR: illegal nodesize 16384 (not equal to 4096 for mixed block group)

[CAUSE]
Currently btrfs_parse_fs_features() will return a u64, and reuse the
same u64 for both incompat and compat RO flags for experimental branch.

This can easily leads to conflicts, as
BTRFS_FEATURE_INCOMPAT_MIXED_BLOCK_GROUP and
BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE both share the same bit
(1 << 2).

Thus for above case, mkfs.btrfs believe it has set MIXED_BLOCK_GROUP
feature, but what we really want is BLOCK_GROUP_TREE.

[FIX]
Instead of incorrectly re-using the same bits in btrfs_feature, split
the old flags into 3 flags:

- incompat_flag
- compat_ro_flag
- runtime_flag

The first two flags are easy to understand, the corresponding flag of
each feature.
The last runtime_flag is to compensate features which doesn't have any
on-disk flag set, like QUOTA and LIST_ALL.

And since we're no longer using a single u64 as features, we have to
introduce a new structure, btrfs_mkfs_features, to contain above 3
flags.

This also mean, things like default mkfs features must be converted to
use the new structure, thus those old macros are all converted to
const static structures:

- BTRFS_MKFS_DEFAULT_FEATURES + BTRFS_MKFS_DEFAULT_RUNTIME_FEATURES
  -> btrfs_mkfs_default_features

- BTRFS_CONVERT_ALLOWED_FEATURES -> btrfs_convert_allowed_features

And since we're using a structure, it's not longer as easy to implement
a disallowed mask.

Thus functions with @mask_disallowed are all changed to using
an @allowed structure pointer (which can be NULL).

Finally if we have experimental features enabled, all features can be
specified by -O options, and we can output a unified feature list,
instead of the old split ones.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:11 +02:00
Qu Wenruo
2cdc8dddbf btrfs-progs: mkfs: offset inode numbers of the source filesystem
[BUG]
When running mkfs tests on a newly rebooted minimal system, it can cause
mkfs/009 to fail.

The reproduce steps requires /tmp to has minimal files in the first
place.

  # mkdir /tmp/rootdir
  # xfs_io -f -c "pwrite 0 16k" /tmp/rootdir
  # mkfs.btrfs --rootdir /tmp/rootdir -f $dev
  # btrfs check $dev
  Opening filesystem to check...
  Checking filesystem on /dev/test/scratch1
  UUID: 6821b3db-f056-4c18-b797-32679dcd4272
  [1/7] checking root items
  [2/7] checking extents
  data backref 13631488 root 5 owner 170 offset 0 num_refs 0 not found in extent tree
  incorrect local backref count on 13631488 root 5 owner 170 offset 0 found 1 wanted 0 back 0x55ff6cd72260
  backref 13631488 root 5 not referenced back 0x55ff6cd4c1f0
  incorrect global backref count on 13631488 found 2 wanted 1
  backpointer mismatch on [13631488 16384]
  ERROR: errors found in extent allocation tree or chunk allocation

[CAUSE]
The extent tree has the following weird item:

	item 0 key (13631488 EXTENT_ITEM 16384) itemoff 16250 itemsize 33
		refs 1 gen 0 flags DATA
		tree block backref root FS_TREE

This is an extent item for data, thus it should not have an inline tree
backref.

Then checking the fs tree:

	item 0 key (170 INODE_ITEM 0) itemoff 16123 itemsize 160
		generation 7 transid 0 size 16384 nbytes 16384
		block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0
		sequence 0 flags 0x0(none)
		atime 1664866393.0 (2022-10-04 14:53:13)
		ctime 1664863510.0 (2022-10-04 14:05:10)
		mtime 1664863455.0 (2022-10-04 14:04:15)
		otime 0.0 (1970-01-01 08:00:00)

There is an inode item before the root dir inode.

And that inode number 170 is causing the problem.

In traverse_directory(), we use the inode number reported from stat()
directly as btrfs inode number, and pass it to
btrfs_record_file_extent(), which finally calls btrfs_inc_extent_ref(),
with above 170 passed as @owner parameter.

But inside btrfs_inc_extent_ref() we use that @owner value to determine
if it's a data backref.
Since we got a smaller than BTRFS_FIRST_FREE_OBJECTID, btrfs treats it
as tree block, and cause the above problem.

[FIX]
As a quick fix, always add BTRFS_FIRST_FREE_OBJECTID to all inode number
directly grabbed from stat().

And add an ASSERT() in __btrfs_record_file_extent() to catch unexpected
objectid.

This is not a perfect solution, as the resulted fs will has a huge gap
in its inodes:

	item 0 key (256 INODE_ITEM 0) itemoff 16123 itemsize 160
	item 4 key (426 INODE_ITEM 0) itemoff 15883 itemsize 160

For a proper fix, we should allocate new btrfs inode numbers in a
sequential order, but that would be another series of patches.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:10 +02:00
David Sterba
ccb2d4aa45 btrfs-progs: device-utils: rename btrfs_device_size
There's a group of helpers to read device size, the btrfs_device_size
should be one of them. Rename it and so minor cleanup.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:10 +02:00
David Sterba
ea0b894967 btrfs-progs: mkfs: do proper error handling
Replace BUG_ON after transaction start failures, all the functions
already handle errors and return them to the caller. The other error
handling is for impossible conditions.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:10 +02:00
David Sterba
a827bb2db8 btrfs-progs: use template for transaction commit error messages
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:10 +02:00
David Sterba
8fcafae04a btrfs-progs: use template for transaction start error messages
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:10 +02:00
David Sterba
c2be0e2ce0 btrfs-progs: use template for out of memory error messages
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:09 +02:00
David Sterba
f7a768d624 btrfs-progs: mkfs: remove support for option --leafsize
The leafsize has never been different from nodesize and since 4.0 (2015)
it's been alias for nodesize. This should be enough time for everybody
to update so the support is removed.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:09 +02:00
David Sterba
48c5740e87 btrfs-progs: docs: clarify meaning of mkfs --byte-count
The meaning of the -b/--byte-count option is different than what the
help text says. Historically it was used to set the filesystem size but
with multiple devices it sets the size on each device:

  $ mkfs.btrfs /dev/sdx[1234]
  ...
  Number of devices:  4
  Devices:
     ID        SIZE  PATH
      1     2.00GiB  /dev/sdx1
      2     2.00GiB  /dev/sdx2
      3     2.00GiB  /dev/sdx3
      4     2.00GiB  /dev/sdx4

And when set to 1G:

  $ mkfs.btrfs -b 1G /dev/sdx[1234]
  ...
  Number of devices:  4
  Devices:
     ID        SIZE  PATH
      1     1.00GiB  /dev/sdx1
      2     1.00GiB  /dev/sdx2
      3     1.00GiB  /dev/sdx3
      4     1.00GiB  /dev/sdx4

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:09 +02:00
David Sterba
6743a47b35 btrfs-progs: mkfs: rename dev_cnt to device_count
The variable name could expanded, it's not necessary to be abbreviated.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:09 +02:00
David Sterba
971bee96c2 btrfs-progs: mkfs: use _set suffix for option tracking
Unify naming of variables that track if the command option is set to use
the _set suffix.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:09 +02:00
David Sterba
c85bb9b5bf btrfs-progs: mkfs: group feature option declarations
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:09 +02:00
David Sterba
b73a29936a btrfs-progs: remove unnecessary casts for u64
The (unsigned long long) type casts can be dropped, printf understands
%llu and u64 and does not warn. In cases where the type is not u64 keep
the cast.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:09 +02:00
Li Zhang
bb2eed3aa5 btrfs-progs: mkfs: run device preparation in parallel
When devices are formatted as btrfs, btrfs_prepare_device is called
sequentially for each device, which takes too much time.

Put each btrfs_prepare_device into a thread, wait for the first thread
to complete to mkfs.btrfs, and wait for other threads to complete before
adding other devices to the file system.

During the preparation it's either trim/discard or zone reset.

This was tested with TCMU emulation with two zoned devices.  Each device
is 2000G (about 19.53 TiB), the region size is 4MB, Use the following
parameters for targetcli:

  create name=zbc0 size=20000G cfgstring=model-HM/zsize-4/conv-100@~/zbc0.raw

Call difftime to calculate the running time of the function
btrfs_prepare_device.  Calculate the time from thread creation to
completion of all threads after patching:

  $ lsscsi -p
  [10:0:1:0]   (0x14)  LIO-ORG  TCMU ZBC device  0002  /dev/sdb   -          none
  [11:0:1:0]   (0x14)  LIO-ORG  TCMU ZBC device  0002  /dev/sdc   -          none

  $ sudo mkfs.btrfs -d single -m single -O zoned /dev/sdc /dev/sdb -f
  ....
  time for prepare devices:4.000000.
  ....

  $ sudo mkfs.btrfs -d single -m single -O zoned /dev/sdc /dev/sdb -f
  ...
  time for prepare devices:2.000000.
  ...

Issue: #496
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Li Zhang <zhanglikernel@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:08 +02:00
David Sterba
b19ab3ba9f btrfs-progs: mkfs: use message helpers for error messages
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:07 +02:00
David Sterba
6edd4b2121 btrfs-progs: factor string helpers out of utils.c
Utils is the catch-all file, we can now separate some string utility
functions.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:06:13 +02:00
David Sterba
0f03da53cc btrfs-progs: mkfs: update include lists
The tool IWYU (include what you use) suggests to remove and add some
includes.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:06:12 +02:00
David Sterba
61de520917 btrfs-progs: mkfs: reorder includes
The preferred order:
- system headers
- standard headers
- libraries
- kernel library
- kernel shared
- common headers
- other tools
- own headers

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:06:11 +02:00
David Sterba
c33c2c66b3 btrfs-progs: mkfs: duplicate argument for --rootdir path
The source dir points to the argv data, we should make a copy to be sure
it won't change due to further processing.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:06:11 +02:00
David Sterba
54fe8e648a btrfs-progs: mkfs: open code label parsing helper
The helper parse_label is used only once and is trivial. Open code it in
the argument parsing, also to make the exit() is more visible.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:06:11 +02:00
David Sterba
74be9f5115 btrfs-progs: mkfs: open code profile parsing helper
There's a helper to parse profile name and exits on error. As this is a
trivial helper we can open code it and adapt the error message to be
more specific what failed.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:06:11 +02:00
David Sterba
e331037d40 btrfs-progs: factor out and export rotational/ssd device helper
The helper belongs to device utils, move it from the mkfs.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:06:11 +02:00
Qu Wenruo
1c414061ed btrfs-progs: mkfs: add artificial dependency for block group tree
To reduce the test matrix and to follow the kernel behavior, make sure
for block-group-tree feature, we have no-holes and free-space-tree
features enabled.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-09-12 18:25:32 +02:00
Qu Wenruo
1430b41427 btrfs-progs: separate block group tree from extent tree v2
Block group tree feature is completely a standalone feature, and it has
been over 5 years before the initial introduction to solve the long
mount time.

I don't really want to waste another 5 years waiting for a feature which
may or may not work, but definitely not properly reviewed for its
preparation patches.

So this patch will separate the block group tree feature into a
standalone compat RO feature.

There is a catch, in mkfs create_block_group_tree(), current
tree-checker only accepts block group item with valid chunk_objectid,
but the existing code from extent-tree-v2 didn't properly initialize it.

This patch will also fix above mentioned problem so kernel can mount it
correctly.

Now mkfs/fsck should be able to handle the fs with block group tree.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-09-12 18:25:32 +02:00
Qu Wenruo
c5a21a7814 btrfs-progs: don't save block group root into super block
The extent tree v2 (thankfully not yet fully materialized) needs a
new root for storing all block group items.

My initial proposal years ago just added a new tree rootid, and load it
from tree root, just like what we did for quota/free space tree/uuid/extent
roots.

But the extent tree v2 patches introduced a completely new (and to me,
wasteful) way to store block group tree root into super block.

Currently there are only 3 trees stored in super blocks, and they all
have their valid reasons:

- Chunk root
  Needed for bootstrap.

- Tree root
  Really the entrance of all trees.

- Log root
  This is special as log root has to be updated out of existing
  transaction mechanism.

There is not even any reason to put block group root into super blocks,
the block group tree is updated at the same timing as old extent tree,
no need for extra bootstrap/out-of-transaction update.

So just move block group root from super block into tree root.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-09-12 15:31:27 +02:00
Qu Wenruo
cc4a249e99 btrfs-progs: mkfs: dynamically modify mkfs blocks array
In mkfs_btrfs(), we have a btrfs_mkfs_block array to store how many tree
blocks we need to reserve for the initial btrfs image.

Currently we have two very similar arrays, extent_tree_v1_blocks and
extent_tree_v2_blocks.

The only difference is just v2 has an extra block for block group tree.

This patch will add two helpers, mkfs_blocks_add() and
mkfs_blocks_remove() to properly add/remove one block dynamically from
the array.

This allows 3 things:

- Merge extent_tree_v1_blocks and extent_tree_v2_blocks into one array
  The new array will be the same as extent_tree_v1_blocks.
  For extent-tree-v2, we just dynamically add MKFS_BLOCK_GROUP_TREE.

- Remove free space tree block on-demand
  This only works for extent-tree-v1 case, as v2 has a hard requirement
  on free space tree.
  But this still make code much cleaner, not doing any special hacks.

- Allow future expansion without introduce new array
  I strongly doubt why this is not properly done in extent-tree-v2
  preparation patches.
  We should not allow bad practice to sneak in just because it's some
  preparation patches for a larger feature.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-09-12 15:31:26 +02:00
David Sterba
a33af50c52 btrfs-progs: add constant for initial getopt values
Add constant for initial value to avoid unexpected clashes with user
defined getopt values and shift the common size getopt values.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-08-16 15:18:11 +02:00
Qu Wenruo
007c799ca8 btrfs-progs: mkfs: use sectorsize as nodesize fallback for mixed profiles
[BUG]
When running btrfs/011 with subpage case, even with RAID56 support, it
still fails with the following error:

 QA output created by 011
 *** test btrfs replace
 mkfs failed
 (see /home/adam/xfstests-dev/results//btrfs/011.full for details)

The full log shows:

  ---------workout "-m single -d single -M" 1 no 64-----------
  ERROR: illegal nodesize 65536 (not equal to 4096 for mixed block group)
  mkfs failed

This is a critical error, making test case to be aborted, without
checking the rest profiles.

[CAUSE]
Mkfs.btrfs always uses the maximum value between sectorsize and page
size for its mixed profile nodesize.

For subpage case, it means we always go PAGE_SIZE, no matter whatever
the sectorsize is passed in.

[FIX]
Just get rid of the direct PAGE_SIZE usage when determining nodesize for
mixed profiles.
And use sectorsize directly (either passed in by the user, or
determined from page size).

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-26 01:14:48 +02:00
Naohiro Aota
e273f9ebbd btrfs-progs: zoned: fix initial system BG location
Currently, we create the initial system block group in the zone 2. That
will create the BG at 64MB when the zone size is 32 MB, which collides with
the regular superblock location. It results in mount failure with:

  BTRFS info (device nullb0): zoned mode enabled with zone size 33554432
  BTRFS error (device nullb0): zoned: block group 67108864 must not contain super block
  BTRFS error (device nullb0): failed to read block groups: -117
  BTRFS error (device nullb0): open_ctree failed

Fix that by calculating the proper location of the initial system BG. It
avoids using zones reserved for zoned superblock logging and the zones
where a regular superblock resides.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-08 23:17:35 +02:00
Josef Bacik
659f041537 btrfs-progs: mkfs: create the global root's
Now that we have all of the supporting code, add the ability to create
all of the global roots for an extent tree v2 fs.  This will default to
nr_cpu's, but also allow the user to specify how many global roots they
would like.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 18:07:26 +01:00
Josef Bacik
1f89a5d461 btrfs-progs: mkfs: set chunk_item_objectid properly for extent tree v2
Our initial block group will use global root id 0 with extent tree v2,
so adjust the helper to take the chunk_objectid as an argument, as we'll
set this to 0 for extent tree v2 and then
BTRFS_FIRST_CHUNK_TREE_OBJECTID for extent tree v1.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 18:07:24 +01:00
Josef Bacik
02fb308bdc btrfs-progs: make btrfs_create_tree take a key for the root key
We're going to start create global roots from mkfs, and we need to have
a offset set for the root key.  Make the btrfs_create_tree() take a key
for the root_key instead of just the objectid so we can setup these new
style roots properly.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 18:07:22 +01:00
Josef Bacik
27eaa3b514 btrfs-progs: set the number of global roots in the super block
In order to make sure the file system is consistent we need to record
the number of global roots we should have in the super block.  We could
infer this from the number of global roots we find, however this could
lead to interesting fuzzing problems, so add a source of truth to the
super block in order to make it easier to verify the file system is
consistent.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 18:07:14 +01:00
Josef Bacik
7e3bf7fc44 btrfs-progs: mkfs: add support for the block group tree
Add the extent tree v2 table with the block group tree as a root, and
then create the empty root and use the proper root for cleanup up the
temporary block groups.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 18:07:02 +01:00
Josef Bacik
360103e610 btrfs-progs: mkfs: use the btrfs_block_group_root helper
Instead of accessing the extent root directory for modifying block
groups, use the helper which will do the correct thing based on the
flags of the file system.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 18:06:56 +01:00
Josef Bacik
5dc3964aaa btrfs-progs: remove the _nr from the item helpers
Now that all callers are using the _nr variations we can simply rename
these helpers to btrfs_item_##member/btrfs_set_item_##member and change
the actual item SETGET funcs to raw_item_##member/set_raw_item_##member
and then change all callers to drop the _nr part.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 15:13:13 +01:00
Josef Bacik
04ffea07e4 btrfs-progs: add btrfs_set_item_*_nr() helpers
We have a lot of the following patterns

	item = btrfs_item_nr(nr);
	btrfs_set_item_*(eb, item, val);

	btrfs_set_item_*(eb, btrfs_item_nr(nr), val);

in a lot of places in our code.  Instead add _nr variations of these
helpers and convert all of the users to this new helper.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 15:13:13 +01:00
Josef Bacik
ed098523dc btrfs-progs: reduce usage of __BTRFS_LEAF_DATA_SIZE
This helper only takes the nodesize, but in the future it'll take a bool
to indicate if we're extent tree v2.  The remaining users are all where
we only have extent_buffer, but we should always have a valid
eb->fs_info in these cases, so add BUG_ON()'s for the !eb->fs_info case
and then convert these callers to use BTRFS_LEAF_DATA_SIZE which takes
the fs_info.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 15:13:13 +01:00
Josef Bacik
7300aeecff btrfs-progs: store LEAF_DATA_SIZE in the mkfs_config
We use __BTRFS_LEAF_DATA_SIZE() in a few places for mkfs.  With extent
tree v2 we'll be increasing the size of btrfs_header, so it'll be kind
of annoying to add flags to all callers of __BTRFS_LEAF_DATA_SIZE, so
simply calculate it once and put it in the mkfs_config and use that.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 15:13:12 +01:00
Johannes Thumshirn
89191f8c12 btrfs-progs: pass in block-group type to zoned_profile_supported
Pass BTRFS_BLOCK_GROUP_DATA and BTRFS_BLOCK_GROUP_METADATA to
zoned_profile_supported(), so we can actually distinguish if it is a data
or a meta-data block group.

Fixes: 8f914d518a46 ("btrfs-progs: zoned support DUP on metadata block groups")
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-02-16 22:48:01 +01:00
Johannes Thumshirn
88895a920f btrfs-progs: use profile_supported in mkfs as well
Currently we have two places checking if a block-group profile is
supported on a zoned device, one in mkfs/main.c and one in
kernel-shared/zoned.c.

Use the one from kernel-shared/zoned.c in mkfs as well, unifying all
checks.

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-02-01 18:41:51 +01:00
Nikolay Borisov
f59a229d2c btrfs-progs: remove redundant fs uuid validation from make_btrfs
cfg->fs_uuid is either 0 or set to the value of the -U parameter
passed to mkfs.btrfs. However the value of the latter is already being
validated in the main mkfs function. Just remove the duplicated checks
in make_btrfs as they effectively can never be executed.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-01-11 18:02:46 +01:00
Josef Bacik
3337b7993b btrfs-progs: common: allow users to select extent-tree-v2 option
We want to enable developers to test the extent tree v2 features as they
are added, add the ability to mkfs an extent tree v2 fs if we have
experimental enabled.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-30 19:07:34 +01:00
Josef Bacik
b057607325 btrfs-progs: track csum, extent, and free space trees in a rb tree
We are going to have multiples of these trees with extent tree v2, so
add a rb tree to track them based on their root key value.  This works
for both v1 and v2, so we can remove the direct pointers to these roots
in our fs_info.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-30 18:57:25 +01:00
Josef Bacik
0b23744de5 btrfs-progs: stop accessing ->free_space_root directly
We're going to have multiple free space roots in the future, so access
it via a helper in most cases.  We will address the remaining direct
accesses in future patches.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-30 18:57:19 +01:00
Josef Bacik
db2ab47823 btrfs-progs: stop accessing ->extent_root directly
When we switch to multiple global trees we'll need to access the
appropriate extent root depending on the block group or possibly root.
To handle this, use a helper in most places and then the actual root in
places where it is required.  We will whittle down the direct accessors
with future patches, but this does the bulk of the preparatory work.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-30 18:56:54 +01:00
Josef Bacik
639b1fc2e7 btrfs-progs: stop accessing ->csum_root directly
With extent tree v2 we will have per-block group checksums, so add a
helper to access the csum root and rename the fs_info csum_root to
_csum_root to catch all the places that are accessing it directly.
Convert everybody to use the helper except for internal things.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-22 21:45:37 +01:00
Josef Bacik
08b63c0fc5 btrfs-progs: stop passing root to csum related functions
We are going to need to start looking up the csum root based on the
bytenr with extent tree v2.  To that end stop passing the root to the
csum related functions so that can be done in the helper functions
themselves.

There's an unrelated deletion of a function prototype that no longer
exists.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-22 21:45:37 +01:00
Qu Wenruo
636b2e6027 btrfs-progs: remove temporary buffer for super block
There are a lot of call sites where we use the following code snippet:

	u8 super_block_data[BTRFS_SUPER_INFO_SIZE];
	struct btrfs_super_block *sb;
	u64 ret;

	sb = (struct btrfs_super_block *)super_block_data;

The reason for this is, structure btrfs_super_block was smaller than
BTRFS_SUPER_INFO_SIZE.

Thus for anything with csum involved, we have to use a proper 4K buffer.

Since the recent unification of sizeof(struct btrfs_super_block), we no
longer need such workaround, and can use struct btrfs_super_block
directly to do any operation.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-05 12:50:03 +01:00
David Sterba
c0c307c313 btrfs-progs: fix space_cache generation again when free-space-tree is enabled
In commit a138daac17 ("btrfs-progs: mkfs: set super_cache_generation
to 0 if we're using free space tree") the space cache (v1) generation
was reset to 0 to let kernel know it's not used when the free-space-tree
is enabled.

This got broken again in 5.14 when the free space tree code got
refactored in 4b6cf2a3eb ("btrfs-progs: mkfs: generate free space tree
at make_btrfs() time").

Reset the space cache generation to 0.

Issue: #414
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-05 12:50:03 +01:00
Qu Wenruo
0befc6dce2 btrfs-progs: mkfs: recow all tree blocks properly
[BUG]
Since btrfs-progs v5.14, mkfs.btrfs no longer cleans up the temporary
SINGLE metadata chunks if "-R free-space-tree" is specified:

 $ mkfs.btrfs  -f -R free-space-tree -m dup -d dup /dev/test/test
 $ btrfs ins dump-tree -t chunk /dev/test/test | grep "type METADATA"
		length 8388608 owner 2 stripe_len 65536 type METADATA
		length 268435456 owner 2 stripe_len 65536 type METADATA|DUP

[CAUSE]
Since commit 4b6cf2a3eb ("btrfs-progs: mkfs: generate free space tree
at make_btrfs() time"), free space tree is created when the temporary
btrfs image is created.

This behavior itself has no problem at all.  The problem happens when
"-m DUP -d DUP" (or other profiles) is specified.

This makes btrfs to create extra chunks, enlarging free space tree so
that it can be as high as level 1.

During mkfs, we rely on recow_roots() to re-COW all tree blocks to the
newly allocated chunks.

But __recow_root() can only handle tree root at level 0, as it forces
root node to be COWed, not bothering the children leaves/nodes.

This makes part of the free space cache tree still live on the old
temporary chunks, leaving later cleanup_temp_chunks() unable to delete
temporary SINGLE chunks.

[FIX]
Rework __recow_root() to do a proper COW of the whole tree.

But above rework is not enough, as if a free space tree block is
allocated during current transaction, but before new chunks added.
Then the reworked __recow_root() can't COW it, as btrfs_search_slot()
won't COW a tree block allocated in current transaction.

So this patch will also commit current transaction before calling
recow_roots(), to force us to re-cow all tree blocks.

This shouldn't be a problem, as at the time of calling, we should have
less than a dozen tree blocks, thus there won't be a performance impact.

Reported-by: FireFish5000 <firefish5000@gmail.com>
Fixes: 4b6cf2a3eb ("btrfs-progs: mkfs: generate free space tree at make_btrfs() time")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-20 18:59:24 +02:00
Naohiro Aota
e9696b06f0 btrfs-progs: use direct-io for zoned device
We need to use direct-IO for zoned devices to preserve the write
ordering.  Instead of detecting if the device is zoned or not, we simply
use direct-IO for any kind of device (even if emulated zoned mode on a
regular device).

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-20 18:59:23 +02:00
Naohiro Aota
c821e5545f btrfs-progs: introduce btrfs_pwrite wrapper for pwrite
Wrap pwrite with btrfs_pwrite(). It simply calls pwrite() on non-zoned
btrfs (opened without O_DIRECT). On zoned mode (opened with O_DIRECT),
it allocates an aligned bounce buffer, copies the contents and uses it
for direct-IO writing.

Writes in device_zero_blocks() and btrfs_wipe_existing_sb() are a little
tricky. We don't have fs_info on our hands, so use zinfo to determine it
is a zoned device or not.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-20 18:59:23 +02:00
Naohiro Aota
40ab7530df btrfs-progs: set eb::fs_info properly everywhere
Several extent_buffer initializations miss fs_info initialization. This
is OK before the following patch ("btrfs-progs: use direct-io for zoned
device") as eb->fs_info is not always necessary. But, after that patch,
we will use fs_info to determine it is zoned or not and that causes
segfault in such cases.

Properly set fs_info when initializing extent_buffers to fix the issue.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:47:04 +02:00
Naohiro Aota
c9c36aaf5b btrfs-progs: mkfs: do not set zone size on non-zoned mode
Since zone_size() returns an emulated zone size even for non-zoned
device, we cannot use cfg.zone_size to determine the device is zoned or
not. Set zone_size = 0 on non-zoned mode.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:47:04 +02:00
David Sterba
522945efc8 btrfs-progs: remove unused prototypes from send-utils.h
The functions are part of path-utils.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:47:04 +02:00
David Sterba
5f1a09198f btrfs-progs: mkfs: print notice about 5.15 changes in defaults
Changing several defaults at once is desirable for easier reference,
rather than a number of scattered releases enabling each. The changes
are documented but printing a notice won't hurt as not everybody reads
the documentation or release notes.

Undesired features can be unselected by prepending ^ to the option name,
like:

   $ mkfs.btrfs -O ^no-holes

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:47:03 +02:00
David Sterba
65181c273e btrfs-progs: mkfs: don't autoselect DUP on SSD for metadata anymore
The original idea of not doing DUP on SSD was that the duplicate blocks
get deduplicated again by the driver firmware. This was in 2013, years
ago. Then it was speculative and even nowadays we don't have much
reliable information from vendors what optimizations are done on the
drive level.

After the year there's enough information gathered by user community and
there's no simple answer. Expensive drives are more reliable but less
common, for cheap consumer drive it's vice versa. The characteristics
are described in more detail in manual page btrfs(5) in section "SOLID
STATE DRIVES (SSD)".

The reasoning is based on numerous reports on IRC and technical
difficulty on mkfs side to do the right decision. The default is chosen
to be the safe option and up to user to change that based on informed
decision.

Issue: #319
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:47:03 +02:00
David Sterba
72260bc3b9 btrfs-progs: mkfs: enable space_cache=v2 (free-space-tree) by default
The free space tree is a better way to track the free space and has been
tested in the wild for a long time. The backward compatibility is
sufficient, several long term kernels. On-line conversion from v1 to v2
can be done by mount, switching from v2 to v1 can be done by 'btrfs
check'.

Issue: #295
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:47:03 +02:00
David Sterba
c1fdf2f20d btrfs-progs: mkfs: switch status variables to bool
There are many variables for on/off tracking, use bool instead of int.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:46:36 +02:00
David Sterba
785218efb1 btrfs-progs: remove direct calls to crc32c from ctree.h
Make the helpers using crc32c not inline so the crc32c.h can be removed
from the public headers exported by libbtrfs.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:46:35 +02:00
David Sterba
5664631b5b btrfs-progs: clean up test_uuid_unique
Move the declaration to the right header, constify argument and document
the function.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-08 20:46:33 +02:00
David Sterba
64ed09c318 btrfs-progs: mkfs: switch to global verbosity options
Use the bconf verbosity so that other code outside of main.c respects
the command line settings.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-06 16:49:42 +02:00
David Sterba
2814ec6b1f btrfs-progs: mkfs: add option -v/--verbose
The default output of mkfs is intentionally verbose so we did not need
the verbosity option. For some additional information it could be useful
to increase the level in case it's wired to the global verbosity
settings.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-06 16:49:39 +02:00
David Sterba
76ab1fa364 btrfs-progs: rename and move group_profile_max_safe_loss
The helper belongs to the others that translate bg flags to the raid
attr table member.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 16:38:56 +02:00
David Sterba
c3ee6a8a09 btrfs-progs: unify GPL header comments
Add the GPL v2 header to files where it was missing and is not from an
external source, update to the most recent version with the address.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 13:58:44 +02:00
David Sterba
0c3991e171 btrfs-progs: mkfs: use common parser of bg profiles
Replace open-coded profile parser with the common one.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-07 13:58:44 +02:00
David Sterba
af56460de8 btrfs-progs: split parsing helpers from utils.c
There are various parsing helpers scattered everywhere, unify them to
one file and start with helpers already in utils.c.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-06 17:15:51 +02:00
David Sterba
7572839a74 btrfs-progs: add and use bit masks for RAID1 and RAID56 profiles
Many test conditions can be simplified in case they check all the
related profiles.

Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-06 16:36:18 +02:00
Josef Bacik
79e534def9 btrfs-progs: add the incompat flag for extent tree v2
I will have a lot of preparatory patches to reduce the review pain of
this large feature.  In order to enable that work define the incompat
flag.  Once all of the work lands to support the feature there will be a
patch to actually enable us to select it and manipulate file systems
with that incompat flag set.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-06 16:36:17 +02:00
Josef Bacik
4b6cf2a3eb btrfs-progs: mkfs: generate free space tree at make_btrfs() time
With extent-tree-v2 we won't be able to cache block groups based on the
extent tree, so we need to have a valid free space tree before we open
the temporary file system to finish setting the file system up.  Set up
the basic free space entries for our temporary system chunk if we have
the free space tree enabled and stop generating the tree after the fact.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-06 16:36:17 +02:00