The function btrfs_mksubvol() is very different between btrfs-progs and
kernel, the former version is really just linking a subvolume to another
directory inode, but the kernel version is really to make a completely
new subvolume.
Instead of same-named function, introduce btrfs_link_subvolume() and use
it to replace the old btrfs_mksubvol().
This is done by:
- Introduce btrfs_link_subvolume()
Which does extra checks before doing any modification:
* Make sure the target inode is a directory
* Make sure no filename conflict
Then do the linkage:
* Add the dir_item/dir_index into the parent inode
* Add the forward and backward root refs into tree root
- Introduce link_image_subvolume() helper
Currently btrfs_mksubvol() has a dedicated convert filename retry
behavior, which is unnecessary and should be done by the convert code.
Now move the filename retry behavior into the helper.
- Remove btrfs_mksubvol()
Since there is only one caller utilizing btrfs_mksubvol(), and it's
now gone, we can remove the old btrfs_mksubvol().
Signed-off-by: Qu Wenruo <wqu@suse.com>
There are two different subvolume/data reloc tree creation routines:
- create_subvol() from convert/main.c
* calls btrfs_copy_root() to create an empty root
This is not safe, as it relies on the source root to be empty.
* calls btrfs_read_fs_root() to add it to the cache and trace it
properly
* calls btrfs_make_root_dir() to initialize the empty new root
- create_data_reloc_tree() from mkfs/main.c
* calls btrfs_create_tree() to create an empty root
* Manually add the root to fs_root cache
This is only safe for data reloc tree as it's never updated
inside btrfs-progs.
But not safe for other subvolume trees.
* manually setup the root dir
Both have their good and bad aspects, so here we introduce a new helper,
btrfs_make_subvolume():
- Calls btrfs_create_tree() to create an empty root
- Calls btrfs_read_fs_root() to setup the cache and tracking properly
- Calls btrfs_make_root_dir() to initialize the root dir
- Calls btrfs_update_root() to reflect the rootdir change
So this new helper can replace both create_subvol() and
create_data_reloc_tree().
Signed-off-by: Qu Wenruo <wqu@suse.com>
Commit 4db925911c ("btrfs-progs: use strncpy_null everywhere")
replaced strncpy with strncpy_null, the maximum xattr name length is 255
(current limit), the target buffer is large enough for the whole size so
make sure the last character is also copied.
Signed-off-by: David Sterba <dsterba@suse.com>
Use the safe version of strncpy that makes sure the string is
terminated.
To be noted:
- the conversion in scrub path handling was skipped
- sizes of device paths in some ioctl related structures is
BTRFS_DEVICE_PATH_NAME_MAX + 1
Recently gcc 13.3 started to detect problems with our use of strncpy
potentially lacking the null terminator, warnings like:
cmds/inspect.c: In function ‘cmd_inspect_logical_resolve’:
cmds/inspect.c:294:33: warning: ‘__builtin_strncpy’ specified bound 4096 equals destination size [-Wstringop-truncation]
294 | strncpy(mount_path, mounted, PATH_MAX);
| ^
Signed-off-by: David Sterba <dsterba@suse.com>
Now that there's only __strncpy_null we can drop the underscore and move
it to string-utils as it's a generic string function rather than
something for paths.
Signed-off-by: David Sterba <dsterba@suse.com>
The label is of a fixed size 256 bytes and expects the zero terminator.
Using __strncpy_null is correct as it makes sure there's always the zero
termination but the argument passed in skips the last character.
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
There is a bug report that btrfs-convert cannot handle unwritten extents
(EXT2_EXTENT_FLAGS_UNINIT set, which is pretty much the same as
BTRFS_FILE_EXTENT_PREALLOC), which can cause the converted image to have
incorrect contents.
[CAUSE]
Currently we use ext2fs_block_iterate2() to go through all data extents
of an ext2 inode, but it doesn't provide the info on if the range is
unwritten or not.
Thus for unwritten extents, the results btrfs would just treat it as
regular extents, and read the contents from disk other than setting the
contents to zero.
[FIX]
Instead of the ext2fs_block_iterate2(), here we follow the debugfs'
"dump_extents" command, to use ext2fs_extent_*() helpers to go through
every data extent of the inode, that's if the inode supports the
EXT4_EXTENTS_FL flag.
Now we can properly get the info of which extents are unwritten, and use
holes to replace those unwritten extents.
Reported-by: Yordan <y16267966@gmail.com>
Link: https://lore.kernel.org/all/d34c7d77a7f00c93bea6a4d6e83c7caf.mailbg@mail.bg/
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There is a new release candidate for e2fsprogs https://github.com/tytso/e2fsprogs/releases/tag/v1.47.1-rc2
Linking btrfs-progs v6.8 against this version of e2fsprogs leads to the following compile error:
convert/source-ext2.c: In function 'ext4_copy_inode_timespec_extra':
convert/source-ext2.c:733:13: warning: implicit declaration of function 'inode_includes' [-Wimplicit-function-declaration]
733 | if (inode_includes(inode_size, i_ ## xtime ## _extra)) { \
| ^~~~~~~~~~~~~~
convert/source-ext2.c:769:9: note: in expansion of macro 'EXT4_COPY_XTIME'
769 | EXT4_COPY_XTIME(atime, dst, tv_sec, tv_nsec);
| ^~~~~~~~~~~~~~~
convert/source-ext2.c:733:40: error: 'i_atime_extra' undeclared (first use in this function)
733 | if (inode_includes(inode_size, i_ ## xtime ## _extra)) { \
| ^~
convert/source-ext2.c:769:9: note: in expansion of macro 'EXT4_COPY_XTIME'
769 | EXT4_COPY_XTIME(atime, dst, tv_sec, tv_nsec);
| ^~~~~~~~~~~~~~~
convert/source-ext2.c:733:40: note: each undeclared identifier is reported only once for each function it appears in
733 | if (inode_includes(inode_size, i_ ## xtime ## _extra)) { \
| ^~
convert/source-ext2.c:769:9: note: in expansion of macro 'EXT4_COPY_XTIME'
769 | EXT4_COPY_XTIME(atime, dst, tv_sec, tv_nsec);
| ^~~~~~~~~~~~~~~
convert/source-ext2.c:733:40: error: 'i_mtime_extra' undeclared (first use in this function)
733 | if (inode_includes(inode_size, i_ ## xtime ## _extra)) { \
| ^~
convert/source-ext2.c:770:9: note: in expansion of macro 'EXT4_COPY_XTIME'
770 | EXT4_COPY_XTIME(mtime, dst, tv_sec, tv_nsec);
| ^~~~~~~~~~~~~~~
convert/source-ext2.c:733:40: error: 'i_ctime_extra' undeclared (first use in this function)
733 | if (inode_includes(inode_size, i_ ## xtime ## _extra)) { \
| ^~
convert/source-ext2.c:771:9: note: in expansion of macro 'EXT4_COPY_XTIME'
771 | EXT4_COPY_XTIME(ctime, dst, tv_sec, tv_nsec);
| ^~~~~~~~~~~~~~~
convert/source-ext2.c:774:40: error: 'i_crtime_extra' undeclared (first use in this function)
774 | if (inode_includes(inode_size, i_crtime_extra)) {
| ^~~~~~~~~~~~~~
from tytso/e2fsprogs@ca8bc92
Fix inode_includes() macro to properly wrap "inode" parameter,
and rename to ext2fs_inode_includes() to avoid potential name
clashes. Use this to check inode field inclusion in debugfs
instead of bare constants for inode field offsets.
To fix that use the new prefixed macro and add backward compatibility that
would still use inode_includes().
Issue: #785
Signed-off-by: David Sterba <dsterba@suse.com>
Use the objectid, type, offset natural order as it's more readable and
we're used to read keys like that.
Signed-off-by: David Sterba <dsterba@suse.com>
This patch introduces a new parser helper, parse_u64_with_suffix(),
which has a better error handling, following all the parse_*()
helpers to return non-zero value for errors.
This new helper is going to replace parse_size_from_string(), which
would directly call exit(1) to stop the whole program.
Furthermore most callers of parse_size_from_string() are expecting
exit(1) for error, so that they can skip the error handling.
For those call sites, introduce a wrapper, arg_strtou64_with_suffix(),
to do that. The only disadvantage is a little less detailed error
report for why the parse failed, but for most cases the generic error
string should be enough.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
There is a report about failed btrfs-convert, which shows the following
error:
Create btrfs metadata
corrupt leaf: root=5 block=5001931145216 slot=1 ino=89911763, invalid previous key objectid, have 89911762 expect 89911763
leaf 5001931145216 items 336 free space 7 generation 90 owner FS_TREE
leaf 5001931145216 flags 0x1(WRITTEN) backref revision 1
fs uuid 8b69f018-37c3-4b30-b859-42ccfcbe2449
chunk uuid 448ce78c-ea41-49f6-99dc-46ad80b93da9
item 0 key (89911762 INODE_REF 3858733) itemoff 16222 itemsize 61
index 171 namelen 51 name: [FILENAME1]
item 1 key (89911763 INODE_REF 3858733) itemoff 16161 itemsize 61
index 103 namelen 51 name: [FILENAME2]
[CAUSE]
When iterating a directory, btrfs-convert would insert the DIR_ITEMs,
along with the INODE_REF of that inode.
This leads to above stray INODE_REFs, and trigger the tree-checker.
This can only happen for large fs, as for most cases we have all these
modified tree blocks cached, thus tree-checker won't be triggered.
But when the tree block cache is not hit, and we have to read from disk,
then such behavior can lead to above tree-checker error.
[FIX]
Insert a dummy INODE_ITEM for the INODE_REF first, the inode items would
be updated when iterating the child inode of the directory.
Issue: #731
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Although kernel scrub code has been updated to handle the unaligned
chunk length, there is also no harm if we can allocate data chunk with
both start and length aligned.
This patch handles this by rounding up the end bytenr when allocating
data chunks for the conversion.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
There is a report about failed btrfs-convert, which shows the following
error:
corrupt leaf: root=5 block=5001928998912 slot=1 ino=89911763, invalid previous key objectid, have 89911762 expect 89911763
ERROR: failed to copy ext2 inode 89911320: -5
ERROR: error during copy_inodes -5
WARNING: error during conversion, the original filesystem is not modified
[CAUSE]
Above error is triggered when checking the following items inside a
subvolume:
- inode ref
- dir item/index
- file extent
- xattr
This is to make sure these items have correct previous key.
However btrfs-convert is not following this requirement, it always
inserts those items first, then creates a btrfs_inode for it.
Thus it can lead to the error.
This can only happen for large fs, as for most cases we have all these
modified tree blocks cached, thus tree-checker won't be triggered.
But when the tree block cache is not hit, and we have to read from disk,
then such behavior can lead to above tree-checker error.
[FIX]
Make sure we insert the inode item first, then the file extents/dir
items/xattrs. And after the file extents/dir items/xattrs inserted, we
update the existing inode (to update its size and bytes).
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The clear-cache functionality is shared by several commands:
- btrfs check
For --clear-cache and --clear-ino-cache.
- btrfstune
Mostly for block-group-tree feature conversion.
- btrfs-convert
To enable the now default v2 space cache.
Thus it's no longer proper to keep clear-cache.[ch] under check/
directory, move them to common/ directory.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This function and it's related functions only exist for the utilities
that populate existing file systems, and do not exist in the upstream
kernel. Move this function and the related function into it's own
common source file and out of the kernel-shared sources, and then update
all of the users to include the new location of this code.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This simply zero's out the path, and this is used everywhere we use a
stack path. Drop this usage and simply init the path's to empty instead
of using a function to do the memset.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
We got some test failures related to btrfs-convert with subpage, e.g.
btrfs/012, the failure would cause the following dmesg:
BTRFS warning (device nvme0n1p7): v1 space cache is not supported for page size 16384 with sectorsize 4096
BTRFS error (device nvme0n1p7): open_ctree failed
[CAUSE]
v1 space cache has tons of hard coded PAGE_SIZE usage, and considering
v2 space cache is going to replace it (which is already the new default
since v5.15 btrfs-progs), thus for btrfs subpage support, we just simply
reject the v1 space cache, and utilize v2 space cache when possible.
But there is special catch in btrfs-convert, although we're specifying
v2 space cache as the new default for btrfs-convert, it doesn't really
follow the specification at all.
Thus the converted filesystem will still go v1 space cache.
[FIX]
It can be a huge change to btrfs-convert to make the initial btrfs image
to support v2 cache.
Thus this patch would change the fs at the final stage, just before we
finalize the btrfs.
This patch would drop all the v1 cache created, then call
btrfs_create_free_space_tree() to populate the free space tree and
commit the superblock with needed compat_ro flags.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch would modify btrfs_csum_file_block() to handle csum type
other than the one used in the current fs.
The new data checksum would use a different objectid (-13) to
distinguish with the existing one (-10).
This needs to change tree-checker to skip the item size checks,
since new csum can be larger than the original csum.
After this stage, the resulted csum tree would look like this:
item 0 key (CSUM_CHANGE EXTENT_CSUM 13631488) itemoff 8091 itemsize 8192
range start 13631488 end 22020096 length 8388608
item 1 key (EXTENT_CSUM EXTENT_CSUM 13631488) itemoff 7067 itemsize 1024
range start 13631488 end 14680064 length 1048576
Note the itemsize is 8 times the original one, as the original csum is
CRC32, while target csum is SHA256, which is 8 times the size.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
There is a report that btrfs-convert leads to bad csum for the image
file.
The reproducer looks like this:
(note the 64K block size, it's used to force a certain chunk layout)
# touch test.img
# truncate -s 10G test.img
# mkfs.ext4 -b 64K test.img
# btrfs-convert -N 64K test.img
# btrfs check --check-data-csum test.img
Opening filesystem to check...
Checking filesystem on /home/adam/test.img
UUID: 39d49537-a9f5-47f1-b6ab-7857707b9133
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space cache
[4/7] checking fs roots
[5/7] checking csums against data
mirror 1 bytenr 4563140608 csum 0x3f1fa0ef expected csum 0xa4c4c072
mirror 1 bytenr 4563206144 csum 0x55dcf0d3 expected csum 0xa4c4c072
mirror 1 bytenr 4563271680 csum 0x4491b00a expected csum 0xa4c4c072
mirror 1 bytenr 4563337216 csum 0x655d1f61 expected csum 0xa4c4c072
mirror 1 bytenr 4563402752 csum 0xd37114d3 expected csum 0xa4c4c072
mirror 1 bytenr 4563468288 csum 0x4c2dab30 expected csum 0xa4c4c072
mirror 1 bytenr 4563533824 csum 0xa80fceed expected csum 0xa4c4c072
mirror 1 bytenr 4563599360 csum 0xaf610db8 expected csum 0xa4c4c072
mirror 1 bytenr 4563795968 csum 0x67b3c8a0 expected csum 0xa4c4c072
ERROR: errors found in csum tree
[6/7] checking root refs
...
[CAUSE]
Above initial failure is for logical bytenr of 4563140608, which is
inside the relocated range of the image file offset [0, 1M).
During convert, we migrate the original image file ranges which would
later be covered by super and other reserved ranges.
The migration happens as:
- Read out the original data
- Reserve a new file extent
- Write the data back to the file extent
Note that, the new file extent can be inside some new data chunks,
thus it's no longer 1:1 mapped.
- Generate the new csum for the new file extent
The problem happens at the last stage. We should read out the data from
the new file extent, but we call read_disk_extent() using the logical
bytenr, however read_disk_extent() is not doing logical -> physical
mapping.
Thus we will read some garbage, not the newly written data, and use
those garbage to generate csum. And caused the above problem.
[FIX]
Instead of read_disk_extent(), call read_data_from_disk(), which would
do the proper logical -> physical mapping, thus would fix the bug.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The function write_and_map_eb() is quite abused as a way to write any
generic buffer back to disk.
But we have a more suitable function already, write_data_to_disk().
This patch would remove the abused write_data_to_disk() calls, and
convert the only three valid call sites to write_data_to_disk() instead.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch syncs file-item.h into btrfs-progs. This carries with it an
API change for btrfs_del_csums, which takes a root argument in the
kernel, so all callsites have been updated accordingly.
I didn't sync file-item.c because it carries with it a bunch of bio
related helpers which are difficult to adapt to the kernel.
Additionally there's a few helpers in the local copy of file-item.c that
aren't in the kernel that are required for different tools.
This requires more cleanups in both the kernel and progs in order to
sync file-item.c, so for now just do file-item.h in order to pull things
out of ctree.h.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We want to keep this file locally as we want to be uptodate with
upstream, so we can build btrfs-progs regardless of which kernel is
currently installed. Sync this with the upstream version and put it in
kernel-shared/uapi to maintain some semblance of where this file comes
from.
There are some changes that need to be synced back to kernel. A local
definition of static_assert is used to avoid compilation problems on gcc
(< 9) due to mandatory 2nd parameter.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
While syncing messages.[ch] I had to back out the ASSERT() code in
kerncompat.h, which means we now rely on the kernel code for ASSERT().
In order to maintain some semblance of separation introduce UASSERT()
and use that in all the purely userspace code.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Since e2fsprog 1.47, even with a newly created empty ext4 filesystem,
btrfs-convert would result an fs that btrfs-check would complain:
# mkfs.ext4 -F test.img
# btrfs-convert test.img
# btrfs-check test.img
Opening filesystem to check...
Checking filesystem on test.img
UUID: e45da158-8967-4e4d-9c9f-66b0d127dbce
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space cache
[4/7] checking fs roots
root 5 inode 266 errors 2000, link count wrong
ERROR: errors found in fs roots
found 26333184 bytes used, error(s) found <<<
total csum bytes: 25540
total tree bytes: 180224
total fs tree bytes: 49152
total extent tree bytes: 16384
btree space waste bytes: 145423
file data blocks allocated: 33947648
referenced 26284032
[CAUSE]
Ext4 has a new compat feature, COMPAT_ORPHAN_FILE, as a better way to
track all the orphan inodes.
This new feature would create a new special inode for this purpose, and
such orphan file inode would not be reachable from any other inode, but
only from super block.
Unfortunately btrfs-convert only skip ext2 known special inodes, not the
newer one.
[FIX]
According to the kernel document, we can locate the orphan file inode
using ext2 super block s_orphan_file_inum, and skip it for
btrfs-convert.
And such skip would only happen if we have the definition of
EXT4_FEATURE_COMPAT_ORPHAN_FILE, to be compatible with older e2fsprogs.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Prepare a single location that will detect or set accelerated versions
of hash algorithms. Right now it's the crc32c, blake2 and sha256 do
an if-else switch while crc32c sets a function pointer.
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Even with chunk_objectid bug fixed, mkfs.btrfs can still caused stack
overflow when enabling extent-tree-v2 feature (need experimental
features enabled):
# ./mkfs.btrfs -f -O extent-tree-v2 ~/test.img
btrfs-progs v5.19.1
See http://btrfs.wiki.kernel.org for more information.
ERROR: superblock magic doesn't match
NOTE: several default settings have changed in version 5.15, please make sure
this does not affect your deployments:
- DUP for metadata (-m dup)
- enabled no-holes (-O no-holes)
- enabled free-space-tree (-R free-space-tree)
Label: (null)
UUID: 205c61e7-f58e-4e8f-9dc2-38724f5c554b
Node size: 16384
Sector size: 4096
Filesystem size: 512.00MiB
Block group profiles:
Data: single 8.00MiB
Metadata: DUP 32.00MiB
System: DUP 8.00MiB
SSD detected: no
Zoned device: no
=================================================================
[... Skip full ASAN output ...]
==65655==ABORTING
[CAUSE]
For experimental build, we have unified feature output, but the old
buffer size is only 64 bytes, which is too small to cover the new full
feature string:
extref, skinny-metadata, no-holes, free-space-tree, block-group-tree, extent-tree-v2
Above feature string is already 84 bytes, over the 64 on-stack memory
size.
This can also be proved by the ASAN output:
==65655==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffc4e03b1d0 at pc 0x7ff0fc05fafe bp 0x7ffc4e03ac60 sp 0x7ffc4e03a408
WRITE of size 17 at 0x7ffc4e03b1d0 thread T0
#0 0x7ff0fc05fafd in __interceptor_strcat /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:377
#1 0x55cdb7b06ca5 in parse_features_to_string common/fsfeatures.c:316
#2 0x55cdb7b06ce1 in btrfs_parse_fs_features_to_string common/fsfeatures.c:324
#3 0x55cdb7a37226 in main mkfs/main.c:1783
#4 0x7ff0fbe3c28f (/usr/lib/libc.so.6+0x2328f)
#5 0x7ff0fbe3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#6 0x55cdb7a2cb34 in _start ../sysdeps/x86_64/start.S:115
[FIX]
Introduce a new macro, BTRFS_FEATURE_STRING_BUF_SIZE, along with a new
sanity check helper, btrfs_assert_feature_buf_size().
The problem is I can not find a build time method to verify
BTRFS_FEATURE_STRING_BUF_SIZE is large enough to contain all feature
names, thus have to go the runtime function to do the BUG_ON() to verify
the macro size.
Now the minimal buffer size for experimental build is 138 bytes, just
bump it to 160 for future expansion.
And if further features go beyond that number, mkfs.btrfs/btrfs-convert
will immediately crash at that BUG_ON(), so we can definitely detect it.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Tested-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Commit "btrfs-progs: prepare merging compat feature lists" tries to
merged "-O" and "-R" options, as they don't correctly represents
btrfs features.
But that commit caused the following bug during mkfs for experimental
build:
$ mkfs.btrfs -f -O block-group-tree /dev/nvme0n1
btrfs-progs v5.19.1
See http://btrfs.wiki.kernel.org for more information.
ERROR: superblock magic doesn't match
ERROR: illegal nodesize 16384 (not equal to 4096 for mixed block group)
[CAUSE]
Currently btrfs_parse_fs_features() will return a u64, and reuse the
same u64 for both incompat and compat RO flags for experimental branch.
This can easily leads to conflicts, as
BTRFS_FEATURE_INCOMPAT_MIXED_BLOCK_GROUP and
BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE both share the same bit
(1 << 2).
Thus for above case, mkfs.btrfs believe it has set MIXED_BLOCK_GROUP
feature, but what we really want is BLOCK_GROUP_TREE.
[FIX]
Instead of incorrectly re-using the same bits in btrfs_feature, split
the old flags into 3 flags:
- incompat_flag
- compat_ro_flag
- runtime_flag
The first two flags are easy to understand, the corresponding flag of
each feature.
The last runtime_flag is to compensate features which doesn't have any
on-disk flag set, like QUOTA and LIST_ALL.
And since we're no longer using a single u64 as features, we have to
introduce a new structure, btrfs_mkfs_features, to contain above 3
flags.
This also mean, things like default mkfs features must be converted to
use the new structure, thus those old macros are all converted to
const static structures:
- BTRFS_MKFS_DEFAULT_FEATURES + BTRFS_MKFS_DEFAULT_RUNTIME_FEATURES
-> btrfs_mkfs_default_features
- BTRFS_CONVERT_ALLOWED_FEATURES -> btrfs_convert_allowed_features
And since we're using a structure, it's not longer as easy to implement
a disallowed mask.
Thus functions with @mask_disallowed are all changed to using
an @allowed structure pointer (which can be NULL).
Finally if we have experimental features enabled, all features can be
specified by -O options, and we can output a unified feature list,
instead of the old split ones.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There are cases where the BUG_ON should be replaced by error
handling as it's validating the data from the source filesystem or
possibility to convert. The unconverted cases are asserts and will be
replaced later.
Signed-off-by: David Sterba <dsterba@suse.com>
The (unsigned long long) type casts can be dropped, printf understands
%llu and u64 and does not warn. In cases where the type is not u64 keep
the cast.
Signed-off-by: David Sterba <dsterba@suse.com>
The logic at the beginning of this function to handle reserved ranges
was pretty complex and hard to follow. By refactoring it to use the
existing intersect_with_reserved() function, we can remove most of the
comparisons and boolean operators while preserving the exact same logic.
This change is only for readability. It does not change the logic itself
at all.
Author: Thomas Hebb <tommyhebb@gmail.com>
Pull-request: #494
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We currently open code a similar operation in create_image_file_range().
By exposing intersect_with_reserved() outside of source-fs.c and
slightly changing its semantics to return the entire range instead of
just the end address, we can reuse it in create_image_file_range().
Author: Thomas Hebb <tommyhebb@gmail.com>
Pull-request: #494
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When checking if the requested range starts in a valid region but later
hits a reserved range, we require the reserved range to end before the
requested one does.
This is incorrect. Since we're going to truncate the requested range
anyway, we want this check to pass even if the requested range ends
partway through a reserved range.
Fix the issue by checking against the reserved range's start address
instead of its end.
Luckily, I don't believe this bug makes a difference in the current code
path, since the range we pass to this function never ends before the end
of the filesystem.
Issue: #297
Issue: #349
Author: Thomas Hebb <tommyhebb@gmail.com>
Pull-request: #494
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
intersect_with_reserved() currently succeeds if (bytenr + num_bytes) is
greater than or equal to the first address in the range, assuming that
bytenr is also not past the end of the range.
This is wrong. (bytenr + num bytes) is one byte past the last address in
the range we're checking, meaning that our range only overlaps the
reserved range if it's strictly greater than the reserved range's start
address.
For example, imagine a range at 0x3000 with length 0x1000 that we're
checking against a reserved range that starts at 0x4000. The addresses
in our range are 0x3000-0x3fff: it doesn't overlap. But the current
check, (0x3000 + 0x1000 >= 0x4000), will erroneously pass.
Fix the issue by changing >= to >.
Issue: #297
Issue: #349
Author: Thomas Hebb <tommyhebb@gmail.com>
Pull-request: #494
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is currently defined in source-fs.h, but main.c uses it far more
than source-fs.c does. Put it in common.h instead, since it's a useful
standalone type.
Author: Thomas Hebb <tommyhebb@gmail.com>
Pull-request: #494
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The preferred order:
- system headers
- standard headers
- libraries
- kernel library
- kernel shared
- common headers
- other tools
- own headers
Signed-off-by: David Sterba <dsterba@suse.com>
On large (blockcount > 32bit) filesystems reading directly
super_block->s_blocks_count is not sufficient as the block count is held
in 2 separate 32 bit variables. Instead always use the provided
ext2fs_blocks_count to read the value. This can result in assertion
failure, when the block count is only held in the high 32 bits, in this
case s_block_counts would be zero, which would result in
btrfs_convert_context::block_count/total_bytes to also be 0 and hit an
assertion failure:
convert/main.c:1162: do_convert: Assertion `cctx.total_bytes != 0` failed, value 0
btrfs-convert(+0xffb0)[0x557defdabfb0]
btrfs-convert(main+0x6c5)[0x557defdaa125]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea)[0x7f66e1f8bd0a]
btrfs-convert(_start+0x2a)[0x557defdab52a]
Aborted
What's worse it can also result in btrfs-convert mistakenly thinking
that a filesystem is smaller than it actually is (ignoring the top 32 bits).
Link: https://lore.kernel.org/linux-btrfs/023b5ca9-0610-231b-fc4e-a72fe1377a5a@jansson.tech/
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add constant for initial value to avoid unexpected clashes with user
defined getopt values and shift the common size getopt values.
Signed-off-by: David Sterba <dsterba@suse.com>
Creating a simple directory structure leads to the following error:
$ btrfs check
Checking filesystem on test.img
UUID: 8f2292ad-c80e-4ab4-8a72-29aa3a83002c
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space cache
[4/7] checking fs roots
unresolved ref dir 260 index 0 namelen 2 name .. filetype 0 errors 3, no dir item, no dir index
ERROR: errors found in fs roots
found 101085184 bytes used, error(s) found
total csum bytes: 98460
total tree bytes: 262144
total fs tree bytes: 49152
total extent tree bytes: 16384
btree space waste bytes: 151864
file data blocks allocated: 167931904
referenced 167931904
The self-reference should exist for the toplevel directory, where the
parent directory points to itself.
Issue: #453
Author: tyan0
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When running some tests, I notice that my debug build of btrfs-convert
is throwing out garbage for target fs label:
$ ./btrfs-convert ~/test.img
btrfs-convert from btrfs-progs v5.17
Source filesystem:
Type: ext2
Label:
Blocksize: 4096
UUID: 29d159a8-cb46-41d3-8089-3c5c65e4afae
Target filesystem:
Label: @pcwU <<< Garbage here
Blocksize: 4096
Nodesize: 16384
UUID: 682bf5f2-8cb1-4390-b9ac-6883cd87ed39
Checksum: crc32c
...
[CAUSE]
The fslabel[] array is just not initialized, thus it can contain
garbage.
[FIX]
Initialize fslabel[] array to all zero.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Now that all callers are using the _nr variations we can simply rename
these helpers to btrfs_item_##member/btrfs_set_item_##member and change
the actual item SETGET funcs to raw_item_##member/set_raw_item_##member
and then change all callers to drop the _nr part.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have a lot of the following patterns
item = btrfs_item_nr(nr);
btrfs_set_item_*(eb, item, val);
btrfs_set_item_*(eb, btrfs_item_nr(nr), val);
in a lot of places in our code. Instead add _nr variations of these
helpers and convert all of the users to this new helper.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>