This patch syncs file-item.h into btrfs-progs. This carries with it an
API change for btrfs_del_csums, which takes a root argument in the
kernel, so all callsites have been updated accordingly.
I didn't sync file-item.c because it carries with it a bunch of bio
related helpers which are difficult to adapt to the kernel.
Additionally there's a few helpers in the local copy of file-item.c that
aren't in the kernel that are required for different tools.
This requires more cleanups in both the kernel and progs in order to
sync file-item.c, so for now just do file-item.h in order to pull things
out of ctree.h.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Since e2fsprog 1.47, even with a newly created empty ext4 filesystem,
btrfs-convert would result an fs that btrfs-check would complain:
# mkfs.ext4 -F test.img
# btrfs-convert test.img
# btrfs-check test.img
Opening filesystem to check...
Checking filesystem on test.img
UUID: e45da158-8967-4e4d-9c9f-66b0d127dbce
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space cache
[4/7] checking fs roots
root 5 inode 266 errors 2000, link count wrong
ERROR: errors found in fs roots
found 26333184 bytes used, error(s) found <<<
total csum bytes: 25540
total tree bytes: 180224
total fs tree bytes: 49152
total extent tree bytes: 16384
btree space waste bytes: 145423
file data blocks allocated: 33947648
referenced 26284032
[CAUSE]
Ext4 has a new compat feature, COMPAT_ORPHAN_FILE, as a better way to
track all the orphan inodes.
This new feature would create a new special inode for this purpose, and
such orphan file inode would not be reachable from any other inode, but
only from super block.
Unfortunately btrfs-convert only skip ext2 known special inodes, not the
newer one.
[FIX]
According to the kernel document, we can locate the orphan file inode
using ext2 super block s_orphan_file_inum, and skip it for
btrfs-convert.
And such skip would only happen if we have the definition of
EXT4_FEATURE_COMPAT_ORPHAN_FILE, to be compatible with older e2fsprogs.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There are cases where the BUG_ON should be replaced by error
handling as it's validating the data from the source filesystem or
possibility to convert. The unconverted cases are asserts and will be
replaced later.
Signed-off-by: David Sterba <dsterba@suse.com>
On large (blockcount > 32bit) filesystems reading directly
super_block->s_blocks_count is not sufficient as the block count is held
in 2 separate 32 bit variables. Instead always use the provided
ext2fs_blocks_count to read the value. This can result in assertion
failure, when the block count is only held in the high 32 bits, in this
case s_block_counts would be zero, which would result in
btrfs_convert_context::block_count/total_bytes to also be 0 and hit an
assertion failure:
convert/main.c:1162: do_convert: Assertion `cctx.total_bytes != 0` failed, value 0
btrfs-convert(+0xffb0)[0x557defdabfb0]
btrfs-convert(main+0x6c5)[0x557defdaa125]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea)[0x7f66e1f8bd0a]
btrfs-convert(_start+0x2a)[0x557defdab52a]
Aborted
What's worse it can also result in btrfs-convert mistakenly thinking
that a filesystem is smaller than it actually is (ignoring the top 32 bits).
Link: https://lore.kernel.org/linux-btrfs/023b5ca9-0610-231b-fc4e-a72fe1377a5a@jansson.tech/
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add new option --uuid to convert with the following modes:
- 'copy' -- copy the UUID from the source filesystem
- 'new' -- (default) generate new UUID
- UUID -- a valid UUID that will be set on btrfs
Based on patch from Florian
https://lore.kernel.org/linux-btrfs/1357486331-4615-2-git-send-email-falbrechtskirchinger@gmail.com/
and ported to contemporary codebase.
Issue: #391
Signed-off-by: David Sterba <dsterba@suse.com>
Commit b3df561fbf ("btrfs-progs: convert: copy extra timespec on
ext4") has introduced the ability to convert extended inode time
precision on ext4, but this breaks builds on older distros, where ext4
does not have the nsec time precision.
Commit c615287cc0 ("btrfs-progs: a bunch of typo fixes") tried to fix
that by testing the availability of the EXT4_EPOCH_MASK macro, but the
test is not complete.
This patch aims at fixing the macro test, and changes the
name of the associated HAVE_ macro, since the logic is reverted.
This fixes#353 when ext4 has nsec time precision. Note that the test
convert/019-ext4-copy-timestamps fails when ext4 does not have the nsec
time precision and needs to check for the support.
Issue: #353
Signed-off-by: Pierre Labastie <pierre.labastie@neuf.fr>
Signed-off-by: David Sterba <dsterba@suse.com>
As Chris reports: This ext4 file system has 'needs_recovery' feature set, and
if mounted rw, log replay happens. But btrfs-convert doesn't check for it and
converts anyway. It probably shouldn't.
# debugfs -R stats /dev/loop0
debugfs 1.45.6 (20-Mar-2020)
Filesystem volume name: <none>
Last mounted on: /mnt/0
Filesystem UUID: d3e3862e-f892-4ab7-ae91-84eb4be4a3ef
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index
filetype needs_recovery extent 64bit flex_bg
sparse_super large_file huge_file dir_nlink
extra_isize metadata_csum
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
...
Then 'btrfs-convert' proceeds, while 'e2fsck -fvn /dev/loop1' finds some
problems and wants to fix them.
Add a check for the 'needs_recovery' incompat bit set and don't convert
the filesystem.
Issue: #348
Signed-off-by: David Sterba <dsterba@suse.com>
In 5.10 the convert gained support for extended inode time precision,
but this is not available on older distros and breaks build. Add a
configure-time check for the EXT4_EPOCH_MASK macro and add a stub in
case it's not detected.
This means that the 64bit timestamps will not be transferred from the
original filesystem in such environment, at least a warning is printed.
Issue: #344
Signed-off-by: David Sterba <dsterba@suse.com>
Currently btrfs-convert only copies ext2 inode timestamps
i_[cma]time from ext4, while filling 0 to nsec and crtime fields.
This change copies nsec and crtime by parsing i_[cma]time_extra fields.
Author: Jiachen YANG <farseerfc@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch will enhance the error handling of ext2_copy_inodes by:
- Return more meaningful error number
Instead of -1 (-EPERM), now return -EIO for ext2 calls error, and
proper error number from btrfs calls.
- Commit transaction if ext2fs_open_inode_scan() failed
- Call ext2fs_close_inode_scan() on error
- Hunt down the BUG_ON()s
- Add error messages for transaction related calls
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When convert is called on a 64GiB ext4 fs, it fails like this:
$ btrfs-convert /dev/loop0p1
create btrfs filesystem:
blocksize: 4096
nodesize: 16384
features: extref, skinny-metadata (default)
checksum: crc32c
creating ext2 image file
ERROR: missing data block for bytenr 1048576
ERROR: failed to create ext2_saved/image: -2
WARNING: an error occurred during conversion, filesystem is partially created but not finalized and not mountable
Btrfs-convert also corrupts the source fs:
$ LANG=C e2fsck /dev/loop0p1 -f
e2fsck 1.45.6 (20-Mar-2020)
Resize inode not valid. Recreate<y>? yes
Pass 1: Checking inodes, blocks, and sizes
Deleted inode 3681 has zero dtime. Fix<y>? yes
Inodes that were part of a corrupted orphan linked list found. Fix<y>? yes
Inode 3744 was part of the orphaned inode list. FIXED.
Deleted inode 3745 has zero dtime. Fix<y>? yes
Inode 3747 has INLINE_DATA_FL flag on filesystem without inline data support.
Clear<y>? yes
...
[CAUSE]
After some debugging, the first strange behavior is, the value of
cctx->total_bytes is 0 in ext2_open_fs().
It turns out that, the value assign for cctx->total_bytes could lead to
bit overflow for the unsigned int value.
And that 0 cctx->total_bytes leads to various problems for later free
space calculation.
For example, in calculate_available_space(), we use cctx->total_bytes to
ensure we won't create a data chunk beyond device end:
cue_len = min(cctx->total_bytes - cur_off, cur_len);
If that cur_offset is also 0, we will create a cache_extent with 0 size,
which could cause a lot of problems for cache tree search.
[FIX]
Do manual casting for the multiply operation, so we could got a real u64
result. The fix will be applied to all supported fses (ext* and
reiserfs).
Reported-by: Christian Zangl <coralllama@gmail.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[WARNING]
When compiling btrfs-progs, there is one warning from convert ext2 code:
convert/source-ext2.c: In function 'ext2_open_fs':
convert/source-ext2.c:91:44: warning: pointer targets in passing argument 1 of 'strndup' differ in signedness [-Wpointer-sign]
91 | cctx->volume_name = strndup(ext2_fs->super->s_volume_name, 16);
| ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
| |
| __u8 * {aka unsigned char *}
In file included from ./kerncompat.h:25,
from convert/source-ext2.c:19:
/usr/include/string.h:175:35: note: expected 'const char *' but argument is of type '__u8 *' {aka 'unsigned char *'}
175 | extern char *strndup (const char *__string, size_t __n)
| ~~~~~~~~~~~~^~~~~~~~
The toolchain involved is:
- GCC 10.1.0
- e2fsprogs 1.45.6
[CAUSE]
Obviously, in the offending e2fsprogs, the volume label is using u8,
which is unsigned char, not char.
/*078*/ __u8 s_volume_name[EXT2_LABEL_LEN]; /* volume name, no NUL? */
[FIX]
Just do a forced conversion to suppress the warning is enough.
I don't think we need to apply -Wnopointer-sign yet.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In convert we use trans->block_reserved >= 4096 as a threshold to commit
transaction, where block_reserved is the number of new tree blocks
allocated inside a transaction.
The problem is, we still have a hidden bug in delayed ref implementation
in btrfs-progs, when we have a large enough transaction, delayed ref may
failed to find certain tree blocks in extent tree and cause transaction
abort.
This fix will workaround it by committing transaction at a much lower
threshold.
The old 4096 means 4096 new tree blocks, when using default (16K)
nodesize, it's 64M, which can contain over 12k inlined data extent or
csum for around 60G, or over 800K file extents.
The new threshold will limit the size of new tree blocks to 2M, aligning
with the chunk preallocator threshold, and reducing the possibility to
hit that delayed ref bug.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Similar to the changes where strerror(errno) was converted, continue
with the remaining cases where the argument was stored in another
variable.
The savings in object size are about 4500 bytes:
$ size btrfs.old btrfs.new
text data bss dec hex filename
805055 24248 19748 849051 cf49b btrfs.old
804527 24248 19748 848523 cf28b btrfs.new
Signed-off-by: David Sterba <dsterba@suse.com>
[Bug]
On btrfs converted from ext*, one user reported the following kernel
warning:
------------[ cut here ]------------
BTRFS: Transaction aborted (error -95)
WARNING: CPU: 0 PID: 324 at fs/btrfs/inode.c:3042 btrfs_finish_ordered_io+0x7ab/0x850 [btrfs]
Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
RIP: 0010:btrfs_finish_ordered_io+0x7ab/0x850 [btrfs]
...
Call Trace:
normal_work_helper+0x39/0x370 [btrfs]
process_one_work+0x1ce/0x410
worker_thread+0x2b/0x3d0
? process_one_work+0x410/0x410
kthread+0x113/0x130
? kthread_create_on_node+0x70/0x70
? do_syscall_64+0x74/0x190
? SyS_exit_group+0x10/0x10
ret_from_fork+0x35/0x40
---[ end trace c8ed62ff6a525901 ]---
BTRFS: error (device dm-2) in
btrfs_finish_ordered_io:3042: errno=-95 unknown
BTRFS info (device dm-2): forced readonly
BTRFS error (device dm-2): pending csums is 6447104
[Cause]
The call trace and the unique return value points to
__btrfs_drop_extents(), when we tries to drop pages of an inline extent,
we will trigger such -EOPNOTSUPP.
However kernel has limitation on the size of inline file extent
(sector size for ram size and sector size - 1 for on-disk size),
btrfs-convert doesn't have the same limitation, resulting much larger
file extent.
The lack of correct inline extent size check dates back to 2008 when
btrfs-convert is added into btrfs-progs.
[Fix]
Fix the inline extent creation condition, not only using
BTRFS_MAX_INLINE_DATA_SIZE(), which is only the maximum size of inline
data according to nodesize, but also limit it against sector size.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In latest e2fsprogs (1.44.0) definition of ext2_ext_attr_entry has
removed member e_value_block, as currently ext* doesn't support it set
anyway.
So remove such check so that we can pass compile.
Issue: #110
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199071
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Exposed by convert-test with D=asan.
Unlike btrfs, ext2fs_close() still leaves its ext2_filsys parameter
filled with allocated pointers.
It needs ext2fs_free() to free those pointers.
Issue: #92
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Do a cleanup. Also make it consistent with kernel. Use fs_info instead
of root for BTRFS_MAX_INLINE_DATA_SIZE, since maybe in some situation we
do not know root, but just know fs_info.
Change macro to inline function to be consistent with kernel. And
change the function body to match kernel.
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Do a cleanup. Also make it consistent with kernel. Use fs_info instead
of root for BTRFS_LEAF_DATA_SIZE, since maybe in some situation we do
not know root, but just know fs_info.
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently transaction bugs out insided btrfs_start_transaction in case
of error, we want to lift the error handling to the callers. This patch
adds the BUG_ON anywhere it's been missing so far. This is not the best
way of course. Transforming BUG_ON to a proper error handling highly
depends on the caller and should be dealt with case by case.
Signed-off-by: David Sterba <dsterba@suse.com>
The status display was reading the state while the task was updating
it. Use a mutex to prevent the race.
This race was detected using ThreadSanitizer and
misc-tests/005-convert-progress-thread-crash.
==================
WARNING: ThreadSanitizer: data race
Write of size 8 by main thread:
#0 ext2_copy_inodes btrfs-progs/convert/source-ext2.c:853
#1 copy_inodes btrfs-progs/convert/main.c:145
#2 do_convert btrfs-progs/convert/main.c:1297
#3 main btrfs-progs/convert/main.c:1924
Previous read of size 8 by thread T1:
#0 print_copied_inodes btrfs-progs/convert/main.c:124
Location is stack of main thread.
Thread T1 (running) created by main thread at:
#0 pthread_create <null>
#1 task_start btrfs-progs/task-utils.c:50
#2 do_convert btrfs-progs/convert/main.c:1295
#3 main btrfs-progs/convert/main.c:1924
SUMMARY: ThreadSanitizer: data race
btrfs-progs/convert/source-ext2.c:853 in ext2_copy_inodes
Signed-off-by: Adam Buchbinder <abuchbinder@google.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With the current btrfs-convert, if we convert a ext4 without data checksum,
it'd not set nodatasum flag in inode item, nor create csum item, reading
file ends up with checksum errors.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With larger file system (in this case its 22TB), ext2fs_open() returns
EXT2_ET_CANT_USE_LEGACY_BITMAPS error message with
ext2fs_read_block_bitmap().
To overcome this issue,
(a) we need pass EXT2_FLAG_64BITS flag with ext2fs_open.
(b) use 64-bit functions like ext2fs_get_block_bitmap_range2,
ext2fs_inode_data_blocks2,ext2fs_read_ext_attr2
(c) use 64bit types with btrfs_convert_context fields
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=194795
Signed-off-by: Lakshmipathi.G <lakshmipathi.g@giis.co.in>
Signed-off-by: David Sterba <dsterba@suse.com>
Build under musl libc fails because of missing PATH_MAX and XATTR_NAME_MAX
macro declarations. Add the required headers.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Use one flag field instead of several variables. The change cascades
down to the callchain and modifies several functions.
Signed-off-by: David Sterba <dsterba@suse.com>