The kernel code no longer has BTRFS_CRC32_SIZE and only uses
btrfs_csum_sizes[]. So, update the progs code as well.
Suggested-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[Bug]
On btrfs converted from ext*, one user reported the following kernel
warning:
------------[ cut here ]------------
BTRFS: Transaction aborted (error -95)
WARNING: CPU: 0 PID: 324 at fs/btrfs/inode.c:3042 btrfs_finish_ordered_io+0x7ab/0x850 [btrfs]
Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
RIP: 0010:btrfs_finish_ordered_io+0x7ab/0x850 [btrfs]
...
Call Trace:
normal_work_helper+0x39/0x370 [btrfs]
process_one_work+0x1ce/0x410
worker_thread+0x2b/0x3d0
? process_one_work+0x410/0x410
kthread+0x113/0x130
? kthread_create_on_node+0x70/0x70
? do_syscall_64+0x74/0x190
? SyS_exit_group+0x10/0x10
ret_from_fork+0x35/0x40
---[ end trace c8ed62ff6a525901 ]---
BTRFS: error (device dm-2) in
btrfs_finish_ordered_io:3042: errno=-95 unknown
BTRFS info (device dm-2): forced readonly
BTRFS error (device dm-2): pending csums is 6447104
[Cause]
The call trace and the unique return value points to
__btrfs_drop_extents(), when we tries to drop pages of an inline extent,
we will trigger such -EOPNOTSUPP.
However kernel has limitation on the size of inline file extent
(sector size for ram size and sector size - 1 for on-disk size),
btrfs-convert doesn't have the same limitation, resulting much larger
file extent.
The lack of correct inline extent size check dates back to 2008 when
btrfs-convert is added into btrfs-progs.
[Fix]
Fix the inline extent creation condition, not only using
BTRFS_MAX_INLINE_DATA_SIZE(), which is only the maximum size of inline
data according to nodesize, but also limit it against sector size.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In latest e2fsprogs (1.44.0) definition of ext2_ext_attr_entry has
removed member e_value_block, as currently ext* doesn't support it set
anyway.
So remove such check so that we can pass compile.
Issue: #110
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199071
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Exposed by convert-test with D=asan.
Unlike btrfs, ext2fs_close() still leaves its ext2_filsys parameter
filled with allocated pointers.
It needs ext2fs_free() to free those pointers.
Issue: #92
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Do a cleanup. Also make it consistent with kernel. Use fs_info instead
of root for BTRFS_MAX_INLINE_DATA_SIZE, since maybe in some situation we
do not know root, but just know fs_info.
Change macro to inline function to be consistent with kernel. And
change the function body to match kernel.
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Do a cleanup. Also make it consistent with kernel. Use fs_info instead
of root for BTRFS_LEAF_DATA_SIZE, since maybe in some situation we do
not know root, but just know fs_info.
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
@chunk_objectid of btrfs_make_block_group() function is always fixed to
BTRFS_FIRST_FREE_OBJECTID, so there is no need to pass it as parameter
explicitly.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
As btrfs is specific to Linux, %m can be used instead of strerror(errno)
in format strings. This has some size reduction benefits for embedded
systems.
glibc, musl, and uclibc-ng all support %m as a modifier to printf.
A quick glance at the BIONIC libc source indicates that it has
support for %m as well. BSDs and Windows do not but I do believe
them to be beyond the scope of btrfs-progs.
Compiled sizes on Ubuntu 16.04:
Before:
3916512 btrfs
233688 libbtrfs.so.0.1
4899 bcp
2367672 btrfs-convert
2208488 btrfs-corrupt-block
13302 btrfs-debugfs
2152160 btrfs-debug-tree
2136024 btrfs-find-root
2287592 btrfs-image
2144600 btrfs-map-logical
2130760 btrfs-select-super
2152608 btrfstune
2131760 btrfs-zero-log
2277752 mkfs.btrfs
9166 show-blocks
After:
3908744 btrfs
233256 libbtrfs.so.0.1
4899 bcp
2366560 btrfs-convert
2207432 btrfs-corrupt-block
13302 btrfs-debugfs
2151104 btrfs-debug-tree
2134968 btrfs-find-root
2281864 btrfs-image
2143536 btrfs-map-logical
2129704 btrfs-select-super
2151552 btrfstune
2130696 btrfs-zero-log
2276272 mkfs.btrfs
9166 show-blocks
Total savings: 23928 (24 kilo)bytes
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 1170ac307900 ("btrfs-progs: convert: Introduce function to check if
convert image is able to be rolled back") reworked rollback check
condition, by checking 1:1 mapping of each file extent.
The idea itself has nothing wrong, but error handler is not implemented
correctly, which over writes the return value and always try to rollback
the fs even it fails to pass the check.
Fix it by correctly return the error before rollback the fs.
Fixes: 1170ac307900 ("btrfs-progs: convert: Introduce function to check if convert image is able to be rolled back")
Reported-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Build with musl libc needs the sys/types.h header for the dev_t type,
since this header is not included indirectly. This fixes the following
build failure:
In file included from convert/source-fs.c:23:0:
./convert/source-fs.h:112:1: error: unknown type name ‘dev_t’
dev_t decode_dev(u32 dev);
^~~~~
convert/source-fs.c:31:1: error: unknown type name ‘dev_t’
dev_t decode_dev(u32 dev)
^~~~~
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David Sterba <dsterba@suse.com>
For rollback, we only needs to open the fs to check if it meets the
condition to rollback. And this RW read makes us failed to rollback
btrfs with v2 space cache.
In fact, we don't even start a transaction during rollback.
So open the fs RO for rollback, to avoid v2 space cache problem.
Reported-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: Gu JinXiang <gujx@cn.fujitsu.com>
Tested-by: Gu JinXiang <gujx@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For code reuse, btrfs_insert_dir_item() now calls
inserts_with_overflow() even if the dir_item existed.
Add a parameter @ignore_existed to btrfs_add_link().
If @ignore_existed is not zero, btrfs_add_link() continues to do link.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
A convert parameter is added as a flag to indicate if btrfs_mksubvol()
is used for btrfs-convert. The change cascades down to the callchain.
Signed-off-by: Yingyi Luo <yingyil@google.com>
Signed-off-by: David Sterba <dsterba@suse.com>
link_subvol() is moved to inode.c and renamed as btrfs_mksubvol().
The change cascades down to the callchain.
Signed-off-by: Yingyi Luo <yingyil@google.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently transaction bugs out insided btrfs_start_transaction in case
of error, we want to lift the error handling to the callers. This patch
adds the BUG_ON anywhere it's been missing so far. This is not the best
way of course. Transforming BUG_ON to a proper error handling highly
depends on the caller and should be dealt with case by case.
Signed-off-by: David Sterba <dsterba@suse.com>
This patch adds support to convert reiserfs file systems in-place to btrfs.
It will convert extended attribute files to btrfs extended attributes,
translate ACLs, coalesce tails that consist of multiple items into one item,
and convert tails that are too big into indirect files.
This requires that libreiserfscore 3.6.27 be available.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There multiple places where we use well-known sizes - 1,8,16,32 megabytes. We
also have them defined as constants in the sizes.h header. So let's use them.
No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When we are looking for extents in migrate_one_reserved_range, it's likely
that there will be multiple extents that fall into the 0-1MB range.
If lookup_cache_extent is called with a range that covers multiple cache
entries, it will return the first entry it encounters while searching
from the top of the tree that happens to fall in that range. That
means that we can end up skipping regions within that range, resulting
in a file system image that can't be rolled back since it wasn't
all migrated properly.
This is reproducible using convert-tests/008-readonly-image. There was
a range from 0-160kB, but the only entry that was returned began at
~ 280kB.
The fix is to use search_cache_extent to iterate through multiple regions
within that range.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There are two printfs with missing newlines that end up making the
output wonky.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 522ef705e3 (btrfs-progs: convert: Introduce function to calculate
the available space) changed how we handle migrating file data so that
we never have btrfs space associated with the reserved ranges. This
works pretty well and when we iterate over the file blocks, the
associations are redirected to the migrated locations.
This commit missed the case in block_iterate_proc where we just check
for intersection with a superblock location before looking up a block
group. intersect_with_sb checks to see if the range intersects with
a stripe containing a superblock but, in fact, we've reserved the
full 0-1MB range at the start of the disk. So a file block located
at e.g. 160kB will fall in the reserved region but won't be excepted
in block_iterate_block. We ultimately hit a BUG_ON when we fail
to look up the block group for that location.
This is reproducible using convert-tests/003-ext4-basic.
The fix is to have intersect_with_sb and block_iterate_proc understand
the full size of the reserved ranges. Since we use the range to
determine the boundary for the block iterator, let's just return the
boundary. 0 isn't a valid boundary and means that we proceed normally
with block group lookup.
Cc: Qu Wenruo <quwenruo.btrfs@gmx.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The status display was reading the state while the task was updating
it. Use a mutex to prevent the race.
This race was detected using ThreadSanitizer and
misc-tests/005-convert-progress-thread-crash.
==================
WARNING: ThreadSanitizer: data race
Write of size 8 by main thread:
#0 ext2_copy_inodes btrfs-progs/convert/source-ext2.c:853
#1 copy_inodes btrfs-progs/convert/main.c:145
#2 do_convert btrfs-progs/convert/main.c:1297
#3 main btrfs-progs/convert/main.c:1924
Previous read of size 8 by thread T1:
#0 print_copied_inodes btrfs-progs/convert/main.c:124
Location is stack of main thread.
Thread T1 (running) created by main thread at:
#0 pthread_create <null>
#1 task_start btrfs-progs/task-utils.c:50
#2 do_convert btrfs-progs/convert/main.c:1295
#3 main btrfs-progs/convert/main.c:1924
SUMMARY: ThreadSanitizer: data race
btrfs-progs/convert/source-ext2.c:853 in ext2_copy_inodes
Signed-off-by: Adam Buchbinder <abuchbinder@google.com>
Signed-off-by: David Sterba <dsterba@suse.com>
4 functions are involved in this refactor: btrfs_make_block_group()
btrfs_make_block_groups(), btrfs_alloc_chunk, btrfs_alloc_data_chunk().
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Leafsize is deprecated for a long time, and kernel has already updated
ctree.h to rename sb->leafsize to sb->__unused_leafsize.
This patch will remove normal users of leafsize:
1) Remove leafsize member from btrfs_root structure
Now only root->nodesize and root->sectorisze.
No longer root->leafsize.
2) Remove @leafsize parameter from btrfs_setup_root() function
Since no root->leafsize, no need for @leafsize parameter.
The remaining user of leafsize will be:
1) btrfs inspect-internal dump-super
Reformat the "leafsize" output to "leafsize (deprecated)" and
use le32_to_cpu() to do the cast manually.
2) mkfs
We still need to set sb->__unused_leafsize to nodesize.
Do the manual cast too.
3) convert
Same as mkfs, these two superblock setup should be merged later
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
So btrfs_set_header_flags() vs btrfs_set_header_flag, the difference is
sort of similar to "=" vs "|=", when creating and initialising a new
extent buffer, convert uses the former one which clears header_rev by
accident.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With the current btrfs-convert, if we convert a ext4 without data checksum,
it'd not set nodatasum flag in inode item, nor create csum item, reading
file ends up with checksum errors.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With larger file system (in this case its 22TB), ext2fs_open() returns
EXT2_ET_CANT_USE_LEGACY_BITMAPS error message with
ext2fs_read_block_bitmap().
To overcome this issue,
(a) we need pass EXT2_FLAG_64BITS flag with ext2fs_open.
(b) use 64-bit functions like ext2fs_get_block_bitmap_range2,
ext2fs_inode_data_blocks2,ext2fs_read_ext_attr2
(c) use 64bit types with btrfs_convert_context fields
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=194795
Signed-off-by: Lakshmipathi.G <lakshmipathi.g@giis.co.in>
Signed-off-by: David Sterba <dsterba@suse.com>
The u32 types in the convert context might not be enough for some very
large filesytems (20TB). Use 64bit types to be safe.
Signed-off-by: Lakshmipathi.G <lakshmipathi.g@giis.co.in>
Signed-off-by: David Sterba <dsterba@suse.com>
In check_convert_image(), for normal HOLE case, if the file extents are
smaller than image size, we set ret to -EINVAL and print error message.
But forget to return.
This patch adds the missing return to fix it.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since btrfs_reserved_ranges array is just used to store btrfs reserved
ranges, no one will nor should modify them at run time, make them static
and const will be better.
This also eliminates the use of immediate number 3.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
[ definition stays in source-fs.c ]
Signed-off-by: David Sterba <dsterba@suse.com>
Build under musl libc fails because of missing PATH_MAX and XATTR_NAME_MAX
macro declarations. Add the required headers.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Rework rollback to a more easy to understand way.
New convert behavior makes us to have a more flex chunk layout, which
only data chunk containing old fs data will be at the same physical
location, while new chunks (data/meta/sys) can be mapped anywhere else.
This behavior makes old rollback behavior can't handle it.
As old behavior assumes all data/meta is mapped in a large chunk, which is
mapped 1:1 on disk.
So rework rollback to handle new convert behavior, enhance the check by
only checking all file extents of convert image, only to check if these
file extents and therir chunks are mapped 1:1.
This new rollback check behavior can handle both new and old convert
behavior, as the new behavior is a superset of old behavior.
Further more, introduce a simple rollback mechanisim:
1) Read reserved data (offset = file offset) from convert image
2) Write reserved data into disk (offset = physical offset)
Since old fs image is a valid fs, and we only need to rollback
superblocks (btrfs reserved ranges), then we just read out data in
reserved range, and write it back.
Due to the fact that all other file extents of converted image is mapped
1:1 on disk, we put the missing piece back, then the fs is as good as
old one.
Then what we do in btrfs is just another dream.
With this new rollback mechanisim, we can open btrfs read-only, so we
won't cause any damage to current btrfs, until the final piece (0~1M,
containing 1st super block) is put back.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
[ port to v4.10 ]
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce a function, check_convert_image() to check if that image is
rollback-able.
This means all file extents except one of the btrfs reserved ranges, must
be mapped 1:1 on disk.
1:1 mapped file extents must match the following conditions:
1) Their file_offset(key.offset) matches its disk_bytenr
2) The corresponding chunk must be mapped 1:1 on disk
That's to say, it's a SINGLE chunk, and chunk logical matches with
stripe physical.
Above 2 conditions ensured that file extent lies the exactly the same
position as in the old filesystem.
For data in reserved ranges of btrfs, they are relocated to new places,
and in that case, we use btrfs_read_file() to read out the content for
later rollback use.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Introduce a new function, read_reserved_ranges(), to allow later
rollback to use these data to do rollback.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Since we have reserved ranges array now, we can use them to skip all
these open codes.
And add some comment and asciidoc art for related part.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
[ port to v4.10 ]
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce a new strucutre, simple_range, to present one contingous
range.
Also, use such structure to define btrfs_reserved_ranges(), which
convert and rollback will use.
Suggested-by: David Sterba <dsterba@suse.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
[ split hunks to new file structure ]
Signed-off-by: David Sterba <dsterba@suse.com>
Convert is now a little complex due to that fact we need to separate
metadata and data chunks for different profiles.
Add a comment with ascii art explaining the whole design and point
out the really complex part, so any newcomers interested in convert can
get a quick overview of it before digging into the hard to read code.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
[ wording and formatting adjustments ]
Signed-off-by: David Sterba <dsterba@suse.com>