Create directory for all sources that can be used by anything that's not
rellated to a relevant kernel part, all common functions, helpers,
utilities that do not fit any other specific category.
The traditional location would be probably lib/ with all things that are
statically linked to the main binaries, but we have libbtrfs and
libbtrfsutil so this would be confusing.
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
There is a bug report of unexpected ENOSPC from btrfs-convert, issue #123.
After some debugging, even when we have enough unallocated space, we
still hit ENOSPC at btrfs_reserve_extent().
[CAUSE]
Btrfs-progs relies on chunk preallocator to make enough space for
data/metadata.
However after the introduction of delayed-ref, it's no longer reliable
to rely on btrfs_space_info::bytes_used and
btrfs_space_info::bytes_pinned to calculate used metadata space.
For a running transaction with a lot of allocated tree blocks,
btrfs_space_info::bytes_used stays its original value, and will only be
updated when running delayed ref.
This makes btrfs-progs chunk preallocator completely useless. And for
btrfs-convert/mkfs.btrfs --rootdir, if we're going to have enough
metadata to fill a metadata block group in one transaction, we will hit
ENOSPC no matter whether we have enough unallocated space.
[FIX]
This patch will introduce btrfs_space_info::bytes_reserved to track how
many space we have reserved but not yet committed to extent tree.
To support this change, this commit also introduces the following
modification:
- More comment on btrfs_space_info::bytes_*
To make code a little easier to read
- Export update_space_info() to preallocate empty data/metadata space
info for mkfs.
For mkfs, we only have a temporary fs image with SYSTEM chunk only.
Export update_space_info() so that we can preallocate empty
data/metadata space info before we start a transaction.
- Proper btrfs_space_info::bytes_reserved update
The timing is the as kernel (except we don't need to update
bytes_reserved for data extents)
* Increase bytes_reserved when call alloc_reserved_tree_block()
* Decrease bytes_reserved when running delayed refs
With the help of head->must_insert_reserved to determine whether we
need to decrease.
Issue: #123
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Although moderm hardware is fast enough and crc32c calculation is not a
hotspot, doing such optimization won't hurt anyway.
Issue: #175
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For data reloc tree creation, we copy its contents from the fs tree just
for its INODE_ITEM, INODE_REF and dirid. This hides the detail and is
not obvious for why we're copying from fs root.
This patch will create data reloc tree from scratch:
- Create root, including root item and new tree root
- Change dirid to BTRFS_FIRST_FREE_OBJECTID
- Insert root INODE_ITEM and INODE_REF
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Similar to the changes where strerror(errno) was converted, continue
with the remaining cases where the argument was stored in another
variable.
The savings in object size are about 4500 bytes:
$ size btrfs.old btrfs.new
text data bss dec hex filename
805055 24248 19748 849051 cf49b btrfs.old
804527 24248 19748 848523 cf28b btrfs.new
Signed-off-by: David Sterba <dsterba@suse.com>
In case add_inode_items returned -EEXIST, traverse_directory would
handle the condition and still continue under certain circumstances, but
it would not reset the error code, leading to spurious failure later.
This patch fixes that.
Pull-request: #124
Author: Matthias Benkard <matthias.benkard@egym.de>
Signed-off-by: David Sterba <dsterba@suse.com>
The old flag OPEN_CTREE_FS_PARTIAL is in fact quite easy to be confused
with OPEN_CTREE_PARTIAL, which allow btrfs-progs to open damaged
filesystem (like corrupted extent/csum tree).
However OPEN_CTREE_FS_PARTIAL, unlike its name, is just allowing
btrfs-progs to open fs with temporary superblocks (which only has 6
basic trees on SINGLE meta/sys chunks).
The usage of FS_PARTIAL is really confusing here.
So rename OPEN_CTREE_FS_PARTIAL to OPEN_CTREE_TEMPORARY_SUPER, and add
extra comment for its behavior.
Also rename BTRFS_MAGIC_PARTIAL to BTRFS_MAGIC_TEMPORARY to keep the
naming consistent.
And with above comment, the usage of FS_PARTIAL in dump-tree is
obviously incorrect, fix it.
Fixes: 8698a2b9ba ("btrfs-progs: Allow inspect dump-tree to show specified tree block even some tree roots are corrupted")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With mkfs.btrfs on a thin provisioned device with very small backing
size and big virtual size, all code works well in mkfs.btrfs until
close_ctree() is called.
close_ctree() fails to sync device due to small backing size while
closing devices. However, mkfs returns 0 in such situation which causes
failure of fstests generic/405.
So, let mkfs returns nonzero value if previous steps succeeded but
close_ctree() failed. Then fstests generic/405 passes now.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
mkfs-test 016 "rootdir-bad-symbolic-link" fails when selinux is enabled.
This is because add_xattr_item() uses getxattr() and tries to follow a
bad symbolic link for selinux item, which causes ENOENT error.
The line above already uses llistxattr() for getting list of xattr in
order not to follow a symbolic link, so just use lgetxattr() too.
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We can easily create the uuid tree that's usually created after first
mount. The kernel will still check the tree on first mount so we don't
try to fake the uuid tree generation so it appears consistent, even if
it's empty.
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
If we have a symbolic link in rootdir pointing to non-existing location,
mkfs.btrfs --rootdir will just fail:
------
$ mkfs.btrfs -f --rootdir /tmp/rootdir/ /dev/data/btrfs
btrfs-progs v4.15.1
See http://btrfs.wiki.kernel.org for more information.
ERROR: ftw subdir walk of /tmp/rootdir/ failed: No such file or directory
------
[CAUSE]
Commit 599a0abed5 ("btrfs-progs: mkfs/rootdir: Use over-reserve method
to make size estimate easier") add extra ftw walk to estimate the
filesystem size.
Such default ftw walk will follow symbolic link and gives ENOENT error.
[FIX]
Use nftw() to specify FTW_PHYS so we won't follow symbolic link for size
calculation.
Issue: #109
Reported-by: Alexander Kanavin <alexander.kanavin@intel.com>
Fixes: 599a0abed5 ("btrfs-progs: mkfs/rootdir: Use over-reserve method to make size estimate easier")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The kernel code no longer has BTRFS_CRC32_SIZE and only uses
btrfs_csum_sizes[]. So, update the progs code as well.
Suggested-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently, the top-level subvolume lacks the UUID. As a result, both
non-snapshot subvolume and snapshot of top-level subvolume do not have
Parent UUID and cannot be distinguisued. Therefore "fi show" of
top-level lists all the subvolumes which lacks the UUID in
"Snapshot(s)" filed. Also, it lacks the otime information.
Fix this by adding the UUID and otime at the mkfs time. As a
consequence, snapshots of top-level subvolume now have a Parent UUID and
UUID tree will create an entry for top-level subvolume at mount time.
This should not cause the problem for current kernel, but user program
which relies on the empty Parent UUID may be affected by this change.
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Just like convert, we need extra check against sector size for creating
inline extent.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The bug is exposed by mkfs test case 009, with D=asan.
We are leaking memory of parent_dir_entry->path() which ,except the
rootdir, is allocated by strdup().
Before fixing it, unifiy the allocation of parent_dir_entry() to heap
allocation.
Then fix it by adding "free(parent_dir_entry->path);" in
traverse_directory() and error handler.
Issue: #92
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Do a cleanup. Also make it consistent with kernel. Use fs_info instead
of root for BTRFS_MAX_INLINE_DATA_SIZE, since maybe in some situation we
do not know root, but just know fs_info.
Change macro to inline function to be consistent with kernel. And
change the function body to match kernel.
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
@chunk_objectid of btrfs_make_block_group() function is always fixed to
BTRFS_FIRST_FREE_OBJECTID, so there is no need to pass it as parameter
explicitly.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
As btrfs is specific to Linux, %m can be used instead of strerror(errno)
in format strings. This has some size reduction benefits for embedded
systems.
glibc, musl, and uclibc-ng all support %m as a modifier to printf.
A quick glance at the BIONIC libc source indicates that it has
support for %m as well. BSDs and Windows do not but I do believe
them to be beyond the scope of btrfs-progs.
Compiled sizes on Ubuntu 16.04:
Before:
3916512 btrfs
233688 libbtrfs.so.0.1
4899 bcp
2367672 btrfs-convert
2208488 btrfs-corrupt-block
13302 btrfs-debugfs
2152160 btrfs-debug-tree
2136024 btrfs-find-root
2287592 btrfs-image
2144600 btrfs-map-logical
2130760 btrfs-select-super
2152608 btrfstune
2131760 btrfs-zero-log
2277752 mkfs.btrfs
9166 show-blocks
After:
3908744 btrfs
233256 libbtrfs.so.0.1
4899 bcp
2366560 btrfs-convert
2207432 btrfs-corrupt-block
13302 btrfs-debugfs
2151104 btrfs-debug-tree
2134968 btrfs-find-root
2281864 btrfs-image
2143536 btrfs-map-logical
2129704 btrfs-select-super
2151552 btrfstune
2130696 btrfs-zero-log
2276272 mkfs.btrfs
9166 show-blocks
Total savings: 23928 (24 kilo)bytes
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When creating btrfs, mkfs.btrfs will firstly create a temporary system
chunk as basis, and then created needed trees or new devices.
However the layout temporary system chunk is hard-coded and uses
reserved [0, 1M) range of devid 1.
Change the temporary chunk layout from old:
0 1M 4M 5M
|<----------- temp chunk -------------->|
And it's 1:1 mapped, which means it's a SINGLE chunk,
and stripe offset is also 0.
to new layout:
0 1M 4M 5M
|<----------- temp chunk -------------->|
And still keeps the 1:1 mapping.
However this also affects btrfs_min_dev_size() which still assume
temporary chunks starts at device offset 0.
The problem can only be exposed by "-m single" or "-M" where we reuse the
temporary chunk.
With other meta profiles, system and meta chunks are allocated by later
btrfs_alloc_chunk() call, and old SINGLE chunks are removed, so it will
be no such problem for other meta profiles.
Reported-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
[ folded fix for the minimal device size calculation ]
Signed-off-by: David Sterba <dsterba@suse.com>
For --rootdir, even for large existing file or block device, it will
always shrink the resulting filesystem.
The problem is, mkfs.btrfs will try to calculate the dir size, and use
it as @block_count to mkfs, which makes the filesystem shrunk.
Fix it by trying to get the original block device or file size as
@block_count, so mkfs.btrfs can use the full file/block device for
--rootdir option.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 460e93f257 ("btrfs-progs: mkfs: check the status of file at mkfs")
will try to check the file state before creating fs on it.
The check is mostly fine for normal mkfs case, while for --rootdir
option, it's allowed to create a new file if the destination file
doesn't exist.
Fix it by allowing non-existent file if --rootdir is specified.
Fixes: 460e93f257 ("btrfs-progs: mkfs: check the status of file at mkfs")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Make --shrink a separate option for --rootdir, and change the default to
off.
The shrinking behaviour is not a commonly used feature but can be useful
for creating minimal pre-filled images, in one step, without requiring
to mount.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update changelog and error messages ]
Signed-off-by: David Sterba <dsterba@suse.com>
Use the new dev extent based shrink method for rootdir option. This
restores the original behaviour when --rootdir will create a minimal
filesystem size.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
Use an easier method to calculate the estimate device size for
mkfs.btrfs --rootdir.
The new method will over-estimate, but should ensure we won't encounter
ENOSPC.
It relies on the following data:
1) number of inodes -- for metadata chunk size
2) rounded up data size of each regular inode -- for data chunk size
Total meta chunk size = round_up(nr_inode * (PATH_MAX * 3 + sectorsize),
min_chunk_size) * profile_multiplier
PATH_MAX is the maximum size possible for INODE_REF/DIR_INDEX/DIR_ITEM.
Sectorsize is the maximum size possible for inline extent.
min_chunk_size is 8M for SINGLE, and 32M for DUP, get from
btrfs_alloc_chunk().
profile_multiplier is 1 for Single, 2 for DUP.
Total data chunk size is much easier.
Total data chunk size = round_up(total_data_usage, min_chunk_size) *
profile_multiplier
Total_data_usage is the sum of *rounded up* size of each regular inode
use.
min_chunk_size is 8M for SINGLE, 64M for DUP, get from btrfS_alloc_chunk().
Same profile_multiplier for meta.
This over-estimate calculate is, of course inacurrate, but since we will
later shrink the fs to its real usage, it doesn't matter much now.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update comments ]
Signed-off-by: David Sterba <dsterba@suse.com>
Remove the custom chunk allocator for mkfs. It is buggy in connection to
the --rootdir option and puts file data to the reerved 1M area. The
feature of the custom allocator was to reserve only minimal amount of
blockgroup space. This will temporarily stop working and will need an
explicit request by option, added by following patches.
Use the generic chunk allocator.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
Cleanup of temporary chunks should be done as soon as possible, and it
should be especially before doing large tree operations, like filling
the filesystem when using --rootdir.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since new --rootdir can allocate chunk, it will modify the chunk
allocation result.
This patch will update allocation info before verbose output to reflect
such info.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Also rename the function from size_sourcedir() to mkfs_size_dir().
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In fact, --rootdir option is getting more and more independent from
normal mkfs code.
So move image creation function, make_image() and its related code to
mkfs/rootdir.[ch], and rename the function to btrfs_mkfs_fill_dir().
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It's a waste of IO to fill the whole image before creating btrfs on it,
just wiping the first 1M, and then write 1 byte to the last position to
create a sparse file.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since commit c11e36a29e ("Btrfs-progs: Do not force mixed block group
creation unless '-M' option is specified"), mkfs no longer use mixed
block group unless specified manually.
This breaks the minimal device size calculation, which only considered
mixed block group use case.
This patch enhances minimal device size calculation for mkfs, by using
different minimal stripe length (calculated from code) for different
profiles, and use them to calculate minimal device size.
Reported-by: Wesley Aptekar-Cassels <W.Aptekar@gmail.com>
Fixes: c11e36a29e ("Btrfs-progs: Do not force mixed block group creation unless '-M' option is specified")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
[ updated comments ]
Signed-off-by: David Sterba <dsterba@suse.com>
Currently, only the status of block devices is checked at mkfs,
but we should also check for regular files whether they are already
formatted or mounted to prevent overwrite accidentally.
Device status is checked by test_dev_for_mkfs().
The part which is not related to block device is split from this
and used for both block device and regular file.
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
test_minimum_size() function is only called to check if provided device
is large enough.
However the minimal device size only needs to be calculated once, and
can be reused everywhere.
Refactor that function to make later modification easier.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
--rootdir option will start a transaction to fill the fs, however if
something goes wrong, from ENOSPC to lack of permission, we won't commit
the transaction and cause BUG_ON triggered by uncommitted transaction:
------
extent buffer leak: start 29392896 len 16384
extent_io.c:579: free_extent_buffer: BUG_ON `eb->flags & EXTENT_DIRTY` triggered, value 1
------
The root fix is to introduce btrfs_abort_transaction() in btrfs-progs,
however in this particular case, we can workaround it by force
committing the transaction.
Since during mkfs, the magic of btrfs is set to an invalid one, without
setting fs_info->finalize_on_close() the fs is never able to be mounted.
So even we force to commit wrong transaction we won't screw up things
worse.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For mkfs failure, especially --rootdir errors like EPERM/ENOSPC, the out
branch will overwrite the return value, causing wrong status code.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since we're calling btrfs_search_slot() the return value can be
positive. However we just pass that return value out, causing undefined
return value.
This can cause mkfs to return 1, which indicates something wrong.
Fix it.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>