Add an option to mkfs to specify which checksum algorithm will be used
for the filesystem. Currently only crc32c is supported.
The option name is -c, presumably one of the comonly used options so it
gets the lowercase option.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Add checksum type to the definition structure for a new filesystem, this
will be used in following patches.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
When btrfs_add_to_fsid fails in mkfs we try to close the ctree. That
complains that we already have a transaction open. We should be taking
the error path and exit cleanly without writing.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When creating a filesystem with mixed block groups, we are creating two
space info objects to track used/reserved/pinned space, one only for data
and another one only for metadata.
This is making fstests test case generic/416 fail, with btrfs' check
reporting over an hundred errors about bad extents:
(...)
bad extent [17186816, 17190912), type mismatch with chunk
bad extent [17195008, 17199104), type mismatch with chunk
bad extent [17203200, 17207296), type mismatch with chunk
(...)
Because, surprisingly, this results in block groups that do not have the
BTRFS_BLOCK_GROUP_DATA flag set but have data extents allocated in them.
This is a regression introduced in btrfs-progs v5.2.
So fix this by making sure we only create one space info object, for both
metadata and data, when mixed block groups are enabled.
Fixes: c31edf610c ("btrfs-progs: Fix false ENOSPC alert by tracking used space correctly")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Build several standalone tools into one binary and switch the function
by name (symlink or hardlink).
* btrfs
* mkfs.btrfs
* btrfs-image
* btrfs-convert
* btrfstune
The static target is also supported. The name of resulting boxed
binaries is btrfs.box and btrfs.box.static . All the binaries can be
built at the same time without prior configuration.
text data bss dec hex filename
822454 27000 19724 869178 d433a btrfs
927314 28816 20812 976942 ee82e btrfs.box
2067745 58004 44736 2170485 211e75 btrfs.static
2627198 61724 83800 2772722 2a4ef2 btrfs.box.static
File sizes:
857496 btrfs
968536 btrfs.box
2141400 btrfs.static
2704472 btrfs.box.static
Standalone utilities:
512504 btrfs-convert
495960 btrfs-image
471224 btrfstune
491864 mkfs.btrfs
1747720 btrfs-convert.static
1411416 btrfs-image.static
1304256 btrfstune.static
1361696 mkfs.btrfs.static
So the shared 900K binary saves ~2M, or ~5.7M for static build.
Signed-off-by: David Sterba <dsterba@suse.cz>
[BUG]
There is a bug report of unexpected ENOSPC from btrfs-convert, issue #123.
After some debugging, even when we have enough unallocated space, we
still hit ENOSPC at btrfs_reserve_extent().
[CAUSE]
Btrfs-progs relies on chunk preallocator to make enough space for
data/metadata.
However after the introduction of delayed-ref, it's no longer reliable
to rely on btrfs_space_info::bytes_used and
btrfs_space_info::bytes_pinned to calculate used metadata space.
For a running transaction with a lot of allocated tree blocks,
btrfs_space_info::bytes_used stays its original value, and will only be
updated when running delayed ref.
This makes btrfs-progs chunk preallocator completely useless. And for
btrfs-convert/mkfs.btrfs --rootdir, if we're going to have enough
metadata to fill a metadata block group in one transaction, we will hit
ENOSPC no matter whether we have enough unallocated space.
[FIX]
This patch will introduce btrfs_space_info::bytes_reserved to track how
many space we have reserved but not yet committed to extent tree.
To support this change, this commit also introduces the following
modification:
- More comment on btrfs_space_info::bytes_*
To make code a little easier to read
- Export update_space_info() to preallocate empty data/metadata space
info for mkfs.
For mkfs, we only have a temporary fs image with SYSTEM chunk only.
Export update_space_info() so that we can preallocate empty
data/metadata space info before we start a transaction.
- Proper btrfs_space_info::bytes_reserved update
The timing is the as kernel (except we don't need to update
bytes_reserved for data extents)
* Increase bytes_reserved when call alloc_reserved_tree_block()
* Decrease bytes_reserved when running delayed refs
With the help of head->must_insert_reserved to determine whether we
need to decrease.
Issue: #123
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Although moderm hardware is fast enough and crc32c calculation is not a
hotspot, doing such optimization won't hurt anyway.
Issue: #175
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For data reloc tree creation, we copy its contents from the fs tree just
for its INODE_ITEM, INODE_REF and dirid. This hides the detail and is
not obvious for why we're copying from fs root.
This patch will create data reloc tree from scratch:
- Create root, including root item and new tree root
- Change dirid to BTRFS_FIRST_FREE_OBJECTID
- Insert root INODE_ITEM and INODE_REF
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Similar to the changes where strerror(errno) was converted, continue
with the remaining cases where the argument was stored in another
variable.
The savings in object size are about 4500 bytes:
$ size btrfs.old btrfs.new
text data bss dec hex filename
805055 24248 19748 849051 cf49b btrfs.old
804527 24248 19748 848523 cf28b btrfs.new
Signed-off-by: David Sterba <dsterba@suse.com>
The old flag OPEN_CTREE_FS_PARTIAL is in fact quite easy to be confused
with OPEN_CTREE_PARTIAL, which allow btrfs-progs to open damaged
filesystem (like corrupted extent/csum tree).
However OPEN_CTREE_FS_PARTIAL, unlike its name, is just allowing
btrfs-progs to open fs with temporary superblocks (which only has 6
basic trees on SINGLE meta/sys chunks).
The usage of FS_PARTIAL is really confusing here.
So rename OPEN_CTREE_FS_PARTIAL to OPEN_CTREE_TEMPORARY_SUPER, and add
extra comment for its behavior.
Also rename BTRFS_MAGIC_PARTIAL to BTRFS_MAGIC_TEMPORARY to keep the
naming consistent.
And with above comment, the usage of FS_PARTIAL in dump-tree is
obviously incorrect, fix it.
Fixes: 8698a2b9ba ("btrfs-progs: Allow inspect dump-tree to show specified tree block even some tree roots are corrupted")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With mkfs.btrfs on a thin provisioned device with very small backing
size and big virtual size, all code works well in mkfs.btrfs until
close_ctree() is called.
close_ctree() fails to sync device due to small backing size while
closing devices. However, mkfs returns 0 in such situation which causes
failure of fstests generic/405.
So, let mkfs returns nonzero value if previous steps succeeded but
close_ctree() failed. Then fstests generic/405 passes now.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We can easily create the uuid tree that's usually created after first
mount. The kernel will still check the tree on first mount so we don't
try to fake the uuid tree generation so it appears consistent, even if
it's empty.
Signed-off-by: David Sterba <dsterba@suse.com>
Currently, the top-level subvolume lacks the UUID. As a result, both
non-snapshot subvolume and snapshot of top-level subvolume do not have
Parent UUID and cannot be distinguisued. Therefore "fi show" of
top-level lists all the subvolumes which lacks the UUID in
"Snapshot(s)" filed. Also, it lacks the otime information.
Fix this by adding the UUID and otime at the mkfs time. As a
consequence, snapshots of top-level subvolume now have a Parent UUID and
UUID tree will create an entry for top-level subvolume at mount time.
This should not cause the problem for current kernel, but user program
which relies on the empty Parent UUID may be affected by this change.
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
@chunk_objectid of btrfs_make_block_group() function is always fixed to
BTRFS_FIRST_FREE_OBJECTID, so there is no need to pass it as parameter
explicitly.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
As btrfs is specific to Linux, %m can be used instead of strerror(errno)
in format strings. This has some size reduction benefits for embedded
systems.
glibc, musl, and uclibc-ng all support %m as a modifier to printf.
A quick glance at the BIONIC libc source indicates that it has
support for %m as well. BSDs and Windows do not but I do believe
them to be beyond the scope of btrfs-progs.
Compiled sizes on Ubuntu 16.04:
Before:
3916512 btrfs
233688 libbtrfs.so.0.1
4899 bcp
2367672 btrfs-convert
2208488 btrfs-corrupt-block
13302 btrfs-debugfs
2152160 btrfs-debug-tree
2136024 btrfs-find-root
2287592 btrfs-image
2144600 btrfs-map-logical
2130760 btrfs-select-super
2152608 btrfstune
2131760 btrfs-zero-log
2277752 mkfs.btrfs
9166 show-blocks
After:
3908744 btrfs
233256 libbtrfs.so.0.1
4899 bcp
2366560 btrfs-convert
2207432 btrfs-corrupt-block
13302 btrfs-debugfs
2151104 btrfs-debug-tree
2134968 btrfs-find-root
2281864 btrfs-image
2143536 btrfs-map-logical
2129704 btrfs-select-super
2151552 btrfstune
2130696 btrfs-zero-log
2276272 mkfs.btrfs
9166 show-blocks
Total savings: 23928 (24 kilo)bytes
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When creating btrfs, mkfs.btrfs will firstly create a temporary system
chunk as basis, and then created needed trees or new devices.
However the layout temporary system chunk is hard-coded and uses
reserved [0, 1M) range of devid 1.
Change the temporary chunk layout from old:
0 1M 4M 5M
|<----------- temp chunk -------------->|
And it's 1:1 mapped, which means it's a SINGLE chunk,
and stripe offset is also 0.
to new layout:
0 1M 4M 5M
|<----------- temp chunk -------------->|
And still keeps the 1:1 mapping.
However this also affects btrfs_min_dev_size() which still assume
temporary chunks starts at device offset 0.
The problem can only be exposed by "-m single" or "-M" where we reuse the
temporary chunk.
With other meta profiles, system and meta chunks are allocated by later
btrfs_alloc_chunk() call, and old SINGLE chunks are removed, so it will
be no such problem for other meta profiles.
Reported-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
[ folded fix for the minimal device size calculation ]
Signed-off-by: David Sterba <dsterba@suse.com>
For --rootdir, even for large existing file or block device, it will
always shrink the resulting filesystem.
The problem is, mkfs.btrfs will try to calculate the dir size, and use
it as @block_count to mkfs, which makes the filesystem shrunk.
Fix it by trying to get the original block device or file size as
@block_count, so mkfs.btrfs can use the full file/block device for
--rootdir option.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 460e93f257 ("btrfs-progs: mkfs: check the status of file at mkfs")
will try to check the file state before creating fs on it.
The check is mostly fine for normal mkfs case, while for --rootdir
option, it's allowed to create a new file if the destination file
doesn't exist.
Fix it by allowing non-existent file if --rootdir is specified.
Fixes: 460e93f257 ("btrfs-progs: mkfs: check the status of file at mkfs")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Make --shrink a separate option for --rootdir, and change the default to
off.
The shrinking behaviour is not a commonly used feature but can be useful
for creating minimal pre-filled images, in one step, without requiring
to mount.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update changelog and error messages ]
Signed-off-by: David Sterba <dsterba@suse.com>
Use the new dev extent based shrink method for rootdir option. This
restores the original behaviour when --rootdir will create a minimal
filesystem size.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
Use an easier method to calculate the estimate device size for
mkfs.btrfs --rootdir.
The new method will over-estimate, but should ensure we won't encounter
ENOSPC.
It relies on the following data:
1) number of inodes -- for metadata chunk size
2) rounded up data size of each regular inode -- for data chunk size
Total meta chunk size = round_up(nr_inode * (PATH_MAX * 3 + sectorsize),
min_chunk_size) * profile_multiplier
PATH_MAX is the maximum size possible for INODE_REF/DIR_INDEX/DIR_ITEM.
Sectorsize is the maximum size possible for inline extent.
min_chunk_size is 8M for SINGLE, and 32M for DUP, get from
btrfs_alloc_chunk().
profile_multiplier is 1 for Single, 2 for DUP.
Total data chunk size is much easier.
Total data chunk size = round_up(total_data_usage, min_chunk_size) *
profile_multiplier
Total_data_usage is the sum of *rounded up* size of each regular inode
use.
min_chunk_size is 8M for SINGLE, 64M for DUP, get from btrfS_alloc_chunk().
Same profile_multiplier for meta.
This over-estimate calculate is, of course inacurrate, but since we will
later shrink the fs to its real usage, it doesn't matter much now.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update comments ]
Signed-off-by: David Sterba <dsterba@suse.com>
Remove the custom chunk allocator for mkfs. It is buggy in connection to
the --rootdir option and puts file data to the reerved 1M area. The
feature of the custom allocator was to reserve only minimal amount of
blockgroup space. This will temporarily stop working and will need an
explicit request by option, added by following patches.
Use the generic chunk allocator.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
Cleanup of temporary chunks should be done as soon as possible, and it
should be especially before doing large tree operations, like filling
the filesystem when using --rootdir.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since new --rootdir can allocate chunk, it will modify the chunk
allocation result.
This patch will update allocation info before verbose output to reflect
such info.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Also rename the function from size_sourcedir() to mkfs_size_dir().
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In fact, --rootdir option is getting more and more independent from
normal mkfs code.
So move image creation function, make_image() and its related code to
mkfs/rootdir.[ch], and rename the function to btrfs_mkfs_fill_dir().
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It's a waste of IO to fill the whole image before creating btrfs on it,
just wiping the first 1M, and then write 1 byte to the last position to
create a sparse file.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since commit c11e36a29e ("Btrfs-progs: Do not force mixed block group
creation unless '-M' option is specified"), mkfs no longer use mixed
block group unless specified manually.
This breaks the minimal device size calculation, which only considered
mixed block group use case.
This patch enhances minimal device size calculation for mkfs, by using
different minimal stripe length (calculated from code) for different
profiles, and use them to calculate minimal device size.
Reported-by: Wesley Aptekar-Cassels <W.Aptekar@gmail.com>
Fixes: c11e36a29e ("Btrfs-progs: Do not force mixed block group creation unless '-M' option is specified")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
[ updated comments ]
Signed-off-by: David Sterba <dsterba@suse.com>
Currently, only the status of block devices is checked at mkfs,
but we should also check for regular files whether they are already
formatted or mounted to prevent overwrite accidentally.
Device status is checked by test_dev_for_mkfs().
The part which is not related to block device is split from this
and used for both block device and regular file.
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
test_minimum_size() function is only called to check if provided device
is large enough.
However the minimal device size only needs to be calculated once, and
can be reused everywhere.
Refactor that function to make later modification easier.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
--rootdir option will start a transaction to fill the fs, however if
something goes wrong, from ENOSPC to lack of permission, we won't commit
the transaction and cause BUG_ON triggered by uncommitted transaction:
------
extent buffer leak: start 29392896 len 16384
extent_io.c:579: free_extent_buffer: BUG_ON `eb->flags & EXTENT_DIRTY` triggered, value 1
------
The root fix is to introduce btrfs_abort_transaction() in btrfs-progs,
however in this particular case, we can workaround it by force
committing the transaction.
Since during mkfs, the magic of btrfs is set to an invalid one, without
setting fs_info->finalize_on_close() the fs is never able to be mounted.
So even we force to commit wrong transaction we won't screw up things
worse.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For mkfs failure, especially --rootdir errors like EPERM/ENOSPC, the out
branch will overwrite the return value, causing wrong status code.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since we're calling btrfs_search_slot() the return value can be
positive. However we just pass that return value out, causing undefined
return value.
This can cause mkfs to return 1, which indicates something wrong.
Fix it.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Mkfs uses make_path that is duplicate of path_cat* functions, so we can
switch to them and add the error handling.
Signed-off-by: David Sterba <dsterba@suse.com>
Add an objectid parameter to make the function a general one for
inserting root items and rename it to create_tree. The change cascades
down to the callchain.
Signed-off-by: yingyil <yingyil@google.com>
Signed-off-by: David Sterba <dsterba@suse.com>