Now we set @refs to 2 on creating a new extent buffer, meanwhile we
allocate the needed free space, but we don't give enough free_extent_buffer()
to reduce the eb's references to zero so that the eb can finally be freed,
so the problem is we has decrease the referene count of backrefs to zero, which
ends up releasing the space occupied by the eb, and this space can be allocated
again for something else(another eb or disk), usually a crash(core dump) will
occur, I've hit a crash in rb_insert() because another eb re-use the space while
the original one is floating around.
We should do the same thing as the kernel code does, it's necessary to initialize
@refs to 1 instead of 2, this helps us get rid of the above problem.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
This fixes the regression introduced with the patch
btrfs-progs: avoid write to the disk before sure to create fs
what happened with this patch is it missed the check to see if the
user has the option set before pushing the defaults.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
The feature has been introduced in kernel 3.7 and enabling it by
default is desired.
All features enabled by default are marked as such in
'mkfs.btrfs -O list-all' output.
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
A way of disabling features that are on by default in case it's not
wanted, eg. due to lack of support in the used kernel.
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
16KB is faster and leads to less metadata fragmentation in almost all
workloads. It does slightly increase lock contention on the root nodes
in some workloads, but that is best dealt with by adding more subvolumes
(for now).
This uses 16KB or the page size, whichever is bigger. If you're doing a
mixed block group mkfs, it uses the sectorsize instead.
Since the kernel refuses to mount a mixed block group FS where the
metadata leaf size doesn't match the data sectorsize, this also adds a
similar check during mkfs.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
add_file_items() leaked "buffer" on this error return.
Free it first.
Resolves-Coverity-CID: 1125937
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
So I needed to add a flag to not try to read block groups when doing
--init-extent-tree since we could hang there, but that meant adding a whole
other 0/1 type flag to open_ctree_fs_info. So instead I've converted it all
over to using a flags setting and added the flag that I needed. This has been
tested with xfstests and make test. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
mkfs -r wasn't creating chunks properly, making it very difficult to
allocate space for anything except tiny filesystems.
This changes it around to use more of the generic infrastructure, and
to do actual logical->physical block number translation.
It also allocates space to the files in smaller extents (max 1MB), which
keeps the allocator from trying to allocate an extent bigger than a
single chunk.
It doesn't quite support multi-device mkfs -r yet, but is much closer.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Signed-off-by: "Chris West (Faux)" <git@goeswhere.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
The previous patch works fine if the size of specified volume to mkfs
is less than 4MB. However usually btrfs requires more than 4MB to work,
and the minimum preferred size is depending on the raid setting etc.
This patch let mkfs print error message if it cannot allocate one of
chunks should be there at first.
[before]
# truncate --size=4500K testfile
# ./mkfs.btrfs -f testfile
:
SMALL VOLUME: forcing mixed metadata/data groups
mkfs.btrfs: mkfs.c:84: make_root_dir: Assertion `!(ret)' failed.
Aborted (core dumped)
[After]
# truncate --size=4500K testfile
# ./mkfs.btrfs -f testfile
:
SMALL VOLUME: forcing mixed metadata/data groups
no space to alloc data/metadata chunk
failed to setup the root directory
TBD is calculate minimum size for setting and put it in the error
message to let user know how large amount of volume is required.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Eric pointed out that mkfs abort if specified volume is too small:
# truncate --size=2m testfile
# ./mkfs.btrfs testfile
:
SMALL VOLUME: forcing mixed metadata/data groups
mkfs.btrfs: volumes.c:852: btrfs_alloc_chunk: Assertion `!(ret)' failed.
Aborted (core dumped)
As the first step to fix problems around there, let mkfs to report
error if the size of target volume is less than the size of the first
system block group, BTRFS_MKFS_SYSTEM_GROUP_SIZE (= 4MB).
Reported-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
The local probe variable in is_ssd() freed upon unsuccessful return;
The local dir_head list in make_image() freed upon unsuccessful return.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
mkfs.c: In function ‘is_ssd’:
mkfs.c:1168:26: warning: ignoring return value of ‘blkid_devno_to_wholedisk’,
declared with attribute warn_unused_result [-Wunused-result]
blkid_devno_to_wholedisk(devno, wholedisk, sizeof(wholedisk), NULL);
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
This fix the regression introduced by 830427d
that it no more creates the FS if disk is small
and if no mixed option is provided.
This patch will bring it to the original design
which will force mixed profile when disk is small
and go ahead to create the FS.
Which also means that before we open the device
for the write we should also check if disk is small.
v2: fixes the checkpatch.pl warnings
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
This patch provides fix for the following bug,
When mkfs.btrfs fails the disks shouldn't be written.
------------
btrfs fi show /dev/sdb
Label: none uuid: 60fb76f4-3b4d-4632-a7da-6a44dea5573d
Total devices 1 FS bytes used 24.00KiB
devid 1 size 2.00GiB used 20.00MiB path /dev/sdb
mkfs.btrfs -dsingle -mraid1 /dev/sdb -f
::
unable to create FS with metadata profile 16 (have 1 devices)
btrfs fi show /dev/sdb
Label: none uuid: 2da2179d-ecb1-4a4e-a44d-e7613a08c18d
Total devices 1 FS bytes used 24.00KiB
devid 1 size 2.00GiB used 20.00MiB path /dev/sdb
-------------
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Before this change, passing -O skinny-metadata to mkfs.btrfs would
only set the skinny metadata incompat flag in the super block after
the filesystem was created. This change makes mkfs.btrfs directly
create a filesystem with only skinny extents for metadata.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
These were mostly in option structs but there were a few gross string
pointer arguments given as 0.
Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
sparse can freak out when <linux/fs.h> is included because it redefines
approximately a gazillion symbols already found in <sys/mount.h>:
/usr/include/linux/fs.h:203:9: warning: preprocessor token MS_RDONLY redefined
/usr/include/sys/mount.h:37:9: this was the original definition
Happily, we don't actually need to include the low-level <linux/fs.h>
for anything. One assumes it was just carried over from kernel space.
Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
__CHECKER__ is only for the type juggling used to tell sparse which
types need conversion between address spaces. It is not OK to use to
change the code that gets checked to avoid bugs elsewhere in the build
infrastructure. We want to check the code that builds when the checker
isn't enabled.
Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Commit 8d082fb727ac11930ea20bf1612e334ea7c2b697 (Btrfs: do not mount when
we have a sectorsize unequal to PAGE_SIZE) requires the sectorsize to be
equal to the pagesize for the filesystem to be mountable.
The nodesize and leafsize should be equal, and not larger than 65536.
Adding this information to the manpage and usage instructions of mkfs.btrfs.
Signed-off-by: Koen De Wit <koen.de.wit@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Port of commit b3b4aa7 to userspace.
parameter tree root it's not used since commit
5f39d397dfbe140a14edecd4e73c34ce23c4f9ee ("Btrfs: Create extent_buffer
interface for large blocksizes")
This gets userspace a tad closer to kernelspace by removing
this unused parameter that was all over the codebase...
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Instead of aborting with a BUG_ON() statement, return a
negated errno code. Also updated mkfs and convert tools
to print a nicer error message when make_btrfs() returns
an error.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
We don't need callers to manage string storage for each pretty_sizes()
call. We can use a macro to have per-thread and per-call static storage
so that pretty_sizes() can be used as many times as needed in printf()
arguments without requiring a bunch of supporting variables.
This lets us have a natural interface at the cost of requiring __thread
and TLS from gcc and a small amount of static storage. This seems
better than the current code or doing something with illegible format
specifier macros.
Signed-off-by: Zach Brown <zab@redhat.com>
Acked-by: Wang Shilong <wangs.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Extend mkfs options to specify optional or potentially backwards
incompatible features.
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
When making btrfs filesystem. we firstly write root leaf to
specified filed, and then we recow the root. If we don't recow,
some trees are not in the correct block group.
Steps to reproduce:
dd if=/dev/zero of=test.img bs=1M count=100
mkfs.btrfs -f test.img
btrfs-debug-tree test.img
extent tree key (EXTENT_TREE ROOT_ITEM 0)
leaf 4210688 items 10 free space 3349 generation 4 owner 2
fs uuid 2e08fd93-f24d-4f44-a226-e2116fcd544f
chunk uuid dc482988-6246-46ce-9329-68bcf6d3683c
item 0 key (0 BLOCK_GROUP_ITEM 4194304) itemoff 3971 itemsize 24
block group used 12288 chunk_objectid 256 flags 2
[..snip..]
item 3 key (1138688 EXTENT_ITEM 4096) itemoff 3827 itemsize 42
extent refs 1 gen 1 flags 2
tree block key (0 UNKNOWN.0 0) level 0
item 4 key (1138688 TREE_BLOCK_REF 7) itemoff 3827 itemsize 0
tree block backref
[..snip..]
checksum tree key (CSUM_TREE ROOT_ITEM 0)
leaf 1138688 items 0 free space 3995 generation 1 owner 7
fs uuid 2e08fd93-f24d-4f44-a226-e2116fcd544f
chunk uuid dc482988-6246-46ce-9329-68bcf6d3683c
For the above example, csum root leaf comes into system block group which
is wrong,csum root leaf should be in metadata block group.
Signed-off-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com>
Reviewed-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
is_ssd() uses nondescript variable names; path - to what?
disk - it's a dev_t not a disk name, unlike dev, which is
a name not a dev_t!
Rename some vars to make things hopefully clearer:
wholedisk - the name of the node for the entire disk
devno - the dev_t of the device we're mkfs'ing
sysfs_path - the path in sysfs we ultimately check
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
blkid_probe_get_wholedisk_devno() isn't available in some older
versions of libblkid. It was used to work around an old
bug in blkid_devno_to_wholedisk(), but that has been fixed since
5cd0823 libblkid: fix blkid_devno_to_wholedisk(), present in
util-linux 2.17 and beyond.
If we happen to be missing that fix, the worst that happens is
that we'd fail to detect that a device is an ssd; the upside is
that this code compiles on older systems.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
This extref feature (lifting the single file hardlink limitation) is new
and not backward compatible with older kernels that are still in wide
use.
For now, use btrfstune to enable the feature, in the future it will be
possible to turn it on within mkfs by -O option.
Signed-off-by: David Sterba <dsterba@suse.cz>
We are going to unify enabling filesystem features via option -O.
For now, use btrfstune to enable the features.
Signed-off-by: David Sterba <dsterba@suse.cz>
This fixes up the progs to properly deal with skinny metadata. This adds the -x
option to mkfs and btrfstune for enabling the skinny metadata option. This also
makes changes to fsck so it can properly deal with the skinny metadata entries.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
In the cases where one of the disk is not suitable for
btrfs, then we would fail the mkfs, however we determine
that after we have written btrfs to the preceding disks.
At this time if user changes mind for not to use btrfs
will left with no choice.
So this patch will check if all the provided disks are
suitable for the btrfs at once before proceeding to
create btrfs on a disk.
Further this patch also removed duplicate code to check
device suitability for the btrfs.
Next, there is an existing bug about the -r mkfs option,
which this patch would carry forward most of it.
Ref:
[PATCH 2/2, RFC] btrfs-progs: overhaul mkfs.btrfs -r option
Signed-off-by: Anand Jain <anand.jain@oracle.com>
to merg prev
Signed-off-by: Anand Jain <anand.jain@oracle.com>
I missed updating the mkfs.btrfs usage() when I added the
option to force fs overwrite.
Update that, and while we're at it add a long option, since
all other commands have long counterparts.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Allocate fs_info::super_copy dynamically of full BTRFS_SUPER_INFO_SIZE
and use it directly for saving superblock to disk.
This fixes incorrect superblock checksum after mkfs.
Signed-off-by: David Sterba <dsterba@suse.cz>
The core of this is shamelessly stolen from xfsprogs.
Use blkid to detect an existing filesystem or partition
table on any of the target devices. If something is found,
require the '-f' option to overwrite it, hopefully avoiding
disaster due to mistyped devicenames, etc.
# mkfs.btrfs /dev/sda1
WARNING! - Btrfs v0.20-rc1-59-gd00279c-dirty IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using
/dev/sda1 appears to contain an existing filesystem (xfs).
Use the -f option to force overwrite.
#
This does introduce a requirement on libblkid.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Currently, the following commands succeed.
# cat /proc/swaps
Filename Type Size Used Priority
/dev/sda3 partition 8388604 0 -1
/dev/sdc8 partition 9765884 0 -2
# mkfs.btrfs /dev/sdc8
WARNING! - Btrfs v0.20-rc1-165-g82ac345 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using
fs created label (null) on /dev/sdc8
nodesize 4096 leafsize 4096 sectorsize 4096 size 9.31GB
Btrfs v0.20-rc1-165-g82ac345
# btrfs fi sh /dev/sdc8
Label: none uuid: fc0bdbd0-7eed-460f-b4e9-131273b66df2
Total devices 1 FS bytes used 28.00KB
devid 1 size 9.31GB used 989.62MB path /dev/sdc8
Btrfs v0.20-rc1-165-g82ac345
#
But we should check out the swap device. Fixed it.
Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Tested-by: David Sterba <dsterba@suse.cz>
size_sourcedir() uses shockingly bad code to try and estimate the size
of the files and directories in a subtree.
- Its use of snprintf(), strcat(), and sscanf() with arbitrarily small
on-stack buffers manages to overflow the stack a few times when given
long file names.
$ BIG=$(perl -e 'print "a" x 200')
$ mkdir -p /tmp/$BIG/$BIG/$BIG/$BIG/$BIG
$ mkfs.btrfs /tmp/img -r /tmp/$BIG/$BIG/$BIG/$BIG/$BIG
*** stack smashing detected ***: mkfs.btrfs terminated
- It passes raw paths to system() allowing interpreting file names as
shell control characters.
$ mkfs.btrfs /tmp/img -r /tmp/spacey\ dir/
du: cannot access `/tmp/spacey': No such file or directory
du: cannot access `dir/': No such file or directory
- It redirects du output to "temp_file" in the current directory,
allowing overwriting of files through symlinks.
$ echo hi > target
$ ln -s target temp_file
$ mkfs.btrfs /tmp/img -r /tmp/somedir/
$ cat target
3 /tmp/somedir/
This fixes the worst problems while maintaining -r functionality by
tearing out the system() code and using ftw() to walk the source tree
and sum up st.st_size.
Signed-off-by: Zach Brown <zab@redhat.com>
David Woodhouse originally contributed this code, and Chris Mason
changed it around to reflect the current design goals for raid56.
The original code expected all metadata and data writes to be full
stripes. This meant metadata block size == stripe size, and had a few
other restrictions.
This version allows metadata blocks smaller than the stripe size. It
implements both raid5 and raid6, although it does not have code to
rebuild from parity if one of the drives is missing or incorrect.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
This patch turns on the BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF superblock flag
when creating a new file system in mkfs, enabling extended inode refs.
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Commit 605e806166 broke the
mkfs.btrfs -r option, because it calls make_btrfs
without ever setting dev_block_count, in the -r case,
so we tell it to make a filesystem of size 0.
Then we wander into ENOSPC land and segfault.
As a quick one-line-fix, just set the dev_block_count
to the size of the destination image file.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
This patch kills a check in mkfs's label stuff which doesn't allow labels that
have /'s in them. This causes problems for Anaconda which try to label volumes
with their mountpoints. Thanks,
Signed-off-by: Josef Bacik <jbacik@redhat.com>
Patch rebased because of changes in mkfs.c but otherwise the same
as created by Josef Bacik
SSD's do not gain anything by having metadata DUP turned on. The underlying
file system that is a part of all SSD's could easily map duplicate metadat
blocks into the same erase block which effectively eliminates the benefit of
duplicating the metadata on disk. So detect if we are formatting a single
SSD drive and if we are do not use DUP. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Gene Czarcinski <gene@czarc.net>
Rawhide is getting cranky with posix compliance, and a few
things have stopped building.
getpagesize() is now only available -with- __USE_XOPEN_EXTENDED
or __USE_BSD, and NOT __USE_XOPEN2K.
_GNU_SOURCE must define __USE_XOPEN2K because getpagesize()
has gone away for mkfs. I gave up and used sysconf.
Also, something used to pull in stat that no longer does, so
things like S_ISREG weren't getting defined.
The following fixes things for me.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>