Instead of aborting with a BUG_ON() statement, return a
negated errno code. Also updated mkfs and convert tools
to print a nicer error message when make_btrfs() returns
an error.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
We don't need callers to manage string storage for each pretty_sizes()
call. We can use a macro to have per-thread and per-call static storage
so that pretty_sizes() can be used as many times as needed in printf()
arguments without requiring a bunch of supporting variables.
This lets us have a natural interface at the cost of requiring __thread
and TLS from gcc and a small amount of static storage. This seems
better than the current code or doing something with illegible format
specifier macros.
Signed-off-by: Zach Brown <zab@redhat.com>
Acked-by: Wang Shilong <wangs.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Extend mkfs options to specify optional or potentially backwards
incompatible features.
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
When making btrfs filesystem. we firstly write root leaf to
specified filed, and then we recow the root. If we don't recow,
some trees are not in the correct block group.
Steps to reproduce:
dd if=/dev/zero of=test.img bs=1M count=100
mkfs.btrfs -f test.img
btrfs-debug-tree test.img
extent tree key (EXTENT_TREE ROOT_ITEM 0)
leaf 4210688 items 10 free space 3349 generation 4 owner 2
fs uuid 2e08fd93-f24d-4f44-a226-e2116fcd544f
chunk uuid dc482988-6246-46ce-9329-68bcf6d3683c
item 0 key (0 BLOCK_GROUP_ITEM 4194304) itemoff 3971 itemsize 24
block group used 12288 chunk_objectid 256 flags 2
[..snip..]
item 3 key (1138688 EXTENT_ITEM 4096) itemoff 3827 itemsize 42
extent refs 1 gen 1 flags 2
tree block key (0 UNKNOWN.0 0) level 0
item 4 key (1138688 TREE_BLOCK_REF 7) itemoff 3827 itemsize 0
tree block backref
[..snip..]
checksum tree key (CSUM_TREE ROOT_ITEM 0)
leaf 1138688 items 0 free space 3995 generation 1 owner 7
fs uuid 2e08fd93-f24d-4f44-a226-e2116fcd544f
chunk uuid dc482988-6246-46ce-9329-68bcf6d3683c
For the above example, csum root leaf comes into system block group which
is wrong,csum root leaf should be in metadata block group.
Signed-off-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com>
Reviewed-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
is_ssd() uses nondescript variable names; path - to what?
disk - it's a dev_t not a disk name, unlike dev, which is
a name not a dev_t!
Rename some vars to make things hopefully clearer:
wholedisk - the name of the node for the entire disk
devno - the dev_t of the device we're mkfs'ing
sysfs_path - the path in sysfs we ultimately check
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
blkid_probe_get_wholedisk_devno() isn't available in some older
versions of libblkid. It was used to work around an old
bug in blkid_devno_to_wholedisk(), but that has been fixed since
5cd0823 libblkid: fix blkid_devno_to_wholedisk(), present in
util-linux 2.17 and beyond.
If we happen to be missing that fix, the worst that happens is
that we'd fail to detect that a device is an ssd; the upside is
that this code compiles on older systems.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
This extref feature (lifting the single file hardlink limitation) is new
and not backward compatible with older kernels that are still in wide
use.
For now, use btrfstune to enable the feature, in the future it will be
possible to turn it on within mkfs by -O option.
Signed-off-by: David Sterba <dsterba@suse.cz>
We are going to unify enabling filesystem features via option -O.
For now, use btrfstune to enable the features.
Signed-off-by: David Sterba <dsterba@suse.cz>
This fixes up the progs to properly deal with skinny metadata. This adds the -x
option to mkfs and btrfstune for enabling the skinny metadata option. This also
makes changes to fsck so it can properly deal with the skinny metadata entries.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
In the cases where one of the disk is not suitable for
btrfs, then we would fail the mkfs, however we determine
that after we have written btrfs to the preceding disks.
At this time if user changes mind for not to use btrfs
will left with no choice.
So this patch will check if all the provided disks are
suitable for the btrfs at once before proceeding to
create btrfs on a disk.
Further this patch also removed duplicate code to check
device suitability for the btrfs.
Next, there is an existing bug about the -r mkfs option,
which this patch would carry forward most of it.
Ref:
[PATCH 2/2, RFC] btrfs-progs: overhaul mkfs.btrfs -r option
Signed-off-by: Anand Jain <anand.jain@oracle.com>
to merg prev
Signed-off-by: Anand Jain <anand.jain@oracle.com>
I missed updating the mkfs.btrfs usage() when I added the
option to force fs overwrite.
Update that, and while we're at it add a long option, since
all other commands have long counterparts.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Allocate fs_info::super_copy dynamically of full BTRFS_SUPER_INFO_SIZE
and use it directly for saving superblock to disk.
This fixes incorrect superblock checksum after mkfs.
Signed-off-by: David Sterba <dsterba@suse.cz>
The core of this is shamelessly stolen from xfsprogs.
Use blkid to detect an existing filesystem or partition
table on any of the target devices. If something is found,
require the '-f' option to overwrite it, hopefully avoiding
disaster due to mistyped devicenames, etc.
# mkfs.btrfs /dev/sda1
WARNING! - Btrfs v0.20-rc1-59-gd00279c-dirty IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using
/dev/sda1 appears to contain an existing filesystem (xfs).
Use the -f option to force overwrite.
#
This does introduce a requirement on libblkid.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Currently, the following commands succeed.
# cat /proc/swaps
Filename Type Size Used Priority
/dev/sda3 partition 8388604 0 -1
/dev/sdc8 partition 9765884 0 -2
# mkfs.btrfs /dev/sdc8
WARNING! - Btrfs v0.20-rc1-165-g82ac345 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using
fs created label (null) on /dev/sdc8
nodesize 4096 leafsize 4096 sectorsize 4096 size 9.31GB
Btrfs v0.20-rc1-165-g82ac345
# btrfs fi sh /dev/sdc8
Label: none uuid: fc0bdbd0-7eed-460f-b4e9-131273b66df2
Total devices 1 FS bytes used 28.00KB
devid 1 size 9.31GB used 989.62MB path /dev/sdc8
Btrfs v0.20-rc1-165-g82ac345
#
But we should check out the swap device. Fixed it.
Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Tested-by: David Sterba <dsterba@suse.cz>
size_sourcedir() uses shockingly bad code to try and estimate the size
of the files and directories in a subtree.
- Its use of snprintf(), strcat(), and sscanf() with arbitrarily small
on-stack buffers manages to overflow the stack a few times when given
long file names.
$ BIG=$(perl -e 'print "a" x 200')
$ mkdir -p /tmp/$BIG/$BIG/$BIG/$BIG/$BIG
$ mkfs.btrfs /tmp/img -r /tmp/$BIG/$BIG/$BIG/$BIG/$BIG
*** stack smashing detected ***: mkfs.btrfs terminated
- It passes raw paths to system() allowing interpreting file names as
shell control characters.
$ mkfs.btrfs /tmp/img -r /tmp/spacey\ dir/
du: cannot access `/tmp/spacey': No such file or directory
du: cannot access `dir/': No such file or directory
- It redirects du output to "temp_file" in the current directory,
allowing overwriting of files through symlinks.
$ echo hi > target
$ ln -s target temp_file
$ mkfs.btrfs /tmp/img -r /tmp/somedir/
$ cat target
3 /tmp/somedir/
This fixes the worst problems while maintaining -r functionality by
tearing out the system() code and using ftw() to walk the source tree
and sum up st.st_size.
Signed-off-by: Zach Brown <zab@redhat.com>
David Woodhouse originally contributed this code, and Chris Mason
changed it around to reflect the current design goals for raid56.
The original code expected all metadata and data writes to be full
stripes. This meant metadata block size == stripe size, and had a few
other restrictions.
This version allows metadata blocks smaller than the stripe size. It
implements both raid5 and raid6, although it does not have code to
rebuild from parity if one of the drives is missing or incorrect.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
This patch turns on the BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF superblock flag
when creating a new file system in mkfs, enabling extended inode refs.
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Commit 605e806166847872bb91831b397d58f95027975a broke the
mkfs.btrfs -r option, because it calls make_btrfs
without ever setting dev_block_count, in the -r case,
so we tell it to make a filesystem of size 0.
Then we wander into ENOSPC land and segfault.
As a quick one-line-fix, just set the dev_block_count
to the size of the destination image file.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
This patch kills a check in mkfs's label stuff which doesn't allow labels that
have /'s in them. This causes problems for Anaconda which try to label volumes
with their mountpoints. Thanks,
Signed-off-by: Josef Bacik <jbacik@redhat.com>
Patch rebased because of changes in mkfs.c but otherwise the same
as created by Josef Bacik
SSD's do not gain anything by having metadata DUP turned on. The underlying
file system that is a part of all SSD's could easily map duplicate metadat
blocks into the same erase block which effectively eliminates the benefit of
duplicating the metadata on disk. So detect if we are formatting a single
SSD drive and if we are do not use DUP. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Gene Czarcinski <gene@czarc.net>
Rawhide is getting cranky with posix compliance, and a few
things have stopped building.
getpagesize() is now only available -with- __USE_XOPEN_EXTENDED
or __USE_BSD, and NOT __USE_XOPEN2K.
_GNU_SOURCE must define __USE_XOPEN2K because getpagesize()
has gone away for mkfs. I gave up and used sysconf.
Also, something used to pull in stat that no longer does, so
things like S_ISREG weren't getting defined.
The following fixes things for me.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
The kernel uses unsigned long long for u64, but PPC64 uses unsigned
long by default. This results in compilation warnings such as:
print-tree.c:333: warning: format '%llu' expects type 'long long
unsigned int', but argument 4 has type 'u64'
To fix this, the macro __KERNEL__ needs to be defined before including
the file <asm/types.h>. This can be done by defining the macro in
"kerncompat.h" and making it the first included file in the relevant
header files; this fixes the compiler warnings on PPC64.
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Wade Cline <clinew@linux.vnet.ibm.com>
Using mkfs.btrfs like:
mkfs.btrfs -l 131072 /dev/sda
will return no error, but after mount it, the dmesg will report:
BTRFS: couldn't mount because metadata blocksize (131072) was too large
The leafsize and nodesize are equal at present, so we just use one function
"check_leaf_or_node_size" to limit leaf and node size below BTRFS_MAX_METADATA_BLOCKSIZE.
Signed-off-by: Robin Dong <sanbai@taobao.com>
Reviewed-by: David Sterba <dave@jikos.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
My patch
04609add88
introduced a regression where if you mkfs'ed a group of disks with different
sizes it limited the disks to the size of the first one that is specified.
This was not the intent of my patch, I only want it to limit the size based
on the -b option, so I've reworked the code to pass in a max block count and
that fixes the issue. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
I had a test that creates a 7gig raid1 device but it was ending up wonky
because the second device that gets added is the full size of the disk
instead of the limited size. So enforce the limited size on all disks
passed in at mkfs time, otherwise our threshold calculations end up wonky
when doing chunk allocations. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
On Wed 08-02-12 22:05:26, Phillip Susi wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 02/08/2012 06:20 PM, Jan Kara wrote:
> > Thanks for your reply. I admit I was not sure what exactly size argument
> > should be. So after looking into the code for a while I figured it should
> > be a total size of the filesystem - or differently it should be size of
> > virtual block address space in the filesystem. Thus when filesystem has
> > more devices (or admin wants to add more devices later), it can be larger
> > than the first device. But I'm not really a btrfs developper so I might be
> > wrong and of course feel free to fix the issue as you deem fit.
>
> The size of the fs is the total size of the individual disks. When you
> limit the size, you limit the size of a disk, not the whole fs. IIRC,
> mkfs initializes the fs on the first disk, which is why it was using that
> size as the size of the whole fs, and then adds the other disks after (
> which then add their size to the total fs size ).
OK, I missed that btrfs_add_to_fsid() increases total size of the
filesystem. So now I agree with you. New patch is attached. Thanks for your
review.
> It might be nice if
> mkfs could take sizes for each disk, but it only seems to take one size
> for the initial disk.
Yes, but I don't see a realistic usecase so I don't think it's really
worth the work.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
>From e5f46872232520310c56327593c02ef6a7f5ea33 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Fri, 10 Feb 2012 11:44:44 +0100
Subject: [PATCH] mkfs: Handle creation of filesystem larger than the first device
mkfs does not properly check requested size of the filesystem. Thus if the
requested size is larger than the first device, it happily creates larger
filesystem than a device it resides on which results in 'attemp to access
beyond end of device' messages from the kernel. So verify specified filesystem
size against the size of the first device.
CC: David Sterba <dsterba@suse.cz>
Signed-off-by: Jan Kara <jack@suse.cz>
* mkfs.c (parse_size): ./mkfs.btrfs -A '' would read and possibly
write the byte before beginning of strdup'd heap buffer. All other
size-accepting options were similarly affected.
Reviewed-by: Josef Bacik <josef@redhat.com>
We don't allow different leaf and node blocksizes, so
this just makes the two options mean the same thing
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Before commit a46e7ff2 was merged it was possible to create dup for
data+metadata chunks (mixed mode) by giving -m raid1 -d raid1 -M to
mkfs. a46e7ff2 purposefully disabled behind the scenes profile
upgrading/downgrading, so give users a chance to pick dup explicitly and
bail if dup for data is requested in normal mode.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Currently mkfs in response to
mkfs.btrfs -d raid10 dev1 dev2
instead of telling "you can't do that" creates a SINGLE on two devices,
and only rebalance can transform it to raid0. Generally, it never warns
users about decisions it makes and it's not at all obvious which profile
it picks when.
Fix this by checking the number of effective devices and reporting back
if the specified profile is impossible to create. Do not create FS in
case invalid profile was given.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
* Initialize ret in btrfs_csum_file_block
* Do not abort when xattr is not supported in the source directory
* Remove size limitation of 256M
* Alloc data chunk in a smaller size (8M) to make btrfs image smaller
* Let user specify the btrfs image name
Depends on below patch from samsung guys:
http://marc.info/?l=linux-btrfs&m=127858068226025&w=2
Signed-off-by: Zhong, Xin <xin.zhong@intel.com>
Hello,
While going through the mkfs.c, I noticed there is an issue for label
length checking, mkfs.btrfs will crashed if the label length exceeding
255 bytes, it's easy to triggered that out as below:
jeff@pibroch:~/opensource/btrfs-progs$ sudo ./mkfs.btrfs -L `perl -e
'print "A"x256'` /usr/src/linux-3.0/img0
WARNING! - Btrfs v0.19-35-g1b444cd IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using
*** buffer overflow detected ***: ./mkfs.btrfs terminated
======= Backtrace: =========
/lib/i386-linux-gnu/libc.so.6(__fortify_fail+0x50)[0xb7774df0]
/lib/i386-linux-gnu/libc.so.6(+0xe4cca)[0xb7773cca]
/lib/i386-linux-gnu/libc.so.6(__strcpy_chk+0x3f)[0xb777305f]
./mkfs.btrfs[0x805acc4]
./mkfs.btrfs[0x805def6]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0xb76a5e37]
./mkfs.btrfs[0x8048ef1]
======= Memory map: ========
......
a tiny patch could fix it.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
gcc 4.6 complains about several possible use-before-initialise cases
in mkfs, and stops. Fix these by initialising one of the variables in
question, and using the correct error-handling paths for the
remainder.
Signed-off-by: Hugo Mills <hugo@carfax.org.uk>
Smart gcc noticed use of uninitialized warning when compiled
with -O0 flags:
mkfs.c:1291: error: 'file' may be used uninitialized in this function
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Hugo Mills <hugo@carfax.org.uk>
found by valgrind:
==2559== 16 bytes in 1 blocks are definitely lost in loss record 3 of 19
==2559== at 0x4C2720E: malloc (vg_replace_malloc.c:236)
==2559== by 0x412F7E: pretty_sizes (utils.c:1054)
==2559== by 0x4179E9: main (mkfs.c:1395)
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Hugo Mills <hugo@carfax.org.uk>
Found by valgrind:
==8968== Use of uninitialised value of size 8
==8968== at 0x41CE7D: crc32c_le (crc32c.c:98)
==8968== by 0x40A1D0: csum_tree_block_size (disk-io.c:82)
==8968== by 0x40A2D4: csum_tree_block (disk-io.c:105)
==8968== by 0x40A7D6: write_tree_block (disk-io.c:241)
==8968== by 0x40ACEE: __commit_transaction (disk-io.c:354)
==8968== by 0x40AE9E: btrfs_commit_transaction (disk-io.c:385)
==8968== by 0x42CF66: make_image (mkfs.c:1061)
==8968== by 0x42DE63: main (mkfs.c:1410)
==8968== Uninitialised value was created by a stack allocation
==8968== at 0x42B5FB: add_inode_items (mkfs.c:493)
1. On-disk inode format has reserved (and thus, random at alloc time) fields:
btrfs_inode_item: __le64 reserved[4]
2. Sometimes extents are created on disk without writing data there.
(Or at least not all data is written there). Kernel code always had
it kzalloc'ed.
Zero them all.
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Hugo Mills <hugo@carfax.org.uk>