The filesystem existence on a device is manifested by the signature,
during the mkfs process we write it first and then create other
structures. Such filesystem is not valid and should not be registered
during device scan nor listed among devices from blkid.
This patch will introduce two staged creation. In the first phase, the
signature is wrong, but recognized as a partially created filesystem (by
open or scan helpers). Once we successfully create and write everything,
we fixup the signature. At this point automated scanning should find
a valid filesystem on all devices.
We can also rely on the partially created filesystem to do better error
handling during creation. We can just bail out and do not need to clean
up.
The partial signature is '!BHRfS_M', can be shown by
btrfs inspect-internal dump-super -F image
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We call scan ioctl on the devices too early, when most of the filesystem
structures are not yet created. Move the registration to the end, after
the filesystem gets closed.
Signed-off-by: David Sterba <dsterba@suse.com>
Do not use fprintf, adjust messages, add verbose errno or at least the
errorr code if there's no clear mapping to a string.
Signed-off-by: David Sterba <dsterba@suse.com>
The message about discard is printed unconditionally and does not
conform to the --quite option eg. in mkfs. Consolidate the operation
flags into one argument and add support for verbosity.
Signed-off-by: David Sterba <dsterba@suse.com>
When cleanup_temp_chunks() removes block groups, it forgot to update
mkfs_allocation accordingly, fix this.
Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
[ minor adjustments ]
Signed-off-by: David Sterba <dsterba@suse.com>
stripesize should ideally be set to the value of sectorsize. However
previous versions of btrfs-progs/mkfs.btrfs had set stripesize to a
value of 4096. On machines with PAGE_SIZE other than 4096, This could
lead to the following scenario,
- /dev/loop0, /dev/loop1 and /dev/loop2 are mounted as a single
filesystem. The filesystem was created by an older version of mkfs.btrfs
which set stripesize to 4k.
- losetup -a
/dev/loop0: [0030]:19477 (/root/disk-imgs/file-0.img)
/dev/loop1: [0030]:16577 (/root/disk-imgs/file-1.img)
/dev/loop2: [64770]:3423229 (/root/disk-imgs/file-2.img)
- /etc/mtab lists only /dev/loop0
- losetup /dev/loop4 /root/disk-imgs/file-1.img
The new mkfs.btrfs invoked as 'mkfs.btrfs -f /dev/loop4' succeeds even
though /dev/loop1 has already been mounted and has
/root/disk-imgs/file-1.img as its backing file.
The above behaviour occurs because check_super() function returns an
error code (due to stripesize not being set to 4096) and hence
check_mounted_where() function treats /dev/loop1 as a disk containing a
filesystem other than Btrfs.
Hence as a workaround this commit allows 4096 as a valid stripesize.
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce new function, make_convert_data_chunks(), to build up data
chunks for convert.
It will call a modified version of btrfs_alloc_data_chunk() to force
data chunks to covert all known ext* data.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce new function make_convert_btrfs() for convert.
This new function will have the following features:
1) Allocate temporary sb/metadata/system chunk, avoiding old used data
2) More structured functions
No more over 1000 lines function, better function split and code
reuse
This will finally replace current make_btrfs(), but now only used for
convert.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
ftw_add_entry_size() assumes 4k as the block size of the underlying
filesystem and hence the file sizes computed is incorrect for non-4k
sectorsized filesystems. Fix this by rounding up file sizes to
sectorsize.
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In current versions of util-linux the buffer passed to blkid_devno_to_wholedisk
has to be sufficiently large to not only hold the device name but the complete
target of the /sys/dev/block/<maj:min> symlink. This was changed only recently
in 4419ffb9eff5801fdbd385a4a6199b3877f802ad.
The small buffer size currently can lead to failure of is_ssd due to truncated
device names:
readlink("/sys/dev/block/254:7", "../../devices/virtual/block/dm-", 31) = 31
open("/sys/block/dm-/queue/rotational", O_RDONLY) = -1 ENOENT (No such file or directory)
Signed-off-by: Michael Lass <bevan@bi-co.net>
Signed-off-by: David Sterba <dsterba@suse.com>
Rewrite the loop so we don't need to allocate sectorsize and write in 4k
steps instead. We know that sectorsize is divisible by 4096.
Signed-off-by: David Sterba <dsterba@suse.com>
With the rootdir option we try to guess the final size of the image and
fill it with zeros, preceded by truncation. After patch
"Btrfs-progs: Do not force mixed block group creation unless '-M' option
is specified"
the misc test 002 will fail, because of the non-mixed mode. I think we
should not touch the image size (no change for block devices) and try to
fit into whatever is provided by user.
Signed-off-by: David Sterba <dsterba@suse.com>
Variant named dev_uuid and uuid_unparse() for set its value are
not used, remove it.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
list_for_each_entry_reverse() in current code can not output
devices in sorted order, because the sequence are broken in
btrfs_alloc_chunk().
We can use list_sort() instead.
Before patch:
# mkfs.btrfs -f /dev/vdd /dev/vde /dev/vdf
...
Number of devices: 3
Devices:
ID SIZE PATH
3 2.60GiB /dev/vdf
1 2.60GiB /dev/vdd
2 2.60GiB /dev/vde
After patch:
# mkfs.btrfs -f /dev/vdd /dev/vde /dev/vdf
...
Number of devices: 3
Devices:
ID SIZE PATH
1 2.60GiB /dev/vdd
2 2.60GiB /dev/vde
3 2.60GiB /dev/vdf
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We no longer force mixed-bg mode since "Btrfs-progs: Do not force mixed
block group creation unless '-M' option is specified", the message is
not relevant anymore.
Signed-off-by: David Sterba <dsterba@suse.com>
mkfs.btrfs allows creation of Btrfs filesystem instances with mixed block
group feature enabled and having a sectorsize different from nodesize.
For e.g:
[root@localhost btrfs-progs]# mkfs.btrfs -f -M -s 4096 -n 16384 /dev/loop0
Forcing mixed metadata/data groups
btrfs-progs v3.19-rc2-404-gbbbd18e-dirty
See http://btrfs.wiki.kernel.org for more information.
Performing full device TRIM (4.00GiB) ...
Label: (null)
UUID: c82b5720-6d88-4fa1-ac05-d0d4cb797fd5
Node size: 16384
Sector size: 4096
Filesystem size: 4.00GiB
Block group profiles:
Data+Metadata: single 8.00MiB
System: single 4.00MiB
SSD detected: no
Incompat features: mixed-bg, extref, skinny-metadata
Number of devices: 1
Devices:
ID SIZE PATH
1 4.00GiB /dev/loop6
This commit fixes the issue by setting BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS
feature bit before checking the validity of nodesize that was specified on the
command line.
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When creating small Btrfs filesystem instances (i.e. filesystem size <= 1GiB),
mkfs.btrfs fails if both sectorsize and nodesize are specified on the command
line and sectorsize != nodesize, since mixed block groups involves both data
and metadata blocks sharing the same block group. This is an incorrect behavior
when '-M' option isn't specified on the command line.
This commit makes optional the creation of mixed block groups i.e. Mixed block
groups are created only when -M option is specified on the command line.
Since we now allow small filesystem instances with sectorsize != nodesize to
be created, we can end up in the following situation,
[root@localhost ~]# mkfs.btrfs -f -n 65536 /dev/loop0
btrfs-progs v3.19-rc2-405-g976307c
See http://btrfs.wiki.kernel.org for more information.
Performing full device TRIM (512.00MiB) ...
Label: (null)
UUID: 49fab72e-0c8b-466b-a3ca-d1bfe56475f0
Node size: 65536
Sector size: 4096
Filesystem size: 512.00MiB
Block group profiles:
Data: single 8.00MiB
Metadata: DUP 40.00MiB
System: DUP 12.00MiB
SSD detected: no
Incompat features: extref, skinny-metadata
Number of devices: 1
Devices:
ID SIZE PATH
1 512.00MiB /dev/loop0
[root@localhost ~]# mount /dev/loop0 /mnt/
mount: mount /dev/loop0 on /mnt failed: No space left on device
The ENOSPC occurs during the creation of the UUID tree. This is because of
things like large metadata block size, DUP mode used for metadata and global
reservation consuming space. Also, large nodesize does not make sense on small
filesystems, hence this should not be an issue.
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch is generated from a coccinelle semantic patch:
identifier t;
expression e;
statement s;
@@
-t = malloc(e);
+t = calloc(1, e);
(
if (!t) s
|
if (t == NULL) s
|
)
-memset(t, 0, e);
Signed-off-by: Silvio Fricke <silvio.fricke@gmail.com>
[squashed patches into one]
Signed-off-by: David Sterba <dsterba@suse.com>
it was highlighted to me is_block_device(), returns
1 if the file is a block device,
< 0 in case of an error (eg: file not found)
0 otherwise
This patch makes proper return checks at all the places
where is_block_device() is used. Thanks to Goffredo.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Suggested-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: David Sterba <dsterba@suse.com>
Check nodesize against features, not only sectorsize.
In fact, one of the btrfs-convert and mkfs differs in the nodesize
check.
This patch also provides the basis for later btrfs-convert fix.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Added a missing newline to some error messages.
Also printf() was changed to fprintf(stderr) for error messages.
Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Reviewed-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
mkfs creates more than one fs_devices in fs_uuids.
1: one is for file system being created
2: others are created in test_dev_for_mkfs in order to check mount point
test_dev_for_mkfs()-> ... -> btrfs_scan_one_device()
Current code only closes 1, and this patch also closes in case 2.
Similar problem exist in other tools, eg.::
cmd-check.c: the function is:
cmd_check()->check_mounted()-> ... -> btrfs_scan_one_device()
...
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In parse_profile() function, in error handling route, it output error
message but forgot to exit(1), causing even profile is not valid, it
will just fallback to single.
Reported-by: James Harvey <jamespharvey20@gmail.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
# mkfs.btrfs /dev/sdb /dev/sdd -m raid0 -d raid0
# mount /dev/sdb /mnt/btrfs
# btrfs balance start /mnt/btrfs
# btrfs fi df /mnt/btrfs
Data, single: total=1.00GiB, used=320.00KiB
System, single: total=32.00MiB, used=16.00KiB
Metadata, RAID0: total=256.00MiB, used=112.00KiB
GlobalReserve, single: total=16.00MiB, used=0.00B
Only metadata stay RAID0. Data and system goes from RAID0 to single.
[REASON]
The problem is caused by the temporary single chunk.
In mkfs, it will always create single data/metadata/sys chunk and them
add device into the temporary btrfs.
When doing all chunk balance, for data and syschunk, they are almost
empty, so balance will move them into the single chunk and remove the
old RAID0 chunk.
For metadata, it has more data and will kick the metadata chunk pre
alloc, so new RAID0 chunk is allocated and the old metadata is move
there. Old RAID0 and single chunks are removed.
[FIX]
Now we add a new function to cleanup the temporary chunks at the end of
mkfs routine.
It will cleanup the chunks which is empty and its profile differs from
the mkfs profile.
So in balance, btrfs will always alloc a new chunk to keep the profile,
other than moving data into the single chunk.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This reverts commit 5f8232e5c8.
This commit causes a regression:
$ mkfs.btrfs -f /dev/sda6
$ btrfsck /dev/sda6
Checking filesystem on /dev/sda6
UUID: 2ebb483c-1986-4610-802a-c6f3e6ab4b76
checking extents
Chunk[256, 228, 0]: length(4194304), offset(0), type(2) mismatch with
block group[0, 192, 4194304]: offset(4194304), objectid(0), flags(34)
Chunk[256, 228, 4194304]: length(8388608), offset(4194304), type(4)
mismatch with block group[4194304, 192, 8388608]: offset(8388608),
objectid(4194304), flags(36)
Block group[0, 4194304] (flags = 34) didn't find the relative chunk.
Block group[4194304, 8388608] (flags = 36) didn't find the relative
chunk.
......
The commit has the following bug causing the problem.
1) Typo forgets to add meta/data_profile for alloc_chunk.
Only meta/data_profile is added to allocate a block group, but not
chunk.
2) Type for the first system chunk is impossible to modify yet.
The type for the first chunk and its stripe is hard coded into
make_btrfs() function.
So even we try to modify the type of the block group, we are unable to
change the type of the first chunk.
Causing the chunk type mismatch problem.
The 1st bug can be fixed quite easily but the second is not.
The good news is, the last patch "btrfs-progs: mkfs: Cleanup temporary
chunk to avoid strange balance behavior." from my patchset can handle it
quite well alone.
So just revert the patch.
New bug fix for btrfsck(err is 0 even chunk/extent tree is corrupted) and
new test cases for mkfs will follow soon.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The filesystem creation has to solve some chicken-egg problems and
creates some temporary objects. In our case it's an extra single/single
pair of block groups that's not used unless the user asks that
explicitly.
Example:
Data, single: total=8.00MiB, used=64.00KiB
System, DUP: total=8.00MiB, used=16.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, DUP: total=153.56MiB, used=112.00KiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=16.00MiB, used=0.00B
Even with a single device filesystem and defaults, there's single
block group for metadata and system. The single device case is easy to
fix, we'll simply create the right type from the beginning.
Example:
Data, single: total=8.00MiB, used=64.00KiB
System, DUP: total=4.00MiB, used=16.00KiB
Metadata, DUP: total=136.00MiB, used=112.00KiB
GlobalReserve, single: total=16.00MiB, used=0.00B
Filesystem on top of multiple devices still leaves the single/single
groups behind.
Signed-off-by: David Sterba <dsterba@suse.com>