check_node_or_leaf_size in utils.c now prints 'nodesize (or leafsize)'
instead of 'leafsize (or nodesize)' in the error messages, in order to
be less confusing for the user, as leafsize in mkfs is deprecated.
'ERROR: ' is also prepended to be consistent with other error messages.
Signed-off-by: Sebastian Thorarensen <sebth@naju.se>
Signed-off-by: David Sterba <dsterba@suse.cz>
Move the constant DEFAULT_MKFS_LEAF_SIZE to utils.h and rename it to
BTRFS_MKFS_DEFAULT_NODE_SIZE for consistency. Move the function
check_leaf_or_node_size to utils.c and rename it to
btrfs_check_node_or_leaf_size.
Signed-off-by: Sebastian Thorarensen <sebth@naju.se>
[added btrfs_ prefix]
Signed-off-by: David Sterba <dsterba@suse.cz>
glibc 2.10+ (5+ years old) enables all the desired features:
_XOPEN_SOURCE 700, __XOPEN2K8, POSIX_C_SOURCE, DEFAULT_SOURCE; with a
single _GNU_SOURCE define in the makefile alone. For portability to
other libc implementations (e.g. dietlibc) _XOPEN_SOURCE=700 is also
defined.
This also resolves Debian bug report filed by Michael Tautschnig -
"Inconsistent use of _XOPEN_SOURCE results in conflicting
declarations". Whilst I was not able to reproduce the results, the
reported fact is that _XOPEN_SOURCE set to 500 in one set of files
(e.g. cmds-filesystem.c) generates/defines different struct stat from
other files (cmds-replace.c).
This patch thus cleans up all feature defines, and sets them at a
consistent level.
Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=747969
Signed-off-by: Dimitri John Ledkov <dimitri.j.ledkov@intel.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The newly introduced search_chunk_tree_for_fs_info() won't count devid 0
in fi_arg->num_devices, which will cause buffer overflow since later
get_device_info() will fill di_args with devid.
This can be trigger by fstests/btrfs/069 and any operations needs to
iterate over all the devices like 'fi show' or 'dev stat' while
replacing.
The fix is do an extra probe specifically for devid 0 after
search_chunk_tree_for_fs_info() and change num_devices if needed.
Reported-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The commit 1bad43fbe002 ("btrfs-progs: refine btrfs-debug-tree error
prompt when a mount point given")
add judgement on btrfs-debug-tree to restrict only block device to be
executed on, but the command can also be used on regular file, so add
regular file support for the judgement.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The check_arg_type() function does quite generic thing, move it to
utils.c.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Reviewed-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The @fi_args->num_devices in @get_fs_info() does not include seed devices.
We could just correct it by searching the chunk tree and count how
many dev_items there are in total which includes seed devices.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The following BUG_ON:
BUG_ON(ndevs >= fi_args->num_devices)
is not needed, because it always fails with seed devices present.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Current error messages are like following:
Error: unable to create FS with metadata profile 32 (have 2 devices)
Error: unable to create FS with metadata profile 256 (have 2 devices)
Obviously it is hard for users to interpret "profile XX" to proper
meaning, such as "raidN". So use recongizable string instead of
internal numerical value. In case of "DUP", use an explicit message.
Plus this patch fix a bug that message mistake metadata profile
for data profile.
After applying this patch, messages will be like:
Error: DUP is not allowed when FS have multiple devices
Error: unable to create FS with metadata profile RAID6 (have 2
devices but 3 devices are required)
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Enhance the command "btrfs filesystem df" to show space usage information
for a mount point(s). It shows also an estimation of the space available,
on the basis of the current one used.
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
[code moved under #if 0 instead of deletion]
Signed-off-by: David Sterba <dsterba@suse.cz>
Make run from a long base path will overflow the argv0 buffer during
tests. Otherwise, this would happen for all the standalone binaries that
use set_argv0.
Original report:
https://bbs.archlinux.org/viewtopic.php?id=189861
Reported-by: WorMzy Tykashi <wormzy.tykashi@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
btrfs_scan_lblikd() is called by most the device related command functions.
And btrfs_scan_lblkid() is most expensive function and it becomes more expensive
as number of devices in the system increase. Further some threads call this
function more than once for absolutely no extra benefit and the real waste of
resources. Below list of threads and number of times btrfs_scan_lblkid()
is called in that thread.
btrfs-find-root 1
btrfs rescue super-recover 2
btrfs-debug-tree 1
btrfs-image -r 2
btrfs check 2
btrfs restore 2
calc-size NC
btrfs-corrupt-block NC
btrfs-image NC
btrfs-map-logical 1
btrfs-select-super NC
btrfstune 2
btrfs-zero-log NC
tester NC
quick-test.c NC
btrfs-convert 0
mkfs #number of devices to be mkfs
btrfs label set unmounted 2
btrfs get label unmounted 2
This patch will:
move out calling register_one_device with in btrfs_scan_lblkid()
and so function setting the BTRFS_UPDATE_KERNEL to yes will
call btrfs_register_all_devices() separately.
introduce a global variable scan_done, which is set when scan is
done succssfully per thread. So that following calls to this function
will just return success.
Further if any function needs to force scan after scan_done is set,
then it can be done when there is such a requirement, but as of now there
isn't any such requirement.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
This function is to register all devices found after scanning
the system. Before we had this functionality with in the
btrfs_scan_lblkid(), however scanning and registering are two
different distinct operation its better keep them separate.
Also we want to optimize btrfs_scan_lblkid and avoid multiple
system scans unless needed. As of now device scan uses this function.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
There is a compatibility issue with older kernel with the progs commit id as below.
d0588bfa479409b2a0f6243f894338a01a56221a
btrfs-progs: do a separate probe for _transient_ replacing device
So as of now writing to revert the above commit id.
The brewing sysfs interface would help to fix the impending issue, which is
seed device would fail show in 'btrfs fi show' output of a sprout device.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
cmd_scan_dev() has it own code to register device (calling ioctl
BTRFS_IOC_SCAN_DEV), apparently it could use btrfs_register_one_device().
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
This change adds code to detect and fix the issue introduced in the kernel
release 3.17, where creation of read-only snapshots lead to a corrupted
filesystem if they were created at a moment when the source subvolume/snapshot
had orphan items. The issue was that the on-disk root items became incorrect,
referring to the pre orphan cleanup root node instead of the post orphan
cleanup root node.
A test filesystem can be generated with the test case recently submitted for
xfstests/fstests, which is essencially the following (bash script):
workout()
{
ops=$1
procs=$2
num_snapshots=$3
_scratch_mkfs >> $seqres.full 2>&1
_scratch_mount
snapshot_cmd="$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT"
snapshot_cmd="$snapshot_cmd $SCRATCH_MNT/snap_\`date +'%H_%M_%S_%N'\`"
run_check $FSSTRESS_PROG -p $procs \
-x "$snapshot_cmd" -X $num_snapshots -d $SCRATCH_MNT -n $ops
}
ops=10000
procs=4
snapshots=500
workout $ops $procs $snapshots
Example of btrfsck's (btrfs check) behaviour against such filesystem:
$ btrfsck /dev/loop0
root item for root 311, current bytenr 44630016, current gen 60, current level 1, new bytenr 44957696, new gen 61, new level 1
root item for root 1480, current bytenr 1003569152, current gen 1271, current level 1, new bytenr 1004175360, new gen 1272, new level 1
root item for root 1509, current bytenr 1037434880, current gen 1300, current level 1, new bytenr 1038467072, new gen 1301, new level 1
root item for root 1562, current bytenr 33636352, current gen 1354, current level 1, new bytenr 34455552, new gen 1355, new level 1
root item for root 3094, current bytenr 1011712000, current gen 2935, current level 1, new bytenr 1008484352, new gen 2936, new level 1
root item for root 3716, current bytenr 80805888, current gen 3578, current level 1, new bytenr 73515008, new gen 3579, new level 1
root item for root 4085, current bytenr 714031104, current gen 3958, current level 1, new bytenr 716816384, new gen 3959, new level 1
Found 7 roots with an outdated root item.
Please run a filesystem check with the option --repair to fix them.
$ echo $?
1
$ btrfsck --repair /dev/loop0
enabling repair mode
fixing root item for root 311, current bytenr 44630016, current gen 60, current level 1, new bytenr 44957696, new gen 61, new level 1
fixing root item for root 1480, current bytenr 1003569152, current gen 1271, current level 1, new bytenr 1004175360, new gen 1272, new level 1
fixing root item for root 1509, current bytenr 1037434880, current gen 1300, current level 1, new bytenr 1038467072, new gen 1301, new level 1
fixing root item for root 1562, current bytenr 33636352, current gen 1354, current level 1, new bytenr 34455552, new gen 1355, new level 1
fixing root item for root 3094, current bytenr 1011712000, current gen 2935, current level 1, new bytenr 1008484352, new gen 2936, new level 1
fixing root item for root 3716, current bytenr 80805888, current gen 3578, current level 1, new bytenr 73515008, new gen 3579, new level 1
fixing root item for root 4085, current bytenr 714031104, current gen 3958, current level 1, new bytenr 716816384, new gen 3959, new level 1
Fixed 7 roots.
Checking filesystem on /dev/loop0
UUID: 2186e9b9-c977-4a35-9c7b-69c6609d4620
checking extents
checking free space cache
cache and super generation don't match, space cache will be invalidated
checking fs roots
checking csums
checking root refs
found 618537000 bytes used err is 0
total csum bytes: 130824
total tree bytes: 601620480
total fs tree bytes: 580288512
total extent tree bytes: 18464768
btree space waste bytes: 136939144
file data blocks allocated: 34150318080
referenced 27815415808
Btrfs v3.17-rc3-2-gbbe1dd8
$ echo $?
0
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
coverity warned that the return code from sscanf() assigned to 'i'
wasn't checked before being assigned again. Check it.
Signed-off-by: Zach Brown <zab@zabbo.net>
Signed-off-by: David Sterba <dsterba@suse.cz>
We are passing device path to be registered with in kernel,
so we need to open with RW
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The size unit format is a longstanding annoyance. This patch is based on
the work of Nils and Alexandre and enhances the options. It's possible
to select raw bytes, SI-based or IEC-based compact units (human
frientdly) or a fixed base from kilobytes to terabytes. The default is
compact human readable IEC-based, no change to current version.
CC: Nils Steinger <nst@voidptr.de>
CC: Alexandre Oliva <oliva@gnu.org>
Reviewed-by: Hugo Mills <hugo@carfax.org.uk>
Signed-off-by: David Sterba <dsterba@suse.cz>
'const int const *x' means the same thing as 'const int *x' or
'int const *x'; the intent was probably 'const int * const x'.
However, this won't work for the 'suffix' variable, as it has
to be assigned, and making the static tables into const pointers
to const chars leads to a mismatch there.
This was found with clang's duplicate-decl-specifier warning.
Signed-off-by: Adam Buchbinder <abuchbinder@google.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The functionality of pretty unit printing was duplicated by
df_pretty_sizes, merge it with pretty_size and enhance the interface
with more suffix mode. Raw, binary or decimal.
Signed-off-by: David Sterba <dsterba@suse.cz>
As mentioned in the kernel patch
btrfs: ioctl BTRFS_IOC_FS_INFO and
BTRFS_IOC_DEV_INFO miss-matched with slots
The count as returned by BTRFS_IOC_FS_INFO is the number of slots that
btrfs-progs would allocate for the BTRFS_IOC_DEV_INFO ioctl. Since
BTRFS_IOC_DEV_INFO would loop across the seed devices, So its better
ioctl BTRFS_IOC_FS_INFO returns the total_devices instead of num_devices.
The above mentioned patch just does that. That is, it returns
total_devices instead of num_devices.
Which means we need to probe for the replacing device separately.
This patch will probe for the replacing device separately.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
With the changes as in the previous patch, now scan_for_btrfs()
is an unused function. So delete it.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The libblkid scan method which was introduced later, will also
scan devices under /proc/partitions. So we don't have to do
the explicit scan of the same.
Remove the scan method BTRFS_SCAN_PROC.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
(I am unable to reproduce the issue, tried to go back with progs versions
but still the same. So as of now this code remains untested, suggest to
wait till we have a reproducible test case).
Here is a test case which says it all..
mkfs.xfs -f $DEV
mkfs.btrfs -f $DEV
mount $DEV $MNT
mount: /dev/vdiskc: more filesystems detected. This should not happen,
use -t <type> to explicitly specify the filesystem type or
use wipefs(8) to clean up the device.
mount: you must specify the filesystem type
with this patch btrfs_prepare_device() also wipes old FS if any,
btrfs_prepare_device() is called after we have verified that
user has provided -f option.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
After the previous 2 patches, nothing uses
whole-dev-tree scanning, so remove the code which
implemented that functionality.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
If we didn't find what we are looking for in /proc/partitions,
we're not going to find it by scanning every node under /dev, either.
But that's just what btrfs_scan_for_fsid() does.
Remove that fallback; at that point btrfs_scan_for_fsid() just calls
scan_for_btrfs(), so remove the wrapper & call it directly.
Side note: so, these paths always use /proc/partitions, not libblkid.
Userspace-intiated scans default to libblkid. I presume this is
part of the design, and intentional? Anyway, not changing it now!
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Original find_mount_root() will use the first mount point match and
return it.
It was OK until the following commit, which will also check the fstype:
de22c28ef31d9721606ba059 btrfs-progs: Check fstype in find_mount_root()
With fstype check, we should check the last match, not only the first
one.
Or the following mount will not pass the find_mount_root():
/dev/sdc on /mnt/test type ext4 (rw,relatime,data=ordered)
/dev/sdb on /mnt/test type btrfs (rw,relatime,space_cache)
This patch will use the last match to do the fstype check.
Reported-by: Remco Hosman <remco@yerf-it.nl>
Signed-off-by: Remco Hosman <remco@yerf-it.nl>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Fix (at least one user-visible) typos: it's its, not it's.
Signed-off-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Since test_isdir() is a utility function, it's better to
move it to utils.c. In addition, "const char *" is
more appropriate type as its "path" argument because
this argument is not changed in this function.
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Cc: David Sterba <dsterba@suse.cz>
Cc: Mike Fleetwood <mike.fleetwood@googlemail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
There are many duplicated codes to check if the given string is
correct subvolume name. Introduce test_issubvolname() for this
purpose for simplicity.
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Cc: David Sterba <dsterba@suse.cz>
Cc: Mike Fleetwood <mike.fleetwood@googlemail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The function @is_existing_blk_or_reg_file has a return value of -errno,
which indicate the @stat call fails with non-ENOENT errors.
In this condition, we should not continue the following work.
But -errno evaluates to true and will let the following work go.
So we should judge more accurately whether the return value of
@is_existing_blk_or_reg_file is > 0 or not to decide our behavior.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Fix following build warnings on 32bit platform:
...
utils.c:1708:3: warning: left shift count >= width of
type [enabled by default]
if (x << i & (1UL << 63))
^
qgroup-verify.c:393:9: warning: cast to pointer from integer
of different size [-Wint-to-pointer-cast]
return (struct tree_block *)unode->aux;
^
qgroup-verify.c:407:38: warning: cast from pointer to integer
of different size [-Wpointer-to-int-cast]
if (ulist_add(tree_blocks, bytenr, (unsigned long long)block, 0) >= 0)
^
cmds-restore.c:120:4: warning: format %lu expects argument of type
long unsigned int, but argument 3 has type size_t [-Wformat=]
fprintf(stderr, "bad compress length %lu\n", in_len);
...
BTW, this patch also switches other castings with new helpers.
Signed-off-by: Wang Shilong <wangshilong1991@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
When calling find_mount_root(), caller in fact wants to find the mount
point of *BTRFS*.
So also check ent->fstype in find_mount_root() and do special error
string output in caller.
This will suppress a lot of "Inapproiate ioctl for device" error
message.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
find_mount_root() function in utils.c should not print error string.
Caller should be responsible to print error string.
This patch will remove the only fprintf in find_mount_root() and modify
the caller a little to use strerror() to prompt users.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
mkfs can try to write outside of small devices. The zeroing code
doesn't test the device size and runs before mkfs tests for small
devices and exits.
Testers experienced this as small regular files being extended as mkfs
failed:
$ truncate -s 1m /tmp/some-file
$ strace -epwrite ./mkfs.btrfs /tmp/some-file
SMALL VOLUME: forcing mixed metadata/data groups
WARNING! - Btrfs v3.14.2 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using
pwrite(3, ..., 2097152, 0) = 2097152
pwrite(3, ..., 4096, 65536) = 4096
pwrite(3 ..., 2097152, 18446744073708503040) = -1 EINVAL (Invalid argument)
ERROR: failed to zero device '/tmp/some-file' - Input/output error
$ ls -lh /tmp/some-file
-rw-rw-r--. 1 zab zab 2.0M Jul 16 13:49 /tmp/some-file
This simple fix adds a helper that clamps a region to be zeroed to the
size of the device. It doesn't address the larger questions of whether
to modify the device before the size test or whether or zero regions
that have been trimmed.
Finally, the error handling mess after the zeroing calls is cleaned up.
zero_blocks() and its callers only return -errno.
Signed-off-by: Zach Brown <zab@zabbo.net>
Signed-off-by: David Sterba <dsterba@suse.cz>
mkfs cut of size '1024 * 1024 * 1024' to mark dev as small volume so to
force mixed group. Use a define for that.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Btrfs-progs superblock checksum check is somewhat too restricted for
super-recover, since current btrfs-progs will only read the 1st
superblock and if you need super-recover the 1st superblock is
possibly already damaged.
The fix is introducing super_recover parameter for
btrfs_read_dev_super() and callers to allow scan backup superblocks if
needed.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
To let the independent tools(e.g. btrfs-image, btrfs-convert, etc.)
share the convenience of check_argc_* functions, just move it into
utils.c.
Also add a new function "set_argv0" to set the correct tool name:
*btrfs-image*: too few arguments
The original btrfs* tools work as before.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
[moved argv0 and check_argc to utils.*]
Signed-off-by: David Sterba <dsterba@suse.cz>
Btrfs has global block reservation, so even mkfs.btrfs can execute
without problem, there is still a possibility that the filesystem can't
be mounted.
For example when mkfs.btrfs on a 8M file on x86_64 platform, kernel will
refuse to mount due to ENOSPC, since system block group takes 4M and
mixed block group takes 4M, and global block reservation will takes all
the 4M from mixed block group, which makes btrfs unable to create uuid
tree.
This patch will add minimum device size check before actually mkfs.
The minimum size calculation uses a simplified one:
minimum_size_for_each_dev = 2 * (system block group + global block rsv)
and global block rsv = leafsize << 10
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
When using parse_size(), even non-numeric value is passed, it will only
give error message "ERROR: size value is empty", which is quite
confusing for end users.
This patch will introduce more meaningful error message for the
following new cases
1) Invalid size string (non-numeric string)
2) Minus size value (like "-1K")
Also this patch will take full use of endptr returned by strtoll() to
reduce unneeded loop.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
mount(8) will canonicalize pathnames before passing them to the kernel.
Links to e.g. /dev/sda will be resolved to /dev/sda. Links to /dev/dm-#
will be resolved using the name of the device mapper table to
/dev/mapper/<name>.
Btrfs will use whatever name the user passes to it, regardless of whether
it is canonical or not. That means that if a 'btrfs device ready' is
issued on any device node pointing to the original device, it will adopt
the new name instead of the name that was used during mount.
Mounting using /dev/sdb2 will result in df:
/dev/sdb2 209715200 39328 207577088 1% /mnt
lrwxrwxrwx 1 root root 4 Jun 4 13:36 /dev/whatever-i-like -> sdb2
/dev/whatever-i-like 209715200 39328 207577088 1% /mnt
Likewise, mounting with /dev/mapper/whatever and using /dev/dm-0 with a
btrfs device command results in df showing /dev/dm-0. This can happen with
multipath devices with friendly names enabled and doing something like
'partprobe' which (at least with our version) ends up issuing a 'change'
uevent on the sysfs node. That *always* uses the dm-# name, and we get
confused users.
This patch does the same canonicalization of the paths that mount does
so that we don't end up having inconsistent names reported by ->show_devices
later.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
[use PATH_MAX in canonicalize_dm_name]
Signed-off-by: David Sterba <dsterba@suse.cz>
Allow the specification of the filesystem UUID at mkfs time.
Non-unique unique IDs are rejected. This includes attempting
to re-mkfs with the same UUID; if you really want to do that,
you can mkfs with a new UUID, then re-mkfs with the one you
wanted.
(Implemented only for mkfs.btrfs, not btrfs-convert).
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
[converted help to asciidoc]
Signed-off-by: David Sterba <dsterba@suse.cz>
Linking with libbtrfs fails because arg_strtou64 is not defined and we
cannot just add utils.o to library objects because it's not
library-clean.
Reported-by: Arvin Schnell <aschnell@suse.com>
Reported-by: Anton Farygin <rider@altlinux.org>
Signed-off-by: David Sterba <dsterba@suse.cz>
In utils.c, zero_end is used as a parameter, should not force it to 1.
In mkfs.c, zero_end is set to 1 or 0(-b) at the beginning, should not
force it to 1 unconditionally.
Signed-off-by: Li Yang <liyang.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Because the function open_file_or_dir() always opened the input file in
read/write mode (O_RDWR), we were not able to due a compression property
get against a file living in a read-only subvolume/snapshot.
Fix this by opening the file with O_RDONLY mode if we're doing a property
get.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
It was added in 25d82d22 but broke recently in 4724d7b0 while making
discard interruptible.
Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The ioctl for the whole range is not interruptible, which can be
annoying when the discard is not wanted but user forgets to use the -K
option.
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
In btrfs_scan_lblkid(), blkid_get_cache() is called but cache not freed.
This patch adds blkid_put_cache() to free it.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
get_fs_info() provides the info of the specific
device/devid, however when we delete the missing disk
the super-block on the disk isn't cleared, and since
btrfs-progs makes its decision by reading the disk super
block, so it doesn't know about the kernel previous action,
And now when we tried to probe kernel for the devid it fails.
reproducer:
$ mkfs.btrfs -d raid1 -m raid1 /dev/sde /dev/sdf
$ modprobe -r btrfs && modprobe btrfs
$ mount -o degraded /dev/sde /btrfs
$ btrfs dev add /dev/sdd /btrfs
$ btrfs dev del missing /btrfs
$ btrfs scrub start -B /dev/sdf
btrfs: utils.c:1741: get_fs_info: Assertion `!(ndevs == 0)' failed.
Aborted (core dumped)
Signed-off-by: Anand Jain <Anand.Jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
btrfs-progs picks the latest_dev based on first probed
greatest trans-id. However below test case proofs that
approach is wrong.
$ mkfs.btrfs -d raid1 -m raid1 /dev/sde /dev/sdf
$ modprobe -r btrfs && modprobe btrfs
$ mount -o degraded /dev/sde /btrfs
$ touch /btrfs/testfile && btrfs fi sync /btrfs
The above steps will make /dev/sdf not part of the btrfs.
and as below when you use /dev/sdf the btrfs dev stat
and dev scrub picks up wrong disk
$ btrfs dev stat /dev/sdf
[/dev/sde].write_io_errs 0
[/dev/sde].read_io_errs 0
[/dev/sde].flush_io_errs 0
[/dev/sde].corruption_errs 0
[/dev/sde].generation_errs 0
$ btrfs scrub start -B /dev/sdf
scrub done for 2e99c881-6abd-4f8a-8290-e2f8d0acc575
scrub started at Mon Feb 24 14:45:06 2014 and finished after 0 seconds
total bytes scrubbed: 256.00KiB with 0 errors
Signed-off-by: Anand Jain <Anand.Jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
as of now, when we replace a disk, it is added to the
dev list with devid 0. And we fail to obtain details
of devid 0 because we don't query devid 0 at all.
reproducer:
btrfs rep start /dev/sdb /dev/sdf /btrfs
btrfs fi show
Label: none uuid: f8fb9819-16c8-47b7-b62f-0ff90f8c56cd
Total devices 3 FS bytes used 1.94GiB
devid 1 size 1.10GiB used 1.10GiB path /dev/sdb
devid 2 size 1.10GiB used 1.08GiB path /dev/sdc
devid 0 size 0.00 used 0.00 path
this patch will make it proper by querying devid 0.
btrfs repl start /dev/sdb /dev/sdf /btrfs
btrfs fi show /btrfs
Label: none uuid: f8fb9819-16c8-47b7-b62f-0ff90f8c56cd
Total devices 3 FS bytes used 1.94GiB
devid 0 size 1.10GiB used 1.10GiB path /dev/sdf
devid 1 size 1.10GiB used 1.10GiB path /dev/sdb
devid 2 size 1.10GiB used 1.08GiB path /dev/sdc
Its fine to query devid 0 when there is no replace
activity as well, because we just skip the error ENODEV
btrfs fi show /btrfs
Label: none uuid: f8fb9819-16c8-47b7-b62f-0ff90f8c56cd
Total devices 2 FS bytes used 1.94GiB
devid 1 size 1.10GiB used 1.10GiB path /dev/sdf
devid 2 size 1.10GiB used 1.08GiB path /dev/sdc
Signed-off-by: Anand Jain <Anand.Jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Allow the use of get_device_info() for different units.
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
When exec btrfsck as non-root user on a disk, btrfsck will always
warn that "No such file or directory", despite that a directory
(e.g. /dev/vboxusb)actually exists. We just have no permission.
In this case, return the -errno set by the opendir call in
btrfs_scan_one_dir rather than blindly return -ENOENT.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
There are many places that need parse string to u64 for btrfs commands,
in fact, we do such things *too casually*, using atoi/atol/atoll..is not
right at all, and even we don't check whether it is a valid string.
Let's do everything more gracefully, we introduce a new helper
arg_strtou64() which will do all the necessary checks.If we fail to
parse string to u64, we will output message and exit directly, this is
something like what usage() is doing. It is ok to not return erro to
it's caller, because this function should be called when parsing arg
(just like usage!)
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Move find_mount_root to utils.[ch] for general use.
Signed-off-by: Qu Wenruo <quwenruo@cn.fuijitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
add_seen_fsid() which was introduced lately will eliminate
the mounted disks, so we don't need test_skip_this_disk()
anymore
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
this patch will handle the strerror reporting of the error instead of
printing errno, and also replaced the BUG_ON with the error handling
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Previously, open_file_or_dir() will open block device successfully, however,
we should enhance such checks to make sure we are really opening a file or dir.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Internally, btrfs_header_chunk_tree_uuid() calculates an unsigned
long, but casts it to a pointer, while all callers cast it to unsigned
long again.
From btrfs commit b308bc2f05a86e728bd035e21a4974acd05f4d1e
Signed-off-by: Ross Kirk <ross.kirk@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
get_label prints the label at the moment. Change this so that
the label is returned and printing is done by the caller.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
When creating a fs on a loop device, mkfs checks whether the same file
is not already mounted, but a backing file of another loop dev does not
exist, mkfs fails. This fixes a bug during openSUSE installation.
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
We intentionally fall through these case statements;
just annotate it to be clear.
Resolves-Coverity-CID: 1054887
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Even if it's "definitely" btrfs at this point,
btrfs_scan_one_device could fail for other reasons.
Check the return value, warn if it fails, and skip
the device register.
Resolves-Coverity-CID: 1125925
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
open can fail, of course.
Resolves-Coverity-CID: 1125925
Resolves-Coverity-CID: 1125930
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
If any pwrite failed we leaked the allocated "buf" on
return from the function. "goto out" takes care of
those paths.
Resolves-Coverity-CID: 1125938
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Close fd before we return on error paths.
Resolves-Coverity-CID: 1125939
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Use strncpy(... ,PATH_MAX) to be sure we don't overflow
the path[PATH_MAX] array.
Resolves-Coverity-CID: 1125941
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
So I needed to add a flag to not try to read block groups when doing
--init-extent-tree since we could hang there, but that meant adding a whole
other 0/1 type flag to open_ctree_fs_info. So instead I've converted it all
over to using a flags setting and added the flag that I needed. This has been
tested with xfstests and make test. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
This got changed to a double but all the callers still use a u64, which causes
us to segfault sometimes because of some weird C voodoo that I had to have
explained to me. Apparently because we're using a double the compiler will use
the floating point registers to hold our argument which ends up not being
aligned properly if you don't actually give it a double so it will cause
problems for other things, in our case it was screwing up str_bytes so it was
larger than the actual size of the str. This patch fixes the segfault. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Originally, thinking was user will use mount point if the disk
is mounted. But thats not really true, actually user don't
(or shouldn't) care to check if disk mounted, so whether disk
is mounted/unmounted when disk path is specified it should work.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
get_btrfs_mount is reusable function but it is printing
errors, this removes it. Here the parent function of
open_path_or_dev_mnt does print error msg on error.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Sometimes, we need to catch length of snprintf() in pretty_size_snprintf().
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Internally, btrfs_header_fsid() calculates an unsigned long, but casts
it to a pointer, while all callers cast it to unsigned long again.
Committed to btrfs as fba6aa75654394fccf2530041e9451414c28084f
Fix line length issues and match changes to kernelspace
Signed-off-by: Ross Kirk <ross.kirk@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
The message about trim was printed unconditionally, we should check if
trim is supported at all.
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Find the tree id of the containing subvolume for a given file or
directory. For subvolume return it's own id.
$ btrfs inspect-internal rootid <path>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Remove unused parameter, 'eb'. Unused since introduction in
7777e63b425f1444d2472ea05a6b2b9cf865f35b
Signed-off-by: Ross Kirk <ross.kirk@gmail.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Remove unused eb parameter from btrfs_item_nr, unused since introduced
in 7777e63b425f1444d2472ea05a6b2b9cf865f35b
Signed-off-by: Ross Kirk <ross.kirk@gmail.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
We don't need to run ioctls when checking whether btrfs
has mounted somewhere.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Originally the local pending_list is not guaranteed to be freed upon
fails, it should be emptyed and the elements should be freed.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
This fix the regression introduced by 830427d
that it no more creates the FS if disk is small
and if no mixed option is provided.
This patch will bring it to the original design
which will force mixed profile when disk is small
and go ahead to create the FS.
Which also means that before we open the device
for the write we should also check if disk is small.
v2: fixes the checkpatch.pl warnings
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
This patch provides fix for the following bug,
When mkfs.btrfs fails the disks shouldn't be written.
------------
btrfs fi show /dev/sdb
Label: none uuid: 60fb76f4-3b4d-4632-a7da-6a44dea5573d
Total devices 1 FS bytes used 24.00KiB
devid 1 size 2.00GiB used 20.00MiB path /dev/sdb
mkfs.btrfs -dsingle -mraid1 /dev/sdb -f
::
unable to create FS with metadata profile 16 (have 1 devices)
btrfs fi show /dev/sdb
Label: none uuid: 2da2179d-ecb1-4a4e-a44d-e7613a08c18d
Total devices 1 FS bytes used 24.00KiB
devid 1 size 2.00GiB used 20.00MiB path /dev/sdb
-------------
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Before this change, passing -O skinny-metadata to mkfs.btrfs would
only set the skinny metadata incompat flag in the super block after
the filesystem was created. This change makes mkfs.btrfs directly
create a filesystem with only skinny extents for metadata.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
__CHECKER__ is only for the type juggling used to tell sparse which
types need conversion between address spaces. It is not OK to use to
change the code that gets checked to avoid bugs elsewhere in the build
infrastructure. We want to check the code that builds when the checker
isn't enabled.
Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Mark many functions as static, and remove any resulting dead code.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Commit 55061a98 adds a cut & paste error that makes mkfs.btrfs fail
if leafsize != sectorsize.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Reviewed-by: Filipe Manana <fdmanana@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
test_dev_for_mkfs() is a common place where
we check if a device is fit for the btrfs use.
cmd_start_replace() should make use of test_dev_for_mkfs(),
and here the test_dev_for_mkfs() is further enhanced
to fit the cmd_start_replace() needs.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Port of commit b3b4aa7 to userspace.
parameter tree root it's not used since commit
5f39d397dfbe140a14edecd4e73c34ce23c4f9ee ("Btrfs: Create extent_buffer
interface for large blocksizes")
This gets userspace a tad closer to kernelspace by removing
this unused parameter that was all over the codebase...
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
when we scan /proc/partitions the cdrom is scanned
as well, and we don't have to report ENOMEDIUM errors
against it.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
This would help to reuse the function
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
the dev scan to find btrfs is performed at two locations
all most the same way one at filesystem show and another
at device scan. They both follow the same steps. This
patch does not alter anything except that it brings these
two same logic into the function scan_for_btrfs so that
we can play tweaking it.
the patch which recommends to use /dev/mapper
will also need it
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
btrfs_scan_for_fsid uses only one argument run_ioctl out of 3
so remove the rest two of them
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
valgrind complains open_file_or_dir() causes a memory leak.That is because
if we open a directoy by opendir(), and then we should call closedir()
to free memory.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
As implemented now, we use 1024 based units but reporting 1000 based,
let's finally fix that and add optional unit bases later.
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Instead of aborting with a BUG_ON() statement, return a
negated errno code. Also updated mkfs and convert tools
to print a nicer error message when make_btrfs() returns
an error.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Assert that the writes of the device and chunk tree
roots succeed. This verification is currently done
for all other tree roots, however it was missing for
those 2 trees.
Would these tree root writes fail, but all others succeed,
it would lead to a corrupted/incomplete btrfs filesystem,
or, more likely some weird failure later on in mkfs.btrfs
inside open_ctree().
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
We don't need callers to manage string storage for each pretty_sizes()
call. We can use a macro to have per-thread and per-call static storage
so that pretty_sizes() can be used as many times as needed in printf()
arguments without requiring a bunch of supporting variables.
This lets us have a natural interface at the cost of requiring __thread
and TLS from gcc and a small amount of static storage. This seems
better than the current code or doing something with illegible format
specifier macros.
Signed-off-by: Zach Brown <zab@redhat.com>
Acked-by: Wang Shilong <wangs.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Some codes still use the cpu_to_lexx instead of the
BTRFS_SETGET_STACK_FUNCS declared in ctree.h.
Also added some BTRFS_SETGET_STACK_FUNCS for btrfs_header and
btrfs_super.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
With commit
87c09f7 Btrfs-progs: fix memory leaks on cleanup
mkfs on multiple dev is ending with segfault at
close_all_devices() during kfree(device->name)
because mkfs calls btrfs_add_to_fsid, which does not initialize
name when dev is added to the list.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
In the cases where one of the disk is not suitable for
btrfs, then we would fail the mkfs, however we determine
that after we have written btrfs to the preceding disks.
At this time if user changes mind for not to use btrfs
will left with no choice.
So this patch will check if all the provided disks are
suitable for the btrfs at once before proceeding to
create btrfs on a disk.
Further this patch also removed duplicate code to check
device suitability for the btrfs.
Next, there is an existing bug about the -r mkfs option,
which this patch would carry forward most of it.
Ref:
[PATCH 2/2, RFC] btrfs-progs: overhaul mkfs.btrfs -r option
Signed-off-by: Anand Jain <anand.jain@oracle.com>
to merg prev
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Previously btrfs-image would set a METADUMP flag and would make one big system
chunk to cover the entire file system in the super in order to get around the
unpleasant business of having to adjust the chunk tree. This meant that you
could use the progs stuff on a restored file system, which is great for testing
btrfsck and other such things. But we want to be able to run the tree log
replay on a file system that is not able to run the tree log replay. So in
order to do this we need to fixup the super's chunk array and the chunk tree
itself. This is pretty easy since we restore using the logical offsets of the
metadata, so we just have to set the chunk items to have 1 stripe and have the
stripes point at the primary device and then use the logical offset of the chunk
as the physical offset. With this patch I can restore a file system image that
had a tree log and mount the file system and have the log be replayed
successfully. This patch also gives you the -o option in case you want the old
restore way, in the case where we want to make sure the system chunks as they
were given to us are correct. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Of recently and intermittently I am seeing open fail
for /dev/btrfs-control (btrfs is loaded), and there are no
dmesg errors, this may not be a complete help in digging
this issue but something which is necessary.
Thanks
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
get_fs_info() has been silently switching from a device to a mounted
path as needed; the caller's filehandle was unexpectedly closed &
reopened outside the caller's scope. Not so great.
The callers do want "fdmnt" to be the filehandle for the mount point
in all cases, though - the various ioctls act on this (not on an fd
for the device). But switching it in the local scope of get_fs_info
is incorrect; it just so happens that *usually* the fd number is
unchanged.
So - use the new helpers to detect when an argument is a block
device, and open the the mounted path more obviously / explicitly
for ioctl use, storing the filehandle in fdmnt.
Then, in get_fs_info, ignore the fd completely, and use the path on
the argument to determine if the caller wanted to act on just that
device, or on all devices for the filesystem.
Affects those commands which are documented to accept either
a block device or a path:
* btrfs device stats
* btrfs replace start
* btrfs scrub start
* btrfs scrub status
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Add 3 new helpers:
* is_block_device(), to test if a path is a block device.
* get_btrfs_mount(), to get the mountpoint of a device,
if mounted.
* open_path_or_dev_mnt(path), to open either the pathname
or, if it's a mounted btrfs dev, the mountpoint. Useful
for some commands which can take either type of arg.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Allocate fs_info::super_copy dynamically of full BTRFS_SUPER_INFO_SIZE
and use it directly for saving superblock to disk.
This fixes incorrect superblock checksum after mkfs.
Signed-off-by: David Sterba <dsterba@suse.cz>
Clean btrfslabel.[c|h] out of the source tree and move those related
functions to utils.[c|h].
CC: Gene Czarcinski <gene@czarc.net>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Refactor check_label().
- Make it be static at first, this is a preparation step since we'll remove
btrfslabel.[c|h] and move those functions from there to utils.[c|h], we can
do pre-checking against the input label string with it.
- Fix the label length check up from BTRFS_LABEL_SIZE to BTRFS_LABEL_SIZE - 1.
- Kill the check of label contains an invalid character, see below commits for detail:
79e0e445fc2365e47fc7f060d5a4445d37e184b8
btrfs-progs: kill check for /'s in labels.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
CC: David Sterba <dsterba@suse.cz>
CC: Gene Czarcinski <gene@czarc.net>
Currently, the following commands succeed.
# cat /proc/swaps
Filename Type Size Used Priority
/dev/sda3 partition 8388604 0 -1
/dev/sdc8 partition 9765884 0 -2
# mkfs.btrfs /dev/sdc8
WARNING! - Btrfs v0.20-rc1-165-g82ac345 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using
fs created label (null) on /dev/sdc8
nodesize 4096 leafsize 4096 sectorsize 4096 size 9.31GB
Btrfs v0.20-rc1-165-g82ac345
# btrfs fi sh /dev/sdc8
Label: none uuid: fc0bdbd0-7eed-460f-b4e9-131273b66df2
Total devices 1 FS bytes used 28.00KB
devid 1 size 9.31GB used 989.62MB path /dev/sdc8
Btrfs v0.20-rc1-165-g82ac345
#
But we should check out the swap device. Fixed it.
Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Tested-by: David Sterba <dsterba@suse.cz>
print more informative error when we fail to open a device
If open() fails, we should let the user know why it failed.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Gene Czarcinski <gene@czarc.net>
In the places where we copy a string into the name
member of btrfs_ioctl_vol_args or btrfs_ioctl_vol_args_v2,
we use strncopy (to not overflow the name array) and then
set the last position to the null character.
Howver, in both cases the arrays are defined with:
char name[MAX+1];
hence the last array position is name[MAX].
In most cases, we now insert the null at name[MAX-1]
which deprives us of one useful character.
Even the above isn't consistent through the code, so
make some helper code to make it simple, i.e.
strncpy_null(dest, src) which automatically does the
right thing based on the size of dest.
Thanks to Zach Brown for the macro suggestion.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
btrfs_scan_one_dir() can overflow an arbitrarily small 256 byte buffer
with an arbitrarily slightly larger 1024 byte buffer as it remembers the
path of a dir to later descend.
Make these buffers the same size to stop the overflow and chose PATH_MAX
for that size so that it won't fail on legitimately bonkers paths.
Signed-off-by: Zach Brown <zab@redhat.com>
The super block magic is a le64 whose value looks like an unterminated
string in memory. The lack of null termination leads to clumsy use of
string functions and causes static analysis tools to warn that the
string will be unterminated.
So let's just treat it as the le64 that it is. Endian wrappers are used
on the constant so that they're compiled into run-time constants.
Signed-off-by: Zach Brown <zab@redhat.com>
Two convenient utility functions that have so far been local to scrub are
moved to utils.c.
They will be used in the device stats code in a following commit.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
The definition of the function open_file_or_dir() is moved from common.c
to utils.c in order to be able to share some common code between scrub
and the device stats in the following step. That common code uses
open_file_or_dir(). Since open_file_or_dir() makes use of the function
dirfd(3), the required XOPEN version was raised from 6 to 7.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Original-Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
The LOOP_GET_STATUS ioctl truncates filenames to 64 characters. We should get
the backing file for a given loop device from /sys/. This is how losetup does it
as well.
Signed-off-by: Nirbheek Chauhan <nirbheek.chauhan@collabora.co.uk>
Signed-off-by: Gene Czarcinski <gene@czarc.net>
Tested-By: Hector Oron <hector.oron@collabora.co.uk>
Ignore the error ENXIO (device don't exists) and ENOMEDIUM (
No medium found -> like a cd tray empty) in the function
btrfs_scan_one_dir.
This avoids spurios errors due to an empty CD or a block device node
without a device (which is frequent in a static /dev).
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Add new suffixes in parse_size() function. New suffixes are: T as
terabyte, P as petabyte, E as exabyte. Note these units are
multiply of 2 .
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Replace the function atoll with strtoull(); Check that the suffix for the
parse_size() input is of only one character.
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
My patch
04609add88ef8428d725de6ef60f46a3ff0dbc8e
introduced a regression where if you mkfs'ed a group of disks with different
sizes it limited the disks to the size of the first one that is specified.
This was not the intent of my patch, I only want it to limit the size based
on the -b option, so I've reworked the code to pass in a max block count and
that fixes the issue. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
I had a test that creates a 7gig raid1 device but it was ending up wonky
because the second device that gets added is the full size of the disk
instead of the limited size. So enforce the limited size on all disks
passed in at mkfs time, otherwise our threshold calculations end up wonky
when doing chunk allocations. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
If we iterate the "goto again" loop, we've called "closedir(dirp)",
yet at the top of the loop, upon malloc failure we "goto fail",
where we test dirp and if non-NULL, call closedir(dirp) again.
* utils.c (btrfs_scan_one_dir): Clear "dirp" after closedir to avoid
use-after-free upon failed fullpath = malloc(...
Signed-off-by: Jim Meyering <meyering@redhat.com>
When we're using multipath or raid0, it is possible
that btrfs dev scan will find one of the component devices
instead of the proper virtual device the kernel creates.
We want to make sure the kernel scans the virtual devices last,
since it always remembers the last device it finds with a given fsid.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
btrfs_scan_for_fsid is used by open_ctree and by mkfs when it is
checking for mounted devices. It currently scans all of /dev,
which is rarely the right answer.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
/proc/mounts contains device names that don't exist,
we end up erroring out because we're not able to stat
the device (that doesn't exist).
Fix this by allowing the mkfs when the target device doesn't exist.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
During the commands:
- btrfs filesystem show
- btrfs device scan
the devices "scanned" are extracted from /proc/partitions. This
should avoid to scan devices not suitable for a btrfs filesystem like cdrom
and floppy or to scan not existant devices.
The old behavior (scan all the block devices under /dev) may be
forced passing the "--all-devices" switch.
new version of check_mounted() returning more information gathered while
searching. check_mounted() is now a wrapper for check_mounted_where(). the
new version is needed by scrub.c
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Hugo Mills <hugo@carfax.org.uk>
we discovered speed setting is (probably unintentionally) initialized to 1 in make_btrfs(), while being initialized to 0 in btrfs_add_to_fsid(). initialization in make_btrfs() is due to reuse of buf after pwrite() without clearing it. consequently, code like
btrfs_set_extent_generation(buf, extent_item, 1);
writes to the same location in buf where speed will be placed, later. It may be a good idea to clear buf after each pwrite(), though leaving the struct btrfs_header intact.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Hugo Mills <hugo@carfax.org.uk>
This patch adds the setting of time to the root directory to the
mkfs.btrfs command.
As a result, the time of the mount point not correctly displayed
comes to be displayed correctly.
[before]
# mkfs.btrfs /dev/sdd10
# mount /dev/sdd10 /test1
# ls -ld /test1
dr-xr-xr-x 1 root root 0 Jan 1 1970 /test1
[after]
# date
Tue Nov 16 18:06:05 JST 2010
# mkfs.btrfs /dev/sdd10
# mount /dev/sdd10 /test1
# ls -ld /test1
dr-xr-xr-x 1 root root 0 Nov 16 18:06 /test1
Thanks,
Tsutomu
Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Hugo Mills <hugo@carfax.org.uk>
Discard the whole device before starting to create the filesystem structures.
Modelled after similar support in mkfs.xfs.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Hi all,
this patch adds the command "btrfs filesystem label" to change (or show) the
label of a filesystem.
This patch is a subset of the one written previously by Morey Roof. I
included the user space part only. So it is possible only to change/show a
label of a *single device* and *unounted* filesystem.
The reason of excluding the kernel space part, is to simplify the patch in
order to speed the check and then the merging of the patch itself. In fact I
have to point out that in the past there was almost three attempts to propose
this patch, without success neither complaints.
Chris, let me know how you want to proceed. I know that you are very busy,
and you prefer to work to stabilize btrfs instead adding new feature. But I
think that changing a label is a *essential* feature for a filesystem
managing tool. Think about a mount by LABEL.
To show a label
$ btrfs filesystem label <device>
To set a label
$ btrfs filesystem label <device> <newlabel>
Please guys, give a look to the source.
Comments are welcome.
You can pull the source from the branch "label" of the repository
http://cassiopea.homelinux.net/git/btrfs-progs-unstable.git
Regards
G.Baroncelli
Signed-off-by: Chris Mason <chris.mason@oracle.com>
So alot of crazy people (I'm looking at you Meego) want to use btrfs on phones
and such with small devices. Unfortunately the way we split out metadata/data
chunks it makes space usage inefficient for volumes that are smaller than
1gigabyte. So add a -M option for mixing metadata+data, and default to this
mixed mode if the filesystem is less than or equal to 1 gigabyte. I've tested
this with xfstests on a 100mb filesystem and everything is a-ok.
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This patch updates the super field to add the cache_generation member. It also
makes us set it to -1 on mkfs so any new filesystem will get the space cache
stuff turned on. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Hi Chris,
below is enclosed a trivial patch, which has the aim to improve the error
reporting of the "btrfs" command.
You can pull from
http://cassiopea.homelinux.net/git/btrfs-progs-unstable.git
branch
strerror
I changed every printf("some-error") to something like:
e = errno;
fprintf(stderr, "ERROR: .... - %s", strerror(e));
so:
1) all the error are reported to standard error
2) At the end of the message is printed the error as returned by the system.
The change is quite simple, I replaced every printf("some-error") to the line
above. I don't touched anything other.
I also integrated a missing "printf" on the basis of the Ben patch.
This patch leads the btrfs command to be more "user friendly" :-)
Regards
G.Baroncelli
btrfs-list.c | 40 ++++++++++++++++++++++--------
btrfs_cmds.c | 77 ++++++++++++++++++++++++++++++++++++++++-----------------
utils.c | 6 ++++
3 files changed, 89 insertions(+), 34 deletions(-)
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Check_mount() should also work with multi device filesystems.
This patch adds checks that allow to detect if a file is a device
file used by a mounted single or multi device btrfs or if it is a
regular file used by a loopback device that is part of a mounted
single or multi device btrfs.
The single device checks also work for non-btrfs filesystems.
This might be helpful to prevent users from running btrfs programs
(e.g. mkfs.btrfs) accidentally on a filesystem used somewhere else.
Signed-off-by: Andi Drebes <lists-receive@programmierforen.de>
This commit introduces a new kind of back reference for btrfs metadata.
Once a filesystem has been mounted with this commit, IT WILL NO LONGER
BE MOUNTABLE BY OLDER KERNELS.
The new back ref provides information about pointer's key, level and in which
tree the pointer lives. This information allow us to find the pointer by
searching the tree. The shortcoming of the new back ref is that it only works
for pointers in tree blocks referenced by their owner trees.
This is mostly a problem for snapshots, where resolving one of these fuzzy back
references would be O(number_of_snapshots) and quite slow. The solution used
here is to use the fuzzy back references in the common case where a given tree
block is only referenced by one root, and use the full back references when
multiple roots have a reference
The structure used to send device in btrfs ioctl calls was not
properly aligned, and so 32 bit ioctls would not work properly on
64 bit kernels.
We could fix this with compat ioctls, but we're just one byte away
and it doesn't make sense at this stage to carry about the compat ioctls
forever at this stage in the project.
This patch brings the ioctl arg up to an evenly aligned 4k.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
brfsctl -a will do nothing and no error is output
if btrfs.ko is not inserted.
Since no caller do error processing for btrfs_register_one_device,
make its return void and do error processing inside.
Signed-off-by: Shen Feng <shen@cn.fujitsu.com>
This patch updates the ext3 to btrfs converter for the new
disk format. This mainly involves changing the convert's
data relocation and free space management code. This patch
also ports some functions from kernel module to btrfs-progs.
Thank you,
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
This patch updates btrfs-progs for superblock duplication.
Note: I didn't make this patch as complete as the one for
kernel since updating the converter requires changing the
code again. Thank you,
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
Btrfs stores checksums for each data block. Until now, they have
been stored in the subvolume trees, indexed by the inode that is
referencing the data block. This means that when we read the inode,
we've probably read in at least some checksums as well.
But, this has a few problems:
* The checksums are indexed by logical offset in the file. When
compression is on, this means we have to do the expensive checksumming
on the uncompressed data. It would be faster if we could checksum
the compressed data instead.
* If we implement encryption, we'll be checksumming the plain text and
storing that on disk. This is significantly less secure.
* For either compression or encryption, we have to get the plain text
back before we can verify the checksum as correct. This makes the raid
layer balancing and extent moving much more expensive.
* It makes the front end caching code more complex, as we have touch
the subvolume and inodes as we cache extents.
* There is potentitally one copy of the checksum in each subvolume
referencing an extent.
The solution used here is to store the extent checksums in a dedicated
tree. This allows us to index the checksums by phyiscal extent
start and length. It means:
* The checksum is against the data stored on disk, after any compression
or encryption is done.
* The checksum is stored in a central location, and can be verified without
following back references, or reading inodes.
This makes compression significantly faster by reducing the amount of
data that needs to be checksummed. It will also allow much faster
raid management code in general.
The checksums are indexed by a key with a fixed objectid (a magic value
in ctree.h) and offset set to the starting byte of the extent. This
allows us to copy the checksum items into the fsync log tree directly (or
any other tree), without having to invent a second format for them.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This is the btrfs-progs version of the patch to add the ability to have
different csum algorithims. Note I didn't change the image maker since it
seemed a bit more complicated than just changing some stuff around so I will let
Yan take care of that.
Everything else was converted and for now a mkfs just
sets the type to be BTRFS_CSUM_TYPE_CRC32.
Signed-off-by: Josef Bacik <jbacik@redhat.com>
This patch does the following:
1) Update device management code to match the kernel code.
2) Allocator fixes.
3) Add a program called btrfstune to set/clear the SEEDING
super block flags.
This patch adds transaction IDs to root tree pointers.
Transaction IDs in tree pointers are compared with the
generation numbers in block headers when reading root
blocks of trees. This can detect some types of IO errors.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
The offset field in struct btrfs_extent_ref records the position
inside file that file extent is referenced by. In the new back
reference system, tree leaves holding reference to file extent
are recorded explicitly. We can quickly scan these tree leaves, so the
offset field is not required.
This patch also makes the back reference system check the objectid
when extents are being deleted
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
This patch makes the back reference system to explicit record the
location of parent node for all types of extents. The location of
parent node is placed into the offset field of backref key. Every
time a tree block is balanced, the back references for the affected
lower level extents are updated.
Gcc only sends warnings for uninitialized variables when you compile with -O,
and there were a couple of bugs sprinkled in the code. The biggest was the
alloc_start variable for mkfs, which can cause strange things to happen.
(thanks to Gabor Micsko for helping to find this)
The main changes in this patch are adding chunk handing and data relocation
ability. In the last step of conversion, the converter relocates data in system
chunk and move chunk tree into system chunk. In the rollback process, the
converter remove chunk tree from system chunk and copy data back.
Regards
YZ
---
Block headers now store the chunk tree uuid
Chunk items records the device uuid for each stripes
Device extent items record better back refs to the chunk tree
Block groups record better back refs to the chunk tree
The chunk tree format has also changed. The objectid of BTRFS_CHUNK_ITEM_KEY
used to be the logical offset of the chunk. Now it is a chunk tree id,
with the logical offset being stored in the offset field of the key.
This allows a single chunk tree to record multiple logical address spaces,
upping the number of bytes indexed by a chunk tree from 2^64 to
2^128.
The mkfs code bootstraps the filesystem on a single device. Once
the raid block groups are setup, it needs to recow all of the blocks so
that each tree is properly allocated.
We get lots of warnings of the flavor:
utils.c:441: warning: format '%Lu' expects type 'long long unsigned int' but argument 2 has type 'u64'
And thanks to -Werror, the build fails. Clean up these printfs
by properly casting the arg to the format specified.
Signed-off-by: Alex Chiang <achiang@hp.com>
This saves from the blunder of formatting a live mounted filesystem.
This can be extended to get the mount flags of the filesystem
mounted.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@gmail.com>
Using strncpy avoids a 1 byte overflow into the next field
of the struct. The overflow is harmless, but does
trip automated tools.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
---
utils.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
This patch adds rollback support for the converter, the converter can
roll back a conversion if the image file haven't been modified. In
addition, I rearrange some codes in convert.c and add a few comments.