With mkfs.btrfs on a thin provisioned device with very small backing
size and big virtual size, all code works well in mkfs.btrfs until
close_ctree() is called.
close_ctree() fails to sync device due to small backing size while
closing devices. However, mkfs returns 0 in such situation which causes
failure of fstests generic/405.
So, let mkfs returns nonzero value if previous steps succeeded but
close_ctree() failed. Then fstests generic/405 passes now.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
mkfs-test 016 "rootdir-bad-symbolic-link" fails when selinux is enabled.
This is because add_xattr_item() uses getxattr() and tries to follow a
bad symbolic link for selinux item, which causes ENOENT error.
The line above already uses llistxattr() for getting list of xattr in
order not to follow a symbolic link, so just use lgetxattr() too.
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Using cp -a to install files will preserve the ownership of the original
files (if possible), which is typically not wanted. E.g. if the files
were built by a normal user, but are being installed by root, then the
installed files would maintain the UIDs/GIDs of the user that built the
files rather than be owned by root.
Signed-off-by: Peter Kjellerstedt <peter.kjellerstedt@axis.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
A few more typo fixes, merged with the pull request.
Pull-request: #120
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since "btrfs-progs: mkfs: add uuid and otime to ROOT_ITEM of, FS_TREE",
the top-level subvolume has a non-zero UUID, ctime, and otime. Fix the
subvolume_info() test to not check for zero.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Otherwise, make test-libbtrfsutil from a fresh checkout fails.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If we fail to reallocate the ID array, we still need to free it.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Deleted free space cache inodes also get an orphan item in the root
tree, but we shouldn't report those as deleted subvolumes. Deleted
subvolumes will still have the root item, so we can just do an extra
tree search.
Reported-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We can easily create the uuid tree that's usually created after first
mount. The kernel will still check the tree on first mount so we don't
try to fake the uuid tree generation so it appears consistent, even if
it's empty.
Signed-off-by: David Sterba <dsterba@suse.com>
For read_data_extent() in convert/main.c it's using mirror number in a
incorrect way, which will not get correct copy for RAID1:
for (cur_mirror = 0; cur_mirror < num_copies; cur_mirror++) {
In such case, for RAID1 @cur_mirror will only be 0 and 1.
However for 0 and 1 case, btrfs_map_block() will only return the first
copy. To reach the 2nd copy, it correct @cur_mirror range should be 1
and 2.
So with this off-by-one error, btrfs-image will never be able to read
out data extent if the first stripe of the chunk is the missing one.
Fix it by starting @cur_mirror from 1 and to @num_copies (including).
Fixes: 2d46558b30 ("btrfs-progs: Use existing facility to replace read_data_extent function")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When device is missing, read_extent_data() (function exported from old
btrfs check code) has the following problems:
1) Modifies @len parameter if device is missing
If device returned in @multi is missing, @len can be larger than
@max_len (originl length).
This could confuse caller and underflow in the read loop.
2) Still returns 0 for missing device
It only handles read error, missing device is not handled and 0 is
returned.
3) Wrong check for device->fd
In fact, 0 is also a valid fd.
Although not possible under most cases, but still needs fix.
Fix them all.
Fixes: 1bad2f2f2d ("Btrfs-progs: fsck: add an option to check data csums")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Parameter usagestr is not used, remove it.
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
If we have a symbolic link in rootdir pointing to non-existing location,
mkfs.btrfs --rootdir will just fail:
------
$ mkfs.btrfs -f --rootdir /tmp/rootdir/ /dev/data/btrfs
btrfs-progs v4.15.1
See http://btrfs.wiki.kernel.org for more information.
ERROR: ftw subdir walk of /tmp/rootdir/ failed: No such file or directory
------
[CAUSE]
Commit 599a0abed5 ("btrfs-progs: mkfs/rootdir: Use over-reserve method
to make size estimate easier") add extra ftw walk to estimate the
filesystem size.
Such default ftw walk will follow symbolic link and gives ENOENT error.
[FIX]
Use nftw() to specify FTW_PHYS so we won't follow symbolic link for size
calculation.
Issue: #109
Reported-by: Alexander Kanavin <alexander.kanavin@intel.com>
Fixes: 599a0abed5 ("btrfs-progs: mkfs/rootdir: Use over-reserve method to make size estimate easier")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Modify cscope/ctags rule to include directories such as check/
libbtrfsutil/kernel-lib/kernel-shared.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
$ make btrfs-sb-mod
$ ./btrfs-sb-mod image field1 operation1 ...
Fields (only u64 supported for now):
* total_bytes
* root
* generation
* chunk_root
* chunk_root_generation
* cache_generation
* uuid_tree_generation
Operations:
* read value ?0
* set value =NUMBER
* add to +NUMBER
* subtract from value -NUMBER
* xor with value ^NUMBER
* byteswap (u64) @0
Use with care!
Signed-off-by: David Sterba <dsterba@suse.com>
This @first_key variable is introduced in f5c4c4f3b7
("btrfsck: add code to rebuild extent records"), however it's not only
unused, but also used incorrectly.
It's calling btrfs_item_key_to_cpu() on an node extent buffer.
Anyway, just remove it.
Fixes: f5c4c4f3b7 ("btrfsck: add code to rebuild extent records")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The kernel code no longer has BTRFS_CRC32_SIZE and only uses
btrfs_csum_sizes[]. So, update the progs code as well.
Suggested-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently, the top-level subvolume lacks the UUID. As a result, both
non-snapshot subvolume and snapshot of top-level subvolume do not have
Parent UUID and cannot be distinguisued. Therefore "fi show" of
top-level lists all the subvolumes which lacks the UUID in
"Snapshot(s)" filed. Also, it lacks the otime information.
Fix this by adding the UUID and otime at the mkfs time. As a
consequence, snapshots of top-level subvolume now have a Parent UUID and
UUID tree will create an entry for top-level subvolume at mount time.
This should not cause the problem for current kernel, but user program
which relies on the empty Parent UUID may be affected by this change.
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently we print the raw values of the owner field of leaf/nodes.
This can result in output like the following:
leaf 30490624 items 2 free space 16061 generation 4 owner 18446744073709551607
With the patch applied the same leaf looks like:
leaf 30490624 items 2 free space 16061 generation 4 owner DATA_RELOC_TREE
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add a test case for mkfs --rootdir, using files with different file
sizes to check if invalid large inline extent could exist.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For inline compressed file extent, kernel doesn't allow inline extent
ram size larger than sector size and on-disk inline extent size should
not exceed BTRFS_MAX_INLINE_DATA_SIZE().
For inline uncompressed file extent, kernel doesn't allow inline extent
ram and on-disk size larger than either BTRFS_MAX_INLINE_DATA_SIZE() or
sector size.
Check it in original mode.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Just like convert, we need extra check against sector size for creating
inline extent.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[Bug]
On btrfs converted from ext*, one user reported the following kernel
warning:
------------[ cut here ]------------
BTRFS: Transaction aborted (error -95)
WARNING: CPU: 0 PID: 324 at fs/btrfs/inode.c:3042 btrfs_finish_ordered_io+0x7ab/0x850 [btrfs]
Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
RIP: 0010:btrfs_finish_ordered_io+0x7ab/0x850 [btrfs]
...
Call Trace:
normal_work_helper+0x39/0x370 [btrfs]
process_one_work+0x1ce/0x410
worker_thread+0x2b/0x3d0
? process_one_work+0x410/0x410
kthread+0x113/0x130
? kthread_create_on_node+0x70/0x70
? do_syscall_64+0x74/0x190
? SyS_exit_group+0x10/0x10
ret_from_fork+0x35/0x40
---[ end trace c8ed62ff6a525901 ]---
BTRFS: error (device dm-2) in
btrfs_finish_ordered_io:3042: errno=-95 unknown
BTRFS info (device dm-2): forced readonly
BTRFS error (device dm-2): pending csums is 6447104
[Cause]
The call trace and the unique return value points to
__btrfs_drop_extents(), when we tries to drop pages of an inline extent,
we will trigger such -EOPNOTSUPP.
However kernel has limitation on the size of inline file extent
(sector size for ram size and sector size - 1 for on-disk size),
btrfs-convert doesn't have the same limitation, resulting much larger
file extent.
The lack of correct inline extent size check dates back to 2008 when
btrfs-convert is added into btrfs-progs.
[Fix]
Fix the inline extent creation condition, not only using
BTRFS_MAX_INLINE_DATA_SIZE(), which is only the maximum size of inline
data according to nodesize, but also limit it against sector size.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When debuging with "btrfs inspect dump-tree", it's not that handy if we
want to iterate all child tree blocks starting from a specified block.
-b can only print a single block, while without -b "btrfs inspect dump-tree"
will need extra tree roots fulfilled to continue, which is not possible
for a damaged filesystem.
Add a new option --follow to iterate a sub-tree starting from block
specified by --block.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ remove the short option for now ]
Signed-off-by: David Sterba <dsterba@suse.com>
This patch enhances the tree block level mismatch by the following
methods:
1) Merge same warning branches into one
We had two branches showing the same message, and their condition
is also the same. Merge them
2) Only skip bad slot
The old code skipped all the remaining slots, here we just skip one
slot to output as many correct tree blocks as possible.
3) Enhance warning message
Output the parent bytenr and expected and wrong level, so we don't
need to refer to stdout to get which tree block is corrupted.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit Fixes: 8c36786c81 ("btrfs-progs: print-tree: Print offset as
tree objectid for ROOT_ITEM") changes how we translate offset of
ROOT_ITEM.
However the fact is, even for ROOT_ITEM, we have different meaning of
offset.
For tree reloc tree, it's indeed subvolume id. But for other trees,
it's the transid of when it's created.
Reported-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In latest e2fsprogs (1.44.0) definition of ext2_ext_attr_entry has
removed member e_value_block, as currently ext* doesn't support it set
anyway.
So remove such check so that we can pass compile.
Issue: #110
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199071
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Verify that a filesystem check operation (fsck) does not report the
following scenario as an error:
An extent is shared between two inodes, as a result of clone/reflink
operation, and for one of the inodes, lets call it inode A, the extent is
referenced through a file extent item as a prealloc extent, while for the
other inode, call it inode B, the extent is referenced through a regular
file extent item, that is, it was written to. The goal of this test is to
make sure a filesystem check operation will not report "odd csum items"
errors for the prealloc extent at inode A, because this scenario is valid
since the extent was written through inode B and therefore it is expected
to have checksum items in the filesystem's checksum btree for that shared
extent.
Such scenario can be created with the following steps for example:
mkfs.btrfs -f /dev/sdb
mount /dev/sdb /mnt
touch /mnt/foo
xfs_io -c "falloc 0 256K" /mnt/foo
sync
xfs_io -c "pwrite -S 0xab 0 256K" /mnt/foo
touch /mnt/bar
xfs_io -c "reflink /mnt/foo 0 0 256K" /mnt/bar
xfs_io -c "fsync" /mnt/bar
<power fail>
mount /dev/sdb /mnt
umount /mnt
This scenario is fixed by the following patch for the filesystem checker:
"Btrfs-progs: check, fix false error reports for shared prealloc extents"
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Under some cases the filesystem checker reports an error when it finds
checksum items for an extent that is referenced by an inode as a prealloc
extent. Such cases are not an error when the extent is actually shared
(was cloned/reflinked) with other inodes and was written through one of
those other inodes.
Example:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ touch /mnt/foo
$ xfs_io -c "falloc 0 256K" /mnt/foo
$ sync
$ xfs_io -c "pwrite -S 0xab 0 256K" /mnt/foo
$ touch /mnt/bar
$ xfs_io -c "reflink /mnt/foo 0 0 256K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
<power fail>
$ mount /dev/sdb /mnt
$ umount /mnt
$ btrfs check /dev/sdc
Checking filesystem on /dev/sdb
UUID: 52d3006e-ee3b-40eb-aa21-e56253a03d39
checking extents
checking free space cache
checking fs roots
root 5 inode 257 errors 800, odd csum item
ERROR: errors found in fs roots
found 688128 bytes used, error(s) found
total csum bytes: 256
total tree bytes: 163840
total fs tree bytes: 65536
total extent tree bytes: 16384
btree space waste bytes: 138819
file data blocks allocated: 10747904
referenced 10747904
$ echo $?
1
So teach check to not report such cases as errors by checking if the
extent is shared with other inodes and if so, consider it an error the
existence of checksum items only if all those other inodes are referencing
the extent as a prealloc extent.
This case can be hit often when running the generic/475 testcase from
fstests.
A test case will follow in a separate patch.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Strangely, we have level check in btrfs_print_tree() while we don't have
the same check in read_node_slot().
That's to say, for the following corruption, btrfs_search_slot() or
btrfs_next_leaf() can return invalid leaf:
Parent eb:
node XXXXXX level 1
^^^^^^^
Child should be leaf (level 0)
...
key (XXX XXX XXX) block YYYYYY
Child eb:
leaf YYYYYY level 1
^^^^^^^
Something went wrong now
And for the corrupted leaf returned, later caller can be screwed up
easily.
Reported-by: Ralph Gauges <ralphgauges@googlemail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The commit cebf3b3722 ("btrfs-progs: introduce TEST_TOP and
INTERNAL_BIN for tests") did not convert all test paths. This would
break the exported testsutie.
Signed-off-by: David Sterba <dsterba@suse.com>