If a file lost all its extents, fsck will unable to print out the hole.
Add an extra check to print out the hole range.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
- limits.h must be included to pick up PATH_MAX.
- remove double declaration of BTRFS_DISABLE_BACKTRACE
kerncompat.h assumed that if __GLIBC__ was not defined,
it could safely define BTRFS_DISABLE_BACKTRACE, however this can be
defined by the configure script. Added a check to ensure it is not
defined first.
Signed-off-by: Brendan Heading <brendanheading@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In parse_profile() function, in error handling route, it output error
message but forgot to exit(1), causing even profile is not valid, it
will just fallback to single.
Reported-by: James Harvey <jamespharvey20@gmail.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Previous patch detecs inconsistency and unconditionally triggers quota
rescan. This may not be always desired as it's a heavy metadata
operation. In case of batch assignments it's better to trigger the
rescan at the end.
Signed-off-by: David Sterba <dsterba@suse.com>
NOTE: This patch needs to cooperate with kernel patches, which will fix
a kernel bug that never clear INCONSISTENT bit and return 1 if quota
assign makes qgroup data inconsistent.
Some qgroup assign will cause qgroup data inconsistent, like remove a
qgroup with shared extents from a parent qgroup. But some won't, like
assign a empty(OK, nodesize rfer and exel) to a qgroup.
Newer kernel will return 1 if qgroup data inconsistent and in that case
we should schedule a quota rescan.
This patch will do this in btrfs-progs.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Current code of btrfs-convert have a bug of thread conflict, which caused
invalid memory accessing between threads, and make program panic.
This patch add a test item for above bug, as:
# ./misc-tests.sh
[TEST] 001-btrfstune-features
[TEST] 002-uuid-rewrite
[TEST] 003-zero-log
[TEST] 004-convert-thread-conflict
failed: btrfs-convert /root/btrfsprogs/tests/test.img
test failed for case 004-convert-thread-conflict
#
# cat misc-tests-results.txt
...
############### btrfs-convert /root/btrfsprogs/tests/test.img
trans 7 running 5
ctree.c:363: btrfs_cow_block: Assertion `1` failed.
btrfs-convert(btrfs_cow_block+0x92)[0x40acaf]
btrfs-convert(btrfs_search_slot+0x1cb)[0x40c50f]
btrfs-convert(btrfs_csum_file_block+0x20f)[0x41d83a]
btrfs-convert[0x43422d]
btrfs-convert[0x4342cd]
btrfs-convert[0x4345ca]
btrfs-convert[0x434767]
btrfs-convert[0x435770]
btrfs-convert[0x439748]
btrfs-convert(main+0x13f8)[0x43b09d]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x335e01ecdd]
btrfs-convert[0x407649]
create btrfs filesystem:
blocksize: 4096
nodesize: 16384
features: extref, skinny-metadata (default)
creating btrfs metadata.
creating ext2fs image file.
failed: btrfs-convert /root/btrfsprogs/tests/test.img
test failed for case 004-convert-thread-conflict
#
Note that this bug is not happened every time, especilly in slow
device as loop(slow cpu with fast block device is likely to trigger).
I set loop count to 20 to make bug happened in 90% tests.
Suggested-by: David Sterba <dsterba@suse.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This code block is used several tests, move it to ./common and add a
helper.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In current code, (info->periodic.timer_fd == 0) means it is not
valid, as:
if (info->periodic.timer_fd) {
close(info->periodic.timer_fd);
...
We need set its value from -1 to 0 in create-fail case, to make
code like above works.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
task_period_stop() is used to close a timer explicitly, to avoid
the timer handle closed again by task_stop(), we should reset its
value after close.
Also add value-reset for info->id for safe.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
No need to close timer handle afain in subthread-close-callback,
it is closed by task_stop() automatically.
This patch also add a fflush() to make log output on time.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The timer handle have possibility in using by sub thread,
better to close it after sub process exit.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
btrfs-convert sometimes show 'Assertion failed' in converting a nearly blank
file system, as:
create btrfs filesystem:
blocksize: 4096
nodesize: 16384
features: extref, skinny-metadata (default)
creating btrfs metadata.
creating ext2fs image file.
trans 7 running 5
ctree.c:363: btrfs_cow_block: Assertion `1` failed.
btrfs-convert(btrfs_cow_block+0x92)[0x40acaf]
btrfs-convert(btrfs_search_slot+0x1cb)[0x40c50f]
btrfs-convert(btrfs_csum_file_block+0x20f)[0x41d83a]
btrfs-convert[0x43422d]
btrfs-convert[0x4342cd]
btrfs-convert[0x4345ca]
btrfs-convert[0x434767]
btrfs-convert[0x435770]
btrfs-convert[0x439748]
btrfs-convert(main+0x13f8)[0x43b09d]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x335e01ecdd]
btrfs-convert[0x407649]
Reason is complex:
1: main thread allocated a block of memory,
shared with sub thread
2: main thread killed sub thread, and free above memory
3: main thread malloc a new one(in same address),
and use it
4: sub thread(which is not really quit), write into
this address, and caused this bug.
By adding some debug lines into code, we can see following output:
create btrfs filesystem:
blocksize: 4096
nodesize: 16384
features: extref, skinny-metadata (default)
creating btrfs metadata.
1: ctx(0x7ffe1abde230)->info=0xc65b80
2: task_period_start: will create periodic.timer_fd
3: task_stop: info->periodic.timer_fd = NULL
4: task_stop: begin pthread_cancel info->id=-1746053376
5: task_stop: done pthread_cancel ret=0
6: task_stop: begin info->postfn
7: task_period_stop: periodic.timer_fd NULL
8: task_stop: done info->postfn
9: task_stop: done all
10: creating ext2fs image file.
trans 7 running 5
11: task_period_start: create periodic.timer_fd done info->periodic.timer_fd(0xc65b80)=7
12: btrfs_cow_block: root->fs_info->generation(0xc63568) = 5 trans->transid(0xc65b80)=7
13: ctree.c:368: btrfs_cow_block: Assertion `1` failed.
./btrfs-convert(btrfs_cow_block+0xda)[0x40ad37]
./btrfs-convert(btrfs_search_slot+0x1cb)[0x40c5b4]
./btrfs-convert(btrfs_insert_empty_items+0xac)[0x40d9f6]
./btrfs-convert(btrfs_record_file_extent+0xc0)[0x4183fe]
./btrfs-convert[0x435796]
./btrfs-convert[0x439b0c]
./btrfs-convert(main+0x13f8)[0x43b45d]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x335e01ecdd]
./btrfs-convert[0x407689]
Conclusion:
a: subthread should exit before step 5, but it is still running
in step 11
b: task_stop() hadn't close periodic.timer_fd in step3,
because periodic.timer_fd is not initialized yet.
c. address of 0xc65b80 is overwrited by subthread in step 11,
but this address is freed and re-malloc by main thread
before step 10, and used for trans->transid.
d: trans->transid which is overwrite by subthread caused error
in step 13.
Fix:
pthread_cancel() only send a cancellation request to the thread,
thread will quit in next cancellation point by default.
To make sub thread quit in time, this patch add pthread_join()
after pthread_cancel() call.
And to make pthread_join() works, pthread_detach() is removed.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
To avoid following mount error in test:
mount: /root/btrfs/progs/tests/fsck-tests/012-leaf-corruption/test.img
is not a block device (maybe try `-o loop'?)
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
As convert implement its own alloc extent, avoid such metadata problem
too.
Reported-by: Chris Murphy <lists@colorremedies.com>
Reported-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Now find_free_extent() function won't return a metadata extent that
crosses stripe boundary.
Reported-by: Chris Murphy <lists@colorremedies.com>
Reported-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Kernel btrfs_map_block() function has a limitation that it can only
map BTRFS_STRIPE_LEN size.
That will cause scrub fails to scrub tree block which crosses strip
boundary, causing BUG_ON().
Normally, it's OK as metadata is always in metadata chunk and
BTRFS_STRIPE_LEN can always be divided by node/leaf size.
So without mixed block group, tree block won't cross stripe boundary.
But for mixed block group, especially for btrfs converted from ext4,
it's almost sure one or more tree blocks are not aligned with node size
and cross stripe boundary.
Causing bug with kernel scrub.
This patch will report the problem, although we don't have a good idea
how to fix it in user space until we add the ability to relocate tree
block in user space.
Also, kernel code should also be checked for such tree block alloc
problems.
Reported-by: Chris Murphy <lists@colorremedies.com>
Reported-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Although it is fixed to BTRFS_STRIPE_LEN(64K) now, it's still used in a
lot of code, just output it for user who wants to trace the source of
stripe_len in btrfs_map_bio() code.
Reported-by: Chris Murphy <lists@colorremedies.com>
Reported-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
btrfs progs output following error message when doing resize on
no-enouth-free-space case:
# btrfs filesystem resize +10g /mnt/btrfs_5gb
Resize '/mnt/btrfs_5gb' of '+10g'
ERROR: unable to resize '/mnt/btrfs_5gb' - File too large
#
It is not a good description for users, and this patch changed it to:
# ./btrfs filesystem resize +10G /mnt/tmp1
Resize '/mnt/tmp1' of '+10G'
ERROR: unable to resize '/mnt/tmp1' - no enouth free space
#
Reported-by: Taeha Kim <kthguru@gmail.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
A leftover from when recursive defrag was added.
Signed-off-by: Patrik Lundquist <patrik.lundquist@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit dedb1ebeee broke commit
96cfbbf0ea.
Casting thresh value greater than (u32)-1 simply truncates bits while
desired value is (u32)-1 for max defrag threshold.
I.e. "btrfs fi defrag -t 4g" is trimmed/truncated to 0
and "-t 5g" to 1073741824.
Also added a missing newline.
Signed-off-by: Patrik Lundquist <patrik.lundquist@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
To run a given test set the variable TEST like
$ make test TEST=002-bad-transid
$ make test TEST=002-*
and only tests matching the value will be run. The pattern is glob and
pased to 'find -name'.
The convert tests do not follow the fsck and misc layout and are skipped
if TEST is set.
Signed-off-by: David Sterba <dsterba@suse.com>
Previously in 'filesystem resize get_min_size', now
'inspect-internal min-dev-size'. We'd like to avoid cluttering the
'resize' syntax further.
The test has been updated to exercise the new option.
Signed-off-by: David Sterba <dsterba@suse.com>
Currently there is not way for a user to know what is the minimum size a
device of a btrfs filesystem can be resized to. Sometimes the value of
total allocated space (sum of all allocated chunks/device extents), which
can be parsed from 'btrfs filesystem show' and 'btrfs filesystem usage',
works as the minimum size, but sometimes it does not, namely when device
extents have to relocated to holes (unallocated space) within the new
size of the device (the total allocated space sum).
This change adds the ability to reliably compute such minimum value and
extents 'btrfs filesystem resize' with the following syntax to get such
value:
btrfs filesystem resize [devid:]get_min_size
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
# mkfs.btrfs /dev/sdb /dev/sdd -m raid0 -d raid0
# mount /dev/sdb /mnt/btrfs
# btrfs balance start /mnt/btrfs
# btrfs fi df /mnt/btrfs
Data, single: total=1.00GiB, used=320.00KiB
System, single: total=32.00MiB, used=16.00KiB
Metadata, RAID0: total=256.00MiB, used=112.00KiB
GlobalReserve, single: total=16.00MiB, used=0.00B
Only metadata stay RAID0. Data and system goes from RAID0 to single.
[REASON]
The problem is caused by the temporary single chunk.
In mkfs, it will always create single data/metadata/sys chunk and them
add device into the temporary btrfs.
When doing all chunk balance, for data and syschunk, they are almost
empty, so balance will move them into the single chunk and remove the
old RAID0 chunk.
For metadata, it has more data and will kick the metadata chunk pre
alloc, so new RAID0 chunk is allocated and the old metadata is move
there. Old RAID0 and single chunks are removed.
[FIX]
Now we add a new function to cleanup the temporary chunks at the end of
mkfs routine.
It will cleanup the chunks which is empty and its profile differs from
the mkfs profile.
So in balance, btrfs will always alloc a new chunk to keep the profile,
other than moving data into the single chunk.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Man manual need to be updated since RAID5/6 has been supported
by btrfs-replace.
Signed-off-by: Wang Yanfeng <wangyf-fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This reverts commit 5f8232e5c8.
This commit causes a regression:
$ mkfs.btrfs -f /dev/sda6
$ btrfsck /dev/sda6
Checking filesystem on /dev/sda6
UUID: 2ebb483c-1986-4610-802a-c6f3e6ab4b76
checking extents
Chunk[256, 228, 0]: length(4194304), offset(0), type(2) mismatch with
block group[0, 192, 4194304]: offset(4194304), objectid(0), flags(34)
Chunk[256, 228, 4194304]: length(8388608), offset(4194304), type(4)
mismatch with block group[4194304, 192, 8388608]: offset(8388608),
objectid(4194304), flags(36)
Block group[0, 4194304] (flags = 34) didn't find the relative chunk.
Block group[4194304, 8388608] (flags = 36) didn't find the relative
chunk.
......
The commit has the following bug causing the problem.
1) Typo forgets to add meta/data_profile for alloc_chunk.
Only meta/data_profile is added to allocate a block group, but not
chunk.
2) Type for the first system chunk is impossible to modify yet.
The type for the first chunk and its stripe is hard coded into
make_btrfs() function.
So even we try to modify the type of the block group, we are unable to
change the type of the first chunk.
Causing the chunk type mismatch problem.
The 1st bug can be fixed quite easily but the second is not.
The good news is, the last patch "btrfs-progs: mkfs: Cleanup temporary
chunk to avoid strange balance behavior." from my patchset can handle it
quite well alone.
So just revert the patch.
New bug fix for btrfsck(err is 0 even chunk/extent tree is corrupted) and
new test cases for mkfs will follow soon.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This function will be used to free a empty chunk.
This provides the basis for later temp chunk cleanup.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce two functions, free_space_info and free_block_group_cache.
The former will free the space of a empty block group.
The latter will free the in memory block group cache along with its
space in space_info and device space.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce two functions, free_chunk_item and free_system_chunk_item.
First one will free chunk item in chunk tree.
The latter one will free a system chunk in super block.
They are used for later chunk/block group free function.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce two functions, free_dev_extent_item and
free_chunk_dev_extent_items, to free dev extent items in a chunk.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This function is used to free a block group item. It must be called
with all the space in the block group pinned. Or there is a possibility
that tree blocks be allocated into the range.
The function is used for later block group/chunk free function.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
As chunk tree is only stored in super block, chunk tree commit doesn't
need to go through tree root update.
Or a BUG_ON will be triggered.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add a new test case for I_ERR_FILE_WRONG_NBYTES.
The new btrfs-image dump image contains a file in 12K size.
But nbytes in its inode item is a random number.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Some unknown kernel bug makes inode nbytes modification out of sync with
file extent update.
But it's quite easy to fix in btrfs-progs anyway.
So just fix it by adding a new function repair_inode_nbytes by using the
found_size in inode_record.
Reported-by: Christian <cdysthe@gmail.com>
Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The filesystem creation has to solve some chicken-egg problems and
creates some temporary objects. In our case it's an extra single/single
pair of block groups that's not used unless the user asks that
explicitly.
Example:
Data, single: total=8.00MiB, used=64.00KiB
System, DUP: total=8.00MiB, used=16.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, DUP: total=153.56MiB, used=112.00KiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=16.00MiB, used=0.00B
Even with a single device filesystem and defaults, there's single
block group for metadata and system. The single device case is easy to
fix, we'll simply create the right type from the beginning.
Example:
Data, single: total=8.00MiB, used=64.00KiB
System, DUP: total=4.00MiB, used=16.00KiB
Metadata, DUP: total=136.00MiB, used=112.00KiB
GlobalReserve, single: total=16.00MiB, used=0.00B
Filesystem on top of multiple devices still leaves the single/single
groups behind.
Signed-off-by: David Sterba <dsterba@suse.com>