We reuse the task_position enum and task_ctx struct of the original progress
indicator, adding more values and fields for our needs.
Then add hooks in all steps of the check to properly record progress.
Here's how the output looks like on a 22 Tb 5-disk RAID1 FS:
Opening filesystem to check...
Checking filesystem on /dev/mapper/luks-ST10000VN0004-XXXXXXXX
UUID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
[1/7] checking extents (0:20:21 elapsed, 950958 items checked)
[2/7] checking root items (0:01:29 elapsed, 15121 items checked)
[3/7] checking free space cache (0:00:11 elapsed, 4928 items checked)
[4/7] checking fs roots (0:51:31 elapsed, 600892 items checked)
[5/7] checking csums (0:14:35 elapsed, 754522 items checked)
[6/7] checking root refs (0:00:00 elapsed, 232 items checked)
[7/7] checking quota groups skipped (not enabled on this FS)
found 5286458060800 bytes used, no error found
Signed-off-by: Stéphane Lesimple <stephane_btrfs@lesimple.fr>
Signed-off-by: David Sterba <dsterba@suse.com>
This is a single-line fix on the preexisting task_period_start function.
Signed-off-by: Stéphane Lesimple <stephane_btrfs@lesimple.fr>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
periodic.timer_fd's value is 0 on inititlize-failed case,
if no value-checking before read(), the code will run as
read(STDIN).
This patch fixed above case.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In current code, (info->periodic.timer_fd == 0) means it is not
valid, as:
if (info->periodic.timer_fd) {
close(info->periodic.timer_fd);
...
We need set its value from -1 to 0 in create-fail case, to make
code like above works.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
task_period_stop() is used to close a timer explicitly, to avoid
the timer handle closed again by task_stop(), we should reset its
value after close.
Also add value-reset for info->id for safe.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The timer handle have possibility in using by sub thread,
better to close it after sub process exit.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
btrfs-convert sometimes show 'Assertion failed' in converting a nearly blank
file system, as:
create btrfs filesystem:
blocksize: 4096
nodesize: 16384
features: extref, skinny-metadata (default)
creating btrfs metadata.
creating ext2fs image file.
trans 7 running 5
ctree.c:363: btrfs_cow_block: Assertion `1` failed.
btrfs-convert(btrfs_cow_block+0x92)[0x40acaf]
btrfs-convert(btrfs_search_slot+0x1cb)[0x40c50f]
btrfs-convert(btrfs_csum_file_block+0x20f)[0x41d83a]
btrfs-convert[0x43422d]
btrfs-convert[0x4342cd]
btrfs-convert[0x4345ca]
btrfs-convert[0x434767]
btrfs-convert[0x435770]
btrfs-convert[0x439748]
btrfs-convert(main+0x13f8)[0x43b09d]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x335e01ecdd]
btrfs-convert[0x407649]
Reason is complex:
1: main thread allocated a block of memory,
shared with sub thread
2: main thread killed sub thread, and free above memory
3: main thread malloc a new one(in same address),
and use it
4: sub thread(which is not really quit), write into
this address, and caused this bug.
By adding some debug lines into code, we can see following output:
create btrfs filesystem:
blocksize: 4096
nodesize: 16384
features: extref, skinny-metadata (default)
creating btrfs metadata.
1: ctx(0x7ffe1abde230)->info=0xc65b80
2: task_period_start: will create periodic.timer_fd
3: task_stop: info->periodic.timer_fd = NULL
4: task_stop: begin pthread_cancel info->id=-1746053376
5: task_stop: done pthread_cancel ret=0
6: task_stop: begin info->postfn
7: task_period_stop: periodic.timer_fd NULL
8: task_stop: done info->postfn
9: task_stop: done all
10: creating ext2fs image file.
trans 7 running 5
11: task_period_start: create periodic.timer_fd done info->periodic.timer_fd(0xc65b80)=7
12: btrfs_cow_block: root->fs_info->generation(0xc63568) = 5 trans->transid(0xc65b80)=7
13: ctree.c:368: btrfs_cow_block: Assertion `1` failed.
./btrfs-convert(btrfs_cow_block+0xda)[0x40ad37]
./btrfs-convert(btrfs_search_slot+0x1cb)[0x40c5b4]
./btrfs-convert(btrfs_insert_empty_items+0xac)[0x40d9f6]
./btrfs-convert(btrfs_record_file_extent+0xc0)[0x4183fe]
./btrfs-convert[0x435796]
./btrfs-convert[0x439b0c]
./btrfs-convert(main+0x13f8)[0x43b45d]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x335e01ecdd]
./btrfs-convert[0x407689]
Conclusion:
a: subthread should exit before step 5, but it is still running
in step 11
b: task_stop() hadn't close periodic.timer_fd in step3,
because periodic.timer_fd is not initialized yet.
c. address of 0xc65b80 is overwrited by subthread in step 11,
but this address is freed and re-malloc by main thread
before step 10, and used for trans->transid.
d: trans->transid which is overwrite by subthread caused error
in step 13.
Fix:
pthread_cancel() only send a cancellation request to the thread,
thread will quit in next cancellation point by default.
To make sub thread quit in time, this patch add pthread_join()
after pthread_cancel() call.
And to make pthread_join() works, pthread_detach() is removed.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Support for monitoring progress of running tasks, based on timerfd and
pthreads.
Signed-off-by: Silvio Fricke <silvio.fricke@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>