Large numbers like (1024 * 1024 * 1024) may cost reader/reviewer to
waste one second to convert to 1G.
Introduce kernel include/linux/sizes.h to replace any intermediate
number larger than 4096 (not including 4096) to SZ_*.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commands that do not take any options do not use getopt, which means the
standard option separator "--" does not work. Update all command
handlers that need it, argv needs to be referenced using the optind that
is correctly pointed after the separator.
Signed-off-by: David Sterba <dsterba@suse.com>
Use btrfs_open_dir() in open_path_or_dev_mnt() to make the function
return error when target is neither block device nor btrfs mount point.
Also add "verbose" argument to let function output common error
message instead of putting duplicated lines in caller.
Before patch:
# ./btrfs device stats /mnt/tmp1
ERROR: getting dev info for devstats failed: Inappropriate ioctl for device
# ./btrfs replace start /dev/vdd /dev/vde /mnt/tmp1
ERROR: ioctl(DEV_REPLACE_STATUS) failed on "/mnt/tmp1": Inappropriate ioctl for device
After patch:
# ./btrfs device stats /mnt/tmp1
ERROR: not a btrfs filesystem: /mnt/tmp1
# ./btrfs replace start /dev/vdd /dev/vde /mnt/tmp1
ERROR: not a btrfs filesystem: /mnt/tmp1
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Use common warning/error functions in cmds-scrub.c, it can make
message format unified and make code simple.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
[removed ending newlines from messages]
Signed-off-by: David Sterba <dsterba@suse.com>
Scrub output prints the following error message in my test:
ERROR: scrubbing /var/ltf/tester/scratch_mnt failed for device id 5 (Success)
It is caused by a broken kernel and fs, but the we need to avoid
printing both "error and success" on one line as above.
This patch modified above message to:
ERROR: scrubbing /var/ltf/tester/scratch_mnt failed for device id 5: ret=1, errno=0 (Success)
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
[minor updates in formatting]
Signed-off-by: David Sterba <dsterba@suse.com>
Anthony Plack <anthony@plack.net> reported a output bug in maillist:
title: btrfs-progs SCRUB reporting aborted but still running - minor
btrfs scrub status report it was aborted but still runs to completion.
# btrfs scrub status /mnt/data
scrub status for f591ac13-1a69-476d-bd30-346f87a491da
scrub started at Mon Apr 27 06:48:44 2015 and was aborted after 1089 seconds
total bytes scrubbed: 1.02TiB with 0 errors
#
# btrfs scrub status /mnt/data
scrub status for f591ac13-1a69-476d-bd30-346f87a491da
scrub started at Mon Apr 27 06:48:44 2015 and was aborted after 1664 seconds
total bytes scrubbed: 1.53TiB with 0 errors
#
...
Reason:
When scrub multi-device simultaneously, if some device canceled,
and some device is still running, cancel state have higher priority to
be outputed in global report.
So we can see "scrub aborted" in status line, with running-time keeps
increased.
Fix:
We can increase running state's priority in output, if there is
some device in scrub state, we output running state instead of
cancelled state.
Reported-by: Anthony Plack <anthony@plack.net>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch includes below fixes in error path:
1. fix memory leaks if realloc() fails
2. add missing call free_history() before return error in scrub_read_file()
Signed-off-by: Byongho Lee <bhlee.kernel@gmail.com>
Reviewed-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
- limits.h must be included to pick up PATH_MAX.
- remove double declaration of BTRFS_DISABLE_BACKTRACE
kerncompat.h assumed that if __GLIBC__ was not defined,
it could safely define BTRFS_DISABLE_BACKTRACE, however this can be
defined by the configure script. Added a check to ensure it is not
defined first.
Signed-off-by: Brendan Heading <brendanheading@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The path bufferes should be PATH_MAX but BTRFS_PATH_NAME_MAX is shorter
due to embedding in 4k aligned structures.
The only reason to use BTRFS_PATH_NAME_MAX is for the respective
structures btrfs_ioctl_vol_args::name.
Signed-off-by: David Sterba <dsterba@suse.cz>
scrub status for d4dc0da9-e8cc-4bfe-9b6f-2dcf8e0754f5
scrub started at Sat Jan 1 00:00:01 UTC 2000 and finished after 00:43:05
total bytes scrubbed: 111.17GiB with 0 errors
Signed-off-by: David Sterba <dsterba@suse.cz>
Scrub on multiple devices may report wrong status if scrub finishes
early on one of them.
Reported-by: Sandy McArthur Jr <sandymac@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
If scrub is not cancelled nor finished, the recorded status will prevent
scrub to start again though it's not running. There's a force option to
run it anyway, but this is just a bandaid and the true status of scrub
should be detected automatically. The force option should not be
necessary anymore.
The test introduced in 9681f82853 checks only the status file,
not kernel status of scrub.
Signed-off-by: David Sterba <dsterba@suse.cz>
- Add missing description about "-R" option in the command
usage of "btrfs scrub resume".
- Add missing comma to avoid the following misformatted command
usage of "btrfs scrub start". See the line of "-R" option.
===
usage: btrfs scrub start [-BdqrRf] [-c ioprio_class -n ioprio_classdata] <path>|<device>
Start a new scrub
-B do not background
-d stats per device (-B only)
-q be quiet
-r read only mode
-R raw print mode, print full data instead of summary-c set ioprio class (see ionice(1) manpage)
-n set ioprio classdata (see ionice(1) manpage)
-f force to skip checking whether scrub has started/resumed in userspace
this is useful when scrub stats record file is damaged
===
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The raw number 36 for the uuid string length is somewhat confusing,
use a macro to define replace it.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
[Use BTRFS_UUID_UNPARSED_SIZE]
Signed-off-by: David Sterba <dsterba@suse.cz>
The -f option of scrub means to
"force starting new scrub even if a scrub is already running"
*not*
"force to check whether scrub has started or resumed in userspace"
as described originally.
So replace the orignal description in the manpage and code.
Also, add description of the potential failure as follows
"If a scrub is already running, it fails."
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Cc: David Sterba <dsterba@suse.cz>
Signed-off-by: David Sterba <dsterba@suse.cz>
The scrub_read_file function is always on a branch,
which has (fd >= 0), so there is not need to judgment
the pasted in arg.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
o Return 0 to indicate success,
when detected errors were corrected during scrubbing.
P.s. This is also to facilitate scripting when return value
is to be checked.
o Warn the users if there are uncorrectable errors detected.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
open_path_or_dev_mnt() is used to on *mounted* btrfs device or mount
point, when a unmounted btrfs device is passed, errno is set to EINVAL to
info the caller.
If ignore the errno and just print "ERROR: can't access '%s'", end users
will get confused.
This patch will add check for open_path_or_dev_mnt() caller and print
more meaningful error message when a unmounted btrfs device path is
given.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
scrub_progress_cycle thread runs in asynchronous type but locks mutex
while reading shared data. This patch disables cancelability for a
brief time while locks are on so as to make sure they are unlocked
before thread is canceled.
scrub_write_progress gets called from scrub_progress_cycle in
asynchronous thread but cancelability is disabled after mutex is
locked. This patch moves the call to set cancelability type before
mutex lock and makes corresponding changes to labels for error
handling.
Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Threads always use default attributes in all tools, so pthread
attribute objects and their initializations are of no use. Just pass
NULL as attr attribute to pthread_create for default attributes.
Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
If pthread_mutex_lock fails (rare but fix it anyway), don't call
pthread_mutex_unlock on mutex.
Rationale being that if pthread_mutex_lock fails pthread_mutex_unlock
will always fail and overwrite actual error value in err.
Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Remove the extraneous `to' from `Can't access to X'.
Signed-off-by: Mitchel Humpherys <mitch.special@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
I sometimes get segfault in cmd_scrub_status(), this is because
free_history() forgot to check whether pointer address is valid,fix it.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
I hit a problem that i can not start scrub when i am trying to track
superblock generation mismatch problems.
The fact is that we are trying to check whether we have started a scrub operation
in userspace, this will make us can't start scrub if that record file is damaged
itself. By adding a option to skip that check, everything will be fine.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
we use 37 as the allocation size to hold the uuid_unparse, here
it defines BTRFS_UUID_UNPARSE_SIZE for the same.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
There will be four kinds of return value for command "scrub start":
0: scrub dosen't find errors and return success.
1: usage or syntax errors.
3: scrub finds errors and correct all of them.
4: scrub finds errors and some of them are not correctable.
Three kinds of return values for scrub cancel/resume:
0: cancel successfully.
1: usage or syntax errors.
2: cancel a not started or finished scrub.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
These were mostly in option structs but there were a few gross string
pointer arguments given as 0.
Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Mark many functions as static, and remove any resulting dead code.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Update the usage strings of some cmds to keep the them consistent with
the source.
Also some minor changes are done to fit the man page syntax.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
valgrind complains open_file_or_dir() causes a memory leak.That is because
if we open a directoy by opendir(), and then we should call closedir()
to free memory.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
We don't need callers to manage string storage for each pretty_sizes()
call. We can use a macro to have per-thread and per-call static storage
so that pretty_sizes() can be used as many times as needed in printf()
arguments without requiring a bunch of supporting variables.
This lets us have a natural interface at the cost of requiring __thread
and TLS from gcc and a small amount of static storage. This seems
better than the current code or doing something with illegible format
specifier macros.
Signed-off-by: Zach Brown <zab@redhat.com>
Acked-by: Wang Shilong <wangs.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Check whether any involved device is already busy running a
scrub. This would cause damaged status messages and the state
"aborted" without the explanation that a scrub was already
running. Therefore check it first, prevent it and give some
feedback to the user if scrub is already running.
Note that if scrub is started with a block device as the
parameter, only that particular block device is checked. It
is a normal mode of operation to start scrub on multiple
single devices, there is no reason to prevent this.
Here is an example:
/mnt2 is the mountpoint of a filesystem.
/dev/sdk and /dev/sdl are the block devices for that filesystem.
case 1:
btrfs scrub start /mnt2
btrfs scrub start /mnt2
-> complain
case 1:
btrfs scrub start /dev/sdk
btrfs scrub start /dev/sdk
-> complain
case 3:
btrfs scrub start /dev/sdk
btrfs scrub start /dev/sdl
-> don't complain
case 4:
btrfs scrub start /dev/sdk
btrfs scrub start /mnt2
-> complain
case 5:
btrfs scrub start /mnt2
btrfs scrub start /dev/sdk
-> complain if the scrub on /dev/sdk is still running.
-> don't complain if the scrub on /dev/sdk is finished, the
status messages will be fine.
Reported-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
The btrfs tool is changed in order to support command line parameters
to configure the IO priority of the scrub tasks. Also the default is
changed. The default IO priority for scrub is the idle class now.
The behavior is the same as when one would type
'ionice ... btrfs scrub start ...' or 'ionice ... btrfs scrub resume ...'
(without this patch applied).
The only reason for adding this to the btrfs tool is that it was not
documented and not obvious that it worked like this, that all internal
scrub tasks inherited the IO priority values of the btrfs tool that is
starting or resuming the scrub operation.
Note that after applying the patch it is no longer possible to set
the IO priority using ionice since the btrfs tool always configures
the priority in order to run in the idle class by default.
Some basic performance measurements have been done with the goal to
measure which IO priority for scrub gives the best overall disk data
throughput. The kernel was configured to use the CFQ IO scheduler
with default configuration and without support for throttling. The
summary is, that the more the disk head movements are avoided, the
faster the overall disk transfer capacity is, which is not really a
big surprise. Therefore it makes sense that the best data throughput
was measured setting the scrub IO priority and the scrub readahead
IO priority to the idle class priority. Running with idle class IO
priority means that scrub and scrub readahead IO is paused while
other tasks access the disk. Doing the tasks one after the other
instead of concurrently avoids many disk head movements. The
overall data throughput of rotating disks is improved this way.
However, if it is desired to have the scrub task done within a
reasonable time, and if at the same time the filesystem is heavily
loaded, the idle IO priority should be avoided. Otherwise the scrub
operation will never take place and thus never terminate.
The best effort IO priority class with the subclass 7 (the lowest
one in the best effort class) is recommended in the case of always
heavily loaded hard disks. If the filesystem is not loaded all the
time and leaves some idle slots for scrub, the idle class IO priority
is recommended. The idle class now is the default if the scrub
operation is started with the btrfs-progs tools.
Note that the patch that sets the scrub readahead IO priority to the
idle class is a seperate patch, this needs to be done in the kernel.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
get_fs_info() has been silently switching from a device to a mounted
path as needed; the caller's filehandle was unexpectedly closed &
reopened outside the caller's scope. Not so great.
The callers do want "fdmnt" to be the filehandle for the mount point
in all cases, though - the various ioctls act on this (not on an fd
for the device). But switching it in the local scope of get_fs_info
is incorrect; it just so happens that *usually* the fd number is
unchanged.
So - use the new helpers to detect when an argument is a block
device, and open the the mounted path more obviously / explicitly
for ioctl use, storing the filehandle in fdmnt.
Then, in get_fs_info, ignore the fd completely, and use the path on
the argument to determine if the caller wanted to act on just that
device, or on all devices for the filesystem.
Affects those commands which are documented to accept either
a block device or a path:
* btrfs device stats
* btrfs replace start
* btrfs scrub start
* btrfs scrub status
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
cmd_scrub_cancel had its own mountpoint discovery routine;
just use open_path_or_dev_mnt() for that now.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
consolidate error handling to ensure that peer_fd
is closed on error paths. Add a couple comments
to the error handling after the thread is complete.
Note that scrub_progress_cycle returns negative
errnos on any error.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
The two sigint handlers issue ioctls to clean up, but if
they fail, noone would know. I'm not sure there is
any other error handling to be done at this point, but a
notification seems wise.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
If we request scrub cancel on an unmounted or
non-btrfs device, we still get a "scrub canceled"
success message:
# btrfs scrub cancel /dev/loop1
scrub cancelled
# blkid /dev/loop1
/dev/loop1: UUID="7f586941-1d5e-4ba7-9caa-b35934849957" TYPE="xfs"
Fix this so that if check_mounted_where returns 0
we don't report success.
While we're at it, use perror to report the reason for an open
failure, if we get one.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
If we retry opening the mountpoint and fail, we'll call
close on a filehandle w/ value -1. Rearrange so the
retry uses the same open and same error handling.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>