Add find_file_name() and find_file_type() function for later nlink and
inode_item repair.
Later nlink repair will use both function and and inode_item repair will
use find_file_type().
They are done by searching the backref list, dir_item/index for type
search and dir_item/index or inode_ref for name search.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Add count_digits() function in utils.h to help calculate filename with
ino suffix.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
With the previous btrfs inode operations patches, now we can use
btrfs_mkdir() to create the 'lost+found' dir to do some data salvage in
btrfsck.
This patch along with previous ones will make data salvage easier.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Add btrfs_unlink() and btrfs_add_link() functions in inode.c,
for the incoming btrfs_mkdir() and later inode operations functions.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Record highest inode number before inode repair.
This is especially important for corrupted leaf case.
Under that case, if use btrfs_find_free_objectid, it may find a ino
existing in corrupted leaf but dropped by btree_recover.
If that happens, created dir will be referenced incorrectly since there
may be inode_ref or dir_index/item refers to it.
So we must record the highest inode number according to the inode_cache.
Inode_cache is OK since when a inode_ref or dir_index/item is found even
the referenced source is not found, it will be created.
If we record the highest inode number of inode_cache, and use
highest_inode + 1 as 'lost+found' dir, it will ensure the newly created
dir not conflicting with any possible inode.
This provides the basis for nlink or inode rebuild for repairing btrfs
with leaf/node corruption.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Allow direct search for the last cache extent.
Provide the basis for finding the last ino in inode_cache.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Import lookup/del_inode_ref() function in inode-item.c, as base functions
for the incoming btrfs_add_link() and btrfs_unlink() functions.
Also modify btrfs_insert_inode_ref() and split_leaf() making them able
to deal with EXTENT_IREF incompat flag.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Import btrfs_insert/del/lookup_extref() functions form kernel for the
incoming btrfs_add_link() and btrfs_unlink() functions.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Run fstests: btrfs/012 will fail with message:
unable to do rollback
It is because the rollback function checks sequentially each piece of space
to map to a certain block group. If some piece doesn't, rollback refuses to continue.
After kernel commit:
commit 47ab2a6c689913db23ccae38349714edf8365e0a
Btrfs: remove empty block groups automatically
Empty block groups are removed, so there are possible gaps:
|--block group 1--| |--block group 2--|
^
|
gap
So the piece of space of the gap belongs to a removed empty block group,
and rollback should detect this case, and feel free to continue.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The @search_cache_extent() only returns the next cache_extent or NULL,
it will never return the previous cache_extent.
So just remove the dead condition for previous cache_extent handle.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The value of variable leaf in while loop don't have to be set
for every round. Just move it outside.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
With the task-utils it's in the default LIBS flags now. We want to use
-pthread as it also sets flags for the preprocessor.
Signed-off-by: David Sterba <dsterba@suse.cz>
Support for monitoring progress of running tasks, based on timerfd and
pthreads.
Signed-off-by: Silvio Fricke <silvio.fricke@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Sometimes we have a pretty corrupted fs but have an old tree bytenr that we
could use, add the ability to specify the tree root bytenr. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Tested-by: Ansgar Hockmann-Stolle <ansgar.hockmann-stolle@uni-osnabrueck.de>
Signed-off-by: David Sterba <dsterba@suse.cz>
This is a starting point for a debugfs style python interface using
the search ioctl. For now it can only do one thing, which is to
print out all the extents in a file and calculate the compression ratio.
Over time it will grow more features, especially for the kinds of things
we might run btrfs-debug-tree to find out. Expect the usage and output
to change dramatically over time (don't hard code to it).
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The @fi_args->num_devices in @get_fs_info() does not include seed devices.
We could just correct it by searching the chunk tree and count how
many dev_items there are in total which includes seed devices.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The following BUG_ON:
BUG_ON(ndevs >= fi_args->num_devices)
is not needed, because it always fails with seed devices present.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
There is no need to try to build seed/sprout mapping for those btrfs
without seed devices, so just skip such fs.
We could get the total number of devices from the disk super block, if it
equals the number of items in list @fs_devices->devices, then there shouldn't
be any seed devices.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Before the patch, chunk will be considered bad if the corresponding
block group is missing, even the only uncertain data is the 'used'
member of the block group.
This patch will try to recalculate the 'used' value of the block group
and rebuild it.
So even only chunk item and dev extent item is found, the chunk can be
recovered.
Although if extent tree is damanged and needed extent item can't be
read, the block group's 'used' value will be the block group length, to
prevent any later write/block reserve damaging the block group.
In that case, we will prompt user and recommend them to use
'--init-extent-tree' to rebuild extent tree if possible.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Extract the procedure of searching for a target device for fi show
from the @map_seed_devices() function to make it more clear.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
This patch reworks the basic calculations of 'fi usage'. It does not address
all problems but should make the code more prepared to do so.
The original code tries to estimate the free space that could lead to negative
numbers for some raid profiles:
Data, RAID1: total=147.00GiB, used=141.92GiB
System, RAID1: total=32.00MiB, used=36.00KiB
Metadata, RAID1: total=2.00GiB, used=1.17GiB
GlobalReserve, single: total=404.00MiB, used=0.00B
Overall:
Device size: 279.46GiB
Device allocated: 298.06GiB
Device unallocated: 16.00EiB
Used: 286.18GiB
Free (estimated): 8.00EiB (min: 8.00EiB)
Data ratio: 2.00
Metadata ratio: 2.00
Global reserve: 404.00MiB (used: 0.00B)
Eg. "Device size" - "Device allocated" = negative number or a very large
positive, hence the EiB values.
There are logical and raw numbers multiplied by ratios mixed together,
so the new code makes it explicit which kind is being used. The data and
metadata ratios are calculated separately.
Output after this patch will look like:
Overall:
Device size: 558.92GiB
Device allocated: 298.06GiB
Device unallocated: 260.86GiB
Used: 286.18GiB
Free (estimated): 135.51GiB (min: 135.51GiB)
Data ratio: 2.00
Metadata ratio: 2.00
Global reserve: 404.00MiB (used: 0.00B)
Data,RAID1: Size:147.00GiB, Used:141.92GiB
/dev/sdc 147.00GiB
/dev/sdd 147.00GiB
Metadata,RAID1: Size:2.00GiB, Used:1.17GiB
/dev/sdc 2.00GiB
/dev/sdd 2.00GiB
System,RAID1: Size:32.00MiB, Used:36.00KiB
/dev/sdc 32.00MiB
/dev/sdd 32.00MiB
Unallocated:
/dev/sdc 130.43GiB
/dev/sdd 130.43GiB
Changes:
* Device size is now the raw size, same for the following three
* Free is the logical size
* Max/min were reduced to just min
Filesystem Size Used Avail Use% Mounted on
/dev/sdc 280G 144G 141G 51% /mnt/sdc
The difference between Avail and Free is there because userspace tool does a
different guesswork than kernel.
Issues not addressed by this patch:
* RAID56 profiles are not handled
* mixed profiles are not handled
Signed-off-by: David Sterba <dsterba@suse.cz>
Even if run as root:
# su
# btrfs file usage <path> <== path exits outside the mnt point
We get the output:
WARNING: ..., run as root
WARNING: ..., run as root
ERROR:...
It is because in load_chunk_info, the errno of ioctl is not judged
but rather the ret value of ioctl is judged. And the ret value of
ioctl is -1 which happens to match -EPERM exactly.
So the outer warning is printed.
Just judge the errno of ioctl and prevent the ret value of load_chunk_info
to be -1 in other error conditions.
For load_device_info, the problem and fix is the same.
After the fix, the 'run as root' WARNINGs will not show up in this condition.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
When we exec the following cmd:
# btrfs file usage -t <path> <-- an invalid path
output:
# ERROR: can't access '-t'
should be:
# ERROR: can't access 'path'
Just replace the static 'argv[1]' with 'argv[i]'.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
When run btrfs-file-usage on a btrfs with data profile raid5/6,
the output message for "Free" & "Data to device ratio" seems wrong
as follows:
...
Device size: 100.00GiB
Device allocated: 2.04GiB
Device unallocated: 97.96GiB
Used: 1.12MiB
Free (Estimated): 197.89GiB <== Free > Device size
Data to device ratio: 198 % <== > 100%
Global reserve: 0.00B
...
It is because the function get_raid56_used() is not iterating the
chunk_info array correctly, it is just repeating adding the first
chunk_info statistics.
Just add a ptr to iterate over the array.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Current error messages are like following:
Error: unable to create FS with metadata profile 32 (have 2 devices)
Error: unable to create FS with metadata profile 256 (have 2 devices)
Obviously it is hard for users to interpret "profile XX" to proper
meaning, such as "raidN". So use recongizable string instead of
internal numerical value. In case of "DUP", use an explicit message.
Plus this patch fix a bug that message mistake metadata profile
for data profile.
After applying this patch, messages will be like:
Error: DUP is not allowed when FS have multiple devices
Error: unable to create FS with metadata profile RAID6 (have 2
devices but 3 devices are required)
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The term has not seen an agreement and we don't want to change it once
it's in non-development branches or even released.
Discussion under the patch:
http://thread.gmane.org/gmane.comp.file-systems.btrfs/34627
Signed-off-by: David Sterba <dsterba@suse.cz>
It looks confusing among the chunks, it is not in fact a chunk type.
Sample:
Overall:
Device size: 35.00GiB
Device allocated: 8.07GiB
Device unallocated: 26.93GiB
Used: 1.12MiB
Free (Estimated): 17.57GiB (Max: 30.98GiB, min: 17.52GiB)
Data to device ratio: 50 %
Global reserve: 16.00MiB (used: 0.00B)
...
Signed-off-by: David Sterba <dsterba@suse.cz>
The main point of this is to load the device and chunk infos at one
place and pass down to the printers. The EPERM is handled separately, in
case kernel does not give us all the information about chunks or
devices, but we want to warn and print at least something.
For non-root users, 'filesystem usage' prints only the overall stats and
warns about RAID5/6.
The sole cleanup changes affect mostly the modified code and the related
functions, should be reasonably small.
Signed-off-by: David Sterba <dsterba@suse.cz>
The 'fi usage' lacks an overall report, this used to be in the enhanced
df command. Add it back.
Sample:
Overall:
Device size: 35.00GiB
Device allocated: 8.07GiB
Device unallocated: 26.93GiB
Used: 1.12MiB
Free (Estimated): 17.57GiB (Max: 30.98GiB, min: 17.52GiB)
Data to device ratio: 50 %
...
Signed-off-by: David Sterba <dsterba@suse.cz>
The device may not be fully occupied by the filesystem, the value of
Unallocated should not be calculated against the device size but the
size provided by DEV_INFO.
Signed-off-by: David Sterba <dsterba@suse.cz>
The entire device size may not be available to the filesystem, eg. if
it's modified via resize. Print this information if it can be obtained
from the DEV_INFO ioctl.
Print the device ID on the same line as the device name and move size to
the next line.
Sample:
/dev/sda7, ID: 3
Device size: 10.00GiB
FS occupied: 5.00GiB
Data,RAID10: 512.00MiB
Metadata,RAID10: 512.00MiB
System,RAID10: 4.00MiB
Unallocated: 9.00GiB
Signed-off-by: David Sterba <dsterba@suse.cz>
The TREE_SEARCH ioctl is root-only, FS_INFO will be available for
non-root users with an updated kernel, let the user know.
Signed-off-by: David Sterba <dsterba@suse.cz>
Move the command definitions where they belong, keep common 'usage'
functions in cmds-fi-disk_usage.c and add exports.
Rename structures containing 'disk' to 'device'.
Fix whitespace in the modified code.
Signed-off-by: David Sterba <dsterba@suse.cz>
Add back the original output of the 'btrfs fi df' command for backward
compatibility. The rich output is moved from 'disk_usage' to 'usage'.
Agreed in http://www.spinics.net/lists/linux-btrfs/msg31698.html
Signed-off-by: David Sterba <dsterba@suse.cz>
Lets not assign *info_ptr to 0 before calling free on it and lose
track of already allocated memory if realloc fails in
add_info_to_list. Lets call free first.
Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The usage() calls exit() internally, so remove the return after it.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
This patch adds some functions to manage the printing of the data in
tabular format.
The function
struct string_table *table_create(int columns, int rows)
creates an (empty) table.
The functions
char *table_printf(struct string_table *tab, int column,
int row, char *fmt, ...)
char *table_vprintf(struct string_table *tab, int column,
int row, char *fmt, va_list ap)
populate the table with text. To align the text to the left, the text
shall be prefixed with '<', otherwise the text shall be prefixed by a
'>'. If the first character is a '=', the the text is replace by a
sequence of '=' to fill the column width.
The function
void table_free(struct string_table *)
frees all the data associated to the table.
The function
void table_dump(struct string_table *tab)
prints the table on stdout.
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: David Sterba <dsterba@suse.cz>
Enhance the command "btrfs filesystem df" to show space usage information
for a mount point(s). It shows also an estimation of the space available,
on the basis of the current one used.
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
[code moved under #if 0 instead of deletion]
Signed-off-by: David Sterba <dsterba@suse.cz>
Before this patch, when btrfsck found an error in root dir, it will only
output the following message "root %llu root dir %llu error" without any
detailed error.
Just add print_inode_error() to print out the whole error.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
resolve_one_root() returns the objectid of a tree rather than the logical
address of the root node. Hence using root_bytenr is misleading. Fix this.
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
For now,
# btrfs fi show /mnt/btrfs
gives info correctly, while
# btrfs fi show /mnt/btrfs/
gives nothing.
This implies that the @realpath() function should be applied to
unify the behavior.
Made a more clear comment right above the call as well.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>