The option "--clear-space-cache" is not really that suitable for "btrfs
check" group, as there are some concerns:
- Allowing transid mismatch
- No leaf item checks
Thoe behaviour are inherited from the default open ctree flags for
"btrfs check", which can be unsafe if the end user just wants to clear
the cache.
- Unclear if the cache clearing would happen along with repair
Thankfully the clearing of space cache is done without any repair
Thus there is a proposal to move space cache removal to rescue group,
and this patch would do that exactly.
However this would lead to some behavior changes:
- Transid mismatch would be treated as error
- Leaf items size/offset would still be checked
If we hit any above error, we should just abort without doing any
write.
These change would increase the safety of the space cache removal, thus
I believe it's worthy to introduce such behavior change.
Since we're here, also add a small explanation on why we need this
dedicated tool to clear space cache (especially for v1 cache).
Issue: #698
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There was a support for a short syntax of 'btrfs balance' that accepted
a path where normally would be the mandatory subcommand. This was a
heuristic and nowadays everybody should be using the
'btrfs balance action' syntax. The warning was in place for a year, it's
time to remove the short syntax completely.
Issue: #517
Signed-off-by: David Sterba <dsterba@suse.com>
Allow user to do dry-run deletion, doing any pre-checks and printing
which subvolumes would be deleted. Lack of access rights can still lead
to errors.
Issue: #629
Signed-off-by: David Sterba <dsterba@suse.com>
Some commands could be run in a dry-run mode, i.e. not doing any
write/change actions, only printing the steps and ignoring errors.
There are two possibilities where to put the option:
- as a global one: btrfs --dry-run subvolume delete /path
- local option: btrfs subvolume delete --dry-run /path
As we have several global options already, let's put it there, dry-run
should not be very common so the slight inconvenience of writing the
option out of order of command arguments should be acceptable.
Issue: #629
Signed-off-by: David Sterba <dsterba@suse.com>
The most common sector sizes are 4k, 16k and 64k, we don't really need
to test that with convert, this takes a long time for little benefit as
the node size is only for metadata, while the rest remains the same.
Signed-off-by: David Sterba <dsterba@suse.com>
The ntfs2btrfs tool recently found a bug in 'check', add the conversion
test support to our testsuite. It's optional and depends on kernel
support of the 'ntfs3' driver and the availability of mkntfs and
ntfs2btrfs, which might not be available everywhere.
Signed-off-by: David Sterba <dsterba@suse.com>
Recently the functionality has been added to the 'rescue' group and
check prints a warning when the option is used but this should be also
visible in the help text.
Signed-off-by: David Sterba <dsterba@suse.com>
Bit shifts should be done on unsigned types as we're approaching 32,
also update some missing descriptions.
Signed-off-by: David Sterba <dsterba@suse.com>
Minor refactoring, check arguments and print parameters. This can be
used to stress btree allocation by changing run_size.
Signed-off-by: David Sterba <dsterba@suse.com>
There's an old test for btree code that could be used in the testsuite.
Still needs some polishing and review what tests make sense.
Signed-off-by: David Sterba <dsterba@suse.com>
Previously the XFS-specific code was commented out so we don't need the
headers for building fsstress, this changes how the utility behaves
compared to the one in fstests, e.g. randomness or additional open/close
operations. Add enough code from xfsprogs to make it compile and enable
the #if0-ed code.
Signed-off-by: David Sterba <dsterba@suse.com>
In experimental build, read global '--param zone-size=SIZE' and use it
as emulated zone size. This is for testing only, will be promoted to a
proper option in the future.
Signed-off-by: David Sterba <dsterba@suse.com>
./btrfs --param key=value command ...
./btrfs --param key command ...
To pass various tuning data for testing and debugging, undocumented
for regular users.
To add support add reading of the parameter value after option parsing
bconf_param_value("key") and convert to what you need.
Signed-off-by: David Sterba <dsterba@suse.com>
Add a document describing the layout and functionality of the newly
introduced RAID Stripe Tree.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Due to refactoring in 88c25674c7 ("btrfs-progs: convert device info
to struct array") the variable tracking number of devices was not
updated and led to an error.
$ btrfs device usage /path
ERROR: unexpected number of devices: 0 != 1
...
Issue: #697
Signed-off-by: David Sterba <dsterba@suse.com>
Issue #622 reported a case where ntfs2btrfs can generate out-of-order
inline backref items, which can lead to kernel transaction abort, but
not detected by btrfs-check.
This patch would add such image, whose extent tree looks like this for
the only data extent:
item 0 key (13631488 EXTENT_ITEM 4096) itemoff 16172 itemsize 111
refs 3 gen 7 flags DATA
(178 0xdfb591fa80d95ea) extent data backref root FS_TREE objectid 257 offset 0 count 1
(178 0xdfb591fbbf5f519) extent data backref root FS_TREE objectid 258 offset 0 count 1
(178 0xdfb591f49f9f8e7) extent data backref root FS_TREE objectid 259 offset 0 count 1
While the original good base image has the following backrefs for the
same data extent:
item 0 key (13631488 EXTENT_ITEM 4096) itemoff 16172 itemsize 111
refs 3 gen 7 flags DATA
(178 0xdfb591fbbf5f519) extent data backref root FS_TREE objectid 258 offset 0 count 1
(178 0xdfb591fa80d95ea) extent data backref root FS_TREE objectid 257 offset 0 count 1
(178 0xdfb591f49f9f8e7) extent data backref root FS_TREE objectid 259 offset 0 count 1
Notice the sequence (the 2nd number in the round brackets) should be
descending. (Meanwhile type should be ascending, but it's way harder to
create.)
For now we don't have a way to fix it, but at least we should detect it.
Issue: #622
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 6cf11f3e38 ("btrfs-progs: check: check order of inline extent
refs") added the ability to detect out-of-order inline extent backref
items.
Meanwhile there is no such ability in lowmem mode, this patch would
introduce such ability to lowmem mode.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 6cf11f3e38 ("btrfs-progs: check: check order of inline extent
refs") fixes a problem that btrfs check never properly verify the
sequence of inline references.
It's not obvious because by default kernel handles EXTENT_DATA_REF_KEY
using its own hash, resulting some seemingly out-of-order result:
item 0 key (13631488 EXTENT_ITEM 4096) itemoff 16143 itemsize 140
refs 4 gen 7 flags DATA
extent data backref root FS_TREE objectid 258 offset 0 count 1
extent data backref root FS_TREE objectid 257 offset 0 count 1
extent data backref root FS_TREE objectid 260 offset 0 count 1
extent data backref root FS_TREE objectid 259 offset 0 count 1
By a quick glance, no one can see the above inline backref items are in
any order.
To make such sequence more obvious, let dump-tree to output a new prefix
to indicate the type and the internal sequence number:
For above case, the new output would look like this:
item 0 key (13631488 EXTENT_ITEM 4096) itemoff 16143 itemsize 140
refs 4 gen 7 flags DATA
(178 0xdfb591fbbf5f519) extent data backref root FS_TREE objectid 258 offset 0 count 1
(178 0xdfb591fa80d95ea) extent data backref root FS_TREE objectid 257 offset 0 count 1
(178 0xdfb591f9c0534ff) extent data backref root FS_TREE objectid 260 offset 0 count 1
(178 0xdfb591f49f9f8e7) extent data backref root FS_TREE objectid 259 offset 0 count 1
Although still not that obvious, it should show the inline data backrefs
has descending sequence number.
For the type part, it's anti-instinctive in ascending order, which is
not that easy to produce.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 963188943f ("btrfs-progs: make
btrfs_super_block::log_root_transid deprecated") deprecated
log_root_transid and broke build of sb-mod. As an unused member of sb we
don't need to edit it, so remove it from the list.
Signed-off-by: David Sterba <dsterba@suse.com>
The kernel seems to order inline extent items in a particular way:
forward by sub-type, then reverse by hash. Having these out of order can
cause a volume to go readonly when deleting an inode.
See https://github.com/maharmstone/ntfs2btrfs/issues/51
With additional comments from the pull request:
- lookup_inline_extent_backref() is skipping the remaining backref item
if data/metadata item is smaller (either through the data hash, or
metadata parent/ref_root) than the target range
- the fix could be still missing in lowmem mode
- image could be created according this comment
https://github.com/maharmstone/ntfs2btrfs/issues/51#issuecomment-1500781204
- due to late merge, the squota newly added key
BTRFS_EXTENT_OWNER_REF_KEY was not part of the patch and the value of
'hash' needs to be verified
Pull-request: #622
Signed-off-by: Mark Harmstone <mark@harmstone.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There were some files left after running 'make clean-all'. Reorganize
the target commands and group them by type of files so it's easier to
see what's cleaned and where to add new files.
Signed-off-by: David Sterba <dsterba@suse.com>
Add new option -p to 'subvolume create' so it behaves like 'mkdir -p'
and create all missing path components before the subvolume.
Issue: #429
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We need to validate the device uuid the same way as the fsid:
$ ./mkfs.btrfs --device-uuid 18eabcf0-6766-4fbf-b366-71b4ae725b2- img
btrfs-progs v6.5.2
See https://btrfs.readthedocs.io for more information.
ERROR: could not parse device UUID: 18eabcf0-6766-4fbf-b366-71b4ae725b2-
Signed-off-by: David Sterba <dsterba@suse.com>
Print the device uuid in the summary in case it's specified on the
command line, not always as it would be confusing and is not usually
needed. Can be found in 'btrfs inspect-internal dump-super' as
device_item.uuid .
Signed-off-by: David Sterba <dsterba@suse.com>
Add option --device-uuid that will set the device uuid item in super
block.
This is useful for creating a filesystem with a specific device uuid,
namely for testing.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The commit ("btrfs-progs: allow duplicate fsid for single device
filesystems") lets the duplicate fsid used for a new mkfs document this.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The initial patch used -q for enabling simple quota but this is not
right, -q is reserved for --quiet and we want the long options first.
Signed-off-by: David Sterba <dsterba@suse.com>
The new test case would:
- Prepare two loopback devices
One is for storing the source directory (on a btrfs).
This is to ensure we can set xattr for the directory, as filesystems
like tmpfs (mostly utilized by mktemp) is not supporting xattr.
The other one is the real target fs where we call
"mkfs.btrfs --rootdir" on.
- Create the source directory with the following contents:
* rootdir inode attributs:
# mode (750)
# uid (1000)
# gid (1000)
# xattr (user.rootdir)
* one regular file, with attributes:
# xattr (user.foorbar)
- Execute "mkfs.btrfs --rootdir" and mount the new fs
- Verify the above attributes
The target fs should have the same attributes, especially for the
rootdir inode.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When using "mkfs.btrfs" with "--rootdir" option, the top level inode
(rootdir) will not get the same xattr from the source dir:
mkdir -p source_dir/
touch source_dir/file
setfattr -n user.rootdir_xattr source_dir/
setfattr -n user.regular_xattr source_dir/file
mkfs.btrfs -f --rootdir source_dir $dev
mount $dev $mnt
getfattr $mnt
# Nothing <<<
getfattr $mnt/file
# file: $mnt/file
user.regular_xattr <<<
[CAUSE]
In function traverse_directory(), we only call add_xattr_item() for all
the child inodes, not really for the rootdir inode itself, leading to
the missing xattr items.
Not only xattr, in fact we also miss the uid/gid/timestamps/mode for the
rootdir inode.
[FIX]
Extract a dedicated function, copy_rootdir_inode(), to handle every
needed attributes for the rootdir inode, including:
- xattr
- uid
- gid
- mode
- timestamps
Issue: #688
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
For running scrubs, with v6.3 and newer btrfs-progs, it can report
incorrect "Total to scrub":
Scrub resumed: Mon Oct 9 11:28:33 2023
Status: running
Duration: 0:44:36
Time left: 0:00:00
ETA: Mon Oct 9 11:51:38 2023
Total to scrub: 625.49GiB
Bytes scrubbed: 625.49GiB (100.00%)
Rate: 239.35MiB/s
Error summary: no errors found
[CAUSE]
Commit c88ac0170b ("btrfs-progs: scrub: unify the output numbers for
"Total to scrub"") changed the output method for "Total to scrub", but
that value is only suitable for finished scrubs.
For running scrubs, if we use the currently scrubbed values, it would
lead to the above problem.
The real scrubbed bytes is only reliable for finished scrubs, not for
running/canceled/interrupted ones.
[FIX]
Change print_scrub_dev() to do extra checks, and only for finished
scrubs to use the scrubbed bytes.
Otherwise fall back to the device's bytes_used.
Issue: #690
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When running mkfs.btrfs with --rootdir on a block device, and the source
directory contains a sparse file, whose size is larger than the block
size, then mkfs.btrfs would fail:
# lsblk /dev/test/test
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
test-test 253:0 0 10G 0 lvm
# mkdir -p /tmp/output
# truncate -s 20G /tmp/output/file
# mkfs.btrfs -f --rootdir /tmp/output /dev/test/test
# sudo mkfs.btrfs -f /dev/test/scratch1 --rootdir /tmp/output/
btrfs-progs v6.3.3
See https://btrfs.readthedocs.io for more information.
ERROR: unable to zero the output file
[CAUSE]
Mkfs.btrfs would try to zero out the target file according to the total
size of the directory.
However the directory size is calculated using the file size, not the
real bytes taken by the file, thus for such sparse file with holes only,
it would still take 20G.
Then we would use that 20G size to zero out the target file, but if the
target file is a block device, we would fail as we can not enlarge a block
device.
[FIX]
When zeroing the file, we only enlarge it if the target is a regular
file.
Otherwise we warn about the size and continue.
Please note that, since "mkfs.btrfs --rootdir" doesn't handle sparse
file any differently from regular file, above case would still fail due
to ENOSPC, as will write zeros into the target file inside the fs.
Proper handling for sparse files would need a new series of patch to
address.
Issue: #653
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently, write_dev_supers() compares the superblock location vs the size
of the device to check if it can write the superblock. This is not correct
for a zoned device, whose superblock location is different than a regular
device.
Introduce check_sb_location() to check if the superblock zone exists for
the zoned case.
Running btrfs check can fail on a certain zoned device setup (e.g,
zone size = 128MB, device size = 16GB).
From generic/330:
yes | btrfs check --repair --force /dev/nullb1
[1/7] checking root items
Fixed 0 roots.
[2/7] checking extents
ERROR: zoned: failed to read zone info of 4096 and 4097: Invalid argument
ERROR: failed to write super block for devid 1: write error: Input/output error
failed to write new super block err -5
failed to repair damaged filesystem, aborting
This happens because write_dev_supers() is comparing the original
superblock location vs the device size to check if it can write out a
superblock copy or not.
For the above example, since the first copy location (64MB) < device size
(16GB), it tries to write out the copy. But, the copy must be written into
zone 4096 (512G / zone size (128M) = 4096), which is out of the device.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce sb_bytenr_to_sb_zone(), which converts the original superblock
location to the zone number of superblock log writing.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The convert tests take a long time and are not necessary for quick test
rounds under the convenience target 'make test'. We still run the tests
in CI.
Signed-off-by: David Sterba <dsterba@suse.com>