Now that we have all of the supporting code, add the ability to create
all of the global roots for an extent tree v2 fs. This will default to
nr_cpu's, but also allow the user to specify how many global roots they
would like.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Our initial block group will use global root id 0 with extent tree v2,
so adjust the helper to take the chunk_objectid as an argument, as we'll
set this to 0 for extent tree v2 and then
BTRFS_FIRST_CHUNK_TREE_OBJECTID for extent tree v1.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We're going to start create global roots from mkfs, and we need to have
a offset set for the root key. Make the btrfs_create_tree() take a key
for the root_key instead of just the objectid so we can setup these new
style roots properly.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With extent tree v2 we'll have multiple free space trees, and we can't
just unset the feature flags for the free space tree. Fix this to loop
through all of the free space trees and clear them out properly.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The free space tree code already does this, but we need it for cleaning
up per block group roots. Abstract this code out into a helper so that
we can use it in multiple places in the future.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We will now be using block_group->chunk_objectid to point at the global
root id for this particular block group. For now we'll assign this
based on mod'ing the offset of the block group against the number of
global root id's and handle the block_group_item updating appropriately.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In order to make sure the file system is consistent we need to record
the number of global roots we should have in the super block. We could
infer this from the number of global roots we find, however this could
lead to interesting fuzzing problems, so add a source of truth to the
super block in order to make it easier to verify the file system is
consistent.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We need to make sure we process the block group root, and mark its
blocks as used for the free space tree checking.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In the case of per-bg roots we may be missing the root items. To
re-initialize them we want to add the root item as well as allocate the
empty block. To achieve this extract out the reinit root logic to a
helper that just takes the root key and then does the appropriate work
to allocate an empty root and update the root item. Fix the normal
reinit root helper to use this new helper.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The free space tree needs to be validated against all referenced blocks
in the file system, so use the btrfs_mark_used_blocks() helper to check
the free space tree and free space cache against. This will do the
right thing for both extent tree v1 and extent tree v2.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When we switch to per-block group extent roots we'll need to scan each
individual extent root. To make this easier in the future go ahead and
use the range of the block groups to scan the extents.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This makes the appropriate changes to enable the block group tree
checking for both lowmem and normal check modes. This is relatively
straightforward, simply need to use the helper to get the right root for
dealing with block groups.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add the extent tree v2 table with the block group tree as a root, and
then create the empty root and use the proper root for cleanup up the
temporary block groups.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have an ASSERT(ret == 0) when populating the free space tree as we
should at least find the block group item with extent tree v1. However
with v2 we no longer have the block group item in the extent tree, so
fix the population logic to handle an empty block group (which occurs
during mkfs) and only assert if ret != 0 and we don't have extent tree
v2 turned on.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When we're messing with block group items use the
btrfs_block_group_root() helper to get the correct root to search, and
this will do the right thing based on the file system flags.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Instead of accessing the extent root directory for modifying block
groups, use the helper which will do the correct thing based on the
flags of the file system.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add the appropriate support to the print tree and dump tree code to spit
out the block group tree.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This adds the ability to load the block group root, as well as make sure
the various backup super block and super block updates are made
appropriately.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When we change the size of the btrfs_header we're going to need to
change how these helpers calculate where to find the start of items or
block ptrs. To prepare for that make these helpers take the
extent_buffer as an argument so we can do the appropriate math based on
the version type.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We are duplicating the offsetof(btrfs_node, key_ptr) logic everywhere,
instead use the helper to do this work for us, and make all the node
accessors use the helper.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
btrfs_item_nr_offset(0) is simply offsetof(struct btrfs_leaf, items),
which is the same thing as btrfs_leaf_data(), so replace all calls of
btrfs_item_nr_offset(0) with btrfs_leaf_data().
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Now that all callers are using the _nr variations we can simply rename
these helpers to btrfs_item_##member/btrfs_set_item_##member and change
the actual item SETGET funcs to raw_item_##member/set_raw_item_##member
and then change all callers to drop the _nr part.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
All callers use the btrfs_item_end_nr() variation, simply drop
btrfs_item_end() and make btrfs_item_end_nr() use the _nr() variations
of the item get helpers.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This matches how the kernel does it, simply pass in the slot and fix up
btrfs_file_extent_inline_item_len to use the btrfs_item_nr() helper and
the correct define. Fixup all the callers to use the slot now instead
of passing in the btrfs_item.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have a lot of the following patterns
item = btrfs_item_nr(nr);
btrfs_set_item_*(eb, item, val);
btrfs_set_item_*(eb, btrfs_item_nr(nr), val);
in a lot of places in our code. Instead add _nr variations of these
helpers and convert all of the users to this new helper.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have this pattern in a lot of places
item = btrfs_item_nr(slot);
btrfs_item_size(leaf, item);
btrfs_item_offset(leaf, item);
when we could simply use
btrfs_item_size_nr(leaf, slot);
btrfs_item_offset_nr(leaf, slot);
Fix all callers of btrfs_item_size() and btrfs_item_offset() to use the
_nr variation of the helpers.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This helper only takes the nodesize, but in the future it'll take a bool
to indicate if we're extent tree v2. The remaining users are all where
we only have extent_buffer, but we should always have a valid
eb->fs_info in these cases, so add BUG_ON()'s for the !eb->fs_info case
and then convert these callers to use BTRFS_LEAF_DATA_SIZE which takes
the fs_info.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The mkfs_config can hold the BTRFS_LEAF_DATA_SIZE, so calculate this at
config creation time and then use that value throughout convert instead
of calling __BTRFS_LEAF_DATA_SIZE.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is going to be a different value based on the incompat settings of
the file system, just store this in the fs_info instead of calculating
it every time.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We use __BTRFS_LEAF_DATA_SIZE() in a few places for mkfs. With extent
tree v2 we'll be increasing the size of btrfs_header, so it'll be kind
of annoying to add flags to all callers of __BTRFS_LEAF_DATA_SIZE, so
simply calculate it once and put it in the mkfs_config and use that.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When adding the GC support I noticed we were failing fsck when we had a
directory that hadn't been cleaned up yet because of rm -rf. However
this isn't limited to extent-tree-v2, we'll actually fail in the same
way if we were unable to do the evict portion of the deletion and left
the orphan items around for everybody.
This is a valid file system, it'll be cleaned up properly at mount time,
so fsck shouldn't fail in this case.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When implementing the GC tree I started getting btrfsck errors when a
test rm -rf <directory> with files inside of it and immediately unmount,
leaving behind orphaned directory items that have GC items for them.
This made me realize that we don't actually handle this case currently
for our normal orphan path. If we fail to clean everything up and leave
behind the orphan items we'll fail fsck.
Fix this by not processing any backrefs we find if we found an inode
item and its nlink is 0. This allows us to pass the test case I've
provided to validate this patch.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
I started hitting a segfault on fuzz test 006 because we couldn't find
the extent root. This is because the global root search stuff expects
the actual key to be setup properly, not just an objectid. Fix this by
initializing the key properly so we can find the extent root and other
trees properly.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The chunk recover code passes in a buffer it allocates with metadata but
no fs_info, causing fuzz-test 008 to segfault. Fix this test to only
check the flag if we have buf->fs_info set.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With my global roots prep patches I regressed us on handling the case
where we didn't find a root at all. In this case we need to return an
error (prior we returned -ENOENT) or we need to populate a dummy tree if
we have OPEN_CTREE_PARTIAL set. This fixes a segfault of fuzz test 006.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We were previously getting away with this because the
load_global_roots() treated ENOENT like everything was a-ok. However
that was a bug and fixing that bug uncovered a problem where we were
unconditionally trying to load the free space tree. Fix that by
skipping the load if we do not have the compat bit set.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The fuzz-test/003 was infinite looping when I reworked the code to
re-calculate the used bytes for the superblock. This is because fsck
wasn't properly fixing the bad extent before my change, it just happened
to error out nicely, whereas my change made it so we go the wrong bytes
used count and just infinite looped trying to fix the problem.
Fix this by sanity checking the extent when we try to re-calculate the
bytes_used. This makes us no longer infinite loop so we can get through
the fuzz tests.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
003-multi-check-unmounted was taking a long time, this is because it was
doing the 10 second countdown for each iteration of --init-csum-tree.
Fix this by using --force to bypass the countdown.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
While running make test on other patches I noticed we are now
segfaulting on the fuzz tests. This is because when I converted us to a
rb tree for the global roots we lost the ability to catch that there's
no extent root at all. Before we'd populate a dummy
fs_info->extent_root with a not uptodate node, but now you simply don't
get an extent root in the rb_tree. Fix the check_global_roots_uptodate
helper to count how many roots we find and make sure it matches the
number we expect. If it doesn't then we can return -EIO.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
'make clean' can fail in case the _build directory is not present, can
be observed with the build-tests.sh. Use plain 'rm -fd'.
Signed-off-by: David Sterba <dsterba@suse.com>
The fsstress tool is a useful file generator, pull it from fstests as
it's not packaged as a standalone tool anywhere and the LTP version is
out of date.
The file has been modified to build, some xfs-specific ioctls are not
supported.
Signed-off-by: David Sterba <dsterba@suse.com>
This is still work in progress but can survive some stress testing.
There are still some sanity checks missing, do not user this on valuable
data. To enables this, configure must be run with the experimental
features enabled.
$ mkfs.btrfs --csum crc32c /dev/sdx
$ <mount, fill with data, unmount>
$ btrfstune --csum sha256
Will change the checksum to sha256.
Implementation:
- set bit on superblock when the checksums are being changed (similar to
the uuid rewrite)
- metadata checksums are overwritten in place
- data checksums:
- the checksum tree is completely deleted and no checksums are
verified
- data blocks are enumerated and all checksums generated (same as
check --init-csum-tree)
To make it usable, it should be restartable and track the current
progress somehow. Also the previous data checksums should be verified
any time they're available.
Signed-off-by: David Sterba <dsterba@suse.com>
There was a workaround for asciidoc/xmlto build because page btrfs.5 and
btrfs.8 used the same intermediate file. To avoid clash for the section
5 page it is part of the name, this is still needed, but for sphinx we
can use the final name as it will get the right suffix. This affects
only the generated manual pages.
Signed-off-by: David Sterba <dsterba@suse.com>