The zlib and zstd compression methods support using compression levels.
Enable defrag to pass them to kernel.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Daniel Vacek <neelx@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When compression is null the code always goes through the LZO case,
or prints "lzo support not compiled in".
This bug was added by commit c6d24a363d ("btrfs-progs: mkfs: add lzo
to --compress option").
Pull-request: #967
Signed-off-by: Wang Mingyu <wangmy@fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The alias should be for btrfs_util_subvolume_snapshot instead of
btrfs_util_snapshot_snapshot.
Pull-request: #969
Signed-off-by: Junxuan Liao <ljx@cs.wisc.edu>
Signed-off-by: David Sterba <dsterba@suse.com>
Replace the use of IOW as abbreviation of "In other words" with the
original expanded meaning, as users who don't have English as their
first language may not know what it means.
Pull-request: #966
Signed-off-by: David Sterba <dsterba@suse.com>
When one of two zones composing a DUP block group is a conventional zone, we
have the zone_info[i]->alloc_offset = WP_CONVENTIONAL. That will, of course,
not match the write pointer of the other zone, and fails that block group.
This commit solves that issue by properly recovering the emulated write pointer
from the last allocated extent. The offset for the SINGLE, DUP, and RAID1 are
straight-forward: it is same as the end of last allocated extent. The RAID0 and
RAID10 are a bit tricky that we need to do the math of striping.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Just same as the kernel side.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Implement it just like the kernel side.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Implement it just like the kernel side.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
DUP support is added like the kernel side.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently, the userland tool only considers the SINGLE profile, which make it
fail when a DUP block group is created over one conventional zone and one
sequential required zone.
Before adding the other profiles support, let's factor out per-profile code
(actually, SINGLE only) into functions just like as the kernel side.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Now that, we have zone capacity and (basic) zone activeness support. It's time
to factor out btrfs_load_zone_info() as same as the kernel side.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce "zone_is_active" member to struct btrfs_block_group and activate it
on loading a block group.
Note that activeness check for the extent allocation is currently not
implemented. The activeness checking requires to activate a non-active block
group on the extent allocation, which also require finishing a zone in the case
of hitting the active zone limit. Since mkfs should not hit the limit,
implementing the zone finishing code would not be necessary at the moment.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Properly load the zone activeness on the userland tool. Also, check if a device
has enough active zone limit to run btrfs.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The userland tools did not load and use the zone capacity. Support it properly.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is an userland side update to follow kernel-side commit 15c12fcc50a1
("btrfs: zoned: introduce a zone_info struct in
btrfs_load_block_group_zone_info"). This will make the code unification easier.
This commit introduces zone_info structure to hold per-zone information in
btrfs_load_block_group_zone_info.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce min_not_zero() macro from the kernel.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Some tests in mkfs or misc require the experimental build. Enable it on
the development workflows. Build coverage with or without experimental
build is covered by the CI image build tests.
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 5bd97022f3 ("btrfs-progs: check: add support for squota")
does ulist node allocation but this leaks on any error before the final
accounting. As it's freed right after that we can use on-stack variable
for that.
This was reported by 067-btrfstune-simple-quota with enabled ASAN with
enabled experimental features.
Signed-off-by: David Sterba <dsterba@suse.com>
When a subvolume is deleted with the recursive option, any nested
subvolumes also get removed without reporting it. Update the subvolume
delete command to print the list of nested subvolumes.
Issue: #923
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Fix a option typo in the mkfs help (`mkfs.btrfs -h`) introduced in the
most recent public release: `defalut-ro` instead of `default-ro`.
Pull-request: #958
Signed-off-by: David Sterba <dsterba@suse.com>
Add "duration" format in seconds to fmt_print which will convert the
input to one of the following strings:
1. if number of seconds represents more than one day, the output will be
for example: "1 days 01:30:00" (left the plural so parsing back the
string is easier)
2. if less then a day: "23:30:10"
Author: Racz Zoltan <racz.zoli@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In v6.14 kernel release, btrfs will force a direct IO to fall back to
a buffered one if the inode requires a data checksum.
This will cause a small performance drop, to solve the false data
checksum mismatch problem caused by direct IOs.
Although such a change is small to most end users, for those requiring
such a zero-copy direct IO this will be a behavior change, and this
requires a proper documentation update.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Commit 83bd999c3e ("libbtrfsutil: python: use version from primary
VERSION file") still does not fix all packaging problems. The release
build does not see the file.
Signed-off-by: David Sterba <dsterba@suse.com>
The version file of the python subpackage had to have the version set
manually in setup.py due to the out-of-tree build where it was not
possible to access the file VERSION. Manual update was error prone.
Improve that by adding a separate file template that is finalized with
the version during the configure phase. Then it's inclded in setup.py as
it's in the same directory.
There are two exceptions when the file is not required to run setup.py:
- clean - allow running 'make clean' in partially configured directory
- (no arguments) - show the help and commands
In all other cases the file version.py must exist.
Signed-off-by: David Sterba <dsterba@suse.com>
Remove boilerplate text, reduce commends and keep only the links for
future reference. Add branch 'master' for on-push event. Remove
scheduled scans, it's run frequently enough on 'devel' push.
Signed-off-by: David Sterba <dsterba@suse.com>
ASAN reports memory leak when zlib is used. The missing part is
deflateEnd() that frees structures allocated at deflateInit(). Add it to
all exit paths.
Signed-off-by: David Sterba <dsterba@suse.com>
Create a source for --rootdir and then use it for mkfs with compression.
Try a few levels, nothing special.
Signed-off-by: David Sterba <dsterba@suse.com>
For quick checks before a push of non-code changes we may want to do
only the spellchecking workflow. Any branch pushed matching the prefix
"codespell/" will be picked by this.
Signed-off-by: David Sterba <dsterba@suse.com>
The templates need to be renamed, so make it more suggestive by
prepending the expected name of the script test.sh. Also fix the
permissions and make it 755 so it's not missed later as this would not
execute the script.
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
There is a bug report that when deleting a device using sysfs
/sys/block/<dev>/device/delete, the kernel module will still try to read
and write the device.
Normally it's fine as long as all chunks can tolerate that removed
device (e.g. all RAID1).
But the problem is when one is trying to lower the redundancy by
converting to another profile:
# mkfs.btrfs -f -m raid1 -d raid1 /dev/sdd /dev/sde
# mount /dev/sdd /mnt
# echo 1 > /sys/block/sde/device/delete
# btrfs balance start --force -mdup -dsingle /mnt
This will lead to the filesystem mounted RO, with the following error messages:
sd 6:0:0:0: [sde] Synchronizing SCSI cache
ata7.00: Entering standby power mode
btrfs: attempt to access beyond end of device
sde: rw=6145, sector=21696, nr_sectors = 32 limit=0
btrfs: attempt to access beyond end of device
sde: rw=6145, sector=21728, nr_sectors = 32 limit=0
btrfs: attempt to access beyond end of device
sde: rw=6145, sector=21760, nr_sectors = 32 limit=0
BTRFS error (device sdd): bdev /dev/sde errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
BTRFS error (device sdd): bdev /dev/sde errs: wr 2, rd 0, flush 0, corrupt 0, gen 0
BTRFS error (device sdd): bdev /dev/sde errs: wr 3, rd 0, flush 0, corrupt 0, gen 0
BTRFS error (device sdd): bdev /dev/sde errs: wr 3, rd 0, flush 1, corrupt 0, gen 0
btrfs: attempt to access beyond end of device
sde: rw=145409, sector=128, nr_sectors = 8 limit=0
BTRFS warning (device sdd): lost super block write due to IO error on /dev/sde (-5)
BTRFS error (device sdd): bdev /dev/sde errs: wr 4, rd 0, flush 1, corrupt 0, gen 0
btrfs: attempt to access beyond end of device
sde: rw=14337, sector=131072, nr_sectors = 8 limit=0
BTRFS warning (device sdd): lost super block write due to IO error on /dev/sde (-5)
BTRFS error (device sdd): bdev /dev/sde errs: wr 5, rd 0, flush 1, corrupt 0, gen 0
BTRFS error (device sdd): error writing primary super block to device 2
BTRFS info (device sdd): balance: start -dconvert=single -mconvert=dup -sconvert=dup
BTRFS info (device sdd): relocating block group 1372585984 flags data|raid1
BTRFS error (device sdd): bdev /dev/sde errs: wr 5, rd 0, flush 2, corrupt 0, gen 0
BTRFS warning (device sdd): chunk 2446327808 missing 1 devices, max tolerance is 0 for writable mount
BTRFS: error (device sdd) in write_all_supers:4044: errno=-5 IO failure (errors while submitting device barriers.)
BTRFS info (device sdd state E): forced readonly
BTRFS warning (device sdd state E): Skipping commit of aborted transaction.
BTRFS error (device sdd state EA): Transaction aborted (error -5)
BTRFS: error (device sdd state EA) in cleanup_transaction:2017: errno=-5 IO failure
BTRFS info (device sdd state EA): balance: ended with status: -5
[CAUSE]
Btrfs doesn't have any runtime device error handling, it fully rely on
the extra copy provided.
For the sysfs block device removal, normally there is a device shutdown
callback to the running fs, but unfortunately btrfs doesn't support this
callback yet.
Thus even with that device removed, btrfs will still access that
removed device (both read and write, even if they will fail).
Normally for a full RAID1 btrfs, it will still be fine reading/write the
fs as usual. The proper action is to replace the
removed/missing/failing device with a newer one using `btrfs device
replace`.
But when doing the convert, btrfs will allocate new metadata chunks on
to the removed device (which will lose all writes).
And since the new metadata profile is DUP, which can not handle any
missing device of that metadata chunk, finally it triggers the final
protection at transaction commit time, and flips the filesystem to RO,
before causing any real data loss.
[DOC ENHANCEMENT]
Add a warning to the `convert` filter about the dangerous doing convert
to a lower redundancy profile when there is a known failing/removed
device.
And mention the proper way to handle such failing/missing device.
The root fix is to introduce a failing/removed device detection for
btrfs, but that will be a pretty big feature and will take quite some
time before landing it upstream.
Link: https://lore.kernel.org/linux-btrfs/2cb1d81e-12a8-4fb1-b3fc-e7e83d31e059@siddall.name/
Reported-by: Jeff Siddall <news@siddall.name>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Just like all qgroup functions, if a qgroup is marked inconsistent, limit
will not work as expected. In fact with recent kernels, limit and
qgroup number updating will be fully skipped if qgroup is already
inconsistent.
Add one extra note on `btrfs qgroup limit` subcommand for it.
Link: https://bugzilla.suse.com/show_bug.cgi?id=1235765
Reported-by: Vojtech Lacina <vlacina@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Follow the kernel by setting the BIG_METADATA incompat flag if nodesize
is greater than the page size.
This flag was introduced with commit 727011e07cbdf8 ("Btrfs: allow
metadata blocks larger than the page size") in 2010, as kernels before
2.6.36 would crash due to a buggy page cache implementation.
The flag has no real meaning anymore but we can at least set it at mkfs
time.
Signed-off-by: Mark Harmstone <maharmstone@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The kernel adds a zeroed btrfs_dev_stats_item for each device on the
first mount. Preempt this by doing it at mkfs time.
Signed-off-by: Mark Harmstone <maharmstone@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The kernel commit 08fe4db170b419 ("Btrfs: Fix uninitialized root flags
for subvolumes") from 2011 sets the flag BTRFS_INODE_ROOT_ITEM_INIT on
root items, to work around a bug where flags and byte_limit weren't
being set.
Copy this behaviour in mkfs, to prevent the kernel from having to do it
on the first mount. We memset the btrfs_root_item, so there's no
corruption issue as there once was. We already do this in
btrfs_make_subvolume(), as otherwise the readonly flag of any subvolumes
would get reset.
Signed-off-by: Mark Harmstone <maharmstone@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The references used in :doc: need to reference relative path of the
document, otherwise this leads to a warning.
btrfs-progs/Documentation/Send-receive.rst:19: WARNING: unknown document: 'dev-send-stream' [ref.doc]
btrfs-progs/Documentation/dev/CmdLineConventions.rst:60: WARNING: unknown document: 'btrfstune' [ref.doc]
Signed-off-by: David Sterba <dsterba@suse.com>
Add sections for each directory so it's more visible where the sections
start and end. Update formatting, enhance descriptions.
Signed-off-by: David Sterba <dsterba@suse.com>