btrfs-progs: docs: update Zoned mode
Copy remaining text from wiki. Signed-off-by: David Sterba <dsterba@suse.com>
This commit is contained in:
parent
cdb82bf762
commit
33560e2768
|
@ -42,7 +42,7 @@ This is inherently incompatible with some features:
|
||||||
|
|
||||||
Initial support lacks some features but they're planned:
|
Initial support lacks some features but they're planned:
|
||||||
|
|
||||||
* only single profile is supported
|
* only single (data, metadata) and DUP (metadata) profile is supported
|
||||||
* fstrim - due to dependency on free space cache v1
|
* fstrim - due to dependency on free space cache v1
|
||||||
|
|
||||||
Super block
|
Super block
|
||||||
|
@ -64,3 +64,122 @@ devices is expected to be large, tens of terabytes. Maximum zone size supported
|
||||||
is 8GiB, which would mean that e.g. offset 0-16GiB would be reserved just for
|
is 8GiB, which would mean that e.g. offset 0-16GiB would be reserved just for
|
||||||
the super block on a hypothetical device of that zone size. This is wasteful
|
the super block on a hypothetical device of that zone size. This is wasteful
|
||||||
but required to guarantee crash safety.
|
but required to guarantee crash safety.
|
||||||
|
|
||||||
|
Devices
|
||||||
|
^^^^^^^
|
||||||
|
|
||||||
|
Real hardware
|
||||||
|
"""""""""""""
|
||||||
|
|
||||||
|
The WD Ultrastar series 600 advertises HM-SMR, i.e. the host-managed zoned
|
||||||
|
mode. There are two more: DA (device managed, no zoned information exported to
|
||||||
|
the system), HA (host aware, can be used as regular disk but zoned writes
|
||||||
|
improve performance). There are not many devices available at the moment, the
|
||||||
|
information about exact zoned mode is hard to find, check data sheets or
|
||||||
|
community sources gathering information from real devices.
|
||||||
|
|
||||||
|
Note: zoned mode won't work with DM-SMR disks.
|
||||||
|
|
||||||
|
- Ultrastar® DC ZN540 NVMe ZNS SSD (`product
|
||||||
|
brief <https://documents.westerndigital.com/content/dam/doc-library/en_us/assets/public/western-digital/collateral/product-brief/product-brief-ultrastar-dc-zn540.pdf>`__)
|
||||||
|
|
||||||
|
Emulated: null_blk
|
||||||
|
""""""""""""""""""
|
||||||
|
|
||||||
|
The driver *null_blk* provides memory backed device and is suitable for
|
||||||
|
testing. There are some quirks setting up the devices. The module must be
|
||||||
|
loaded with *nr_devices=0* or the numbering of device nodes will be offset. The
|
||||||
|
*configfs* must be mounted at */sys/kernel/config* and the administration of
|
||||||
|
the null_blk devices is done in */sys/kernel/config/nullb*. The device nodes
|
||||||
|
are named like */dev/nullb0* and are numbered sequentially. NOTE: the device
|
||||||
|
name may be different than the named directory in sysfs!
|
||||||
|
|
||||||
|
Setup:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
modprobe configfs
|
||||||
|
modprobe null_blk nr_devices=0
|
||||||
|
|
||||||
|
Create a device *mydev*, assuming no other previously created devices, size is
|
||||||
|
2048MiB, zone size 256MiB. There are more tunable parameters, this is a minimal
|
||||||
|
example taking defaults:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
cd /sys/kernel/config/nullb/
|
||||||
|
mkdir mydev
|
||||||
|
cd mydev
|
||||||
|
echo 2048 > size
|
||||||
|
echo 1 > zoned
|
||||||
|
echo 1 > memory_backed
|
||||||
|
echo 256 > zone_size
|
||||||
|
echo 1 > power
|
||||||
|
|
||||||
|
This will create a device */dev/nullb0* and the value of file *index* will
|
||||||
|
match the ending number of the device node.
|
||||||
|
|
||||||
|
Remove the device:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
rmdir /sys/kernel/config/nullb/mydev
|
||||||
|
|
||||||
|
Then continue with *mkfs.btrfs /dev/nullb0*, the zoned mode is auto-detected.
|
||||||
|
|
||||||
|
For convenience, there's a script wrapping the basic null_blk management operations
|
||||||
|
https://github.com/kdave/nullb.git, the above commands become:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
nullb setup
|
||||||
|
nullb create -s 2g -z 256
|
||||||
|
mkfs.btrfs /dev/nullb0
|
||||||
|
...
|
||||||
|
nullb rm nullb0
|
||||||
|
|
||||||
|
Emulated: TCMU runner
|
||||||
|
"""""""""""""""""""""
|
||||||
|
|
||||||
|
TCMU is a framework to emulate SCSI devices in userspace, providing various
|
||||||
|
backends for the storage, with zoned support as well. A file-backed zoned
|
||||||
|
device can provide more options for larger storage and zone size. Please follow
|
||||||
|
the instructions at https://zonedstorage.io/projects/tcmu-runner/ .
|
||||||
|
|
||||||
|
Compatibility, incompatibility
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
- the feature sets an incompat bit and requires new kernel to access the
|
||||||
|
filesystem (for both read and write)
|
||||||
|
- superblock needs to be handled in a special way, there are still 3 copies
|
||||||
|
but at different offsets (0, 512GiB, 4TiB) and the 2 consecutive zones are a
|
||||||
|
ring buffer of the superblocks, finding the latest one needs reading it from
|
||||||
|
the write pointer or do a full scan of the zones
|
||||||
|
- mixing zoned and non zoned devices is possible (zones are emulated) but is
|
||||||
|
recommended only for testing
|
||||||
|
- mixing zoned devices with different zone sizes is not possible
|
||||||
|
- zone sizes must be power of two, zone sizes of real devices are e.g. 256MiB
|
||||||
|
or 1GiB, larger size is expected, maximum zone size supported by btrfs is
|
||||||
|
8GiB
|
||||||
|
|
||||||
|
Status, stability, reporting bugs
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
The zoned mode has been released in 5.12 and there are still some rough edges
|
||||||
|
and corner cases one can hit during testing. Please report bugs to
|
||||||
|
https://github.com/naota/linux/issues/ .
|
||||||
|
|
||||||
|
References
|
||||||
|
^^^^^^^^^^
|
||||||
|
|
||||||
|
- https://zonedstorage.io
|
||||||
|
|
||||||
|
- https://zonedstorage.io/projects/libzbc/ -- *libzbc* is library and set
|
||||||
|
of tools to directly manipulate devices with ZBC/ZAC support
|
||||||
|
- https://zonedstorage.io/projects/libzbd/ -- *libzbd* uses the kernel
|
||||||
|
provided zoned block device interface based on the ioctl() system calls
|
||||||
|
|
||||||
|
- https://hddscan.com/blog/2020/hdd-wd-smr.html -- some details about exact device types
|
||||||
|
- https://lwn.net/Articles/853308/ -- *Btrfs on zoned block devices*
|
||||||
|
- https://www.usenix.org/conference/vault20/presentation/bjorling -- Zone
|
||||||
|
Append: A New Way of Writing to Zoned Storage
|
||||||
|
|
Loading…
Reference in New Issue