Merge pull request #29135 from dillaman/wip-40486

doc/rbd: re-organize top-level and add live-migration docs

Reviewed-by: Mykola Golub <mgolub@suse.com>
This commit is contained in:
Mykola Golub 2019-07-22 17:22:38 +03:00 committed by GitHub
commit 978d185cc8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 256 additions and 57 deletions

View File

@ -12,6 +12,11 @@ Synopsis
| **rbd-fuse** [ -p pool ] [-c conffile] *mountpoint* [ *fuse options* ]
Note
====
**rbd-fuse** is not recommended for any production or high performance workloads.
Description
===========

View File

@ -39,20 +39,19 @@ to operate the :ref:`Ceph RADOS Gateway <object-gateway>`, the
Ceph cluster.
.. toctree::
:maxdepth: 1
:maxdepth: 1
Commands <rados-rbd-cmds>
Kernel Modules <rbd-ko>
Snapshots<rbd-snapshot>
Mirroring <rbd-mirroring>
Persistent Cache <rbd-persistent-cache>
LIO iSCSI Gateway <iscsi-overview>
QEMU <qemu-rbd>
libvirt <libvirt>
librbd Settings <rbd-config-ref/>
OpenStack <rbd-openstack>
CloudStack <rbd-cloudstack>
RBD Replay <rbd-replay>
Basic Commands <rados-rbd-cmds>
.. toctree::
:maxdepth: 2
Operations <rbd-operations>
.. toctree::
:maxdepth: 2
Integrations <rbd-integrations>
.. toctree::
:maxdepth: 2

View File

@ -9,8 +9,8 @@
rbd-fuse <../../man/8/rbd-fuse>
rbd-nbd <../../man/8/rbd-nbd>
rbd-ggate <../../man/8/rbd-ggate>
rbd-map <../../man/8/rbdmap>
ceph-rbdnamer <../../man/8/ceph-rbdnamer>
rbd-replay-prep <../../man/8/rbd-replay-prep>
rbd-replay <../../man/8/rbd-replay>
rbd-replay-many <../../man/8/rbd-replay-many>
rbd-map <../../man/8/rbdmap>

View File

@ -1,6 +1,6 @@
=======================
Block Device Commands
=======================
=============================
Basic Block Device Commands
=============================
.. index:: Ceph Block Device; image management

View File

@ -1,5 +1,5 @@
=======================
librbd Settings
Config Settings
=======================
See `Block Device`_ for additional details.
@ -9,7 +9,7 @@ Cache Settings
.. sidebar:: Kernel Caching
The kernel driver for Ceph block devices can use the Linux page cache to
The kernel driver for Ceph block devices can use the Linux page cache to
improve performance.
The user space implementation of the Ceph block device (i.e., ``librbd``) cannot
@ -19,33 +19,36 @@ disk caching. When the OS sends a barrier or a flush request, all dirty data is
written to the OSDs. This means that using write-back caching is just as safe as
using a well-behaved physical hard disk with a VM that properly sends flushes
(i.e. Linux kernel >= 2.6.32). The cache uses a Least Recently Used (LRU)
algorithm, and in write-back mode it can coalesce contiguous requests for
algorithm, and in write-back mode it can coalesce contiguous requests for
better throughput.
.. versionadded:: 0.46
The librbd cache is enabled by default and supports three different cache
policies: write-around, write-back, and write-through. Writes return
immediately under both the write-around and write-back policies, unless there
are more than ``rbd cache max dirty`` unwritten bytes to the storage cluster.
The write-around policy differs from the write-back policy in that it does
not attempt to service read requests from the cache, unlike the write-back
policy, and is therefore faster for high performance write workloads. Under the
write-through policy, writes return only when the data is on disk on all
replicas, but reads may come from the cache.
Ceph supports write-back caching for RBD. To enable it, add ``rbd cache =
true`` to the ``[client]`` section of your ``ceph.conf`` file. By default
``librbd`` does not perform any caching. Writes and reads go directly to the
storage cluster, and writes return only when the data is on disk on all
replicas. With caching enabled, writes return immediately, unless there are more
than ``rbd cache max dirty`` unflushed bytes. In this case, the write triggers
writeback and blocks until enough bytes are flushed.
Prior to receiving a flush request, the cache behaves like a write-through cache
to ensure safe operation for older operating systems that do not send flushes to
ensure crash consistent behavior.
.. versionadded:: 0.47
If the librbd cache is disabled, writes and
reads go directly to the storage cluster, and writes return only when the data
is on disk on all replicas.
.. note::
The cache is in memory on the client, and each RBD image has
its own. Since the cache is local to the client, there's no coherency
if there are others accessing the image. Running GFS or OCFS on top of
RBD will not work with caching enabled.
Ceph supports write-through caching for RBD. You can set the size of
the cache, and you can set targets and limits to switch from
write-back caching to write through caching. To enable write-through
mode, set ``rbd cache max dirty`` to 0. This means writes return only
when the data is on disk on all replicas, but reads may come from the
cache. The cache is in memory on the client, and each RBD image has
its own. Since the cache is local to the client, there's no coherency
if there are others accessing the image. Running GFS or OCFS on top of
RBD will not work with caching enabled.
The ``ceph.conf`` file settings for RBD should be set in the ``[client]``
section of your configuration file. The settings include:
section of your configuration file. The settings include:
``rbd cache``
@ -56,12 +59,30 @@ section of your configuration file. The settings include:
:Default: ``true``
``rbd cache policy``
:Description: Select the caching policy for librbd.
:Type: Enum
:Required: No
:Default: ``writearound``
:Values: ``writearound``, ``writeback``, ``writethrough``
``rbd cache writethrough until flush``
:Description: Start out in write-through mode, and switch to write-back after the first flush request is received. Enabling this is a conservative but safe setting in case VMs running on rbd are too old to send flushes, like the virtio driver in Linux before 2.6.32.
:Type: Boolean
:Required: No
:Default: ``true``
``rbd cache size``
:Description: The RBD cache size in bytes.
:Type: 64-bit Integer
:Required: No
:Default: ``32 MiB``
:Policies: write-back and write-through
``rbd cache max dirty``
@ -71,6 +92,7 @@ section of your configuration file. The settings include:
:Required: No
:Constraint: Must be less than ``rbd cache size``.
:Default: ``24 MiB``
:Policies: write-around and write-back
``rbd cache target dirty``
@ -80,6 +102,7 @@ section of your configuration file. The settings include:
:Required: No
:Constraint: Must be less than ``rbd cache max dirty``.
:Default: ``16 MiB``
:Policies: write-back
``rbd cache max dirty age``
@ -88,15 +111,8 @@ section of your configuration file. The settings include:
:Type: Float
:Required: No
:Default: ``1.0``
:Policies: write-back
.. versionadded:: 0.60
``rbd cache writethrough until flush``
:Description: Start out in write-through mode, and switch to write-back after the first flush request is received. Enabling this is a conservative but safe setting in case VMs running on rbd are too old to send flushes, like the virtio driver in Linux before 2.6.32.
:Type: Boolean
:Required: No
:Default: ``true``
.. _Block Device: ../../rbd
@ -104,12 +120,10 @@ section of your configuration file. The settings include:
Read-ahead Settings
=======================
.. versionadded:: 0.86
RBD supports read-ahead/prefetching to optimize small, sequential reads.
librbd supports read-ahead/prefetching to optimize small, sequential reads.
This should normally be handled by the guest OS in the case of a VM,
but boot loaders may not issue efficient reads.
Read-ahead is automatically disabled if caching is disabled.
but boot loaders may not issue efficient reads. Read-ahead is automatically
disabled if caching is disabled or if the policy is write-around.
``rbd readahead trigger requests``
@ -136,8 +150,8 @@ Read-ahead is automatically disabled if caching is disabled.
:Default: ``50 MiB``
RBD Features
============
Image Features
==============
RBD supports advanced features which can be specified via the command line when creating images or the default features can be specified via Ceph config file via 'rbd_default_features = <sum of feature numeric values>' or 'rbd_default_features = <comma-delimited list of CLI values>'
@ -233,10 +247,10 @@ RBD supports advanced features which can be specified via the command line when
:KRBD support: no
RBD QOS Settings
================
QOS Settings
============
RBD supports limiting per image IO, controlled by the following
librbd supports limiting per image IO, controlled by the following
settings.
``rbd qos iops limit``

View File

@ -0,0 +1,13 @@
=========================================
Ceph Block Device 3rd Party Integration
=========================================
.. toctree::
:maxdepth: 1
Kernel Modules <rbd-ko>
QEMU <qemu-rbd>
libvirt <libvirt>
OpenStack <rbd-openstack>
CloudStack <rbd-cloudstack>
LIO iSCSI Gateway <iscsi-overview>

View File

@ -0,0 +1,155 @@
======================
Image Live-Migration
======================
.. index:: Ceph Block Device; live-migration
RBD images can be live-migrated between different pools within the same cluster
or between different image formats and layouts. When started, the source image
will be deep-copied to the destination image, pulling all snapshot history and
optionally keeping any link to the source image's parent to help preserve
sparseness.
This copy process can safely run in the background while the new target image is
in-use. There is currently a requirement to temporarily stop using the source
image before preparing a migration. This helps to ensure that the client using
the image is updated to point to the new target image.
.. note::
Image live-migration requires the Ceph Nautilus release or later. The krbd
kernel module does not support live-migration at this time.
.. ditaa:: +-------------+ +-------------+
| {s} c999 | | {s} |
| Live | Target refers | Live |
| migration |<-------------*| migration |
| source | to Source | target |
| | | |
| (read only) | | (writable) |
+-------------+ +-------------+
Source Target
The live-migration process is comprised of three steps:
#. **Prepare Migration:** The initial step creates the new target image and
cross-links the source and target images. Similar to `layered images`_,
attempts to read uninitialized extents within the target image will
internally redirect the read to the source image, and writes to
uninitialized extents within the target will internally deep-copy the
overlapping source image block to the target image.
#. **Execute Migration:** This is a background operation that deep-copies all
initialized blocks from the source image to the target. This step can be
run while clients are actively using the new target image.
#. **Finish Migration:** Once the background migration process has completed,
the migration can be committed or aborted. Committing the migration will
remove the cross-links between the source and target images, and will
remove the source image. Aborting the migration will remove the cross-links,
and will remove the target image.
Prepare Migration
=================
The live-migration process is initiated by running the `rbd migration prepare`
command, providing the source and target images::
$ rbd migration prepare migration_source [migration_target]
The `rbd migration prepare` command accepts all the same layout optionals as the
`rbd create` command, which allows changes to the immutable image on-disk
layout. The `migration_target` can be skipped if the goal is only to change the
on-disk layout, keeping the original image name.
All clients using the source image must be stopped prior to preparing a
live-migration. The prepare step will fail if it finds any running clients with
the image open in read/write mode. Once the prepare step is complete, the
clients can be restarted using the new target image name. Attempting to restart
the clients using the source image name will result in failure.
The `rbd status` command will show the current state of the live-migration::
$ rbd status migration_target
Watchers: none
Migration:
source: rbd/migration_source (5e2cba2f62e)
destination: rbd/migration_target (5e2ed95ed806)
state: prepared
Note that the source image will be moved to the RBD trash to avoid mistaken
usage during the migration process::
$ rbd info migration_source
rbd: error opening image migration_source: (2) No such file or directory
$ rbd trash ls --all
5e2cba2f62e migration_source
Execute Migration
=================
After preparing the live-migration, the image blocks from the source image
must be copied to the target image. This is accomplished by running the
`rbd migration execute` command::
$ rbd migration execute migration_target
Image migration: 100% complete...done.
The `rbd status` command will also provide feedback on the progress of the
migration block deep-copy process::
$ rbd status migration_target
Watchers:
watcher=1.2.3.4:0/3695551461 client.123 cookie=123
Migration:
source: rbd/migration_source (5e2cba2f62e)
destination: rbd/migration_target (5e2ed95ed806)
state: executing (32% complete)
Commit Migration
================
Once the live-migration has completed deep-copying all data blocks from the
source image to the target, the migration can be committed::
$ rbd status migration_target
Watchers: none
Migration:
source: rbd/migration_source (5e2cba2f62e)
destination: rbd/migration_target (5e2ed95ed806)
state: executed
$ rbd migration commit migration_target
Commit image migration: 100% complete...done.
If the `migration_source` image is a parent of one or more clones, the `--force`
option will need to be specified after ensuring all descendent clone images are
not in use.
Commiting the live-migration will remove the cross-links between the source
and target images, and will remove the source image::
$ rbd trash list --all
Abort Migration
===============
If you wish to revert the prepare or execute step, run the `rbd migration abort`
command to revert the migration process::
$ rbd migration abort migration_target
Abort image migration: 100% complete...done.
Aborting the migration will result in the target image being deleted and access
to the original source image being restored::
$ rbd ls
migration_source
.. _layered images: ../rbd-snapshot/#layering

View File

@ -0,0 +1,13 @@
==============================
Ceph Block Device Operations
==============================
.. toctree::
:maxdepth: 1
Snapshots<rbd-snapshot>
Mirroring <rbd-mirroring>
Live-Migration <rbd-live-migration>
Persistent Cache <rbd-persistent-cache>
Config Settings (librbd) <rbd-config-ref/>
RBD Replay <rbd-replay>