mirror of
https://github.com/ceph/ceph
synced 2025-02-21 18:17:42 +00:00
doc/rbd: refine rbd-exclusive-locks.rst
Refine grammar (mostly semantics) in rbd-exclusive-locks.rst. Co-authored-by: Ilya Dryomov <idryomov@redhat.com> Signed-off-by: Zac Dover <zac.dover@gmail.com>
This commit is contained in:
parent
d7d177f5e8
commit
62b0012751
@ -6,19 +6,24 @@
|
||||
|
||||
.. index:: Ceph Block Device; RBD exclusive locks; exclusive-lock
|
||||
|
||||
Exclusive locks are a mechanism designed to prevent multiple processes
|
||||
from accessing the same Rados Block Device (RBD) in an uncoordinated
|
||||
fashion. Exclusive locks are heavily used in virtualization (where
|
||||
they prevent VMs from clobbering each others' writes), and also in RBD
|
||||
mirroring (where they are a prerequisite for journaling).
|
||||
Exclusive locks are mechanisms designed to prevent multiple processes from
|
||||
accessing the same Rados Block Device (RBD) in an uncoordinated fashion.
|
||||
Exclusive locks are used heavily in virtualization (where they prevent VMs from
|
||||
clobbering each other's writes) and in RBD mirroring (where they are a
|
||||
prerequisite for journaling).
|
||||
|
||||
Exclusive locks are enabled on newly created images by default, unless
|
||||
overridden via the ``rbd_default_features`` configuration option or
|
||||
the ``--image-feature`` flag for ``rbd create``.
|
||||
By default, exclusive locks are enabled on newly created images. This default
|
||||
can be overridden via the ``rbd_default_features`` configuration option or the
|
||||
``--image-feature`` option for ``rbd create``.
|
||||
|
||||
In order to ensure proper exclusive locking operations, any client
|
||||
using an RBD image whose ``exclusive-lock`` feature is enabled should
|
||||
be using a CephX identity whose capabilities include ``profile rbd``.
|
||||
.. note::
|
||||
Many image features, including ``object-map`` and ``fast-diff``, depend upon
|
||||
exclusive locking. Disabling the ``exclusive-lock`` feature will negatively
|
||||
affect the performance of some operations.
|
||||
|
||||
In order to ensure proper exclusive locking operations, any client using an RBD
|
||||
image whose ``exclusive-lock`` feature is enabled must have a CephX identity
|
||||
whose capabilities include ``profile rbd``.
|
||||
|
||||
Exclusive locking is mostly transparent to the user.
|
||||
|
||||
@ -27,57 +32,55 @@ Exclusive locking is mostly transparent to the user.
|
||||
enabled, it obtains an exclusive lock on the image before the first
|
||||
write.
|
||||
|
||||
#. Whenever any such client process gracefully terminates, it
|
||||
automatically relinquishes the lock.
|
||||
#. Whenever any such client process terminates gracefully, the process
|
||||
relinquishes the lock automatically.
|
||||
|
||||
#. This subsequently enables another process to acquire the lock, and
|
||||
write to the image.
|
||||
#. This graceful termination enables another, subsequent, process to acquire
|
||||
the lock and to write to the image.
|
||||
|
||||
Note that it is perfectly possible for two or more concurrently
|
||||
running processes to merely open the image, and also to read from
|
||||
it. The client acquires the exclusive lock only when attempting to
|
||||
write to the image. To disable transparent lock transitions between
|
||||
multiple clients, it needs to acquire the lock specifically with
|
||||
``RBD_LOCK_MODE_EXCLUSIVE``.
|
||||
.. note::
|
||||
It is possible for two or more concurrently running processes to open the
|
||||
image and to read from it. The client acquires the exclusive lock only when
|
||||
attempting to write to the image. To disable transparent lock transitions
|
||||
between multiple clients, the client must acquire the lock by using the
|
||||
``RBD_LOCK_MODE_EXCLUSIVE`` flag.
|
||||
|
||||
|
||||
Blocklisting
|
||||
============
|
||||
|
||||
Sometimes, a client process (or, in case of a krbd client, a client
|
||||
node's kernel thread) that previously held an exclusive lock on an
|
||||
image does not terminate gracefully, but dies abruptly. This may be
|
||||
due to having received a ``KILL`` or ``ABRT`` signal, for example, or
|
||||
a hard reboot or power failure of the client node. In that case, the
|
||||
exclusive lock is never gracefully released. Thus, when a new process
|
||||
starts and attempts to use the device, it needs a way to break the
|
||||
previously held exclusive lock.
|
||||
Sometimes a client process (or, in case of a krbd client, a client node's
|
||||
kernel) that previously held an exclusive lock on an image does not terminate
|
||||
gracefully, but dies abruptly. This may be because the client process received
|
||||
a ``KILL`` or ``ABRT`` signal, or because the client node underwent a hard
|
||||
reboot or suffered a power failure. In cases like this, the exclusive lock is
|
||||
never gracefully released. This means that any new process that starts and
|
||||
attempts to use the device must break the previously held exclusive lock.
|
||||
|
||||
However, a process (or kernel thread) may also hang, or merely lose
|
||||
network connectivity to the Ceph cluster for some amount of time. In
|
||||
that case, simply breaking the lock would be potentially catastrophic:
|
||||
the hung process or connectivity issue may resolve itself, and the old
|
||||
process may then compete with one that has started in the interim,
|
||||
accessing RBD data in an uncoordinated and destructive manner.
|
||||
However, a process (or kernel thread) may hang or merely lose network
|
||||
connectivity to the Ceph cluster for some amount of time. In that case,
|
||||
breaking the lock would be potentially catastrophic: the hung process or
|
||||
connectivity issue could resolve itself and the original process might then
|
||||
compete with one that started in the interim, thus accessing RBD data in an
|
||||
uncoordinated and destructive manner.
|
||||
|
||||
Thus, in the event that a lock cannot be acquired in the standard
|
||||
graceful manner, the overtaking process not only breaks the lock, but
|
||||
also blocklists the previous lock holder. This is negotiated between
|
||||
the new client process and the Ceph Mon: upon receiving the blocklist
|
||||
request,
|
||||
In the event that a lock cannot be acquired in the standard graceful manner,
|
||||
the overtaking process not only breaks the lock but also blocklists the
|
||||
previous lock holder. This is negotiated between the new client process and the
|
||||
Ceph Monitor.
|
||||
|
||||
* the Mon instructs the relevant OSDs to no longer serve requests from
|
||||
the old client process;
|
||||
* once the associated OSD map update is complete, the Mon grants the
|
||||
lock to the new client;
|
||||
* once the new client has acquired the lock, it can commence writing
|
||||
* Upon receiving the blocklist request, the monitor instructs the relevant OSDs
|
||||
to no longer serve requests from the old client process;
|
||||
* after the associated OSD map update is complete, the new client can break the
|
||||
previously held lock;
|
||||
* after the new client has acquired the lock, it can commence writing
|
||||
to the image.
|
||||
|
||||
Blocklisting is thus a form of storage-level resource `fencing`_.
|
||||
|
||||
In order for blocklisting to work, the client must have the ``osd
|
||||
blocklist`` capability. This capability is included in the ``profile
|
||||
rbd`` capability profile, which should generally be set on all Ceph
|
||||
rbd`` capability profile, which should be set generally on all Ceph
|
||||
:ref:`client identities <user-management>` using RBD.
|
||||
|
||||
.. _fencing: https://en.wikipedia.org/wiki/Fencing_(computing)
|
||||
|
Loading…
Reference in New Issue
Block a user