doc: update ceph-volume lvm prepare

Add option to pass raw physical devices everywhere, restructure a little
(bluestore section before filestore) and reword a few things.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
This commit is contained in:
Jan Fajerski 2019-11-05 13:39:27 +01:00
parent 01a603f6e9
commit f2018d76eb

View File

@ -26,6 +26,75 @@ the back end can be specified with:
* :ref:`--filestore <ceph-volume-lvm-prepare_filestore>`
* :ref:`--bluestore <ceph-volume-lvm-prepare_bluestore>`
.. _ceph-volume-lvm-prepare_bluestore:
``bluestore``
-------------
The :term:`bluestore` objectstore is the default for new OSDs. It offers a bit
more flexibility for devices compared to :term:`filestore`.
Bluestore supports the following configurations:
* A block device, a block.wal, and a block.db device
* A block device and a block.wal device
* A block device and a block.db device
* A single block device
The bluestore subcommand accepts physical block devices, partitions on
physical block devices or logical volumes as arguments for the various device parameters
If a physical device is provided, a logical volume will be created. A volume group will
either be created or reused it its name begins with ``ceph``.
This allows a simpler approach at using LVM but at the cost of flexibility:
there are no options or configurations to change how the LV is created.
The ``block`` is specified with the ``--data`` flag, and in its simplest use
case it looks like::
ceph-volume lvm prepare --bluestore --data vg/lv
A raw device can be specified in the same way::
ceph-volume lvm prepare --bluestore --data /path/to/device
For enabling :ref:`encryption <ceph-volume-lvm-encryption>`, the ``--dmcrypt`` flag is required::
ceph-volume lvm prepare --bluestore --dmcrypt --data vg/lv
If a ``block.db`` or a ``block.wal`` is needed (they are optional for
bluestore) they can be specified with ``--block.db`` and ``--block.wal``
accordingly. These can be a physical device, a partition or
a logical volume.
For both ``block.db`` and ``block.wal`` partitions aren't made logical volumes
because they can be used as-is.
While creating the OSD directory, the process will use a ``tmpfs`` mount to
place all the files needed for the OSD. These files are initially created by
``ceph-osd --mkfs`` and are fully ephemeral.
A symlink is always created for the ``block`` device, and optionally for
``block.db`` and ``block.wal``. For a cluster with a default name, and an OSD
id of 0, the directory could look like::
# ls -l /var/lib/ceph/osd/ceph-0
lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block -> /dev/ceph-be2b6fbd-bcf2-4c51-b35d-a35a162a02f0/osd-block-25cf0a05-2bc6-44ef-9137-79d65bd7ad62
lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block.db -> /dev/sda1
lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block.wal -> /dev/ceph/osd-wal-0
-rw-------. 1 ceph ceph 37 Oct 20 13:05 ceph_fsid
-rw-------. 1 ceph ceph 37 Oct 20 13:05 fsid
-rw-------. 1 ceph ceph 55 Oct 20 13:05 keyring
-rw-------. 1 ceph ceph 6 Oct 20 13:05 ready
-rw-------. 1 ceph ceph 10 Oct 20 13:05 type
-rw-------. 1 ceph ceph 2 Oct 20 13:05 whoami
In the above case, a device was used for ``block`` so ``ceph-volume`` create
a volume group and a logical volume using the following convention:
* volume group name: ``ceph-{cluster fsid}`` or if the vg exists already
``ceph-{random uuid}``
* logical volume name: ``osd-block-{osd_fsid}``
.. _ceph-volume-lvm-prepare_filestore:
``filestore``
@ -33,41 +102,47 @@ the back end can be specified with:
This is the OSD backend that allows preparation of logical volumes for
a :term:`filestore` objectstore OSD.
It can use a logical volume for the OSD data and a partitioned physical device
or logical volume for the journal. No special preparation is needed for these
volumes other than following the minimum size requirements for data and
journal.
It can use a logical volume for the OSD data and a physical device, a partition
or logical volume for the journal. A physical device will have a logical volume
created on it. A volume group will either be created or reused it its name begins
with ``ceph``. No special preparation is needed for these volumes other than
following the minimum size requirements for data and journal.
The API call looks like::
The CLI call looks like this of a basic standalone filestore OSD::
ceph-volume lvm prepare --filestore --data volume_group/lv_name --journal journal
ceph-volume lvm prepare --filestore --data <data block device>
To deploy file store with an external journal::
ceph-volume lvm prepare --filestore --data <data block device> --journal <journal block device>
For enabling :ref:`encryption <ceph-volume-lvm-encryption>`, the ``--dmcrypt`` flag is required::
ceph-volume lvm prepare --filestore --dmcrypt --data volume_group/lv_name --journal journal
ceph-volume lvm prepare --filestore --dmcrypt --data <data block device> --journal <journal block device>
There is flexibility to use a raw device or partition as well for ``--data``
that will be converted to a logical volume. This is not ideal in all situations
since ``ceph-volume`` is just going to create a unique volume group and
a logical volume from that device.
Both the journal and data block device can take three forms:
When using logical volumes for ``--data``, the value *must* be a volume group
name and a logical volume name separated by a ``/``. Since logical volume names
are not enforced for uniqueness, this prevents using the wrong volume. The
``--journal`` can be either a logical volume *or* a partition.
* a physical block device
* a partition on a physical block device
* a logical volume
When using a partition, it *must* contain a ``PARTUUID`` discoverable by
``blkid``, so that it can later be identified correctly regardless of the
device name (or path).
When using logical volumes the value *must* be of the format
``volume_group/logical_volume``. Since logical volume names
are not enforced for uniqueness, this prevents accidentally
choosing the wrong volume.
When using a partition, this is how it would look for ``/dev/sdc1``::
When using a partition, it *must* contain a ``PARTUUID``, that can be
discovered by ``blkid``. THis ensure it can later be identified correctly
regardless of the device name (or path).
For example: passing a logical volume for data and a partition ``/dev/sdc1`` for
the journal::
ceph-volume lvm prepare --filestore --data volume_group/lv_name --journal /dev/sdc1
For a logical volume, just like for ``--data``, a volume group and logical
volume name are required::
Passing a bare device for data and a logical volume ias the journal::
ceph-volume lvm prepare --filestore --data volume_group/lv_name --journal volume_group/journal_lv
ceph-volume lvm prepare --filestore --data /dev/sdc --journal volume_group/journal_lv
A generated uuid is used to ask the cluster for a new OSD. These two pieces are
crucial for identifying an OSD and will later be used throughout the
@ -166,72 +241,6 @@ can be started later (for detailed metadata description see
:ref:`ceph-volume-lvm-tags`).
.. _ceph-volume-lvm-prepare_bluestore:
``bluestore``
-------------
The :term:`bluestore` objectstore is the default for new OSDs. It offers a bit
more flexibility for devices. Bluestore supports the following configurations:
* A block device, a block.wal, and a block.db device
* A block device and a block.wal device
* A block device and a block.db device
* A single block device
It can accept a whole device (or partition), or a logical volume for ``block``.
If a physical device is provided it will then be turned into a logical volume.
This allows a simpler approach at using LVM but at the cost of flexibility:
there are no options or configurations to change how the LV is created.
The ``block`` is specified with the ``--data`` flag, and in its simplest use
case it looks like::
ceph-volume lvm prepare --bluestore --data vg/lv
A raw device can be specified in the same way::
ceph-volume lvm prepare --bluestore --data /path/to/device
For enabling :ref:`encryption <ceph-volume-lvm-encryption>`, the ``--dmcrypt`` flag is required::
ceph-volume lvm prepare --bluestore --dmcrypt --data vg/lv
If a ``block.db`` or a ``block.wal`` is needed (they are optional for
bluestore) they can be specified with ``--block.db`` and ``--block.wal``
accordingly. These can be a physical device (they **must** be a partition) or
a logical volume.
For both ``block.db`` and ``block.wal`` partitions aren't made logical volumes
because they can be used as-is. Logical Volumes are also allowed.
While creating the OSD directory, the process will use a ``tmpfs`` mount to
place all the files needed for the OSD. These files are initially created by
``ceph-osd --mkfs`` and are fully ephemeral.
A symlink is always created for the ``block`` device, and optionally for
``block.db`` and ``block.wal``. For a cluster with a default name, and an OSD
id of 0, the directory could look like::
# ls -l /var/lib/ceph/osd/ceph-0
lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block -> /dev/ceph-be2b6fbd-bcf2-4c51-b35d-a35a162a02f0/osd-block-25cf0a05-2bc6-44ef-9137-79d65bd7ad62
lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block.db -> /dev/sda1
lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block.wal -> /dev/ceph/osd-wal-0
-rw-------. 1 ceph ceph 37 Oct 20 13:05 ceph_fsid
-rw-------. 1 ceph ceph 37 Oct 20 13:05 fsid
-rw-------. 1 ceph ceph 55 Oct 20 13:05 keyring
-rw-------. 1 ceph ceph 6 Oct 20 13:05 ready
-rw-------. 1 ceph ceph 10 Oct 20 13:05 type
-rw-------. 1 ceph ceph 2 Oct 20 13:05 whoami
In the above case, a device was used for ``block`` so ``ceph-volume`` create
a volume group and a logical volume using the following convention:
* volume group name: ``ceph-{cluster fsid}`` or if the vg exists already
``ceph-{random uuid}``
* logical volume name: ``osd-block-{osd_fsid}``
Crush device class
------------------
@ -300,9 +309,8 @@ Summary
-------
To recap the ``prepare`` process for :term:`bluestore`:
#. Accept a logical volume for block or a raw device (that will get converted
to an lv)
#. Accept partitions or logical volumes for ``block.wal`` or ``block.db``
#. Accepts raw physical devices, partitions on physical devices or logical volumes as arguments.
#. Creates logical volumes on any raw physical devices.
#. Generate a UUID for the OSD
#. Ask the monitor get an OSD ID reusing the generated UUID
#. OSD data directory is created on a tmpfs mount.
@ -314,7 +322,7 @@ To recap the ``prepare`` process for :term:`bluestore`:
And the ``prepare`` process for :term:`filestore`:
#. Accept only logical volumes for data and journal (both required)
#. Accepts raw physical devices, partitions on physical devices or logical volumes as arguments.
#. Generate a UUID for the OSD
#. Ask the monitor get an OSD ID reusing the generated UUID
#. OSD data directory is created and data volume mounted