Commit Graph

30 Commits

Author SHA1 Message Date
Loic Dachary
eb968f886e Merge pull request #10135 from david-z/wip-enhance-ceph-disk-bluestore
ceph-disk: support creating block.db and block.wal with customized size for bluestore

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2016-09-22 18:13:00 +02:00
Zhi Zhang
4c1cd4ab17 ceph-disk: support creating block.db and block.wal with customized size for bluestore
Signed-off-by: Zhi Zhang <zhangz.david@outlook.com>
2016-09-20 11:43:18 +08:00
fiskn
781ae3bbda udev: add krbd readahead placeholder
Signed-off-by: Nick Fisk <nick@fisk.me.uk>
2016-09-16 14:43:31 +01:00
Loic Dachary
35004a628b udev: always populate /dev/disk/by-parttypeuuid
ceph-disk activate-all walks /dev/disk/by-parttypeuuid at boot time. It
is not necessary when udev fires ADD event for each partition and
95-ceph-osd.rules gets a chance to activate a ceph disk or journal.

There are various reasons why udev ADD events may not be fired at
boot (for instance Debian Jessi 8.5 never does it and CentOS 7.2 seems
to be racy in that regard when a LVM root is being used).

Populating /dev/disk/by-parttypeuuid fixes ceph-disk activate-all that
would not work without it. And it guarantees disks are activated at boot
time regardless of wether udev fires ADD events at the right time (or at
all).

The new udev file is a partial resurection of the
60-ceph-partuuid-workaround-rules that was removed by
9f77244b8e0782921663e52005b725cca58a8753. It is given a name that
reflects its new purpose.

Fixes http://tracker.ceph.com/issues/16351

Signed-off-by: Loic Dachary <loic@dachary.org>
2016-06-23 09:37:05 +02:00
Nathan Cutler
c86e0021d6 rpm: drop udev/95-ceph-osd-alt.rules
This udev rules file was needed on older RHEL platforms, which are
unsupported as of jewel.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
2016-05-06 13:44:15 +02:00
Sage Weil
9f77244b8e udev: remove 60-ceph-partuuid-workaround-rules
These were added to get /dev/disk/by-partuuid/ symlinks to work on
wheezy.  They are no longer needed for the supported distros (el7+,
jessie+, trusty+), and they apparently break dm by opening devices they
should not.

Fixes: http://tracker.ceph.com/issues/15516
Signed-off-by: Sage Weil <sage@redhat.com>
2016-04-18 09:16:02 -04:00
Loic Dachary
1ec58fcfc8 ceph-disk: implement lockbox key management
Instead of storing the dmcrypt keys in the /etc/ceph/dmcrypt-keys
directory, they are stored in the monitor. If a machine with
OSDs created with ceph-disk prepare --dmcrypt is lost, it does
not contain the key that would allow to decrypt their content.

The dmcrypt key is retrieved from the monitor using a different keyring
for each OSD. It is stored in a small partition called the lockbox. At
boot time the lockbox is mounted

    /var/lib/ceph/osd-lockbox/$uuid

and used when the $uuid partition is detected by udev to map it with
cryptsetup.

The OSDs that were prepared prior to the lockbox implementation are
supported by looking up the key found in /etc/ceph/dmcrypt-keys before
looking in /var/lib/ceph/osd-lockbox/$uuid.

http://tracker.ceph.com/issues/14669 Fixes: #14669

Signed-off-by: Loic Dachary <loic@dachary.org>
2016-03-04 09:13:35 +07:00
Loic Dachary
fbc0984b6e ceph-disk: bluestore trigger
Copy paste the journal code and s/journal/block/

More work will be needed to support multiple auxiliary
devices (block.wal etc). But the goal is to minimize the change because
this commit is part of a series of commits focusing on refactoring
prepare, not the entire ceph-disk codebase.

Signed-off-by: Loic Dachary <loic@dachary.org>
2016-02-04 17:01:46 +07:00
Loic Dachary
cc13fa05fd ceph-disk: fix typos in udev rules
Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-09-22 08:46:56 +02:00
Loic Dachary
b86d9fd973 ceph-disk: ensure ceph owner on udev change
On udev change the owner of the device switch back to the default. If
that happens on a journal while an OSD is being activated, it will fail
with permission denied.

Make sure all ceph device types are chown to ceph on udev change.

http://tracker.ceph.com/issues/13000 Fixes: #13000

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-09-22 08:46:56 +02:00
Sage Weil
3662a225b8 udev: use ceph-disk trigger ... with single set of udev rules
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 11:22:25 -04:00
Loic Dachary
d4869ac9e4 ceph-disk: add multipath support
A multipath device is detected because there is a
/sys/dev/block/M:m/dm/uuid file with the mpath- prefix (or part\w+-mpath
prefix).

When ceph-disk prepares data or journal devices on a multipath device,
it sets the partition typecode to MPATH_JOURNAL_UUID, MPATH_OSD_UUID and
MPATH_TOBE_UUID to

 a) help the udev rules distinguish them from other devices in
    devicemapper
 b) allow ceph-disk to fail if an attempt is made to activate a device
    with this type without accessing it via a multipath device

The 95-ceph-osd.rules call ceph-disk activate on partitions of type
MPATH_JOURNAL_UUID, MPATH_OSD_UUID. It relies on ceph-disk to do nothing
if the device is not accessed via multipath.

http://tracker.ceph.com/issues/11881 Fixes: #11881

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-08-29 02:37:52 +02:00
Loic Dachary
42ad86e14e udev: add devicemapper to partuuid-workaround
The dm-* devices are not excluded and will have by-partuuid symlinks
etc. This will include devices managed by multipath as well as
others. Since this only is used on partitions:

  # ignore partitions that span the entire disk
  TEST=="whole_disk", GOTO="persistent_storage_end_two"

It may create symlinks for dm-* devices that are unrelated to Ceph and
we assume this is going to be ok.

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-08-29 02:36:56 +02:00
Milan Broz
8bd35bd607 Set Ceph device partitions owner to ceph user in udev.
Signed-off-by: Milan Broz <mbroz@redhat.com>
2015-08-26 20:34:15 -04:00
Sage Weil
69cdfcb15f remove ceph-disk-{activate,prepare} wrappers
These ancient aliases shouldn't be used.  We should have removed them
immediately way back when, honestly.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-08-01 09:58:34 -04:00
David Disseldorp
85a894697e systemd: activate disks via systemd service instead of udev
The udev(7) man page states:
  RUN
  ...
  This can only be used for very short-running foreground tasks. Running
  an event process for a long period of time may block all further
  events for this or a dependent device.

  Starting daemons or other long-running processes is not appropriate
  for udev; the forked processes, detached or not, will be
  unconditionally killed after the event handling has finished.

ceph-disk activate is far from a short-running task:
- check whether path is a block dev, for dirs call through to
  activate_dir()
- call blkid to obtain the filesystem type for the block dev
- pull mount options from hard-coded ceph.conf file
- mount the OSD dev at a temporary path
- check the ceph magic for mounted filesystem
- read cluster uuid and locate corresponding /etc/ceph/{cluster}.conf
  path
- read or generate (if missing) the OSD uuid
- create a file indicating init system usage (systemd)
- mount the device at a second (final) location
- umount (lazy) the temporary mount path
- enable the systemd ceph-osd@{osd_id} service
- start the systemd ceph-osd@{osd_id} service

This logic is therefore best left in a systemd service for execution. As
it is less limited in terms of execution time, and also allows for
improved event handling in future (fsck, dmcrypt mapping etc.).

This change sees 95-ceph-osd.rules.systemd trigger ceph-disk activate or
ceph-disk activate-journal via new ceph-disk-activate-journal@.service,
ceph-disk-activate@.service and ceph-disk-dmcrypt-activate@.service
systemd service files.

ceph-disk-dmcrypt-activate@.service makes use of the newly added
--dmcrypt parameter for ceph-disk activate.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2015-08-01 09:58:34 -04:00
Andrew Bartlett
c83a288ab8 Rework ceph-disk to allow LUKS for encrypted partitions
LUKS allows for validation of the key at mount time (rather than
simply mounting a random partition), specification of the encryption
parameters in the header and key rollover of the slot key (the one
that needs to be stored).

New parameters 'osd cryptsetup parameters' and 'osd dmcrypt key size' are
added.  These allow these important policy choices to be overridden or
kept consistent per-site.

The previous default plain mode (rather than using LUKS) remains, select
LUKS by setting 'osd dmcrypt type = luks'

Signed-off-by: Andrew Bartlett <abartlet@catalyst.net.nz>
2015-01-30 14:34:42 +13:00
Sage Weil
d512dc9edd udev: /dev/disk/by-parttypeuuid/$type-$uuid
We need this to help trigger OSD activations.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-17 09:49:53 -07:00
Sage Weil
bcfd2f31a5 udev: drop useless --mount argument to ceph-disk
It doesn't mean anything anymore; drop it.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-14 14:04:54 -07:00
Sage Weil
a2a78e8d16 ceph-disk: implement 'activate-journal'
Activate an osd via its journal device.  udev populates its symlinks and
triggers events in an order that is not related to whether the device is
an osd data partition or a journal.  That means that triggering
'ceph-disk activate' can happen before the journal (or journal symlink)
is present and then fail.

Similarly, it may be that they are on different disks that are hotplugged
with the journal second.

This can be wired up to the journal partition type to ensure that osds are
started when the journal appears second.

Include the udev rules to trigger this.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-13 18:01:43 -07:00
Sage Weil
d8d7113c35 udev: install disk/by-partuuid rules
Wheezy's udev (175-7.2) has broken rules for the /dev/disk/by-partuuid/
symlinks that ceph-disk relies on.  Install parallel rules that work.  On
new udev, this is harmless; old older udev, this will make life better.

Fixes: #4865
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-16 18:40:29 -07:00
Gary Lowell
446641aa34 95-ceph-osd-alt.rules: Fix missing parent parameter
Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
2013-04-24 08:22:04 -07:00
Gary Lowell
7ad63d23d7 ceph-disk: OSD hotplug fixes for Centos
Two fixes for Centos 6.3 and other systems with udev versions
prior to 172.  The disk peristant name using the GPT UUID does
not exist, so use the by_path persistent name instead for the
journal symlink.

The gpt label fields are not available for use in udev rules. Add
ceph-disk-udev wrapper script that extracts the partition
type guid from the label and calls ceph-disk-activate if it is
a ceph guid type. (Bug #4632)

Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
2013-04-22 22:30:39 -07:00
Alexandre Marangone
785b25f53d Fix: use absolute path with udev
Avoids the following: udevd[61613]: failed to execute '/lib/udev/bash'
'bash -c 'while [ ! -e /dev/mapper/....

Signed-off-by: Alexandre Marangone <alexandre.marangone@inktank.com>
2013-04-15 15:57:00 -07:00
Sage Weil
e090a92a20 udev: trigger on dmcrypted osd partitions
Automatically map encrypted journal partitions.

For encrypted OSD partitions, map them, wait for the mapped device to
appear, and then ceph-disk-activate.

This is much simpler than doing the work in ceph-disk-activate.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-15 14:18:34 -08:00
Sage Weil
5bd85ee5aa udev: trigger ceph-disk-activate directly from udev
There is no need to depend on upstart for this.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-13 22:18:59 -08:00
Pascal de Bruijn | Unilogic Networks B.V
96587f39e3 Robustify ceph-rbdnamer and adapt udev rules
Below is a patch which makes the ceph-rbdnamer script more robust and
fixes a problem with the rbd udev rules.

On our setup we encountered a symlink which was linked to the wrong rbd:

  /dev/rbd/mypool/myrbd -> /dev/rbd1

While that link should have gone to /dev/rbd3 (on which a
partition /dev/rbd3p1 was present).

Now the old udev rule passes %n to the ceph-rbdnamer script, the problem
with %n is that %n results in a value of 3 (for rbd3), but in a value of
1 (for rbd3p1), so it seems it can't be depended upon for rbdnaming.

In the patch below the ceph-rbdnamer script is made more robust and it
now it can be called in various ways:

  /usr/bin/ceph-rbdnamer /dev/rbd3
  /usr/bin/ceph-rbdnamer /dev/rbd3p1
  /usr/bin/ceph-rbdnamer rbd3
  /usr/bin/ceph-rbdnamer rbd3p1
  /usr/bin/ceph-rbdnamer 3

Even with all these different styles of calling the modified script, it
should now return the same rbdname. This change "has" to be combined
with calling it from udev with %k though.

With that fixed, we hit the second problem. We ended up with:

  /dev/rbd/mypool/myrbd -> /dev/rbd3p1

So the rbdname was symlinked to the partition on the rbd instead of the
rbd itself. So what probably went wrong is udev discovering the disk and
running ceph-rbdnamer which resolved it to myrbd so the following
symlink was created:

  /dev/rbd/mypool/myrbd -> /dev/rbd3

However partitions would be discovered next and ceph-rbdnamer would be
run with rbd3p1 (%k) as parameter, resulting in the name myrbd too, with
the previous correct symlink being overwritten with a faulty one:

  /dev/rbd/mypool/myrbd -> /dev/rbd3p1

The solution to the problem is in differentiating between disks and
partitions in udev and handling them slightly differently. So with the
patch below partitions now get their own symlinks in the following style
(which is fairly consistent with other udev rules):

  /dev/rbd/mypool/myrbd-part1 -> /dev/rbd3p1

Please let me know any feedback you have on this patch or the approach
used.

Regards,
Pascal de Bruijn
Unilogic B.V.

Signed-off-by: Pascal de Bruijn <pascal@unilogicnetworks.net>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-16 17:34:22 -07:00
Josh Durgin
891025e539 udev: drop device number from name
The device number depends on how many rbd images have been
mapped. Removing it makes the name determined solely by the name,
image, and snapshot that are mapped, for ease of scripting or persistence
across reboots.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-08 16:36:47 -08:00
Tommi Virtanen
da8fa7837a udev: c* -> ceph-* rename: missed crbdnamer.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
2011-09-23 15:55:01 -07:00
Samuel Just
863ef7c331 debian: add udev rules
Add /lib/udev/rules.d/50-rbd.rules to debian package.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-03-10 16:08:39 -08:00