ceph-disk: use blkid instead of sgdisk -i

sgdisk -i 1 /dev/vdb opens /dev/vdb in write mode which indirectly
triggers a BLKRRPART ioctl from udev (starting version 214 and up) when
the device is closed (see below for the udev release note). The
implementation of this ioctl by the kernel (even old kernels) removes
all partitions and adds them again (similar to what partprobe does
explicitly).

The side effects of partitions disappearing while ceph-disk is running
are devastating.

sgdisk is replaced by blkid which only opens the device in read mode and
will not trigger this unexpected behavior.

The problem does not show on Ubuntu 14.04 because it is running udev <
214 but shows on CentOS 7 which is running udev > 214.

git clone git://anonscm.debian.org/pkg-systemd/systemd.git
systemd/NEWS:
CHANGES WITH 214:

        * As an experimental feature, udev now tries to lock the
          disk device node (flock(LOCK_SH|LOCK_NB)) while it
          executes events for the disk or any of its partitions.
          Applications like partitioning programs can lock the
          disk device node (flock(LOCK_EX)) and claim temporary
          device ownership that way; udev will entirely skip all event
          handling for this disk and its partitions. If the disk
          was opened for writing, the close will trigger a partition
          table rescan in udev's "watch" facility, and if needed
          synthesize "change" events for the disk and all its partitions.
          This is now unconditionally enabled, and if it turns out to
          cause major problems, we might turn it on only for specific
          devices, or might need to disable it entirely. Device Mapper
          devices are excluded from this logic.

http://tracker.ceph.com/issues/14094 Fixes: #14094

Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
This commit is contained in:
Loic Dachary 2015-12-18 17:03:21 +01:00
parent fe71647bc9
commit 9dce05a8cd

View File

@ -3076,19 +3076,29 @@ def split_dev_base_partnum(dev):
return (base, partnum)
def get_partition_type(part):
return get_sgdisk_partition_info(part, 'Partition GUID code: (\S+)')
return get_blkid_partition_info(part, 'ID_PART_ENTRY_TYPE')
def get_partition_uuid(part):
return get_sgdisk_partition_info(part, 'Partition unique GUID: (\S+)')
return get_blkid_partition_info(part, 'ID_PART_ENTRY_UUID')
def get_sgdisk_partition_info(dev, regexp):
(base, partnum) = split_dev_base_partnum(dev)
out, _, _ = command(['sgdisk', '-i', partnum, base])
def get_blkid_partition_info(dev, what=None):
out, _, _ = command(
[
'blkid',
'-o',
'udev',
'-p',
dev,
]
)
p = {}
for line in out.splitlines():
m = re.match(regexp, line)
if m:
return m.group(1).lower()
return None
(key, value) = line.split('=')
p[key] = value
if what:
return p.get(what)
else:
return p
def more_osd_info(path, uuid_map, desc):
desc['ceph_fsid'] = get_oneliner(path, 'ceph_fsid')