Implement the KeyValueDB interface using libkinetic_client,
and allow it to be configured as the backend for the KeyValueStore,
running the entire OSD on it.
This prototype implementation has no transaction safety, and is
only suitable as a proof of concept. Since the libkinetic_client
API does not provide reverse iteration over keys without also reading
the value off disk, it implements iterators in a very slow but correct way.
These are used heavily by the KeyValueDB callers, so this is a bottleneck
in performance.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Add libkrbd libtool convenience library to provide an interface for
mapping and unmapping rbd images programmatically. This will be used
by the rbd binary itself and the librbd_fsx testing tool.
libkrbd takes care of the kernel module stuff (common/module.h) and
makes use of libudev to be able to properly wait for block device
creation and deletion and tell which block device got assigned by the
kernel to the newly created mapping.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
strerror_r is not portable; on Gnu libc it returns char * and sometimes
does not fill in the supplied buffer. Use autoconf to test which
version this platform uses and adapt.
Clean up the random calls to strerror and strerror_r (along with all
their private little one-use buffers) and regularize the code to use
cpp_strerror almost everywhere. Where changed, any negation of the
error code is also removed, since cpp_strerror() will do that.
Note: some tools were using their own calls to strerror/strerror_r, so
will now get a (%d) in their output that wasn't there before; hence
the change to test/cli/monmaptool/print-nonexistent.t
Fixes: #8041
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Rename SIMD to INTEL for clarity.
Instead of agregating all flags in INTEL_FLAGS, create individual flags
for each feature (INTEL_SSE2_FLAGS etc.) for finer control in the
makefiles.
Signed-off-by: Loic Dachary <loic@dachary.org>
For each SSE feature supported by the compiler
* add the corresponding -msse* flag
* define HAVE_SSE*
Remove AX_EXT because it decides based on the CPU capabilities of the
machine compiling the binary which may or may not be the one running
them.
Signed-off-by: Loic Dachary <loic@dachary.org>
When configured with --without-libxfs, use GenericFileStoreBackend
instead of XfsFileStoreBackend for XFS. At this point this would only
impact the allocation hint op. The default is to compile with
--with-libxfs. (Previously it was unconditionally enabled on linux and
disabled for non-linux arches.)
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Introduce XfsFileStoreBackend class, currently the only filestore
backend implementing SETALLOCHINT op. This commit adds a build-time
dependency on libxfs as xfs-specific ioctl (XFS_IOC_FSSETXATTR /
XFS_XFLAG_EXTSIZE) is used to implement the new set_alloc_hint()
method.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Currently CEPH_HAVE_SETPIPE_SZ is not set even if F_SETPIPE_SZ is
available, because AC_COMPILE_IFELSE test program as written always
fails to compile. F_SETPIPE_SZ is a macro, so use AC_EGREP_CPP which
works on the preprocessor output instead of trying to compile.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Working around missing integer types is pretty easy. For example, the
__u32 family are Linux-specific types, and using these in Ceph
internally is fine because we can typedef them.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Adds a ceph_spinlock_t implementation that will use pthread_spinlock_t
if available, and otherwise reverts to pthread_mutex_t. Note that this
spinlock is not intended to be used in process-shared memory.
Switches implementation in:
ceph_context
SimpleMessenger
atomic_t
Only ceph_context initialized its spinlock with PTHREAD_PROCESS_SHARED.
However, there does not appear to be any instance in which CephContext
is allocated in shared memory, and thus can use the default private
memory space behavior.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
get_linux_version() returns a version of the currently running kernel,
encoded as in int, and is contained in common/linux_version.[ch].
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Break up AC_CHECK_HEADERS macro into one header-file per line so it's
easier to read and make changes.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Currently the way 'rbd unmap' translates a user-provided block device
into an rbd id is it matches the major number of the specified device
against /sys/bus/rbd/devices/<id>/major for each rbd mapping and
declares success on the first match. This works for both entire disks
and partitions, because under the current device number allocation
scheme, each mapping means a new major number.
In preparation for support for single-major device number allocation
scheme, which would require matching both major and minor numbers, make
sure to always match against entire disk device numbers, by converting
the specified device major:minor pair into wholdedisk major:minor pair.
To achive that, use the libblkid library, which accomplishes this goal
by walking stable sysfs structures.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Checking for fdatasync uses the same approach as the qemu configure
script. The relevant commit is d1722a27f552a22561104210e0afad4577878e53.
Here is a copy of the commit message which explains the check:
Under Darwin, a symbol exists for the fdatasync() function, so that our
link test succeeds. However _POSIX_SYNCHRONIZED_IO is set to '-1'.
According to POSIX:2008, a value of -1 means the feature is not
supported.
A value of 0 means supported at compilation time, and a value greater 0
means supported at both compilation and run time.
Enable fdatasync() only if _POSIX_SYNCHRONIZED_IO is '>0'.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>