Detect leveldb, but do not let autoconf blindly link it with everything on the
planet.
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Sighed-off-by: Sage Weil <sage@redhat.com>
Implement the KeyValueDB interface using libkinetic_client,
and allow it to be configured as the backend for the KeyValueStore,
running the entire OSD on it.
This prototype implementation has no transaction safety, and is
only suitable as a proof of concept. Since the libkinetic_client
API does not provide reverse iteration over keys without also reading
the value off disk, it implements iterators in a very slow but correct way.
These are used heavily by the KeyValueDB callers, so this is a bottleneck
in performance.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Add libkrbd libtool convenience library to provide an interface for
mapping and unmapping rbd images programmatically. This will be used
by the rbd binary itself and the librbd_fsx testing tool.
libkrbd takes care of the kernel module stuff (common/module.h) and
makes use of libudev to be able to properly wait for block device
creation and deletion and tell which block device got assigned by the
kernel to the newly created mapping.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
strerror_r is not portable; on Gnu libc it returns char * and sometimes
does not fill in the supplied buffer. Use autoconf to test which
version this platform uses and adapt.
Clean up the random calls to strerror and strerror_r (along with all
their private little one-use buffers) and regularize the code to use
cpp_strerror almost everywhere. Where changed, any negation of the
error code is also removed, since cpp_strerror() will do that.
Note: some tools were using their own calls to strerror/strerror_r, so
will now get a (%d) in their output that wasn't there before; hence
the change to test/cli/monmaptool/print-nonexistent.t
Fixes: #8041
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Rename SIMD to INTEL for clarity.
Instead of agregating all flags in INTEL_FLAGS, create individual flags
for each feature (INTEL_SSE2_FLAGS etc.) for finer control in the
makefiles.
Signed-off-by: Loic Dachary <loic@dachary.org>
For each SSE feature supported by the compiler
* add the corresponding -msse* flag
* define HAVE_SSE*
Remove AX_EXT because it decides based on the CPU capabilities of the
machine compiling the binary which may or may not be the one running
them.
Signed-off-by: Loic Dachary <loic@dachary.org>
When configured with --without-libxfs, use GenericFileStoreBackend
instead of XfsFileStoreBackend for XFS. At this point this would only
impact the allocation hint op. The default is to compile with
--with-libxfs. (Previously it was unconditionally enabled on linux and
disabled for non-linux arches.)
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Introduce XfsFileStoreBackend class, currently the only filestore
backend implementing SETALLOCHINT op. This commit adds a build-time
dependency on libxfs as xfs-specific ioctl (XFS_IOC_FSSETXATTR /
XFS_XFLAG_EXTSIZE) is used to implement the new set_alloc_hint()
method.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Currently CEPH_HAVE_SETPIPE_SZ is not set even if F_SETPIPE_SZ is
available, because AC_COMPILE_IFELSE test program as written always
fails to compile. F_SETPIPE_SZ is a macro, so use AC_EGREP_CPP which
works on the preprocessor output instead of trying to compile.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Working around missing integer types is pretty easy. For example, the
__u32 family are Linux-specific types, and using these in Ceph
internally is fine because we can typedef them.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Adds a ceph_spinlock_t implementation that will use pthread_spinlock_t
if available, and otherwise reverts to pthread_mutex_t. Note that this
spinlock is not intended to be used in process-shared memory.
Switches implementation in:
ceph_context
SimpleMessenger
atomic_t
Only ceph_context initialized its spinlock with PTHREAD_PROCESS_SHARED.
However, there does not appear to be any instance in which CephContext
is allocated in shared memory, and thus can use the default private
memory space behavior.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
get_linux_version() returns a version of the currently running kernel,
encoded as in int, and is contained in common/linux_version.[ch].
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Break up AC_CHECK_HEADERS macro into one header-file per line so it's
easier to read and make changes.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Currently the way 'rbd unmap' translates a user-provided block device
into an rbd id is it matches the major number of the specified device
against /sys/bus/rbd/devices/<id>/major for each rbd mapping and
declares success on the first match. This works for both entire disks
and partitions, because under the current device number allocation
scheme, each mapping means a new major number.
In preparation for support for single-major device number allocation
scheme, which would require matching both major and minor numbers, make
sure to always match against entire disk device numbers, by converting
the specified device major:minor pair into wholdedisk major:minor pair.
To achive that, use the libblkid library, which accomplishes this goal
by walking stable sysfs structures.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Checking for fdatasync uses the same approach as the qemu configure
script. The relevant commit is d1722a27f552a22561104210e0afad4577878e53.
Here is a copy of the commit message which explains the check:
Under Darwin, a symbol exists for the fdatasync() function, so that our
link test succeeds. However _POSIX_SYNCHRONIZED_IO is set to '-1'.
According to POSIX:2008, a value of -1 means the feature is not
supported.
A value of 0 means supported at compilation time, and a value greater 0
means supported at both compilation and run time.
Enable fdatasync() only if _POSIX_SYNCHRONIZED_IO is '>0'.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Make sure the requested length is below the maximum pipe size for now,
since we're only using one pipe and splicing once into and out of
it. The default max is 1MB on recent kernels, so this isn't such a
terrible limitation.
To get around this we could use multiple pipes, or keep both source and
destination fds open at the same time and call splice many times. This
is more usual usage for splice, but would require a lot more work to
restructure the filestore and messenger to handle it.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
check for the existence of boost_program_options library in
configure.ac since several files need that library
Signed-off-by: Xing Lin <xinglin@cs.utah.edu>
Selects __PRETTY_FUNCTION__ or __func__. Linux assumes GNU, and chooses
__PRETTY_FUNCTION__ if gcc/g++ versions are favorable.
This also includes a fix in ax_c_var_func.m4:
AC_TRY_COMPILE will wrap the test in main{}, and then GCC will complain
about nested functions. Just use the original main{} body.
diff --git a/m4/ax_c_var_func.m4 b/m4/ax_c_var_func.m4
index 0ad7d2b..8b57563 100644
--- a/m4/ax_c_var_func.m4
+++ b/m4/ax_c_var_func.m4
@@ -57,9 +57,9 @@ AC_DEFUN([AX_C_VAR_FUNC],
[AC_REQUIRE([AC_PROG_CC])
AC_CACHE_CHECK(whether $CC recognizes __func__, ac_cv_c_var_func,
AC_TRY_COMPILE(,
-[int main() {
+[
char *s = __func__;
-}],
+],
AC_DEFINE(HAVE_FUNC,,
[Define if the C complier supports __func__]) ac_cv_c_var_func=yes,
ac_cv_c_var_func=no) )
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Creates a test that checks explicitly for res_nquery, which can be a
macro in resolv.h. Defines RESOLV_LIBS that contains any libraries that
need to be linked against.
Notes from later fix:
Based on the 2013-09-30 version of wip-port. On FreeBSD, one must
include netinet/in.h to get the definitions for stuff in resolv.h.
Also, resolv.h's functions are part of libc instead of libresolv.
Signed-off-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Function `get_process_name` has platform specific dependencies. Check
for Linux prctl function and correct command flag.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Loop through a list of sensible default
locations for a JDK, stopping if a
workable JDK is found.
Also, add support for CentOS' default
java location.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
The in-tree Hadoop shim was a combination of libcephfs wrapper, and the
bits to support Hadoop. This has been replaced by src/java that
implements generic libcephfs wrappers, and externally, the hadoop shim
(see docs).
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
This is from Intel's ISA-L library and licensed under BSD 3-clause.
It needs to build with yasm, which means we go through all sorts of pain
to make this work with libtool:
- strip out args it doesn't understand with yasm-wrapper
- detect whether it is recent enough during configure
The code is conditional on:
- build-time support (yasm)
- run-time support (sse4.2)
Signed-off-by: Sage Weil <sage@inktank.com>
This patch adds ZFS parallel journal support. It uses libzfs provided by
zfsonlinux to access ZFS' functionalities. To enable ZFS parallel journal
support, compile ceph by:
./configure --with-libzfs LIBZFS_CFLAGS="-I<libzfs header> -I<libspl header>"
make
Add following line to osd section of ceph.conf
filestore zfs_snap = 1
Note: ZFS (no mater parallel journal is enabled or not) does not support
direct IO. To use it as backend FS for OSD, you need to add following line
to osd section of ceph.conf
journal aio = 0
journal dio = 0
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Remove the rc suffix since RPM complains about. For rc release
builds the "rc" in the git describe string is suffcient for
everyhting but RPM. For rc release builds (i.e. not gitbuilder)
add a flag to the spec file.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
The ac_check_func fails because -lfuse is not in LIBS. This also enables
code that wasn't being compiled, and fixes compiler errors that
resulted.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Reviewed-by: Sage Weil <sage@inktank.com>
This involves three pieces:
For intrusive_ptr type references, we use TrackedIntPtr instead. This
uses get_with_id and put_with_id to associate an id and backtrace with
each particular ref instance.
For refs taken via direct calls to get() and put(), get and put now
require a tag string. The PG tracks individual ref counts for each tag
as well as the total.
Finally, PGs register/unregister themselves on construction/destruction
with OSDService.
As a result, on shutdown, we can check for live pgs and determine where
the references are held.
This behavior is compiled out by default, but can be included with the
--enable-pgrefdebugging flag.
Signed-off-by: Samuel Just <sam.just@inktank.com>
The filter_policy (bloom filter) stuff is fairly new in LevelDB's life,
and it turns out that precise's version is too old for it. Add conditional
compilation for those members in order to build and work properly.
Signed-off-by: Greg Farnum <greg@inktank.com>
Dynamically link to the leveldb installed on the system rather than
statically linking ceph copy. Remove the --with-system-leveldb config
option, and add a requirement for leveldb libraries for rpm and debian
packages. Bug 3945.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
The AC_PROG_CXX macro sets a flag if a C++ compiler is found
but does not fail if one is not found, it left to application
to test the flags as needed. This fix will issue an error
when a c++ compiler is not found. Bug 3955.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
it's not installed, this fix adds an error message for a
Check for fuse_getgroups() only in case we have found libfuse already.
Moved the check to the check for --with-fuse.
Small fix: fix string for NO_ATOMIC_OPS, don't use "'".
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Use git to get RPM_RELEASE only if this is a git repo
clone and if the git command is available on the system.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
The AS_IF used to cover java related checks via --enable-cephfs-java
didn't work correctly. Use a plain 'if/fi' instead to make sure this
section is only executed if --enable-cephfs-java is used.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Check for org.junit.rules.ExternalResource if build with
--enable-cephfs-java and --with-debug. Checking for junit4
isn't enough since junit4 has this class not before 4.7.
Added some m4 files to get some JAVA related macros. Changed
autogen.sh to work with local m4 files/macros.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Change handling of --with-debug and junit4. Add a new conditional HAVE_JUNIT4
to be able to build ceph-test package also if junit4 isn't available. In
this case simply don't build libcephfs-test.jar, but the rest of the tools.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Remove already comment out AC_PROG_RANLIB to get rid of warning:
libtoolize: `AC_PROG_RANLIB' is rendered obsolete by `LT_INIT'
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
In configure.ac, add the crypto library compiler flags to AM_CXXFLAGS and in
src/Makefile remove CRYPTO_CXXFLAGS and use only AM_CXXFLAGS which now has
the flags if needed.
radosgw depends on libresolv since since the commit 951c6be. So we need to
add -lresolve flags, or it cannot link right library.
Signed-off-by: Chen Baozi <baozich@gmail.com>
There is a difference in naming conventions between debian and
rpm based distributions for this library. In configure.ac we
check first for boost_thread-mt, then if it's not found check
for boost_thread. A side effect of the AC_CEHCK_LIB macro is
to add the library to the $LIBS, so the explicit -llibboost_thread
in the Makefile has been removed.
(cherry picked from commit f0c7bb363000037bbf7d58ac6e2d39d0f10200fe)
of tests classes from build.xml to Makefile and editing configure.ac to
look for the junit4 jar in the default location of /usr/share/java. It
is still possible to build and run tests from build.xml as well as
Makefile.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Noah Watkins <noahwatkins@gmail.com>
Adds --enable-cephfs-java and --with-jdk to build
the libcephfs Java bindings and specify the default
JDK directory, respectively.
Also adds default JDK paths to avoid --with-jdk in
the common case. Currently setup for the default
provided by Debian's default-jdk package, but other
default search paths can easily be added.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Users of the libcephfs api (fuse in particular)
don't check the mode against the open flags. This
commit does the proper checks to grant/deny access
to the file. The check_mode() function constructs
a requested mode based on the flags, and compares that
to the mode of the file.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
/usr/include/linux/fs.h defines this on CentOS 5, even though it does not
in fact compile. This stupid workaround avoids the problem.
Reported-by: Nick Couchman <Nick.Couchman@seakr.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
First try the FL_ALLOC_PUNCH_HOLE fallocate() flag. If we get EOPNOTSUPP,
fall back to writing zeros.
Check for fallocate(2) with configure. Also, avoid this if we are not
Linux, since I'm not sure about the hard-coded FL_ALLOC_PUNCH_HOLE being
correct on other platforms.
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Add a resource agent for mapping, unmapping and monitoring RBD devices.
Maps an RBD on start, unmaps it on stop. Checks "rbd showmapped"
output for monitoring whether the device is mapped, thus does not
rely on the ceph-rbdnamer udev magic to be enabled.
This RA is cloneable and essentially allows people to use RBD devices
as a drop-in replacement for
- iSCSI devices,
- host-based mirrored devices using md RAID-1,
- DRBD devices
in Pacemaker clusters.