Commit Graph

56407 Commits

Author SHA1 Message Date
Nathan Cutler
b8c24bf2f8 rpm: move mount.ceph from ceph-base to ceph-common
Ceph clients use mount.ceph to mount CephFS filesystems, and
ceph-base is not expected to be installed on client systems.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
2016-07-18 16:16:11 +02:00
Nathan Cutler
b090e9da32 rpm: create mount.ceph symlink in /sbin (SUSE only)
Fixes: http://tracker.ceph.com/issues/16598
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2016-07-18 15:36:29 +02:00
Radoslaw Zarzynski
9697ca3414 rgw, doc: fix formatting around Keystone-related options.
This patch brings a small fix for broken formatting around
two configurables in doc/radosgw/config-ref.rst. Those are:
  * rgw keystone admin user,
  * rgw keystone admin password.

Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
2016-07-18 15:29:09 +02:00
Ramana Raja
1c1d65a45f ceph_volume_client: version on-disk metadata
Version on-disk metadata with two attributes,
'compat version', the minimum CephFSVolume Client
version that can decode the metadata, and
'version', the version that encoded the metadata.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2016-07-18 16:38:32 +05:30
Ramana Raja
46876fb2ce ceph_volume_client: add versioning
Add class attributes to CephFSVolumeClient to version
its capabilities.

'version' attribute stores the current version number
of CephFSVolumeClient.

'compat_version' attribute stores the earliest version
number of CephFSVolumeClient that the current version is
compatible with.

Fixes: http://tracker.ceph.com/issues/15406

Signed-off-by: Ramana Raja <rraja@redhat.com>
2016-07-18 16:38:32 +05:30
Ramana Raja
82445a20a2 ceph_volume_client: disallow tenants to share auth IDs
Restrict an auth ID to a single OpenStack tenant to enforce
strong tenant isolation of shares.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2016-07-18 16:38:32 +05:30
Ramana Raja
ec2e6e37d0 ceph_volume_client: cleanup auth meta files
Remove auth meta files on last rule for an auth ID deletion

Signed-off-by: Ramana Raja <rraja@redhat.com>
2016-07-18 16:38:32 +05:30
Ramana Raja
7731287761 ceph_volume_client: fix log messages
Log the path of the volume during creation and deletion of volumes.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2016-07-18 16:38:32 +05:30
Ramana Raja
37fbfc7aa8 ceph_volume_client: create/delete VMeta for create/delete volume
Create and delete volume meta files during creation and deletion of
volumes.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2016-07-18 16:38:30 +05:30
xie xingguo
1e4735440c mon/osdmonitor: fix incorrect output of "osd df" due to osd out
If an osd is automatically marked as out, the output of "osd df"
is not right, as follow:

-5 10.00999        -  5586G  2989G  2596G     0    0     host ceph192-9-9-8
11  0.90999  1.00000   931G   542G   388G 58.25 0.99         osd.11
14  0.90999  1.00000   931G   530G   400G 56.97 0.97         osd.14
20  0.90999  1.00000   931G   716G   214G 76.99 1.31         osd.20
22  0.90999  1.00000   931G   477G   453G 51.29 0.87         osd.22
26  0.90999        0      0      0      0     0    0         osd.26
28  0.90999  1.00000   931G   587G   343G 63.09 1.07         osd.28
30  0.90999  1.00000   931G   602G   328G 64.75 1.10         osd.30
16  0.90999  1.00000   931G   589G   341G 63.34 1.08         osd.16
18  0.90999  1.00000   931G   530G   400G 56.93 0.97         osd.18
24  0.90999  1.00000   931G   202G   728G 21.77 0.37         osd.24
32  0.90999  1.00000   931G   477G   454G 51.23 0.87         osd.32

Two problems are identified from the above output:

1. the total capacity(total, total used, total avial)
only includes osd.32, osd.24, osd.18, osd.16, osd.30, osd.28, and other
healthy osds such as osd.11, osd.14 etc. are excluded.

2. the average utilization/deviation are forced resetted.

Fixes: http://tracker.ceph.com/issues/16706
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-07-18 18:46:54 +08:00
xie xingguo
98f50cebe1 mon/osdmonitor: initialize local variable "kb_avail_i"
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-07-18 18:41:45 +08:00
xie xingguo
82ccf7e5c3 os/bluestore: fix error handling of posix_fadvise() syscall
According to Linux man page:
On success, zero is returned.  On error, an error number is returned.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-07-18 18:07:30 +08:00
Ramana Raja
f7c037229b ceph_volume_client: modify locking of meta files
File locks are applied on meta files before updating the meta
file contents. These meta files would need to be cleaned up
sometime, which could lead to locks being held on unlinked meta
files. Prevent this by checking whether the file had been deleted
after lock was acquired on it.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2016-07-18 15:36:40 +05:30
Ramana Raja
f58403f3d1 cephfs.pyx: implement python bindings for fstat
Signed-off-by: Ramana Raja <rraja@redhat.com>
2016-07-18 15:36:40 +05:30
Ramana Raja
7f7d2a76ae ceph_volume_client: restrict volume group names
Prevent craftily-named volume groups from colliding with meta files.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2016-07-18 15:36:40 +05:30
Ramana Raja
27eb51baab ceph_volume_client: use fsync instead of syncfs
Signed-off-by: Ramana Raja <rraja@redhat.com>
2016-07-18 15:36:40 +05:30
Ramana Raja
647a2447f0 ceph_volume_client: recover from dirty auth and auth meta updates
Check dirty flag after locking something and call recover() if we are
opening something dirty (racing with another instance of the driver
restarting after failure) -- only required if someone running multiple
manila-share instances with Ceph loaded.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2016-07-18 15:36:40 +05:30
Ramana Raja
7c8a28a7e8 ceph_volume_client: modify data layout in meta files
Notable changes to data layout in auth meta and volume meta files:

In the auth meta files, add a 'dirty' flag to track the status of auth
updates to a single volume.

In the volume meta file, make the 'dirty' flag track the status of
auth updates for a single ID.

Optimize the recovery of partial auth update changes to auth meta,
volume meta, and the Ceph backend, facilitated by changes in the
data layout in the meta files.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2016-07-18 15:36:40 +05:30
John Spray
d2e9eb55ca pybind: ceph_volume_client authentication metadata
Store a two-way mapping between auth IDs and volumes.

Enables us to record some metadata on auth ids (which
openstack tenant created it) so that we can avoid exposing
keys to other tenants who try to use the same ceph
auth id.

Enables us to expose the list of which auth ids have access
to a volume, so that Manila's update_access() can be
implemented efficiently.

DNM: see TODOs inline.

Fixes: http://tracker.ceph.com/issues/15615

Signed-off-by: John Spray <john.spray@redhat.com>
2016-07-18 15:36:40 +05:30
John Spray
5678584f41 pybind: enable integer flags to libcephfs open
The 'rw+' style flags are handy and convenient, but
they don't capture all possibilities.  Change to
optionally accept an integer here for advance users
who want to specify arbitrary combinations of
flags.

Signed-off-by: John Spray <john.spray@redhat.com>
2016-07-18 15:36:40 +05:30
xie xingguo
012f8f9eb6 os/bluestore: trim cache on collection_list() too
Currently we trim cache on all other read operations
other than this one.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-07-18 16:08:42 +08:00
xie xingguo
9397819522 os/bluestore: assert available of stat doesn't underflow
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-07-18 15:49:04 +08:00
Yan, Zheng
2910bf9269 kv/MemDB: fix compilation on OSX
Include "include/compat.h" for TEMP_FAILURE_RETRY

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-18 15:09:32 +08:00
Yan, Zheng
eaec77077b rgw: fix compilation on OSX
On OSX, some fields of struct stat have different names.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-18 15:09:29 +08:00
Yan, Zheng
1fb15a210a compat: define ceph_pthread_{set,get}name()
pthread_setname_np() exists on OSX, but it only accepts a sinlge
argument. Defining a two parameters version pthread_setname_np()
does not work.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-18 15:01:32 +08:00
xie xingguo
4a09620a01 os/bluestore: en/decode csum_type/csum_chunk_order more efficiently
These two fields are of type uint8_t, so varint encoding/decoding
is unnecessary.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-07-18 14:32:28 +08:00
xie xingguo
1c07f2e2e8 os/bluestore: make csum_type checking more efficient
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-07-18 11:31:43 +08:00
Yan, Zheng
200406fad5 compat: define HOST_NAME_MAX for OSX
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-18 11:08:48 +08:00
Yan, Zheng
c80eb1b5c0 msg: don't use thread_local variable
thread_local is not supported by OSX

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-18 11:08:48 +08:00
Yan, Zheng
620f3bdca7 build: don't link to libuuid
we have already moved to use boost uuid implementation

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-18 11:08:48 +08:00
Yan, Zheng
23f9505ec2 time: make ceph_time clocks work under OSX
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-18 11:08:48 +08:00
Yan, Zheng
edf23836c9 compat: error code translation for OSX
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-18 11:08:48 +08:00
Yan, Zheng
99cf3c13ee compat: define EREMOTEIO for OSX
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-18 11:08:48 +08:00
Yan, Zheng
c26b14649c build: check if posix_fadvise() exists
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-18 11:08:48 +08:00
Yan, Zheng
ddb4a28f40 ceph-fuse: fix compilation on OSX
Osxfuse does not define FUSE_CAP_DONT_MASK/FUSE_SET_ATTR_ATIME_NOW.
Besides, its header files are under osxfuse/ directory.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-18 11:08:48 +08:00
Yan, Zheng
a6eb0d7d17 os/FuseStore: fix build under FreeBSD/OSX
include proper headers for statfs(2)

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-18 11:08:48 +08:00
Yan, Zheng
0e7daa1974 ceph-dencoder: fix build with '--without-libaio'
bluestore is not compiled with '--without-libaio'

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-18 11:08:48 +08:00
Yan, Zheng
7e319f1493 encoding: don't encode/decode 'size_t'
encode/decode funtions for type 'size_t' on OSX are ambiguous

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-18 11:08:48 +08:00
Mark Nelson
cab254e924 os/bluestore: revert preferred csum behavior
This passes "ceph_test_objectstore --gtest_filter=*/2".
This restores 4K random read performance to previous levels when objects
are were previously written out using large IOs (4MB in this case):

pre-patch: 26MB/s
post-pated: 610MB/s

Closes #10320

Signed-off-by: Mark Nelson <mnelson@redhat.com>
2016-07-17 21:33:59 -05:00
Somnath Roy
bf70bcb6c5 Bluestore: Fixed a Bluestore crash
A bluestore race condition is been fixed by protecting txc structures
within _txc_state_poc with collection lock.

Mark's comments:

This fixes segfaults during random write tests with bluestore.
This passes "ceph_test_objectstore --gtest_filter=*/2".
This may introduce a small performance regresion, though there is enough
noise in the results to make it inconclusive.

Closes #10220

Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
2016-07-17 21:26:31 -05:00
Patrick Donnelly
0f044e0880
mds: add assertions for standby_daemons invariant
These assertions catch state changes of an mds in standby_daemons to a state
other than MDSMap::STATE_STANDBY. Currently this invariant is (sometimes!)
checked in other locations on access of standby_daemons. This commit allows us
catch the violated invariant at the time it occurred.

Related to: http://tracker.ceph.com/issues/16592

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2016-07-16 13:31:00 -04:00
Willem Jan Withagen
aab972c0d5 Cmake: fix using CMAKE_DL_LIBS instead of dl
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
2016-07-16 14:08:45 +02:00
Kefu Chai
c30c5223b5 cmake: only allow up to 1 hour for a ceph test
quote from
https://cmake.org/cmake/help/v3.0/prop_test/TIMEOUT.html?highlight=timeout

> If it exceeds that the test process will be killed and ctest will move
> to the next test.

this helps us to identify test hang.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 12:00:47 +08:00
Kefu Chai
5a695974cc cmake: restructure src/CMakeLists.txt in a more hierarchical way
Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 12:00:47 +08:00
Kefu Chai
45176611b3 cmake: do not pass --disable-pip-version-check if not supported
on older versions of pip, this option is not supported, and
--disable-pip-version-check is implied with --no-index. so no need to
use them when --no-index is passed to pip.

this partially reverts 395f2c5

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 12:00:46 +08:00
Kefu Chai
941605d4f3 cmake: remove duplicated src from ceph_rgw_jsonparser
they are included by rgw_a as well. and ceph_rgw_jsonparser is linked
against rgw_a.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 12:00:46 +08:00
Kefu Chai
64a8bfbb4a cmake: link libcommon against $CRYPTO_LIBS
as ceph_crypt.cc is using the symbols in it, and libcommon contains
ceph_crypt.cc.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 12:00:46 +08:00
Kefu Chai
cd0dfc305d run-make-check.sh: run tests in two steps
this is a workaround of the timeout found in jenkins. currently three
tests are found timeout, and they are labeld with "Racing" and
"LongRunning". so, to workaround this issue, we run the tests in two
phases:

1. run the racing tests with -j1
2. run the non-racing tests with -jN

if we all all tests with -j1, the total test time is 2683.57 sec

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 12:00:46 +08:00
Kefu Chai
3a21d0040a cmake: label some tests with "Racing"
two tests timesout for unknown reasons, so label them with
"Racing" and "LongRunning" labels.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 12:00:45 +08:00
Patrick Donnelly
9566ae27b3
mds: use FSMap::insert to add to standby_daemons
This reduces the number of code sites which modify standby_daemons.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2016-07-15 23:41:21 -04:00