Commit Graph

56083 Commits

Author SHA1 Message Date
Ramana Raja
647a2447f0 ceph_volume_client: recover from dirty auth and auth meta updates
Check dirty flag after locking something and call recover() if we are
opening something dirty (racing with another instance of the driver
restarting after failure) -- only required if someone running multiple
manila-share instances with Ceph loaded.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2016-07-18 15:36:40 +05:30
Ramana Raja
7c8a28a7e8 ceph_volume_client: modify data layout in meta files
Notable changes to data layout in auth meta and volume meta files:

In the auth meta files, add a 'dirty' flag to track the status of auth
updates to a single volume.

In the volume meta file, make the 'dirty' flag track the status of
auth updates for a single ID.

Optimize the recovery of partial auth update changes to auth meta,
volume meta, and the Ceph backend, facilitated by changes in the
data layout in the meta files.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2016-07-18 15:36:40 +05:30
John Spray
d2e9eb55ca pybind: ceph_volume_client authentication metadata
Store a two-way mapping between auth IDs and volumes.

Enables us to record some metadata on auth ids (which
openstack tenant created it) so that we can avoid exposing
keys to other tenants who try to use the same ceph
auth id.

Enables us to expose the list of which auth ids have access
to a volume, so that Manila's update_access() can be
implemented efficiently.

DNM: see TODOs inline.

Fixes: http://tracker.ceph.com/issues/15615

Signed-off-by: John Spray <john.spray@redhat.com>
2016-07-18 15:36:40 +05:30
John Spray
5678584f41 pybind: enable integer flags to libcephfs open
The 'rw+' style flags are handy and convenient, but
they don't capture all possibilities.  Change to
optionally accept an integer here for advance users
who want to specify arbitrary combinations of
flags.

Signed-off-by: John Spray <john.spray@redhat.com>
2016-07-18 15:36:40 +05:30
xie xingguo
012f8f9eb6 os/bluestore: trim cache on collection_list() too
Currently we trim cache on all other read operations
other than this one.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-07-18 16:08:42 +08:00
xie xingguo
9397819522 os/bluestore: assert available of stat doesn't underflow
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-07-18 15:49:04 +08:00
xie xingguo
4a09620a01 os/bluestore: en/decode csum_type/csum_chunk_order more efficiently
These two fields are of type uint8_t, so varint encoding/decoding
is unnecessary.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-07-18 14:32:28 +08:00
xie xingguo
1c07f2e2e8 os/bluestore: make csum_type checking more efficient
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-07-18 11:31:43 +08:00
Mark Nelson
cab254e924 os/bluestore: revert preferred csum behavior
This passes "ceph_test_objectstore --gtest_filter=*/2".
This restores 4K random read performance to previous levels when objects
are were previously written out using large IOs (4MB in this case):

pre-patch: 26MB/s
post-pated: 610MB/s

Closes #10320

Signed-off-by: Mark Nelson <mnelson@redhat.com>
2016-07-17 21:33:59 -05:00
Somnath Roy
bf70bcb6c5 Bluestore: Fixed a Bluestore crash
A bluestore race condition is been fixed by protecting txc structures
within _txc_state_poc with collection lock.

Mark's comments:

This fixes segfaults during random write tests with bluestore.
This passes "ceph_test_objectstore --gtest_filter=*/2".
This may introduce a small performance regresion, though there is enough
noise in the results to make it inconclusive.

Closes #10220

Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
2016-07-17 21:26:31 -05:00
Patrick Donnelly
0f044e0880
mds: add assertions for standby_daemons invariant
These assertions catch state changes of an mds in standby_daemons to a state
other than MDSMap::STATE_STANDBY. Currently this invariant is (sometimes!)
checked in other locations on access of standby_daemons. This commit allows us
catch the violated invariant at the time it occurred.

Related to: http://tracker.ceph.com/issues/16592

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2016-07-16 13:31:00 -04:00
Willem Jan Withagen
aab972c0d5 Cmake: fix using CMAKE_DL_LIBS instead of dl
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
2016-07-16 14:08:45 +02:00
Kefu Chai
c30c5223b5 cmake: only allow up to 1 hour for a ceph test
quote from
https://cmake.org/cmake/help/v3.0/prop_test/TIMEOUT.html?highlight=timeout

> If it exceeds that the test process will be killed and ctest will move
> to the next test.

this helps us to identify test hang.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 12:00:47 +08:00
Kefu Chai
5a695974cc cmake: restructure src/CMakeLists.txt in a more hierarchical way
Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 12:00:47 +08:00
Kefu Chai
45176611b3 cmake: do not pass --disable-pip-version-check if not supported
on older versions of pip, this option is not supported, and
--disable-pip-version-check is implied with --no-index. so no need to
use them when --no-index is passed to pip.

this partially reverts 395f2c5

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 12:00:46 +08:00
Kefu Chai
941605d4f3 cmake: remove duplicated src from ceph_rgw_jsonparser
they are included by rgw_a as well. and ceph_rgw_jsonparser is linked
against rgw_a.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 12:00:46 +08:00
Kefu Chai
64a8bfbb4a cmake: link libcommon against $CRYPTO_LIBS
as ceph_crypt.cc is using the symbols in it, and libcommon contains
ceph_crypt.cc.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 12:00:46 +08:00
Kefu Chai
cd0dfc305d run-make-check.sh: run tests in two steps
this is a workaround of the timeout found in jenkins. currently three
tests are found timeout, and they are labeld with "Racing" and
"LongRunning". so, to workaround this issue, we run the tests in two
phases:

1. run the racing tests with -j1
2. run the non-racing tests with -jN

if we all all tests with -j1, the total test time is 2683.57 sec

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 12:00:46 +08:00
Kefu Chai
3a21d0040a cmake: label some tests with "Racing"
two tests timesout for unknown reasons, so label them with
"Racing" and "LongRunning" labels.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 12:00:45 +08:00
Patrick Donnelly
9566ae27b3
mds: use FSMap::insert to add to standby_daemons
This reduces the number of code sites which modify standby_daemons.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2016-07-15 23:41:21 -04:00
Kefu Chai
5b14e28392 cmake: no need to depend run-tox-ceph-disk on test
run-tox-ceph-disk and run-tox-ceph-detect-init are already added as
test, so "ctest" and "make {test,check}" will run them without extra
settings.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 10:57:14 +08:00
Kefu Chai
ce5724effb cmake: add a "tests" target to build tests
please note "make test" is used by cmake to run tests, so we cannot just
repurpose it to *build* them.

* AddCephTest.cmake: depends on "tests"
* CMakeLists.txt: let "check" depend on "tests"
* src/CMakeLists.txt: update the run-tox tests
* run-make-check.sh: use "make tests" and "ctest" instead of "make check"
* ceph-detect-init/CMakeLists.txt: let "tests" depend on
    "ceph-detect-init"
* ceph-disk/CMakeLists.txt: let "tests" depend on "ceph-disk"

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-16 10:57:12 +08:00
Yehuda Sadeh
3e5775475b Merge pull request #10301 from cbodley/wip-rgw-meta-stack-wakeup
rgw: RGWMetaSyncCR holds refs to stacks instead of crs

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
Reviewed-by: Matt Benjamin <mbenjami@redhat.com>
2016-07-15 19:19:57 -07:00
Noah Watkins
d11a6be155 release: release notes update for objclass-perms
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2016-07-15 12:41:18 -07:00
Noah Watkins
298bc93954 osd: use static method and simplify return
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2016-07-15 12:28:10 -07:00
Casey Bodley
e4bc16044e rgw: RGWMetaSyncCR holds refs to stacks for wakeup
because RGWCoroutine::wakeup() calls RGWCoroutinesStack::wakeup(), the
stack must also stay alive

Fixes: http://tracker.ceph.com/issues/16666

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2016-07-15 11:20:13 -04:00
Kefu Chai
4025a2eb80 Merge pull request #10306 from tchaikov/wip-no-mktemp-p
qa/workunits/cephtool/test.sh: s/TMPDIR/TEMP_DIR/

Reviewed-by: Haomai Wang <haomai@xsky.com>
2016-07-15 21:34:02 +08:00
Kefu Chai
bf58aeb6d5 Merge pull request #9972 from ceph/objclass-perm
osd: object class loading and execution permissions

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
2016-07-15 17:25:44 +08:00
Kefu Chai
316c15bc9d Merge pull request #9980 from gaowanlong/split_out_handle_pg_scrub
osd: small cleanups

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2016-07-15 17:09:14 +08:00
Kefu Chai
9103958688 Merge pull request #10126 from dx9/wip-fcntl-warns
test/libcephfs: fix gcc sys/fcntl.h warnings

Reviewed-by: Kefu Chai <kchai@redhat.com>
2016-07-15 17:05:15 +08:00
Kefu Chai
56df5dcda4 Merge pull request #10166 from wjwithagen/wip-wjw-cmake-test_rados_tool.sh
test_rados_tool.sh: Make script work under ctest

Reviewed-by: Kefu Chai <kchai@redhat.com>
2016-07-15 17:03:59 +08:00
Kefu Chai
e0a5e67401 Merge pull request #9782 from Ved-vampir/zlib_cleanup
compressor: zlib compressor plugin  cleanup

Reviewed-by: Kefu Chai <kchai@redhat.com>
2016-07-15 17:03:38 +08:00
Kefu Chai
83e64fad87 test: run_seed_to_range.sh: check existance of a directory using [ -d "$dir" ]
sadly, sh evalutes `[ -d ]` to true. as it takes "-d" as a non-empty
string as true.

this fixes following failure
```
2016-07-12T23:22:02.839 INFO:teuthology.orchestra.run.mira084.stderr:cp:
missing destination file operand after ‘.’
2016-07-12T23:22:02.839 INFO:teuthology.orchestra.run.mira084.stderr:Try
'cp --help' for more information.
```
see
http://pulpito.ceph.com/kchai-2016-07-12_23:09:35-rados-wip-kefu-testing2---basic-mira/311334/

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-15 16:48:22 +08:00
Kefu Chai
7c1b456022 qa/workunits/cephtool/test.sh: s/TMPDIR/TEMP_DIR/
this fixes the test failure of
```
2016-07-12T23:29:40.935
INFO:tasks.workunit.client.0.mira101.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:
line 153: /CEPH_WATCH_9445: Permission denied
```
see
http://pulpito.ceph.com/kchai-2016-07-12_23:09:35-rados-wip-kefu-testing2---basic-mira/311333/

it's a regression introduced by e5c262b

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-15 16:20:37 +08:00
Kefu Chai
e5c262bfe6 qa/workunits/cephtool/test.sh: use mktemp $TEMP_DIR/XXX instead
mktemp -p is not supported on FreeBSD

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-15 14:21:52 +08:00
Kefu Chai
e0b94ff38d Merge pull request #10131 from badone/wip-peering-doc-fixes
doc: peering.rst, fix typo

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@gmail.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2016-07-15 12:15:55 +08:00
Kefu Chai
b342e4f06b Merge pull request #10292 from badone/wip-perf-counters-doc-fixes
doc: perf_counters.rst fix trivial typo

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@gmail.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2016-07-15 12:13:19 +08:00
Kefu Chai
3be6e1c76b test: ceph-detect-init/run-tox.sh: run it from any path
this follows the pattern in ceph-disk. this enables us to run
ceph-detect-init/run-tox.sh from the ${CMAKE_BINARY_DIRECTORY}
as well.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-07-15 09:55:51 +08:00
Yan, Zheng
265f96bda7 client: fix MetaRequest::set_other_inode()
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-15 09:15:55 +08:00
Yan, Zheng
3099cabd11 client: close directory's snapdir when deleting directory
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-15 09:15:55 +08:00
Yan, Zheng
f180ad149a client: invalidate snap inodes after removing snapshot
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-15 09:11:49 +08:00
Yan, Zheng
85e687de87 mds: fix incorrect "unconnected snaprealm xxx" warning
If a snaprealm has no child/parent snaprelam, and the snaprealm inode
is not in the cache while client reconnects. The snaprealm does not
get properly removed from MDCache::reconnected_snaplrealm. This causes
incorrect "unconnected snaprealm xxx" warning

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-15 09:11:49 +08:00
Yan, Zheng
16f7d7c71e qa/workunits/fs: fix expect_failure function in test scripts
The origin expect_failure function return 0 regardness of command's
return value.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-15 09:11:49 +08:00
Yan, Zheng
d3916717e2 client: make sure snapflush is sent before normal cap message
MDS does null snapflush when it receives normal cap message. So client
must send snapflush first.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-15 09:11:49 +08:00
Yan, Zheng
a05e996b2a client: unify cap flush and snapcap flush
This patch includes following changes
- assign flush tid to snapcap flush
- remove session's flushing_capsnaps list. add inode with snapcap
  flushes to session's flushing_caps list instead.
- when reconnecting to MDS, re-send one inode's snapcap flushes and
  cap flushes at the same time.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-15 09:11:49 +08:00
Yan, Zheng
bc50e03092 mds: handle partly purged directory
For a snapshoted direcotry whose snaprealm parents are being opened,
MDS does not know if the directory is purgeable. So MDS can't skip
committing dirfrags of the directory. But if the direcotry is purgeale,
some dirfrags could have already been deleted during MDS failover.
Committing them could return -ENOENT.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-15 09:11:49 +08:00
Yan, Zheng
dd98448d3d mds: do files recovery after processing cap flushes
File recovery may update inode and trigger inode COW. MDS relies on
client caps to setup CInode::client_need_snapflush. But for a given
client, the reconnected caps may not include the flushing caps.
(Before MDS failover, client released and flushed some caps at the
same time. When MDS recovers, client re-send the cap flush and send
cap reconnect to the MDS.) This may cause later snapflush to get
dropped.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-15 09:11:49 +08:00
Yan, Zheng
57067e032e mds: combine MDCache::{reconnected_caps,cap_imports_dirty}
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-15 09:11:49 +08:00
Yan, Zheng
cfc3ec17b6 mds: remove CEPH_LOCK_IFLOCKL from cinode_lock_info
Currently we don't support dirty CEPH_CAP_FLOCK_EXCL

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-15 09:11:49 +08:00
Yan, Zheng
1b7d198f63 mds: rebuild the internal states that tracking pending snapflush
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-15 09:11:49 +08:00