Commit Graph

39888 Commits

Author SHA1 Message Date
Yan, Zheng
8ea5a811b3 client: re-send requsets before composing the cap reconnect message
After commit 419800fe (client: re-send request when MDS enters reconnecting
stage), cephfs client can send both unsafe requests and normal requests when
MDS is in reconnecting stage. Normal requests can have embedded cap releases,
the client code encodes these embedded cap releases after composing the cap
reconnect message. This causes the client sliently drop some caps. The fix
is re-send requsets (which add embedded cap releases) before composing the
cap reconnect message

Fixes: #10912
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-26 12:12:30 +08:00
Josh Durgin
01099766d7 Merge remote-tracking branch 'origin/hammer' 2015-02-25 18:52:36 -08:00
Josh Durgin
32fd355086 Merge pull request #3796 from ceph/wip-librbd-async-operations
librbd: better handling for async maintenance requests

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 18:52:19 -08:00
Jason Dillaman
6db92330d2 Merge pull request #3791 from ceph/wip-librbd-mdlock
librbd: fix object map locking

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2015-02-25 21:37:48 -05:00
Loic Dachary
e285d87014 Merge pull request #3801 from ceph/wip-fix-typo-troubleshooting
doc: fix typo deebug

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-02-26 02:43:53 +01:00
Ken Dreyer
f3ad61a674 packaging: move rbd udev rules to ceph-common
We should ship the RBD udev rules in the same package that ships
/usr/bin/rbd.  This package happens to be ceph-common, so move the udev
rules there.

The udev rules rely on the ceph-rbdnamer utility, so move that utility
and its man page as well.

http://tracker.ceph.com/issues/10864 Refs: #10864

Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
2015-02-25 17:43:01 -08:00
Josh Durgin
ec26f086cc librbd: remove unnecessary md_lock usage
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 17:42:13 -08:00
Josh Durgin
1f9782ea3c librbd: move object_map_lock acquisition into refresh()
Every caller was acquiring this just for these calls.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 17:40:25 -08:00
Gregory Meno
40be1404f5 doc: fix typo deebug
Signed-off-by: Gregory Meno <gmeno@redhat.com>
2015-02-25 17:30:33 -08:00
Josh Durgin
27e5ae603b librbd: don't check if object map is enabled before refreshing
This check is now done internally by the object map.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 17:27:34 -08:00
Josh Durgin
876f128f8b librbd: remove object map on rollback if needed
When rolling back to a snapshot that did not have object map enabled,
delete the head object map.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 17:27:34 -08:00
Josh Durgin
f4d8d16fbb librbd: clarify md_lock usage
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 17:27:34 -08:00
Josh Durgin
01dc05b931 test_librbd: add simple test for object map snapshot consistency
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 17:27:34 -08:00
Josh Durgin
85825008bc librbd: use snap_lock to protect ImageCtx->size
Since this is often looked up by snap_id anyway, snap_lock
is easy to use for this.

This lets us avoid taking md_lock in many places.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 17:27:34 -08:00
Josh Durgin
7fed0a366d librbd: hold snap_lock while reading parent info in diff_iterate
Caught be the re-added assertions in ImageCtx::get_parent_info()

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 17:27:34 -08:00
Josh Durgin
df42fd3833 test_librbd: close ioctx after imagectx
There's no need to explicitly close the ioctx. Doing so may cause
problems when the Images using it are destroyed afterwards.  Just let
normal cleanup at the end of the block take care of it in the correct
order.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 17:27:33 -08:00
Josh Durgin
06e5a3979f rbd: fix --image-feature parsing
Need to use _witharg(), not _flag()

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 17:27:33 -08:00
Josh Durgin
eef7466a5c librbd: apply flag read failure to all snaps
Don't check just the features of head, since it may be possible to
disable object map in the future.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 17:27:33 -08:00
Josh Durgin
6ac8139b25 librbd: make ImageCtx->object_map always present
This simplifies locking by obviating the NULL checks.  We no longer
need md_lock to protect these acceses. We can use object_map_lock
instead, to make sure no one reads an object map while its being
updated.

Keep track of whether the object map is enabled for a given snapshot
internally. In each public method, check this state, and automatically
set it correctly when refreshing the object map. During snapshot
removal, unconditionally try to remove the object map object, to
protect against bugs leaking objects, and to be consistent with image
removal.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 17:27:33 -08:00
Jason Dillaman
d611121ad7 tests: add unit test to verify async requests time out
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2015-02-25 20:00:54 -05:00
Jason Dillaman
c29548519e librbd: restart async requests if lock owner doesn't report progress
Detect the case of a crashed lock owner by waiting for up to 30 seconds
for a async request progress message from the leader.  If a progress
message isn't received, restart the request (and possibly take ownership
of the lock).

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2015-02-25 19:51:07 -05:00
Jason Dillaman
c611936c83 librbd: replace Finisher/SafeTimer use with facade
Replace the two Context threading classes used within
ImageWatcher with a facade to orchestrate the scheduling
and canceling of Context task callbacks.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2015-02-25 19:51:07 -05:00
Jason Dillaman
41e186a24f librbd: cancel in-progress maint operations before releasing lock
Ensure that all in-flight maintenance operations (resize, flatten) are
not running when the exclusive lock is released.  The lock will be
released when transitioning to a snapshot, closing the image, or
cooperatively when another client requests the lock.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2015-02-25 19:51:07 -05:00
Jason Dillaman
dde882cd7a librbd: flush context potentially completing too early
If the async operation associated with a flush request completes,
only complete the flush contexts if no previous operations are
still in flight. Otherwise, move the flush contexts to an older
in-flight async operation.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2015-02-25 19:51:07 -05:00
Josh Durgin
17493d648f Merge pull request #3799 from ceph/wip-librbd-image-watcher-tests
tests: add additional test coverage for ImageWatcher RPC

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 16:42:46 -08:00
Josh Durgin
04d360a4c1 librbd: take ImageCtx->snap_lock for write in add_snap()
add_snap() updates the ImageCtx snapshot metadata in memory, as well
as reading the flags as part of the object map snapshot. Both of these
require holding snap_lock.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 15:42:32 -08:00
Josh Durgin
40c2abb08a librbd: use snap_lock to protect ImageCtx->flags
This is another step towards eliminating md_lock from the writeback
path. Almost all the places that use ImageCtx->flags already use
snap_lock, so there's no need to create a new lock. For the others,
add a helper, test_flags() that acquires the lock, similar to
test_features().

This also makes sure we look up flags of the snapshot we're operating
on, instead of those for head.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 15:42:31 -08:00
Josh Durgin
bb4041fc95 librbd: add locking asserts to ImageCtx
A bunch of these used to be here, but were removed when converting to
RWLocks, before RWLocks had is_[w]locked() methods.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 15:41:54 -08:00
Josh Durgin
4bcbdbfd0f librbd: fix ImageWatcher::is_lock_supported() locking
Take snap_lock while reading ImageCtx->snap_id, and
look up the features by snap_id as well.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 15:41:54 -08:00
Josh Durgin
a94ceb6088 librbd: add and use a test_features() helper
This gets the appropriate locks, and checks the currently open
snapshot instead of head.  Looking up features by snap_id prepares us
for future addition or removal of e.g. an object map throughout the
life of an image.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 15:41:54 -08:00
Josh Durgin
cffd93a32f librbd: use ImageCtx->snap_lock for ImageCtx->features
This was being protected by md_lock, but that has become too coarse
since it is used to prevent writes from proceeding while flushing
caches for a snapshot. With the addition of ObjectMap and
ImageWatcher, writeback could try to acquire md_lock again, leading to
a deadlock.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-02-25 15:41:50 -08:00
Jason Dillaman
468839eac9 tests: add additional test coverage for ImageWatcher RPC
Test flatten, resize, and snap create RPC messages along with
basic error code return paths.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2015-02-25 15:03:49 -05:00
Jason Dillaman
915064a732 librbd: add ostream formatter for NotifyOp
Allow for reuse of the NotifyOp to string conversions within
dencoder and tests.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2015-02-25 14:58:27 -05:00
Greg Farnum
260c8201a8 fuse: do not invoke ll_register_callbacks() on finalize
We were passing in a NULL data structure, probably in an attempt to
let things clean up -- but our implementation just returns with a NULL
pass-in value, so drop it for clarity.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2015-02-25 11:24:52 -08:00
Gregory Farnum
93e04a04a6 Merge pull request #3794 from ceph/wip-10862-hammer
Backport: mon: do not try and "deactivate" the last MDS
2015-02-25 11:07:59 -08:00
Ken Dreyer
05a4d6b492 Merge pull request #3788 from ceph/wip-devel-python-split
split python-ceph into python-{rados,rbd,cephfs}

Reviewed-by: Ken Dreyer <kdreyer@redhat.com>
2015-02-25 12:00:43 -07:00
Sage Weil
1c68264928 doc/release-notes: final v0.87.1 notes
Signed-off-by: Sage Weil <sage@redhat.com>
2015-02-25 10:57:43 -08:00
Gregory Farnum
551fffa3be Merge pull request #3790 from ceph/hadoop-workunits
hadoop: workunits don't need java path
2015-02-25 09:20:49 -08:00
Haomai Wang
ff2d497f86 TestMsgr: Add inject error tests for lossless_peer_reuse policy
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-02-25 23:43:43 +08:00
Haomai Wang
9f24a8c75c TestMsgr: Make SyntheticWorkload support policy passed in
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-02-25 23:43:32 +08:00
John Spray
75d8c01179 mon: do not try and "deactivate" the last MDS
Fixes: #10862
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit a2867987bc)
2015-02-25 14:12:31 +00:00
John Spray
8f18fd9cce Merge pull request #3793 from Ved-vampir/master
common: change default value for perfcounter description ("NO_INFO_FIX">...

Reviewed-by: John Spray <john.spray@redhat.com>
2015-02-25 14:04:28 +00:00
Yan, Zheng
ce3d79f5f3 mds: remove MDSCacheObject::get_pin_totals()
MDSCacheObject::get_pin_totals() is debug code of debug code, which
is not quite useful.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-25 20:51:20 +08:00
Yan, Zheng
d92dbfd0e6 mds: optimize CDir::is_{freezing,frozen}_tree()
avoid checking ancestor CDir(s) when there is no freezing/frozen tree

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-25 20:51:19 +08:00
Yan, Zheng
d7936da0f1 mds: optimize get_projected_{xattrs,srnode}
these two functions traverse the whole projected_nodes list if there
is no projected xatts/srnode. busy directory inode can have large
projected_nodes list.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-25 20:51:19 +08:00
Yan, Zheng
96a85e7868 mds: use compact_map/compact_set to optimize memory usage of CDir
define some rarely used containers as compact_map/compact_set. Each
replacement can save 40 bytes for 64 bits program.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-25 20:51:19 +08:00
Yan, Zheng
151494f407 mds: dynamiclly allocate data structures for file locks
Size of ceph_lock_state_t is about 200 bytes, CInode contains two
ceph_lock_state_t. Dynamiclly allocating them can save about 400
bytes.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-25 20:51:19 +08:00
Yan, Zheng
00047fbefd mds: use compact_map/compact_set to optimize memory usage of CInode
define some rarely used containers as compact_map/compact_set. Each
replacement can save 40 bytes for 64 bits program.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-25 20:51:19 +08:00
Yan, Zheng
aa46d487bf mds: optimize memory usage of inode_t
inode_t::old_pools is rarely used. Defining it as compact_set can save
40 bytes. inline_data is also rarely used, dynamiclly allocating bufferlist
for inline_data can save another 72 bytes.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-25 20:50:36 +08:00
Yan, Zheng
3075a07902 mds: optimize memory usage of class InodeStore
InodeStore::old_inodes and InodeStore::snap_blob are for snapshotted
inode only. their size are 48 bytes and 80 bytes respectively. Defining
InodeStore::old_inodes can save 40 bytes, allocating bufferlist for
snap_blob dynamiclly can save 72 bytes.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-25 20:12:13 +08:00