Commit Graph

21809 Commits

Author SHA1 Message Date
Sage Weil
dc2ced96bf Merge branch 'wip-client-leaks' into next 2012-11-12 15:02:51 -08:00
Sage Weil
2f241685e8 client: fix null put in ~MetaSession
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 15:02:41 -08:00
Sage Weil
a56c1ca3b3 Merge branch 'wip-client-leaks' into next 2012-11-10 02:38:26 -08:00
Dan Mick
07b4f8fa0a si_t was not properly converting values < 100KB
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2012-11-09 19:26:42 -08:00
Sage Weil
8f49de0fb1 osdc/ObjectCacher: only call flush callback if we transitions to clean
If we race with e.g. truncate and are in bh_write_commit but the oset
is already clean, we should not call the flush callback (again).

This is reproduced by:

 - kludging slow osd replies into the code (e.g., 2 second delay)
 - mount ceph-fuse with --client-oc-max-dirty-age 1
 - dd if=/dev/zero of=mnt/foo count=1
   sleep 1
   truncate --size 0 mnt/foo
 -> crash

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 18:34:22 -08:00
Sage Weil
54f6c17ae3 client: ensure we don't leak MClientReply
We are careful to clear this reference when processing it.

Add an assert here.  There's no way we can get 2 quick replies because
of the kick-back below.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 16:03:20 -08:00
Sage Weil
a8053c10d9 ceph-fuse: fix leak of args
Also fix up the helper we use to have fewer sharp edges.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 16:03:20 -08:00
Sage Weil
b305fc8735 osdc/ObjectCacher: fix leak on readahead
If we initiate io (success == false) but have no waiter, we need to
delete the OSDRead.

This affects libcephfs/ceph-fuse, but not librbd, which does no readahead.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 16:03:20 -08:00
Josh Durgin
9aae0eeaa7 rbd: check for second argument to mv/rename
Without this check, 'rbd mv foo' crashed trying to use a NULL char* as
a string.

Reported-by: Andrey Korolyov <andrey@xdel.ru>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-11-09 12:11:20 -08:00
Sage Weil
b5ce4d0ed7 client: fix SnapRealm leak
Start ref count at 0; get_snap_realm() will increment it after alloc.

Fix the ref drop order so that the xlist is empty.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 10:12:21 -08:00
Sage Weil
56a152b1f2 client: debug SnapRealm reference counting
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 10:09:53 -08:00
Sage Weil
88cdde37d1 client: fix leak of Cap
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 10:02:59 -08:00
Sage Weil
5e564f9b7f client: fix leak of session release msg on session close
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 10:01:48 -08:00
Sage Weil
c352edd328 client: only start invalidator thread if cb != NULL
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 09:59:39 -08:00
Sage Weil
44a7017868 ceph-fuse: deallocate messenger, g_ceph_context on stop
This lets us use valgrind to find leaks.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 09:52:08 -08:00
Sage Weil
66e6a63608 client: give get_caps() out-arg a less confusing name
No functional change.

Call this arg "have" and not "got", since we only take a ref on what we
need.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 07:38:42 -08:00
Sage Weil
ad4bd4e23e client: do not gratuitously drop FILE_CACHE ref in _read()
The get_caps() had a confusing out-arg called "got" that is really what
caps we *have*; it only takes a ref on the *need* cap.  We should only
put that one explicitly (CEPH_CAP_FILE_RD).  The _write() method already
does this properly, but _read() did not.

Fixes: #3470
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 07:38:42 -08:00
Sage Weil
128fed8e17 client: assert cap refs don't go negative
This is the root cause of #3470.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 07:38:42 -08:00
Dan Mick
faf23caa6a rbd: fix snap unprotect, which was succeeding while clones existed
1) use right snap id when forming parent spec to search for children
2) add test case for "unprotect with extant children"

Signed-off-by: Dan Mick <dan.mick@inktank.com>
2012-11-08 17:14:22 -08:00
Gary Lowell
a39110db47 ceph.spec.in: Remove ceph version requirement from ceph-fuse package.
The ceph-fuse rpm package now only requires ceph as a pre-req, not a specific
version.
2012-11-08 09:39:59 -08:00
caleb miles
e37c19285a rgw_admin: do not throw error when start-date and end-date are not
passed to usage::trim()

Signed-off-by: caleb miles <caleb.miles@inktank.com>
2012-11-08 09:31:30 -08:00
Sage Weil
ca8988020e client: kick waiters for an mds session to open on mds recovery
We already kicked waiters for request, but we need to kick waiters on open
too (e.g., a client trying to mount).

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-07 04:42:01 -08:00
Sage Weil
f0927cbb76 qa: disable xfstest 45 until mount issue is resolved on precise
Meh

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-07 04:15:31 -08:00
Dan Mick
0baa9275ea cls_rbd: send proper format of key to "last_read" for dir_list
rbd ls of format-2 images was looping on the first 64 (when more than 64
were present).  The key name passed to the omap layer needs to always
contain the prefix, and the "inside-the-loop next-chunk" statement
was missing the "add the prefix" call.

Also, add a test for listing 100 images, format 1 and 2.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-11-06 15:51:44 -08:00
Yehuda Sadeh
84299e16f3 rgw: fix multipart overwrite
Fixes: #3400
Removed a few lines of code that prematurely created the head
part of the final object (before creating the manifest).

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-11-06 10:17:14 -08:00
Yehuda Sadeh
be6d563653 rgw: don't reset multipart parts when updating their metadata
Fixes: #3401
The problem was that put_obj_meta() was assuming object is going
to be reset, so it was resetting the object anyway. This is not
true when dealing with the immutable multipart upload parts.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-11-06 10:16:59 -08:00
Yehuda Sadeh
488b019adf rgw: break out of read loop if we got zero bytes
If the part that we're reading is corrupted and we end up
reading zero bytes, we need to exit, otherwise we'd just
loop forever.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-11-06 10:16:17 -08:00
Dan Mick
241569c595 rbd: allow removal of image even if rbd_children deletion fails
Users have been seeing failures where rbd rm is half-done; could be
because of outstanding watches on the rbd_header object.  The state
is that rbd_children no longer contains the child, but other pieces
remain; remove considers this a failure.

Fix: test for ENOENT from remove_child, and treat that as an ignorable
error and drive on.  Simulate this in copy.sh by removing the
rbd_children object altogether, which also results in ENOENT return
from remove_child.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-11-05 21:41:34 -08:00
Samuel Just
342c2c7077 PG::merge_old_entry: fix case for divergent prior_version
Previously, we asserted that a log entry with a divergent
prior_version must be a clone.  Consider the following
case:

6'11(6'2)  m foo
7'12(6'3) m bar
7'13(7'12) m bar

If this is merged with:

6'11(6'2)  m foo
8'12(6'4) m baz

we will hit the assert.  The correct behavior is simply to remove
the object as in the clone case.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2012-11-04 06:16:36 -08:00
Samuel Just
7e264678a9 PG: use remove_object_with_snap_hardlinks for divergent objects
Otherwise, we end up leaving snap hardlinks in the snapshot
index directories.  This eventually results in an EEXIST error
when we attempt to re-link the clone into place during
recovery.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2012-11-04 06:16:18 -08:00
Sage Weil
c435d314ca ceph-disk-activate: avoid duplicating mounts if already activated
If the given device is already mounted at the target location, do not
mount --move it again and create a bunch of dup entries in the /etc/mtab
and kernel mount table.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-31 17:14:31 -07:00
Mike Ryan
3f08e96cc0 PG: requeue snap_trimmer after scrub finishes
Previously the snap_trimmer would continuously requeue itself until the
end of scrub. This degrades performance and fills up logs for No Good
Reason.

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
2012-10-31 15:42:03 -07:00
Sage Weil
402e1f5319 ceph-disk-prepare: poke kernel into refreshing partition tables
Prod the kernel to refresh the partition table after we create one.  The
partprobe program is packaged with parted, which we already use, so this
introduces no new dependency.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-30 10:40:58 -07:00
Sage Weil
2e32a0ee2d ceph-disk-prepare: fix journal partition creation
The end value needs to have + to indicate it is relative to wherever the
start is.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-30 10:40:58 -07:00
Sage Weil
8921fc7c7b ceph-disk-prepare: assume parted failure means no partition table
If the disk has no valid label we get an error like

  Error: /dev/sdi: unrecognised disk label

Assume any error we get is that and go with an id label of 1.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-30 10:40:58 -07:00
Sage Weil
b9eccdf8ba osd: make pool_snap_info_t encoding backward compatible
Way back in fc869dee1e (v0.42) when we redid
the osd type encoding we forgot to make this conditionally encode the old
format for old clients.  In particular, this means that kernel clients
will fail to decode the osdmap if there is a rados pool with a pool-level
snapshot defined.

Fixes: #3290
Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-29 11:05:59 -07:00
Sage Weil
2f09d47d21 mon: fix leading error string from 'ceph report'
Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-26 14:55:43 -07:00
Sage Weil
c0df832877 osd: fix populate_obc_watchers() assert
There is one case where populate_obc_watchers gets called when the object
is missing: during a revert.  And in that case we *should* do the populate,
since all that is getting reverted is the object version.

Fixes: #3405
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Sam Just <sam.just@inktank.com>
2012-10-25 11:59:33 -07:00
Sage Weil
2248822b2c osd: drop conditional check in populate_obc_watchers
Turn these into asserts.  The only two callers are create_object_context()
and get_object_context(), and they only get called when the object is no
longer missing.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2012-10-22 10:45:36 -07:00
Sage Weil
4156b984a5 osd: populate obc watchers even when degraded
Bug #3142 appears to be caused by the following sequence:

 - object X missing on primary and replica
 - [assert-ver,watch], notify, unwatch requests come in, get deferred
 - object is recovered on primary, !missing, create_object_context
   - populate_obc_watchers() does nothing, since still degraded
 - notify happens now (odd but ok?)
 - replica recovered, !degraded
 - watch skips bc of bad assert
 - unwatch trips up on an assert because populate_obc_watchers never
   ran

Fix this by populating the obc watcher when !missing, not when
!degraded.  This conditional dates back to Sam's original watch/notify
cleanup in October 2011.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2012-10-22 10:45:20 -07:00
Sam Lang
233b0bdf0b test/libcephfs: Fix telldir/seekdir test
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-10-19 09:22:29 -07:00
Sage Weil
1c382869ba addr_parsing: make , and ; and ' ' all delimiters
Instead of just ,.  Currently "foo.com, bar.com" will fail because of the
space after the comma.  This patches fixes that, and makes all delim
chars interchangeable.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-17 23:00:10 -07:00
Sage Weil
6f74e6b36a radosgw: fix compile warning
Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-16 17:27:28 -07:00
Gary Lowell
d78ba6af94 Merge branch 'next' 2012-10-16 23:27:21 +00:00
John Wilkins
ab4d8b75f3 doc: Updated the cephx section of the toc for cluster ops.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-10-16 15:24:11 -07:00
John Wilkins
256c665eab doc: Did a little clean-up work in the cephx guide.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-10-16 15:23:34 -07:00
John Wilkins
0818e1e95a doc: Added a new intro for cephx authentication.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-10-16 15:22:25 -07:00
Yehuda Sadeh
d2afddd457 rgw: multiple coverity fixes
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-10-16 14:13:38 -07:00
Sage Weil
db976663a5 mds: explicitly queue messages for unconnected clients
Previously, the messenger would queue messages for a destination that
didn't exist when you were a server; that changed a while back with the
wip-msgr merge (circa v0.52).  The result is that when we force open
client sessions and queue messages, they are dropped on the floor and the
client--when it does connect--gets confusing stuff from the MDS.

Instead, explicitly queue and send these messages.  Also, *always* send
via the Connection* instead of the inst.

Fixes: #2681
Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-16 13:04:43 -07:00
Gary Lowell
2528b5ee10 v0.43 2012-10-16 17:42:36 +00:00