Commit Graph

22510 Commits

Author SHA1 Message Date
Josh Durgin
efdb209b0b doc: reorder some openstack/rbd instructions
Move client creation to the section on setting up client auth, so you
don't skip it if you already have pools created.

Move CEPH_ARGS setting to the section on configuring OpenStack, since
it's a change for the OpenStack services, not purely ceph client
configuration.

A couple people were confused by the placement of these parts, and
they make more sense in these sections.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-11-13 08:14:18 -08:00
Sage Weil
32fb8eaf7f Merge branch 'wip-client-asok' 2012-11-12 17:01:30 -08:00
Sage Weil
caed0e917f osdc/ObjectCacher: do not take Object ref for bh writes
This reverts commit 46897fd4ff.

There is no reason to carry a ref for the writes; it just makes things
more confusing because the refs aren't used for lifecycle, only for
LRU pinning.  We would need to duplicate the close_object() conditional
here for this to work right.

This fixes #3431, in which a slow osd reply has an Object pinned, but when
we try to truncate it away we hit the assert in can_close().

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 16:31:35 -08:00
Sam Lang
8b4bdda58d client: Remove object from oset before deleting
Prevent invalid memory references for cases where
a truncate causes an object to be deleted but the
object set still references it.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-11-12 16:29:52 -08:00
Noah Watkins
0b85e43c76 java: fix build.xml formatting
set noet ts=2 sw=2 sws=2

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2012-11-12 15:50:41 -08:00
Noah Watkins
8970e81afb java: fix javadoc builds
Don't build JavaDoc for tests, and fix the missing src.dir variable.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2012-11-12 15:50:35 -08:00
Sage Weil
a11940f56a osdc/ObjectCacher: only return ENOENT if ObjectSet is flagged
The fs client can't handle ENOENT from the cache, but librbd wants it.
Also, the fs client will send down multiple ObjectExtents per io, but that
is incompatible with the ENOENT behavior.

Indicate which behavior we want via the ObjectSet, and update librbd to
explicitly ask for it.  This fixes the fs client, which is currently
broken (it returns ENOENT on read).

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 15:44:54 -08:00
Sage Weil
16db00d5d1 client: unregister commands on shutdown
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 15:06:10 -08:00
Sage Weil
7d1974c540 client: fix null dentry crash on dump_cache
Dentries can be NULL!

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 15:06:10 -08:00
Sage Weil
ad3063a4f8 client: dump mds session info
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 15:06:10 -08:00
Sage Weil
fc6b82f01d client: add dump_cache asok command
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 15:05:52 -08:00
Sage Weil
2c28e5dce9 common: dumpers for ceph_{file,dir}_layout
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 15:05:52 -08:00
Sage Weil
69c47d3d8b client: add mds_requests asok command
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 15:05:51 -08:00
Sage Weil
809d0e56c6 Makefile: fix hadoop lib build
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 15:05:51 -08:00
Sage Weil
ef71f32af4 Makefile: use libclient.la for libcephfs
Avoid building these files twice!

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 15:05:51 -08:00
Sage Weil
bc398c0321 Merge branch 'next' 2012-11-12 15:03:12 -08:00
Sage Weil
dc2ced96bf Merge branch 'wip-client-leaks' into next 2012-11-12 15:02:51 -08:00
Sage Weil
2f241685e8 client: fix null put in ~MetaSession
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 15:02:41 -08:00
Josh Durgin
8b1f547243 librbd: fix create existence checking
cda9e516b8 made us return 0 when the
image already existed, causing copy to erroneosly ignore an existing
image. Separate the case where we know the image exists from being
unable to tell whether it exists because of e.g. an authentication
problem.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-11-12 13:59:41 -08:00
Josh Durgin
614cf34bb2 librbd: debug when copy occurs
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-11-12 13:57:18 -08:00
Sage Weil
9c31d09f59 mon: kick failures when we lose leadership
If we were leader and are not anymore, kick any pending failure messages.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 13:54:08 -08:00
Sage Weil
e43f9d724b mon: process failures when osds go down
If we see an osd go down, process any pending failure_info reports we have.
Reply, and then remove the record from the map.

This ensures that we process failures and clean up regardless of *why* or
*who* did the marking down; either way, the osd was failed.

This fixes crashes like #3477, where the failure record was removed when we
updated the pending_map, but additional failures came in, and we didn't
clean up.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 13:53:10 -08:00
Sage Weil
763d348cc4 mon: ignore failure messages if already pending a failure
If a failure is already pending, do nothing in check_failure().

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 13:53:10 -08:00
Sage Weil
f63b7711ab Merge remote-tracking branch 'gh/wip-javadoc' 2012-11-12 12:36:19 -08:00
Sage Weil
b94037ce94 Merge remote-tracking branch 'gh/wip-librbd-remove-cleanup' 2012-11-12 12:30:14 -08:00
Sage Weil
23531c297a osd: add 'osd debug drop op probability'
This is meant to exercise the kclient's timeout and osd reconnect logic
by dropping some client requests on the floor.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 12:04:34 -08:00
Sage Weil
efa03cef5e mon: require pgnum in 'ceph osd pool create <poolname> <pgnum> [<pgp_num>]' command
The default of 8 is virtually never the right answer.  Require the initial
pg count to be explicitly provided.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 12:01:07 -08:00
Josh Durgin
cda9e516b8 librbd: return actual error when detecting format fails during creation
This bit a couple users today, when bad osd caps resulted in a very
confusing error message.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-11-12 11:16:21 -08:00
Josh Durgin
ca9f93aa37 Merge remote branch 'origin/wip-rbd-read'
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-11-12 10:41:35 -08:00
Sage Weil
a1b950e6e9 Merge remote-tracking branch 'gh/wip-client-symlinks'
Reviewed-by: Sam Lang <sam.lang@inktank.com>
2012-11-12 11:08:46 -08:00
Noah Watkins
3d76e67730 java: add symlink/readlink tests
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2012-11-12 10:45:26 -08:00
Sage Weil
6dd7925254 test_libcephfs: fix, add symlink loop tests
The first test did

 /a/b/file
 /a/b/sym -> /a/b

and opened /a/b/sym/file, which is valid.  Change it to

 /a/b/file
 /a/b/sym -> /a/b/sym

which is not.

Add another test that does

 /a -> /b
 /b -> /c
 /c -> /a

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 10:07:49 -08:00
Sage Weil
3902a01843 debug: adjust default debug levels
Trim out most noise, keep things that are interesting.

Notably, we are logging each message sent and received, and we are logging
the filestore operations when they get queued.  Those may still benefit
from being turned off in high IOPS environments.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-12 08:56:45 -08:00
Sage Weil
b17522d7ae Merge branch 'next' 2012-11-10 02:40:33 -08:00
Sage Weil
a56c1ca3b3 Merge branch 'wip-client-leaks' into next 2012-11-10 02:38:26 -08:00
Sage Weil
6c0be020c0 client: simplify/fix symlink loop check
Checking that we visit a symlink isn't correct; for example, the below is
valid, and we visit /b twice.

 /a/b -> c
 /a/c/d -> /a/b

In order to do this "correctly", I think we would need to track the pairs
of paths and symlinks we are resolving.  But, reading the man pages,
ELOOP is actually just defined as traversing more than MAXSYMLINKS syms.
(It appears to be 20 on my machine.)

Just do that instead.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-10 02:35:04 -08:00
Sage Weil
d037ff4ca1 client: fix path_walk for directory symlinks
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-10 02:14:02 -08:00
Josh Durgin
cd144534e9 OSDMonitor: remove max_devices and max_osd interdependency
Higher max_osd than max_devices doesn't hurt anything (and is the
normal way to add more osds). Higher max_devices than max_osds are
filtered out of crush results since e541c0f8d871172ec61962372efca943308e5fe,
so they don't matter either.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-11-10 01:42:09 -08:00
Sage Weil
ae1f46c3b5 mds: re-try_set_loner() after doing evals in eval(CInode*, int mask)
Consider a case where current loner is A and wanted loner is B.
At the top of the function we try to set the loner, but that may fail
because we haven't processed the gathered caps yet for the previous
loner.  In the body we do that and potentially drop the old loner, but we
do not try_set_loner() again on the desired loner.

Try after our drop.  If it succeeds, loop through the eval's one more time
so that we can issue caps approriately.

This fixes a hang induced by a simple loop like:

 while true ; do echo asdf >> mnt.a/foo ; tail mnt.b/foo ; done &
 while true ; do ls mnt.a mnt.b ; done

(The second loop may not be necessary.)

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-10 01:38:15 -08:00
Dan Mick
4b94e83fc9 Merge branch 'next'
Pull in types.h fix
2012-11-09 19:35:12 -08:00
Dan Mick
07b4f8fa0a si_t was not properly converting values < 100KB
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2012-11-09 19:26:42 -08:00
Sage Weil
8f49de0fb1 osdc/ObjectCacher: only call flush callback if we transitions to clean
If we race with e.g. truncate and are in bh_write_commit but the oset
is already clean, we should not call the flush callback (again).

This is reproduced by:

 - kludging slow osd replies into the code (e.g., 2 second delay)
 - mount ceph-fuse with --client-oc-max-dirty-age 1
 - dd if=/dev/zero of=mnt/foo count=1
   sleep 1
   truncate --size 0 mnt/foo
 -> crash

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 18:34:22 -08:00
tamil
0cfe6320a8 cleaned up scripts
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
2012-11-09 17:55:32 -08:00
Sage Weil
54f6c17ae3 client: ensure we don't leak MClientReply
We are careful to clear this reference when processing it.

Add an assert here.  There's no way we can get 2 quick replies because
of the kick-back below.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 16:03:20 -08:00
Sage Weil
a8053c10d9 ceph-fuse: fix leak of args
Also fix up the helper we use to have fewer sharp edges.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 16:03:20 -08:00
Sage Weil
b305fc8735 osdc/ObjectCacher: fix leak on readahead
If we initiate io (success == false) but have no waiter, we need to
delete the OSDRead.

This affects libcephfs/ceph-fuse, but not librbd, which does no readahead.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-09 16:03:20 -08:00
Gary Lowell
1c9ec26eea ceph.spec.in: Build debuginfo subpackage.
This is a partial fix for bug 3471.  Enable building of debuginfo package.
Some distributions enable this automatically by installing additional rpm
macros, on others it needs to be explicity added to the spec file.
2012-11-09 13:28:13 -08:00
Josh Durgin
de2cd18c3d test: add cli test for missing args to rbd
This includes 'rbd mv foo', which used to crash

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-11-09 12:17:50 -08:00
Josh Durgin
34ebda2bab rbd: check for second argument to mv/rename
Without this check, 'rbd mv foo' crashed trying to use a NULL char* as
a string.

Reported-by: Andrey Korolyov <andrey@xdel.ru>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-11-09 12:17:50 -08:00
Josh Durgin
9aae0eeaa7 rbd: check for second argument to mv/rename
Without this check, 'rbd mv foo' crashed trying to use a NULL char* as
a string.

Reported-by: Andrey Korolyov <andrey@xdel.ru>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-11-09 12:11:20 -08:00