Move client creation to the section on setting up client auth, so you
don't skip it if you already have pools created.
Move CEPH_ARGS setting to the section on configuring OpenStack, since
it's a change for the OpenStack services, not purely ceph client
configuration.
A couple people were confused by the placement of these parts, and
they make more sense in these sections.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
This reverts commit 46897fd4ff.
There is no reason to carry a ref for the writes; it just makes things
more confusing because the refs aren't used for lifecycle, only for
LRU pinning. We would need to duplicate the close_object() conditional
here for this to work right.
This fixes#3431, in which a slow osd reply has an Object pinned, but when
we try to truncate it away we hit the assert in can_close().
Signed-off-by: Sage Weil <sage@inktank.com>
Prevent invalid memory references for cases where
a truncate causes an object to be deleted but the
object set still references it.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
The fs client can't handle ENOENT from the cache, but librbd wants it.
Also, the fs client will send down multiple ObjectExtents per io, but that
is incompatible with the ENOENT behavior.
Indicate which behavior we want via the ObjectSet, and update librbd to
explicitly ask for it. This fixes the fs client, which is currently
broken (it returns ENOENT on read).
Signed-off-by: Sage Weil <sage@inktank.com>
cda9e516b8 made us return 0 when the
image already existed, causing copy to erroneosly ignore an existing
image. Separate the case where we know the image exists from being
unable to tell whether it exists because of e.g. an authentication
problem.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
If we see an osd go down, process any pending failure_info reports we have.
Reply, and then remove the record from the map.
This ensures that we process failures and clean up regardless of *why* or
*who* did the marking down; either way, the osd was failed.
This fixes crashes like #3477, where the failure record was removed when we
updated the pending_map, but additional failures came in, and we didn't
clean up.
Signed-off-by: Sage Weil <sage@inktank.com>
This is meant to exercise the kclient's timeout and osd reconnect logic
by dropping some client requests on the floor.
Signed-off-by: Sage Weil <sage@inktank.com>
The default of 8 is virtually never the right answer. Require the initial
pg count to be explicitly provided.
Signed-off-by: Sage Weil <sage@inktank.com>
The first test did
/a/b/file
/a/b/sym -> /a/b
and opened /a/b/sym/file, which is valid. Change it to
/a/b/file
/a/b/sym -> /a/b/sym
which is not.
Add another test that does
/a -> /b
/b -> /c
/c -> /a
Signed-off-by: Sage Weil <sage@inktank.com>
Trim out most noise, keep things that are interesting.
Notably, we are logging each message sent and received, and we are logging
the filestore operations when they get queued. Those may still benefit
from being turned off in high IOPS environments.
Signed-off-by: Sage Weil <sage@inktank.com>
Checking that we visit a symlink isn't correct; for example, the below is
valid, and we visit /b twice.
/a/b -> c
/a/c/d -> /a/b
In order to do this "correctly", I think we would need to track the pairs
of paths and symlinks we are resolving. But, reading the man pages,
ELOOP is actually just defined as traversing more than MAXSYMLINKS syms.
(It appears to be 20 on my machine.)
Just do that instead.
Signed-off-by: Sage Weil <sage@inktank.com>
Higher max_osd than max_devices doesn't hurt anything (and is the
normal way to add more osds). Higher max_devices than max_osds are
filtered out of crush results since e541c0f8d871172ec61962372efca943308e5fe,
so they don't matter either.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Consider a case where current loner is A and wanted loner is B.
At the top of the function we try to set the loner, but that may fail
because we haven't processed the gathered caps yet for the previous
loner. In the body we do that and potentially drop the old loner, but we
do not try_set_loner() again on the desired loner.
Try after our drop. If it succeeds, loop through the eval's one more time
so that we can issue caps approriately.
This fixes a hang induced by a simple loop like:
while true ; do echo asdf >> mnt.a/foo ; tail mnt.b/foo ; done &
while true ; do ls mnt.a mnt.b ; done
(The second loop may not be necessary.)
Signed-off-by: Sage Weil <sage@inktank.com>
If we race with e.g. truncate and are in bh_write_commit but the oset
is already clean, we should not call the flush callback (again).
This is reproduced by:
- kludging slow osd replies into the code (e.g., 2 second delay)
- mount ceph-fuse with --client-oc-max-dirty-age 1
- dd if=/dev/zero of=mnt/foo count=1
sleep 1
truncate --size 0 mnt/foo
-> crash
Signed-off-by: Sage Weil <sage@inktank.com>
We are careful to clear this reference when processing it.
Add an assert here. There's no way we can get 2 quick replies because
of the kick-back below.
Signed-off-by: Sage Weil <sage@inktank.com>
If we initiate io (success == false) but have no waiter, we need to
delete the OSDRead.
This affects libcephfs/ceph-fuse, but not librbd, which does no readahead.
Signed-off-by: Sage Weil <sage@inktank.com>
This is a partial fix for bug 3471. Enable building of debuginfo package.
Some distributions enable this automatically by installing additional rpm
macros, on others it needs to be explicity added to the spec file.
Without this check, 'rbd mv foo' crashed trying to use a NULL char* as
a string.
Reported-by: Andrey Korolyov <andrey@xdel.ru>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Without this check, 'rbd mv foo' crashed trying to use a NULL char* as
a string.
Reported-by: Andrey Korolyov <andrey@xdel.ru>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>