Recent kernels got the new CEPH_LOCK_DN definition but we were still
setting the old bit. Set both so we work with both classes of clients. In
the meantime, update the kernel to ignore this field so that eventually we
can drop/reuse it.
Signed-off-by: Sage Weil <sage@newdream.net>
The scatter_writebehind_finish() is always followed up by an eval_gather(),
which does the clear_flushed(). For everyone else (replicas!), we need to
clear it immediately to avoid confusing things later.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Logrotate ignores entries after a rule that doesn't match any files.
Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Sage Weil <sage@newdream.net>
If we are in XSYN state and want to move to anything else, we must go via
EXCL, but we may not be loner anymore. Weaken the file_excl() assert so we
don't crash.
Reported-by: Fyodor Ustinov <ufm@ufm.su>
Signed-off-by: Sage Weil <sage@newdream.net>
If the id is specified, mark a non-existant osd rank as existant. The id
must fall within the current [0,max) range. This is the counterpart of
'osd rm <id>'.
If the id is not specified, allocate an unused osd id and set the EXISTS
flag. Increase max_osd as needed.
Closes: #1244
Signed-off-by: Sage Weil <sage@newdream.net>
We grew several copies of this code, and it turns out none of them were correct.
- assign flush tid in send_cap() helper
- pin inode on (dirty | flushing), not either/both
- add a proper mark_caps_flushing helper
and a bunch of other stuff. This brings this bit of code in alignment with
the kernel implementation.
And, flush_caps() on cap import.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
We need to wait for the client to flush snapped caps if the client has
not already flushed for the given snap. If the client has already flushed
caps through the last snapid for the old inode, we do not need to set up
the snapped inode's locks to wait for that.
This fixes an occasional hang on the snaps/snaptest-multiple-capsnaps.sh
workunit.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Formerly, CEPH_CONF was not respected by libraries. But now it is.
It overrides the default when reading the config file.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
This mirrors a kclient change a while back (e835124).
We only want to send one flushsnap cap message per MDS session:
- it's a waste to send multiples
- the mds will only reply to the first one
If the mds restarts we need to resend.
This fixes a hang where we send multiples, the first (and only) reply is
ignored (due to tid mismatch), and we are left with dangling references to
the inode and hang on umount. (Reliably reproduced by running the full
snaps/ workunit directory.)
Fixes: #1239
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Add an activate() function that must be called before we call the
onfinish callback. This is especially important in multi-threaded
contexts, since otherwise if completions come in in the wrong order, we
may delete the C_Gather object right before calling new_sub on it!
Also delete rm_subs because it is redundant with sub_finish.
Finally, num_subs_created, num_subs_remaining are now methods on
C_GatherBuilder rather than C_Gather.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Filer.h now uses C_GatherBuilder to avoid memory leaks.
Also, C_GatherBuilder's constructor now takes a Context.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
C_Gather objects are deleted by the last sub-context to execute.
If you create a C_Gather object manually, you must worry about the case
where there are no sub-contexts.
C_GatherBuilder is a little object that sits on the stack that allows
you to build C_Gather objects without worrying about this.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
The past primary was sending out scrub unreserve messages to all the
non-primary OSDs in the acting set on a PG state change. They're
spurious since the other OSDs will cancel the scrubs themselves
on state change, and they weren't right anyway because the loop
was looking at all the non-primary OSDs and sending out a message,
which could have excluded the new primary (if it was a replica before)
included other OSDs new to the PG, and included the current OSD.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
The dentries we reference may have been unlinked prior to us sending this
request. That's fine as long as we don't dereference a null dentry.
Signed-off-by: Sage Weil <sage@newdream.net>
Before nobody ever pinned dentries except Dirs. The only valid ref
counts were 0 and 1, and unlink, rename, etc would delete the unlinked
dentry.
Now, ref can by anything > 0. > 1 means it is also pinned in the LRU.
Unlink/rename ->put() and the last put() deletes (via private destructor).
Signed-off-by: Sage Weil <sage@newdream.net>
scatter_writebehind is called by eval_gather on dirty locks, and
eval_gather is called by wrlock_finish on unstable locks when you
drop the last wrlock...and scatter_writebehind force-takes a wrlock.
This meant that a workload like:
seq 3000|xargs -i mkdir a/b/{} &
mkdir a/c
could cause the mkdir a/c to wait until after the other process
finished because rstats can propagate upwards asynchronously, but
mark the directory dirty synchronously, while the mkdir a/c requires
an actual wrlock in order to modify the rstats.
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>