There was some old, weird stuff going on here where we would wait for the
ACK and COMMIT separately. This is just wrong. Writeback does not
complete until the data is committed on disk.
Simplify by waiting only for commit, removing all the 'ack' code, and
going back to a single callback (flush_set).
I didn't notice this for 05063867e2a54176ffc9bbc73391f52766ab403f; both of
these cleanups are needed to fix this.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Besides being generally better, this means we can accept pool 0
as the pool to store stuff in.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
This way, the MDS can handle updates of some values without needing
the user to specify the entire layout (ie, they can just switch pools).
This brings the behavior more in line with setting the dir layout.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
There was a problem where:
- we would dirty some buffers on an object
- bump dirty_tx count
- flush()
- this adds the Object to ObjectSet::uncommitted
- truncate
- client clears FILE_BUFFER cap_ref
- Object::purge()
- clear dirty_tx count
- client puts last inode
- Object::uncommitted is not empty in ~ObjectSet
(This was triggered after several runs of workunits/suites/blogbensh.sh
on sepia.)
It turns out the uncommitted xlist<> is pretty useless, though: the same
information is captured in the dirty_tx counter. We add a separate
counter to the Object itself (for the benefit of Object::can_close()).
We also clean up Object::purge() to call truncate(0), a small
simplification.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
When we are assimilate readdir results into our cache, we need to be more
careful about replacing existing dentries. We were calling
insert_dentry_inode(), which would replace a name if it already exists,
which might include pd->first, an active iterator.
Move the dentry link/relink into the caller (where we already have an
iterator pointing to the existing item, if any). Then update the dentry
lease information separately.
Fixes: #1391
Signed-off-by: Sage Weil <sage@newdream.net>
The first bit of insert_dentry_inode() handles the details of checking
whether an item still exists, un/relinking it, etc.
The second bit just updates the dentry lease information.
Signed-off-by: Sage Weil <sage@newdream.net>
shared_ptr calls the disposal function even when the pointer is null
that is being disposed of.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
The current race:
- we start readdir
- mds revokes dir cap
- some file gets removed
- mds reissues dir cap
- we finish readdir and set I_COMPLETE
We should only set it if there have been no FILE_SHARED reissues during
the readdir.
Note that we still set I_COMPLETE even if we don't have the cap; that's
useless but harmless, since it is undefined without FILE_SHARED being
set.
Signed-off-by: Sage Weil <sage@newdream.net>
We can only clear this when we have >= a period between flush_pos and
write_pos.
Clear the flag in _do_flush() so the check is not fragile, should this
ever be changed in the future.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Directories can only link into the hierarchy once. We assert as much
in readdir_r_cb(). Fix link() so that it unlinked the directory from the
old location when relinking somewhere new. Be careful to do this after
we take inode refs to avoid any unpleasantness.
Fixes: #1429
Reported-by: Sam Lang <samlang@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
We need to bound the range we write to disk by m->last_committed; this got
lost in translation in commit dfab2c91f5.
Fixes paxos crashes in handle_begin
mon/Paxos.cc: In function 'void Paxos::handle_begin(MMonPaxos*)', in thread '0x7fc74d11f700'
mon/Paxos.cc: 393: FAILED assert(begin->last_committed == last_committed)
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Clear this flag only if we know no subsequent flushes could be waiting on
a prezero operation.
Fixes MDS journaling hang under heavy journal load.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>