These are added to the LogSegment list on the slaves, but also need to be
removed from that list when we replay a COMMIT|ROLLBACK or when the op's
fate is determined during the resolve stage.
This fixes a crash like
./include/elist.h: In function 'elist<T>::item::~item() [with T =
MDSlaveUpdate*]', in thread '0x7fb2004d5700'
./include/elist.h: 39: FAILED assert(!is_on_list())
ceph version 0.26 (commit:9981ff90968398da43c63106694d661f5e3d07d5)
1: (MDSlaveUpdate::~MDSlaveUpdate()+0x59) [0x4d9fe9]
2: (ESlaveUpdate::replay(MDS*)+0x422) [0x4d2772]
3: (MDLog::_replay_thread()+0xb90) [0x67f850]
4: (MDLog::ReplayThread::entry()+0xd) [0x4b89ed]
5: (()+0x7971) [0x7fb20564a971]
6: (clone()+0x6d) [0x7fb2042e692d]
ceph version 0.26 (commit:9981ff90968398da43c63106694d661f5e3d07d5)
1: (MDSlaveUpdate::~MDSlaveUpdate()+0x59) [0x4d9fe9]
2: (ESlaveUpdate::replay(MDS*)+0x422) [0x4d2772]
3: (MDLog::_replay_thread()+0xb90) [0x67f850]
4: (MDLog::ReplayThread::entry()+0xd) [0x4b89ed]
5: (()+0x7971) [0x7fb20564a971]
Fixes: #1019
Signed-off-by: Sage Weil <sage@newdream.net>
Options that are inherently global, like malloc settings, and also
inherently debugging or profiling settings should be environment
variables.
tcmalloc_profiler_run, profiler_allocation_interval,
profiler_highwater_interval, and buffer_track_alloc fall into this
category.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
We do not need to create the pg on query. If we are a new replica we can
create it when we get the PGLog activate message.
Signed-off-by: Sage Weil <sage@newdream.net>
Remove/zero objects N periods ahead of the journal write position. This
ensures that when we reprobe the journal length, we will always detect the
end position as the correct write_pos, even when there is weird data
"ahead" of us that we may bump up against.
Signed-off-by: Sage Weil <sage@newdream.net>
Make filer::zero() remove any whole objects. This is required by the
Journaler, given the way it probes the journal length.
Signed-off-by: Sage Weil <sage@newdream.net>
If we get an error code and assume we successfully wrote the head,
there are going to be all kinds of issues on replay!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Fixes bug where oi.size gets out of sync with the object size because we
actually write zeros. (This explains #933.)
Signed-off-by: Sage Weil <sage@newdream.net>
finish_export_inode changes states! That's not good for our checks,
so just handle unpinning and stuff before we finish_export_inode.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
We need to handle locks and pins on exported inodes but we
were using a separate if block with its own (non-matching!) check
for no good reason.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Otherwise these pins are never dropped from the inode since we
don't go through our normal xlock teardown code. Now we do!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
We don't want auth_pins on the locallocks (they're never auth_pinned)
and we only want new auth_pins that are for locks on the inode that we
imported -- not for each xlock that the mdr has everywhere (like,
say, on the srcdn)!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Because we can do an inode import during a rename that skips the usual
channels, we were getting into an odd state with the xlocks (which we
did as a slave for an inode that we exported away). Clean up the
record of these xlocks for inodes before we get into the request
cleanup (at which point we are labeled as no-longer-auth, and the
standard cleanup routines will break).
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Because we can do an inode import during a rename that skips the usual
channels, we were getting into an odd state with the xlocks (which
were formerly remote and are now local). Clean up the record of
those remote xlocks.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
This was broken a while ago during the last refactor. Whoops! Clean it
up to be smarter (and work at all).
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
rename all the get_uid_by_* to get_user_info_by_*, remove get_user_info()
and call the appropriate function instead (either the by_uid or by_access_key).
In that case we get ENOSYS. This also implies an old version of the client
and that we should fall back.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
We previously dropped the request but that was inappropriate for that
one case because the replica has no way to trigger a resend.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Previously we'd try and do the whole thing, which meant that
the replica got a lock twiddle before it had finished the export.
That broke things spectacularly, since we weren't respecting our
invariants about who gets remote locking messages.
Now we pass through a flag and respect our invariants.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
I don't remember why we needed can_xlock_local() to begin with, but
I can tell that adding this get_xlock_by() check won't stop anything
working that was ever working to begin with (really it's still not
strong enough a check).
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Previously we just had to give up on ESTALE. Now
we can attempt to recover!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
The slave also can hold some auth pins from locks which the
master has asked it to grab. It's possible we can intelligently
determine how many, but for now just drop the assert.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Previously it ignored the auth pin required to hold snap xlock, which
is currently always held for a rename on a dir. This would lead to
a permanent hang on the request. Now we account for it!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>