The sync no longer cares if we trim Paxos versions as we go, as long as we
don't trim so fast that we fall behind between GET_CHUNK messages, which
we can consider a tuning problem.
Remove this extra complexity!
Signed-off-by: Sage Weil <sage@inktank.com>
We were using paxos_max_join_drift to control the minimum number of
paxos transactions to keep around. Instead, make this explicit, and
separate from the join drift.
Signed-off-by: Sage Weil <sage@inktank.com>
The previous sync implementation was highly stateful and very complex.
This made it very hard to understand and to debug, and there were bugs
still lurking in the timeout code (at least).
Replace it with something much simpler:
- sync providers are almost stateless. they keep an iterator, identified
by a unique cookie, that times out in a simple way.
- sync requesters sync from whomever they fancy. namely anyone with newer
committed paxos state.
There are a few extra fields that might allow sync continuation later, but
this is complex and not necessary at this point.
Signed-off-by: Sage Weil <sage@inktank.com>
Add ObjectOperation::write() that includes len instead of using bufferlist length
Have selfmanaged_snap_rollback_object() use mutate()
Signed-off-by: David Zafman <david.zafman@inktank.com>
If libcurl supports curl_multi_wait() then use it, otherwise
use select() and force a timeout, even if it has been disabled.
Otherwise we may wait forever for events that we can't wait for
as select() only uses fds < 1024.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Do not process reads (or, by PaxosService::dispatch() implication, writes)
until we have committed the initial service state. This avoids things like
EPERM due to missing keys when we race with mon creation, triggered by
teuthology tests doing their health check after startup.
Fixes: #5515
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
We want to trim old states even if there is no update activity. For
example, if a long-running rebalance finishes all osdmap updates will
stop and we won't trim out old maps to free space.
Instead, trim at the same time as tick(). Remove the trim during
propose_pending() to force all trims through this path and avoid
introducing a new and rarely-exercised behavior.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Right after cluster creation, first_committed is 1 and latest stashed in 0,
but we don't have the initial full map yet. Thereafter, we do (because we
write it with trim). Fixes afd6c7d824.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
MOSDPG(Push|PushReply|Pull|SubOp|SubOpReply) need the
same thing checked prior to queueing the op, so they
share a templated handler.
Signed-off-by: Samuel Just <sam.just@inktank.com>