Previously, we used an empty transaction to indicate when we
were sending the op to a backfill peer which needs the logs,
but can't run the transaction. I'd like to be able to send
and empty transaction for the rollforward side effect without
it causing the peer to think it missed a backfill op, so
instead, use an explicit flag. Compatability is handled by
interpretting an old version encoding with an empty transaction
as having the backfill field filled.
Signed-off-by: Samuel Just <sjust@redhat.com>
If the read can be completed immediately, objects_read_async will call
the callback syncronously, which will result in ctx being cleaned up.
Clear pending_async_reads before the call.
Signed-off-by: Samuel Just <sjust@redhat.com>
Without this change, we might submit new log entries for marking objects
unfound in a way that causes replicas to process them out-of-order with
pending writes with lower version numbers. That would be bad. Instead,
add an interface to allow an arbitrary callback to be called after any
previously submitted transaction commit, but before any subsequently
submitted operations commit.
Signed-off-by: Samuel Just <sjust@redhat.com>
The RMW pipeline means that we don't start committing an update
immediately, so we can't update the log syncronously with
submit_transaction. Thus, in order to pipeline writes, PG/ReplicatedPG
will need to project last_update and abstain from updating info
directly (updating info.stats was the only offender).
Signed-off-by: Samuel Just <sjust@redhat.com>
Implements the rmw pipeline and integrates the cache.
HashInfo now maintains a projected size for use during the planning
phase of the pipeline.
(Doesn't build without subsequent patches, not worth stubbing out
the interfaces)
Signed-off-by: Samuel Just <sjust@redhat.com>
It was hard to reason about the validity of the IndexedLog internal
pointers and iterators during updates, so this patch cleans that up
a bunch. It also moves responsibility for doing rollbacks into
PGBackend. Finally, it adds support for the new log entry format.
Signed-off-by: Samuel Just <sjust@redhat.com>
trim_rollback_to was a not terrible name before in that all
it ever did is (possibly) trim the stashed version of the
object. However, now, it's going to encompass, in general,
the roll_forward part of a tpc (which will still be to
delete the stashed object in cases where that is
appropriate).
Signed-off-by: Samuel Just <sjust@redhat.com>
For now, this is a white box testing flag to allow
us to start testing the supporting features before
ec overwrites can actually be implemented.
Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Tomy Cheru <tomy.cheru@sandisk.com>
This was used in the past as scaffolding while the ec pools were being
developed. There should be no legitimate users.
Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Tomy Cheru <tomy.cheru@sandisk.com>
This patch removes ReplicatedBackend::PGTransaction and implemenations
and switches over all users. Happily, do_osd_ops loses the mod_desc
cruft and OpContext::pending_attrs. PGTransaction doesn't really
have a natural way to implement append, however. In reality, I think
this is probably an improvement, but it does mean that copy_from's
final transaction is now filled in by a lambda rather than by
appending a transaction fragment.
Signed-off-by: Samuel Just <sjust@redhat.com>
ECBackend is going to need a transaction representation which reduces
the operational representation from the OSDOp to a descriptive one
which makes questions like "what is the largest offest written" and
"does this transaction delete the object?" simple to answer. At the
same time, we're going to eliminate the PGBackend::PGTransaction
interface since I don't think writing directly to an
ObjectStore::Transaction is buying us enough to offset the irritation
of having to update both implemenations.
A happy consequence of this design will be that we can fill in the
pg_log_entry_t::mod_desc member after submission in the backend
rather than inline in do_osd_ops. We can also dispense with having
to maintain OpContext::pending_attrs separately from the ongoing
PGTransaction.
Signed-off-by: Samuel Just <sjust@redhat.com>
The previous implementation was a bit more baroque than it
needed to be. Also, it made copies of the lambdas in a
few places. Finally, it caused segfaults. Not actually
sure why.
Signed-off-by: Samuel Just <sjust@redhat.com>
C++ doesn't have a sum type with nice pattern matching syntax.
Fortunately, someone on stack overflow fixed that.
Signed-off-by: Samuel Just <sjust@redhat.com>
Consider a sequence like:
0. foo object size is 15
1. clone_range foo -> foo.0 5~5
2. write foo 5~5
3. clone_range foo -> foo.1 10~5
4. write 10~5 foo
5. rename foo -> foo.1
6. remove foo.0
7. remove foo.1
8. remove foo.2
If this sequence is interupted after 8 and replayed from 1, by the time
it gets to 3 the object will only have size 10 and no replay guard
(since 1 was skipped and 2 recreated the object with size 10 resulting
in a short read. This should only happen if the replay guard is
missing, which should only happen if the object gets deleted later
in the sequence.
Signed-off-by: Samuel Just <sjust@redhat.com>
So that jenkins can use longer directories. We can't have both otherwise
the limit UNIX domain socket path length limit triggers errors such as:
... client.admin.12750.asok is too long! The maximum length on this system is 107
Fixes: http://tracker.ceph.com/issues/16014
Signed-off-by: Loic Dachary <loic@dachary.org>
mgr: init() return when connection daemons failed && add some err info
Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
As of the Mitaka release show_image_direct_url is not needed, but
instead show_multiple_locations should be used.
Adding the necessary guidance for Mitaka release.
Signed-off-by: Sébastien Han <seb@redhat.com>
common osd: Improve scrub analysis, list-inconsistent-obj output and osd-scrub-repair test
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>