Commit Graph

60149 Commits

Author SHA1 Message Date
xie xingguo
8170b52e6b os/bluestore: kill dead gc-related counters
As the gc logic is deprecated by https://github.com/ceph/ceph/pull/12042

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-11-18 11:30:42 +08:00
Loic Dachary
8249503688 Merge pull request #10585 from zhjwpku/patch-2
doc/start/hardware-recommentdations: cosmetic

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2016-11-17 23:41:11 +01:00
Matt Benjamin
3f161671f7 Merge pull request #12008 from linuxbox2/civet-copy-target
cmake: produce civetweb.h, again

ok, w/upstream acks, merging--jenkins build did succeed (this is a build-only change)
2016-11-17 16:16:44 -05:00
Sage Weil
f61b7ce6a9 Merge pull request #12054 from ddiss/doc_osd_pool_restriction
doc: clarify file deletion from OSD restricted pool behaviour

Reviewed-by: Sage Weil <sage@redhat.com>
2016-11-17 14:57:24 -06:00
David Disseldorp
a138178fbe doc/cephfs/standby: fix minor typos
Signed-off-by: David Disseldorp <ddiss@suse.de>
2016-11-17 20:46:17 +01:00
David Disseldorp
f00546fee0 doc/cephfs: add note about deletion from OSD restricted pool
As described in http://tracker.ceph.com/issues/17937, a client with
restricted pool access can still delete files unless a corresponding
MDS path restriction is also in place.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2016-11-17 20:45:57 +01:00
Samuel Just
64cc94c522 Merge pull request #11701 from athanatos/wip-ec-partial-overwrites
osd: EC Overwrites

Reviewed-by: Sage Weil <sage@redhat.com>
2016-11-17 10:54:19 -08:00
Matt Benjamin
53f6462a01 cmake: produce civetweb.h, again
The recent change to do this logic with file copy (and in src/rgw)
resolved the build problem, but now updates to the civetweb
submodule were not reflected in the build.

Move the copy into a custom target which will always source the
current submodule version at build time.

Avoid using the BYPRODUCTS option, as it is not supported in many
older cmake versions (e.g., Centos 7).

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2016-11-17 13:49:14 -05:00
Samuel Just
0cf383da07 ReplicatedPG: clamp SPARSE_READ to object size for ec pool
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:35 -08:00
Samuel Just
0e7860b1e4 osd/: add some debugging to copyfrom
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:34 -08:00
Samuel Just
d8e0965cc6 ReplicatedPG: we might actually recover an object past crt on repair
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:34 -08:00
Samuel Just
ccbc90dd73 osd/: add projected_log to do client dup detection on not yet committed log entries
Log entries don't get added to the log for ECBackend until reads are
done, yet we still want any other requests with the same id to wait.

ReplicatedPG::update_range should consider the projected log as well.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:34 -08:00
Samuel Just
907b357e8f ReplicatedPG::calc_trim_to: don't trim past can_rollback_to
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:34 -08:00
Samuel Just
4740fe46ec doc/dev/osd_internals: add some docs for ECBackend
Also, clean up some old ones.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:34 -08:00
Samuel Just
db798d9f25 ReplicatedPG: don't leave last_backill pointing at head if snapdir exists
Fixes: http://tracker.ceph.com/issues/17668
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:34 -08:00
Samuel Just
407eaaca32 osd/: cleanup the snap trimmer and deal with delayed repops
With the PGBackend changes, it's not necessarily the case that
calling simple_opc_submit syncronously updates the SnapMapper.
Thus, we can't rely on being able to just ask the snap mapper
for the next object immediately (we could well loop on the same
one if ECBackend is flushing the pipeline).  Instead, update
SnapMapper and the SnapTrimmer to grab N at a time.

Additionally, we need to make sure we don't try this again until
all of the previously submitted repops are flushed (a good idea
anyway).  To that end, this patch also refactors the SnapTrimmer
machine to be fully explicit about why it's blocked so we can be
sure that we don't queue an async work item unless we really
want to.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:34 -08:00
Samuel Just
327dd257d3 osd/ECBackend: use an explicit backfill field on ECSubWrite
Previously, we used an empty transaction to indicate when we
were sending the op to a backfill peer which needs the logs,
but can't run the transaction.  I'd like to be able to send
and empty transaction for the rollforward side effect without
it causing the peer to think it missed a backfill op, so
instead, use an explicit flag.  Compatability is handled by
interpretting an old version encoding with an empty transaction
as having the backfill field filled.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:33 -08:00
Samuel Just
126ada4749 ReplicatedPG: update zero and truncate to only disallow aligned append pools
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:33 -08:00
Samuel Just
b237321bda ReplicatedPG::OpContext::start_async_reads: tolerate case sync callback call
If the read can be completed immediately, objects_read_async will call
the callback syncronously, which will result in ctx being cleaned up.
Clear pending_async_reads before the call.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:33 -08:00
Samuel Just
967764707a osd/: use PGBackend::call_write_ordered to submit log entries in commit order
Without this change, we might submit new log entries for marking objects
unfound in a way that causes replicas to process them out-of-order with
pending writes with lower version numbers.  That would be bad.  Instead,
add an interface to allow an arbitrary callback to be called after any
previously submitted transaction commit, but before any subsequently
submitted operations commit.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:33 -08:00
Samuel Just
f7b55ec144 osd/: Update PGBackend users to project last_update and submit stat deltas
The RMW pipeline means that we don't start committing an update
immediately, so we can't update the log syncronously with
submit_transaction.  Thus, in order to pipeline writes, PG/ReplicatedPG
will need to project last_update and abstain from updating info
directly (updating info.stats was the only offender).

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:33 -08:00
Samuel Just
1e95f2ce64 ECBackend: integrate cache and rmw pipeline
Implements the rmw pipeline and integrates the cache.

HashInfo now maintains a projected size for use during the planning
phase of the pipeline.

(Doesn't build without subsequent patches, not worth stubbing out
the interfaces)

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:33 -08:00
Samuel Just
5e0ec06376 osd/: refactor PGLog a bit and add support for rolling back extents
It was hard to reason about the validity of the IndexedLog internal
pointers and iterators during updates, so this patch cleans that up
a bunch.  It also moves responsibility for doing rollbacks into
PGBackend.  Finally, it adds support for the new log entry format.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:19 -08:00
Samuel Just
ac89594d50 TestPGLog: fix bug with merge_log_split_missing_entries_at_head
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:19 -08:00
Samuel Just
5360986f6d osd/: add support for rolling back overwritten extents to pg_log_entry_t
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:19 -08:00
Samuel Just
74ddea4b2b ECBackend: remove unused hobject argument to read_request_t
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:19 -08:00
Samuel Just
eb07d692b0 ECBackend: deduplicate start_remaining_read_ops and start_read_ops
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:18 -08:00
Samuel Just
89fb686038 ReplicatedBackend: always set rollforward to head in submit
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:18 -08:00
Samuel Just
4a193e828d osd/: split rollback info trims into trims and rollforwards
Also, rollforward on activate() and adjust read_log debugging to
account for non-rollforward entries.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:18 -08:00
Samuel Just
735af7c6fa osd/: 's/trim_rollback_to/roll_forward_to/g'
trim_rollback_to was a not terrible name before in that all
it ever did is (possibly) trim the stashed version of the
object.  However, now, it's going to encompass, in general,
the roll_forward part of a tpc (which will still be to
delete the stashed object in cases where that is
appropriate).

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:18 -08:00
Samuel Just
90f5d3dfcc osd_types: allow non-aligned non-overwrites with ECOVERWRITES flag
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:18 -08:00
Samuel Just
5e49a69cec osd/: add ExtentCache
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:18 -08:00
Samuel Just
050c9010e2 src/test: update ceph_test_rados to support overwrites
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:17 -08:00
Samuel Just
5fee194f13 osd,mon: add pool FLAG_EC_OVERWRITES flag
For now, this is a white box testing flag to allow
us to start testing the supporting features before
ec overwrites can actually be implemented.

Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Tomy Cheru <tomy.cheru@sandisk.com>
2016-11-17 10:40:17 -08:00
Samuel Just
9f17e2b535 osd,mon: remove FLAG_DEBUG_FAKE_EC_POOL
This was used in the past as scaffolding while the ec pools were being
developed.  There should be no legitimate users.

Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Tomy Cheru <tomy.cheru@sandisk.com>
2016-11-17 10:40:17 -08:00
Samuel Just
1dc6b4b694 osd_types: remove unused fill_in_setattrs
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:17 -08:00
Samuel Just
c723f5417a osd_types::ObjectModDesc: remove claim_append
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:17 -08:00
Samuel Just
2c99f864df osd/: switch all users of PGTransaction to use the new structure
This patch removes ReplicatedBackend::PGTransaction and implemenations
and switches over all users.  Happily, do_osd_ops loses the mod_desc
cruft and OpContext::pending_attrs.  PGTransaction doesn't really
have a natural way to implement append, however.  In reality, I think
this is probably an improvement, but it does mean that copy_from's
final transaction is now filled in by a lambda rather than by
appending a transaction fragment.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:17 -08:00
Samuel Just
20204b642a osd/: introduce PGTransaction
ECBackend is going to need a transaction representation which reduces
the operational representation from the OSDOp to a descriptive one
which makes questions like "what is the largest offest written" and
"does this transaction delete the object?" simple to answer.  At the
same time, we're going to eliminate the PGBackend::PGTransaction
interface since I don't think writing directly to an
ObjectStore::Transaction is buying us enough to offset the irritation
of having to update both implemenations.

A happy consequence of this design will be that we can fill in the
pg_log_entry_t::mod_desc member after submission in the backend
rather than inline in do_osd_ops.  We can also dispense with having
to maintain OpContext::pending_attrs separately from the ongoing
PGTransaction.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:17 -08:00
Samuel Just
6ae520da55 common/: add interval_map
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:17 -08:00
Samuel Just
49275c2d27 inline_variant: simplify it a lot, enable perfect forwarding
The previous implementation was a bit more baroque than it
needed to be.  Also, it made copies of the lambdas in a
few places.  Finally, it caused segfaults.  Not actually
sure why.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:16 -08:00
Samuel Just
05ddacb192 common/: add match() utilities for boost::variant
C++ doesn't have a sum type with nice pattern matching syntax.
Fortunately, someone on stack overflow fixed that.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:16 -08:00
Samuel Just
9c65dee274 osd_types: update_snaps should take a const argument
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:16 -08:00
Samuel Just
de3c22b0ee Context: add [Gen]LambdaContext and some related helpers
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:16 -08:00
Samuel Just
1934874e44 hobject: add helper typedefs
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:16 -08:00
Samuel Just
eb3b2024f2 PGBackend: add DoutPrefixProvider to parent interface
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:16 -08:00
Samuel Just
9499cdcd87 FileStore::_do_copy_range: tolerate short reads on replay
Consider a sequence like:

0. foo object size is 15
1. clone_range foo -> foo.0 5~5
2. write foo 5~5
3. clone_range foo -> foo.1 10~5
4. write 10~5 foo
5. rename foo -> foo.1
6. remove foo.0
7. remove foo.1
8. remove foo.2

If this sequence is interupted after 8 and replayed from 1, by the time
it gets to 3 the object will only have size 10 and no replay guard
(since 1 was skipped and 2 recreated the object with size 10 resulting
in a short read.  This should only happen if the replay guard is
missing, which should only happen if the object gets deleted later
in the sequence.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:16 -08:00
Samuel Just
bcb5a0da21 store_test::col_split_test: send bounded size transactions
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:16 -08:00
Samuel Just
313e55d948 vstart: ratchet down the osd_copyfrom_max_chunk to make multiple chunks likely
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:15 -08:00
Loic Dachary
654ad753c9 Merge pull request #12046 from dachary/wip-16014-cot
tests: use shorter directories for tests

Reviewed-by: Kefu Chai <kchai@redhat.com>
2016-11-17 17:37:27 +01:00