Commit Graph

60384 Commits

Author SHA1 Message Date
Pan Liu
c7f55c40ab os/bluestore: fix a bug: when using bluestore, the output of
"ceph osd perf" is always 0ms.

Signed-off-by: Pan Liu <pan.liu@istuary.com>
2016-11-18 16:26:19 +08:00
Pan Liu
3ea4760da8 OSD: change the prefix from fs_* to os_*, because commit_latency and
apply_latency will be used not only for filestore.

Signed-off-by: Pan Liu <pan.liu@istuary.com>
2016-11-18 16:25:31 +08:00
Loic Dachary
cd72ff9f74 tests: save 9 characters for asok paths
For vstart.sh powered tests, save 9 characters in the path name
by replacing testdir/test- with td/t-

60 characters imposed by jenkins
9 characters for src/test
5 characters for td/t-

33 left (instead of 24) for the test to create asok such as out/client.admin.25327.asok

Moving these files outside of the build directory is a bad idea because
tests should only create/use files within the builddir and not write
outside of this directory. Doing so would make things more complicated
for cleanup in case the test fail and create other problems as a
consequence (filling out disk space, conflicting directories between
runs etc.).

For ceph-helpers.sh tests replace testdir with td, saving 5 characters.
This is not strictly necessary but keeps the directory names consistent:
if the developer wants to get rid of all the test leftovers, it is
enough to remove the a single directory: td.

Fixes: http://tracker.ceph.com/issues/16014

Signed-off-by: Loic Dachary <loic@dachary.org>
2016-11-18 09:19:18 +01:00
xie xingguo
8037900c22 os/bluestore: add counter to trace extents have been removed due to compression
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-11-18 16:18:39 +08:00
xie xingguo
5013f9a3f5 os/bluestore: rename Extent::end() to Extent::logical_end()
and use the new method to simplify code.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-11-18 14:11:54 +08:00
David Zafman
db1a9434f1 Merge pull request #12058 from dachary/wip-17830-eio
test: disable osd-scrub-repair and test-erasure-eio
2016-11-17 20:44:22 -08:00
Sage Weil
99536f351c PendingReleaseNotes: note on new omap limits
Signed-off-by: Sage Weil <sage@redhat.com>
2016-11-17 22:28:21 -06:00
xie xingguo
8170b52e6b os/bluestore: kill dead gc-related counters
As the gc logic is deprecated by https://github.com/ceph/ceph/pull/12042

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-11-18 11:30:42 +08:00
Patrick Donnelly
b8b110971b
client: improve failure messages/debugging
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2016-11-17 21:03:02 -05:00
Sage Weil
a3747de953 os/bluestore: kill kv_submitted bool; use a new state
We have state definitions; use them!

Signed-off-by: Sage Weil <sage@redhat.com>
2016-11-17 19:33:12 -06:00
Sage Weil
af12f842a4 os/bluestore: remove unused KV_COMMITTING state
Signed-off-by: Sage Weil <sage@redhat.com>
2016-11-17 19:33:12 -06:00
Sage Weil
315e777c02 ceph_test_objectstore: test w/ and w/o sync_submit_transaction
Signed-off-by: Sage Weil <sage@redhat.com>
2016-11-17 19:33:12 -06:00
Sage Weil
652d18f359 os/bluestore: fix alloc release timing on sync submits
If we submit the txc synchronously, we can't immediately release our
freed space to the allocator; that still needs to be done between
commit_start() and commit_finish() from the kv_sync_thread, protected
by the bdev barriers.

Signed-off-by: Sage Weil <sage@redhat.com>
2016-11-17 19:33:12 -06:00
Sage Weil
cc37efa47c Merge pull request #11709 from iain-buclaw-sociomantic/librados_aioexec
librados: Add rados_aio_exec to the C API

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2016-11-17 18:50:04 -06:00
Sage Weil
060307cc49 Merge pull request #11921 from adamemerson/wip-clangtastic
build: The Light Clangtastic

Reviewed-by: Matt Benjamin <mbenjami@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2016-11-17 18:49:02 -06:00
Willem Jan Withagen
ce47832cf7 common/strtol.cc: Get error testing also to work on FreeBSD
- change order of testing
 - But report the same error types.
 - Changed to report for the last error since the value is there but
   not allowed characters follow.

Error found by: run-cli-tests, because the wrong string was returned.

Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
2016-11-18 00:37:37 +01:00
Sage Weil
41d46e4921 osd/ReplicatedPG: limit omap request by bytes
Signed-off-by: Sage Weil <sage@redhat.com>
2016-11-17 17:00:37 -06:00
Sage Weil
91bb5fbb87 osd/ReplicatedPG: osd configured limit on max omap keys during read
This doesn't apply to the ops that explicitly name keys to read; those
aren't as risky.

Signed-off-by: Sage Weil <sage@redhat.com>
2016-11-17 17:00:36 -06:00
Dan Mick
ee08d38cb9 test: disable osd-scrub-repair and test-erasure-eio
While it is being worked on, because it frequently fails.

Refs: http://tracker.ceph.com/issues/17830

Signed-off-by: Dan Mick <dan.mick@redhat.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
2016-11-17 23:50:17 +01:00
Loic Dachary
8249503688 Merge pull request #10585 from zhjwpku/patch-2
doc/start/hardware-recommentdations: cosmetic

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2016-11-17 23:41:11 +01:00
Sage Weil
b52d602510 ceph_test_rados_api_tier: dump hitset that we fail to decode
See http://tracker.ceph.com/issues/17945

Signed-off-by: Sage Weil <sage@redhat.com>
2016-11-17 16:18:47 -06:00
Matt Benjamin
3f161671f7 Merge pull request #12008 from linuxbox2/civet-copy-target
cmake: produce civetweb.h, again

ok, w/upstream acks, merging--jenkins build did succeed (this is a build-only change)
2016-11-17 16:16:44 -05:00
Sage Weil
f61b7ce6a9 Merge pull request #12054 from ddiss/doc_osd_pool_restriction
doc: clarify file deletion from OSD restricted pool behaviour

Reviewed-by: Sage Weil <sage@redhat.com>
2016-11-17 14:57:24 -06:00
Sage Weil
a863ae1c0f osdc/Objecter: handle race between calc_target and handle_osd_map
If we fail to get an existing session and have to take the exclusive lock,
we may race with an OSDMap update and end up with a stale target.  Check
for an epoch change and, if it happens, recalculate the mapping.

Fixes: http://tracker.ceph.com/issues/17942
Reported-by: wangdongxu <wangdongxu@cmss.chinamobile.com>
Reported-by: menglingkun <menglingkun@cmss.chinamobile.com>
Signed-off-by: Sage Weil <sage@redhat.com>
2016-11-17 14:05:26 -06:00
David Disseldorp
a138178fbe doc/cephfs/standby: fix minor typos
Signed-off-by: David Disseldorp <ddiss@suse.de>
2016-11-17 20:46:17 +01:00
David Disseldorp
f00546fee0 doc/cephfs: add note about deletion from OSD restricted pool
As described in http://tracker.ceph.com/issues/17937, a client with
restricted pool access can still delete files unless a corresponding
MDS path restriction is also in place.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2016-11-17 20:45:57 +01:00
Samuel Just
64cc94c522 Merge pull request #11701 from athanatos/wip-ec-partial-overwrites
osd: EC Overwrites

Reviewed-by: Sage Weil <sage@redhat.com>
2016-11-17 10:54:19 -08:00
Matt Benjamin
53f6462a01 cmake: produce civetweb.h, again
The recent change to do this logic with file copy (and in src/rgw)
resolved the build problem, but now updates to the civetweb
submodule were not reflected in the build.

Move the copy into a custom target which will always source the
current submodule version at build time.

Avoid using the BYPRODUCTS option, as it is not supported in many
older cmake versions (e.g., Centos 7).

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2016-11-17 13:49:14 -05:00
Samuel Just
0cf383da07 ReplicatedPG: clamp SPARSE_READ to object size for ec pool
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:35 -08:00
Samuel Just
0e7860b1e4 osd/: add some debugging to copyfrom
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:34 -08:00
Samuel Just
d8e0965cc6 ReplicatedPG: we might actually recover an object past crt on repair
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:34 -08:00
Samuel Just
ccbc90dd73 osd/: add projected_log to do client dup detection on not yet committed log entries
Log entries don't get added to the log for ECBackend until reads are
done, yet we still want any other requests with the same id to wait.

ReplicatedPG::update_range should consider the projected log as well.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:34 -08:00
Samuel Just
907b357e8f ReplicatedPG::calc_trim_to: don't trim past can_rollback_to
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:34 -08:00
Samuel Just
4740fe46ec doc/dev/osd_internals: add some docs for ECBackend
Also, clean up some old ones.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:34 -08:00
Samuel Just
db798d9f25 ReplicatedPG: don't leave last_backill pointing at head if snapdir exists
Fixes: http://tracker.ceph.com/issues/17668
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:34 -08:00
Samuel Just
407eaaca32 osd/: cleanup the snap trimmer and deal with delayed repops
With the PGBackend changes, it's not necessarily the case that
calling simple_opc_submit syncronously updates the SnapMapper.
Thus, we can't rely on being able to just ask the snap mapper
for the next object immediately (we could well loop on the same
one if ECBackend is flushing the pipeline).  Instead, update
SnapMapper and the SnapTrimmer to grab N at a time.

Additionally, we need to make sure we don't try this again until
all of the previously submitted repops are flushed (a good idea
anyway).  To that end, this patch also refactors the SnapTrimmer
machine to be fully explicit about why it's blocked so we can be
sure that we don't queue an async work item unless we really
want to.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:34 -08:00
Samuel Just
327dd257d3 osd/ECBackend: use an explicit backfill field on ECSubWrite
Previously, we used an empty transaction to indicate when we
were sending the op to a backfill peer which needs the logs,
but can't run the transaction.  I'd like to be able to send
and empty transaction for the rollforward side effect without
it causing the peer to think it missed a backfill op, so
instead, use an explicit flag.  Compatability is handled by
interpretting an old version encoding with an empty transaction
as having the backfill field filled.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:33 -08:00
Samuel Just
126ada4749 ReplicatedPG: update zero and truncate to only disallow aligned append pools
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:33 -08:00
Samuel Just
b237321bda ReplicatedPG::OpContext::start_async_reads: tolerate case sync callback call
If the read can be completed immediately, objects_read_async will call
the callback syncronously, which will result in ctx being cleaned up.
Clear pending_async_reads before the call.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:33 -08:00
Samuel Just
967764707a osd/: use PGBackend::call_write_ordered to submit log entries in commit order
Without this change, we might submit new log entries for marking objects
unfound in a way that causes replicas to process them out-of-order with
pending writes with lower version numbers.  That would be bad.  Instead,
add an interface to allow an arbitrary callback to be called after any
previously submitted transaction commit, but before any subsequently
submitted operations commit.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:33 -08:00
Samuel Just
f7b55ec144 osd/: Update PGBackend users to project last_update and submit stat deltas
The RMW pipeline means that we don't start committing an update
immediately, so we can't update the log syncronously with
submit_transaction.  Thus, in order to pipeline writes, PG/ReplicatedPG
will need to project last_update and abstain from updating info
directly (updating info.stats was the only offender).

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:33 -08:00
Samuel Just
1e95f2ce64 ECBackend: integrate cache and rmw pipeline
Implements the rmw pipeline and integrates the cache.

HashInfo now maintains a projected size for use during the planning
phase of the pipeline.

(Doesn't build without subsequent patches, not worth stubbing out
the interfaces)

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:41:33 -08:00
Samuel Just
5e0ec06376 osd/: refactor PGLog a bit and add support for rolling back extents
It was hard to reason about the validity of the IndexedLog internal
pointers and iterators during updates, so this patch cleans that up
a bunch.  It also moves responsibility for doing rollbacks into
PGBackend.  Finally, it adds support for the new log entry format.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:19 -08:00
Samuel Just
ac89594d50 TestPGLog: fix bug with merge_log_split_missing_entries_at_head
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:19 -08:00
Samuel Just
5360986f6d osd/: add support for rolling back overwritten extents to pg_log_entry_t
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:19 -08:00
Samuel Just
74ddea4b2b ECBackend: remove unused hobject argument to read_request_t
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:19 -08:00
Samuel Just
eb07d692b0 ECBackend: deduplicate start_remaining_read_ops and start_read_ops
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:18 -08:00
Samuel Just
89fb686038 ReplicatedBackend: always set rollforward to head in submit
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:18 -08:00
Samuel Just
4a193e828d osd/: split rollback info trims into trims and rollforwards
Also, rollforward on activate() and adjust read_log debugging to
account for non-rollforward entries.

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:18 -08:00
Samuel Just
735af7c6fa osd/: 's/trim_rollback_to/roll_forward_to/g'
trim_rollback_to was a not terrible name before in that all
it ever did is (possibly) trim the stashed version of the
object.  However, now, it's going to encompass, in general,
the roll_forward part of a tpc (which will still be to
delete the stashed object in cases where that is
appropriate).

Signed-off-by: Samuel Just <sjust@redhat.com>
2016-11-17 10:40:18 -08:00