Commit Graph

98086 Commits

Author SHA1 Message Date
Jason Dillaman
ef9d74720f librbd: switch to lock-free queue for event poll IO interface
'perf' shows several percent of CPU being wasted on lock contention
in the event poll interface. The 'fio' RBD engine uses this poll
IO interface by default when available.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2019-05-02 09:30:45 -04:00
Jason Dillaman
002afa0fe3 librbd: avoid using lock within AIO completion where possible
'perf' shows several percent of CPU is being utilized handling the
heavyweight locking semantics of AIO completion. With these changes,
the lock contention disappears.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2019-05-02 09:30:45 -04:00
Jason Dillaman
b5fc7ecaf7 librbd: remove special case for starting AioCompletion ops
All ops can be immediately started now that flush ops won't
accidentally block themselves.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2019-05-02 09:30:45 -04:00
Jason Dillaman
09e4127d5d librbd: simplify IO flush handling through AsyncOperation
Allow ImageFlushRequest to directly execute a flush call through
AsyncOperation. This will allow the flush to be directly linked
to its preceeding IOs.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2019-05-02 09:30:45 -04:00
Sage Weil
6ad73b2d0f Merge PR #27871 into master
* refs/pull/27871/head:
	ceph_test_objectstore: add very_large_write test
	os/bluestore: fix aio pwritev lost data problem.

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
2019-05-02 08:22:04 -05:00
Yuval Lifshitz
83e5571c87 rgw/pubsub: fix test issue with 3 zones
Signed-off-by: Yuval Lifshitz <yuvalif@yahoo.com>
2019-05-02 10:15:00 +03:00
Jason Dillaman
d712aac916
Merge pull request #27844 from wjwithagen/wjw-fix-src_common_bit_vector.hpp.diff
common: Clang requires a default constructor, but it can be empty

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2019-05-01 18:56:00 -04:00
Igor Fedotov
4464419dde os/bluestore: dump onode meta before "no spanning blob" assertion.
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
2019-05-02 01:27:43 +03:00
Igor Fedotov
70640aaa12 os/bluestore: move _dump_xxx methods out of BlueStore class
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
2019-05-02 01:19:26 +03:00
Samuel Just
0147ac2221
Merge pull request #27874 from athanatos/sjust/wip-peering-refactor-forreview
Extract peering logic into a module for use in crimson

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-05-01 14:33:38 -07:00
Patrick Donnelly
012809cf58
Merge PR #27763 into master
* refs/pull/27763/head:
	common/PriorityCache: fix over-aggressive assert when mem limited

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2019-05-01 13:22:58 -07:00
Mark Nelson
75f60f3776 common/PriorityCache: fix over-aggressive assert when mem limited
Fixes: https://tracker.ceph.com/issues/39437
Signed-off-by: Mark Nelson <mnelson@redhat.com>
2019-05-01 15:09:47 -05:00
Samuel Just
7508fffece PGStateUtils: improvements for PGStateHistory
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:29 -07:00
sjust@redhat.com
f3a22feace PeeringState: don't zero backfill target num_bytes on activation
834d3c19a7 preserves num_bytes
on backfill targets in order to estimate space required to complete
backill.  However, from activation until backfill reservation,
info.stats.stats.sum.num_bytes is persisted to disk as 0 messing
up future intervals.  Instead, preserve it in the info sent during
recovery and leave it alone in RequestBackfillPrio.

Additionally, it's possible for backfill to be preempted between
last_backfill=MAX being sent to the replica and Backfilled being
queued occuring.  In that case, the stats get on reservation
and the replica ends up with invalid stats.

Fixes: https://tracker.ceph.com/issues/39401
Signed-off-by: sjust@redhat.com <sjust@redhat.com>
2019-05-01 11:22:29 -07:00
Samuel Just
25904513c2 PeeringState: use ceph_assert
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:29 -07:00
Samuel Just
7915c89e19 admin/build-doc: use PeeringState* for gen_state_diagram.py
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:29 -07:00
Samuel Just
f78bd13f24 PeeringState: add explanations for public interface methods
Also rearranges the methods a little for clarity.

Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:29 -07:00
sjust@redhat.com
6858ec29ca PG,PeeringState: Fix initialization order
PeeringState needs to be initialized last and destructed
first.

Signed-off-by: sjust@redhat.com <sjust@redhat.com>
2019-05-01 11:22:28 -07:00
Samuel Just
4a0f770d6e PeeringState: mark state and helpers private
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:28 -07:00
Samuel Just
e3fe19cd64 osd/: clean up remaining info mutators
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:28 -07:00
Samuel Just
d33a8b8ab1 osd/: condense missing mutations for recovery/repair/errors
At a high level, this patch attempts to unify the various
sites at which some combination of
- mark object missing in one or more pg_missing_t
- mark object needs_recovery in missing_loc
- manipulate the locations map based on external information
occur.  It seems to me that the pg_missing_t and missing_loc
should be in sync except for the mark_unfound_lost_revert
case and the case where we are about to do a backfill push.

This patch also cleans up repair_object.  It sort of worked by accident
for non-ec non-primary bad peers.  It didn't update missing_loc, so
needs_recovery() returns the wrong answer.  However, is_unfound() does
as well, so ReplicatedBackend is nevertheless happy as the object would
be present on the primary.  This patch updates the behavior to be
uniform as in the other force_obejct_missing cases.

Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:28 -07:00
Samuel Just
5ea5c47152 test-erasure-eio: first eio may be fixed during recovery
The changes to the way EC/ReplicatedBackend communicate read
t showerrors had a side effect of making first eio on the object in
TEST_rados_get_subread_eio_shard_[01] repair itself depending
on the timing of the killed osd recovering.  The test should
be improved to actually test that behavior at some point.

Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:28 -07:00
sjust@redhat.com
17d7a24d61 osd/: move pg_log, missing_loc mutations to PeeringState
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:28 -07:00
Samuel Just
8a8947d2a3 osd/: unify PGBackend pull error pathways
This patch narrows the PGBackend -> PrimaryLogPG recovery
cancel/error interface to on_failed_pull and cancel_pull.

This patch requires careful review.

Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:27 -07:00
sjust@redhat.com
1a011fda05 osd/: move peer_info mutators into PeeringState
Signed-off-by: sjust@redhat.com <sjust@redhat.com>
2019-05-01 11:22:27 -07:00
Samuel Just
4eef7f4050 osd/: Move log version pointer updates to PeeringState
Signed-off-by: sjust@redhat.com <sjust@redhat.com>
2019-05-01 11:22:27 -07:00
Samuel Just
812bef0b2e osd/: refactor to avoid mutable peer_missing refs in PG
Signed-off-by: sjust@redhat.com <sjust@redhat.com>
2019-05-01 11:22:27 -07:00
Samuel Just
b67902ff01 PG: remove might_have_unfound, peer_*_requested
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:27 -07:00
Samuel Just
699e87c5ae osd/: fix upset, actingset, acting_backfill_recovery references
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:27 -07:00
Samuel Just
3745ab0ae5 osd/: fix primary/up_primary references
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:26 -07:00
Samuel Just
9e51298590 osd/: move last_..._to_applied, backfill_targets, async_recovery_targets to PeeringState
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:26 -07:00
Samuel Just
664f26a682 PG: fix last_peering_reset and past_intervals references
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:26 -07:00
Samuel Just
ce143b66ce PeeringState::PeeringMachine::Deleting: rollfoward before resetting backfill
This looked wrong when I was moving it over.

Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:26 -07:00
sjust@redhat.com
dfbe5e070c osd/: add helpers to add remaining info dirtiers into PeeringState
Signed-off-by: sjust@redhat.com <sjust@redhat.com>
2019-05-01 11:22:26 -07:00
sjust@redhat.com
8319c0f64c osd/: move append_log into PeeringState
Signed-off-by: sjust@redhat.com <sjust@redhat.com>
2019-05-01 11:22:26 -07:00
sjust@redhat.com
4faab419f3 osd/: move append_log_entries_update_missing and merge_new_log_entries to PeeringState
Signed-off-by: sjust@redhat.com <sjust@redhat.com>
2019-05-01 11:22:25 -07:00
Samuel Just
fd57e539a1 osd/: clean up PG deleted and deleting references
Signed-off-by: sjust@redhat.com <sjust@redhat.com>
2019-05-01 11:22:25 -07:00
sjust@redhat.com
bb86445e4b PG: remove direct acting and up references
Signed-off-by: sjust@redhat.com <sjust@redhat.com>
2019-05-01 11:22:25 -07:00
Samuel Just
85dbe8ec90 PG: begin cleaning up scrub stat and history mutations
Signed-off-by: sjust@redhat.com <sjust@redhat.com>
2019-05-01 11:22:25 -07:00
sjust@redhat.com
720cb40fd8 PG: remove first batch of unused references
Signed-off-by: sjust@redhat.com <sjust@redhat.com>
2019-05-01 11:22:25 -07:00
Samuel Just
a428973054 osd/: clarify interface for introducing disk state to PeeringState
Signed-off-by: sjust@redhat.com <sjust@redhat.com>
2019-05-01 11:22:25 -07:00
Samuel Just
2e06118f49 osd/: move ostream<< and dump logic into PeeringState
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:25 -07:00
Samuel Just
d99fd0508e osd/: clean up PeeringState::write_if_dirty
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:24 -07:00
Samuel Just
fe31c31ed3 osd/: Move init into PeeringState
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:24 -07:00
sjust@redhat.com
252d5c20cf osd/: move stat updates and publishing to PeeringState
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:24 -07:00
Samuel Just
726894353f osd/: move more state helpers to PeeringState
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:24 -07:00
Samuel Just
ace6655f95 osd/: move split/merge helpers into PeeringState
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:24 -07:00
Samuel Just
985c7f6b87 osd/: fix try_mark_clean/finish_recovery interface boundary
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:24 -07:00
Samuel Just
13e1169097 PeeringState: remove PG references and include
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:24 -07:00
Samuel Just
8907010dee osd/: move calc_min_last_complete_ondisk to PeeringState
Signed-off-by: Samuel Just <sjust@redhat.com>
2019-05-01 11:22:23 -07:00