OSD recomputes op target based on current OSDMap. With an EC pg, we can get
this result:
1) client at map 512 sends an op to osd 3, pg_t 3.9 based on mapping
[CRUSH_ITEM_NONE, 2, 3]/3
2) OSD 3 at map 513 remaps op to osd 3, spg_t 3.9s0 based on mapping [3, 2, 3]/3
3) PG 3.9s0 dequeues the op at epoch 512 and notices that it isn't
primary -- misdirected op
4) client resends and this time PG 3.9s0 having caught up to 513 gets it and
fulfils it
We can't compute the op target based on the sending map epoch due to
splitting. The simplest thing is to detect such cases in
OSD::handle_misdirected_op and drop them without an error (the client
will resend anyway).
Signed-off-by: Samuel Just <sam.just@inktank.com>
Move agent_clear() from only being done when becoming replica
Do it in clear_primary_state() whenever we stop being primary
clear_primary_state() passed whether we are staying a primary
Add asserts in agent_stop() and don't need to clear agent_queue
Fixes: #7458
Signed-off-by: David Zafman <david.zafman@inktank.com>
python-ceph does not require requests, but ceph-common does (for ceph-brag).
Signed-off-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit 9a0ef6a181)
We can't set a shared_ptr to NULL, we need to reset it instead. Add
another test for various permutations of this.
Fixes: #7538
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
check if the inode is anchored/unanchored before updating the inode
Fixes: #7530
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
be2748c6d5 ensured that
if the temp acting mapping contains only CRUSH_ITEM_NONE,
that the acting_primary is left at -1. However, even if
acting.empty(), we need to respect a temp_primary mapping.
Thus, use _acting_primary unless acting.empty() &&
acting_primary == -1.
Bug introduced in be2748c6d5.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Since backfill peers are no longer placed into the acting set,
temp mappings will never exceed the pool size. Also, for ec
pools, temp mappings will never be less than the pool size.
Signed-off-by: Samuel Just <sam.just@inktank.com>
test/objectstore/store_test.cc: In member function ‘void SyntheticWorkloadState::read()’:
error: test/objectstore/store_test.cc:462:23: no matching function for call to ‘swap(uint64_t&, size_t&)’
Signed-off-by: Sage Weil <sage@inktank.com>
If we are encoding a full map based on an old Incremental that does not
encode the features, fall back to the quorum features or (barring that)
all features. Do *not* do no features or else we will end up with
encode_client_old which does not even include the extended info and will
cause the mon to crash when decoding.
This was observed when upgading a 0.76 cluster to 0.77 (all mons stopped,
upgraded, and then started)
Reported-by: Aaron Ten Clay <aarontc@aarontc.com>
Signed-off-by: Sage Weil <sage@inktank.com>
The insert() call here does not overwrite a previous entry, which means
that the osd_epochs map is never moving forward in time. This seems to
have been broken since it was introduced in 091809b814.
Backport: emperor, dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
There are two paths that jump to the out label for which 'in' can be
NULL and outp can be non-NULL. For those cases we want to fill in the
caller's pointer value (they asked for it) but we clearly cannot take
a reference.
Backport: emperor, dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
The FileStore's leveldb currently uses libleveldb's defaults for cache and
write buffer size, which are both 4 MB. Increase the cache size to 128MB and
the write buffer to 8MB.
Tested-by: Dmitry Smirnov <onlyjob@member.fsf.org>
Signed-off-by: Sage Weil <sage@inktank.com>
Keyvaluestore enhance(backport to firely)
Pulling this into firefly because it doesn't (substantiatively) touch anything outside of KeyValueStore.
Reviewed-by: Sage Weil <sage@inktank.com>
info.last_complete should be the entry before log.complete_to.
This appears to have been a typo introduced in
dd71051a8f.
Signed-off-by: Samuel Just <sam.just@inktank.com>
If there are no deep repairs, we don't want to assert.
Fixes:
-1> 2014-02-21 21:13:56.393087 7f0258ff9700 0 log [INF] : 0.0 repair ok, 0 fixed
0> 2014-02-21 21:13:56.428703 7f0258ff9700 -1 osd/PG.cc: In function 'void PG::scrub_finish()' thread 7f0258ff9700 time 2014-02-21 21:13:56.393127
osd/PG.cc: 4294: FAILED assert(deep_scrub)
Signed-off-by: David Zafman <david.zafman@inktank.com>
Signed-off-by: Sage Weil <sage@inktank.com>