When replaying EImportFinish/EFragment event, the replay thread may call
MDS::queue_waiters. MDS::queue_waiters() requires its caller to hold the
mds_lock. Otherwise assert(waiter_mutex == __null || waiter_mutex->is_locked())
in Cond::Signal() will be tiggered.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
When upgrading from a build without the promotion on 2nd read feature,
should set min_read_recency_for_promote to the default value 1, instead
of 0.
Signed-off-by: Zhiqiang Wang <wonzhq@hotmail.com>
Currently in CrushWrapper, the member "struct crush_map *crush" is a public member,
so people can break the encapsulation and manipulate directly to the crush structure.
This is not a good practice for encapsulation and will lead to inconsistent if code
mix use the CrushWrapper API and crush C API.A simple example could be:
1.some code use crush_add_rule(C-API) to add a rule, which will not set the have_rmap flag to false in CrushWrapper
2.another code using CrushWrapper trying to look up the newly added rule by name will get a -ENOENT.
This patch move CrushWrapper::crush to private, together with three reverse map(type_rmap, name_rmap, rule_name_rmap)
and also change codes accessing the CrushWrapper::crush to make it compile.
Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
This is a bad assert. Specifically, handle_osd_op_reply may still be
holding the session ref while it is calling the completion for a previous
request. This is safe: it is only holding the session ref after it dropped
the global map rwlock because of the per-session completion locks. The
request in question was already marked completed by the time our thread
took the session lock.
Fixes: #9241
Signed-off-by: Sage Weil <sage@redhat.com>
If we cancel a read, revoke the rx buffers to avoid a use-after-free and/or
other undefined badness by using user buffers that may no longer be
present.
Fixes: #9362
Backport: firefly, dumpling
Reported-by: Matthias Kiefer <matthias.kiefer@1und1.de>
Signed-off-by: Sage Weil <sage@redhat.com>
Verify we don't receive data after a timeout.
Based on reproducer for #9362 written by
Matthias Kiefer <matthias.kiefer@1und1.de>.
Signed-off-by: Sage Weil <sage@redhat.com>
Suppose we start with the following in the cache pool:
30:[29,21,20,15,10,4]:[22(21), 15(15,10), 4(4)]+head
The object doesn't exist at 29 or 20.
First, we flush 4 leaving the backing pool with:
3:[]+head
Then, we begin to flush 15 with a delete with snapc 4:[4] leaving the
backing pool with:
4:[4]:[4(4)]
Then, we finish flushing 15 with snapc 9:[4] with leaving the backing
pool with:
9:[4]:[4(4)]+head
Next, snaps 10 and 15 are removed causing clone 10 to be removed leaving
the cache with:
30:[29,21,20,4]:[22(21),4(4)]+head
We next begin to flush 22 by sending a delete with snapc 4(4) since
prev_snapc is 4 <---------- here is the bug
The backing pool ignores this request since 4 < 9 (ORDERSNAP) leaving it
with:
9:[4]:[4(4)]
Then, we complete flushing 22 with snapc 19:[4] leaving the backing pool
with:
19:[4]:[4(4)]+head
Then, we begin to flush head by deleting with snapc 22:[21,20,4] leaving
the backing pool with:
22[21,20,4]:[22(21,20), 4(4)]
Finally, we flush head leaving the backing pool with:
30:[29,21,20,4]:[22(21*,20*),4(4)]+head
When we go to flush clone 22, all we know is that 22 is dirty, has snaps
[21], and 4 is clean. As part of flushing 22, we need to do two things:
1) Ensure that the current head is cloned as cloneid 4 with snaps [4] by
sending a delete at snapc 4:[4].
2) Flush the data at snap sequence < 21 by sending a copyfrom with snapc
20:[20,4].
Unfortunately, it is possible that 1, 1&2, or 1 and part of the flush
process for some other now non-existent clone have already been
performed. Because of that, between 1) and 2), we need to send
a second delete ensuring that the object does not exist at 20.
Fixes: #9054
Backport: firefly
Related: 66c7439ea0
Signed-off-by: Samuel Just <sam.just@inktank.com>
Otherwise, hit_set_create could create an unbounded size hitset
object.
Fixes: #9339
Backport: firefly
Signed-off-by: Samuel Just <sam.just@inktank.com>
It means distributing a few plugins that are only used for unit testing
but it does not use much disk space and this is otherwise harmless.
Explicitly listing which plugins are to be installed is problematic
because some of them (isa for now and maybe more later) are not
available for all architectures. Properly maintaining the list of
plugins to install would therefore mean exactly matching which
architecture has which plugins.
http://tracker.ceph.com/issues/9381Fixes: #9381
Signed-off-by: Loic Dachary <loic-201408@dachary.org>
Because erasure-plugin has default k/m and can autotune if k or m
invalid. Check k/m they are the same as we want.
Signed-off-by: Ma Jianpeng <jianpeng.ma@intel.com>