We had a problem with bucket recreation, where we identified
that bucket has already existed, but missed the fact that it's
the same bucket, so removal of the bucket index was wrong.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
We have the con handy; use it. This avoids generate a spurious RESET
event, which we do not need or do anything useful with. Note that in this
case we are not attaching anything to the Connection priv field.
Signed-off-by: Sage Weil <sage@inktank.com>
If we get a reset during shutdown, we should still break the cycle to avoid
tripping the valgrind leak detection. Note that we are touching no
internal Monitor state here and the locking has not changed.
Signed-off-by: Sage Weil <sage@inktank.com>
Document these in the interface, not the implementation; having two copies
clutters the header and invites them to get out of sync.
Signed-off-by: Sage Weil <sage@inktank.com>
If the caller is marking down an addr, they presumably don't have the
Connection* handy, so we should generate a reset event to help them
clean up con <-> session ref cycles.
Signed-off-by: Sage Weil <sage@inktank.com>
This is a delta, not a timestamp.
This triggered when a cluster is idle for 2* the mon_delta_reset_interval,
and required a mon restart to fix.
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
We periodically see strange values come out of the estimated cluster
throughput and recovery rates. Pretty sure this is cause by feeding
negative values into the rate arithmetic and then giving the si_t
helpers mangled (sign-extended + bit shifted) values.
Signed-off-by: Sage Weil <sage@inktank.com>
si_t (and friends) does not handle signed values, but at least we can
give the Formatters unmangled values. This shouldn't happen (tm), but
if it does this will make things a bit less confusing and makes the code
a bit less fragile.
Signed-off-by: Sage Weil <sage@inktank.com>
We also assert in on_flushed() that the temp collection is actually
empty.
Fixes: #5670
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
We often want to maintain a nonnegative value. We generalize
this to floors other than zero only because it makes the function
call make intuitive sense; I don't think it is at all useful.
Signed-off-by: Sage Weil <sage@inktank.com>
If we see a peer reporting features ~0ull, we know they are deluded in a
particular way and should infer what features they *actually* have. Do
this right when the features come over the wire to catch all users.
Fixes: #5655
Signed-off-by: Samuel Just <sam.just@inktank.com>
Signed-off-by: Sage Weil <sage@inktank.com>
We have the con handy; use it. This avoids generate a spurious RESET
event, which we do not need or do anything useful with. Note that in this
case we are not attaching anything to the Connection priv field.
Signed-off-by: Sage Weil <sage@inktank.com>
If we get a reset during shutdown, we should still break the cycle to avoid
tripping the valgrind leak detection. Note that we are touching no
internal Monitor state here and the locking has not changed.
Signed-off-by: Sage Weil <sage@inktank.com>
Document these in the interface, not the implementation; having two copies
clutters the header and invites them to get out of sync.
Signed-off-by: Sage Weil <sage@inktank.com>
If the caller is marking down an addr, they presumably don't have the
Connection* handy, so we should generate a reset event to help them
clean up con <-> session ref cycles.
Signed-off-by: Sage Weil <sage@inktank.com>
It's possible for us to just be really slow when getting the reply to the
first op or doing the second op, resulting in a successful lock. If we
do get a success, assert that at least that amount of time has passed to
avoid any false positives.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
In the event of a split or collection rename, we need to ensure that
we don't replay any operations on objects within those collections
prior to that point. Thus, we mark a global replay guard on the
collection after doing a syncfs and make sure to check that in
_check_replay_guard() for all object operations.
Fixes: #5154
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
We shouldn't hold the pipe_lock while doing the ms_verify_authorizer
upcalls.
Fix by unlocking a bit earlier, and verifying our state is still correct
in the failure path.
This regression was introduced by ecab4bb951.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
This is what e213b1bc25 intended to do
but managed to bungle by using >= instead of >.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
osd: include op queue age histogram in osd_stat_t
Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
rgw/rgw_rados.cc: In member function 'virtual int RGWPutObjProcessor_Atomic::handle_data(ceph::bufferlist&, off_t, void**)':
rgw/rgw_rados.cc:648:5: warning: parameter 'ofs' set but not used [-Wunused-but-set-parameter]
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Currently we see slow request warnings go by in the cluster log, but they
are not reflected by 'ceph health'. Use the new op queue histograms to
raise a flag there as well.
For example:
HEALTH_WARN 59 requests are blocked > 32 sec; 2 osds have slow requests
21 ops are blocked > 65.536 sec
38 ops are blocked > 32.768 sec
16 ops are blocked > 65.536 sec on osd.1
23 ops are blocked > 32.768 sec on osd.1
5 ops are blocked > 65.536 sec on osd.2
15 ops are blocked > 32.768 sec on osd.2
2 osds have slow requests
Fixes: #5505
Signed-off-by: Sage Weil <sage@inktank.com>