mon/MgrMonitor: change 'unresponsive' message to info level
Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: John Spray <john.spray@redhat.com>
We generate a MGR_DOWN health warning at the appropriate points; having
this at WRN level just triggers failed teuthology runs but doesn't much
value for the user.
Clear out teuthology whitelisting for this message.
Fixes: http://tracker.ceph.com/issues/24222
Signed-off-by: Sage Weil <sage@redhat.com>
The patch fixes a race condition that happens between
`unregister_inflight_op` and `visit_ops_in_flight` of
`OpTracker`. When a callable passed to the former one
turns the plain reference it gets into `TrackedOpRef`,
an almost-to-terminate `TrackedOp` (with `nref == 0`)
can be resurrected (`nref++`). This will be reflected
in extra call to `unregister_inflight_op` for same op
leading to e.g. use-after-free. For more details see:
https://tracker.ceph.com/issues/24037#note-5.
The fix deals with the problem by ensuring there will
be no call to the visitor for ops with zeroized `nref`.
Fixes: http://tracker.ceph.com/issues/24037
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
rgw: use DoutPrefixProvider to add more context to log output
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
Reviewed-by: Abhishek Lekshmanan <abhishek@suse.com>
rgw: use partial-order bucket listing in RGWLC, add configurable processing delay
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
rgw: Do not modify email if argument is not set
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
Reviewed-by: Abhishek Lekshmanan <abhishek@suse.com>
When we merged the PR to unify the metadata labels, we forgot to switch
the order of hostname and disk in ceph_disk_occupation metric.
Signed-off-by: Boris Ranto <branto@redhat.com>
Now _fsync call flush_bdev make data safely. But flush_bdev flush all
devices which don't care whether has data for this sync.
So add new api flush_bdev(std::array<bool, MAX_BDEV>& dirty_bdevs)
which only flush dirty devices for this sync op.
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
If the bucket is empty or does not have weight-set weights yet, avoid
crashing when populating the parent bucket.
Fixes: http://tracker.ceph.com/issues/23386
Signed-off-by: Sage Weil <sage@redhat.com>
This avoids adjusting the oncommits without a lock after the txc is
queued on the sequencer.
This is a bit defensive since the ObjectStore caller doesn't call
flush_commit() at the same time as queue_transaction(), but the could
change in the future.
Signed-off-by: Sage Weil <sage@redhat.com>
There is a narrow race possible:
A: lookup foo
A: put on foo
A: foo --nref == 0
B: lookup foo
B: put foo
B: foo --nref == 0
B: try_remove() succeeds, removes
A: try_remove() tries to remove foo again, probably crashes
We could fix this by flagging the object in some way to indicate it was
removed (maybe clearing parent?), but then we need to be careful about
dereferencing foo to get parent from put().
Fix this by moving to a simpler model: make lookup fail if nref == 0.
This eliminates the races around put() entirely because once nref reaches
0 it never goes up again.
Fixes: http://tracker.ceph.com/issues/24211
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/22091/head:
crush: update choose_args on bucket removal
crush: update choose_args on bucket removal, resize, or position mismatch
crush: create weight-set on demand when doing a choose-args reweight
test/cli/crushtool: use straw2 buckets for choose-args test
crush: weight_set_size -> weight_set_positions
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
We were updating the txc state to KV_DONE and queuing the oncommits
waiters without holding any locks. This was mostly fine, *except* that
Collection|OpSequencer::flush_commit(Context *) was looking at the state
(under qlock) and also adding items to oncommits.
The flush_commit() method is only used in 2 places: osd bench, and the
PG reset_interval_flush outgoing message blocking machinery (which is
a bit ick). The first we could get rid of, but the second is hard to
remove (despite its ick factor).
The simple fix is to take qlock while updating the state value and
working with oncommits.
Fixes: http://tracker.ceph.com/issues/21480
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/21540/head:
tests/crypto: print compile warning when NSS is unavailable.
tests/crypto: add tests for the no-bl encrypt/decrypt, part 2.
tests/crypto: add tests for the no-bl encrypt/decrypt.
auth: use OpenSSL for CryptoAESKeyHandler's no-bl encrypt/decrypt.
auth: extend CryptoKey with no-bl encrypt/decrypt.
auth: CryptoAESKeyHandler switches from NSS to OpenSSL.
auth: the outbuf of AES should be multiple of block size
auth: cache the PK11Context for CryptoAESKeyHandler