We are running mysql on top of rbd. sysbench qps occasionally drops to zero
with the INSERT benchmark.
Debug code captured >2s latency between PG::queue_op() and OSD::dequeue_op().
We finally found out that the latency came from below code in OSD::ShardedOpWQ::_process(),
sdata->sdata_cond.WaitInterval(sdata->sdata_lock,
utime_t(osd->cct->_conf->threadpool_empty_queue_max_wait, 0));
"threadpool_empty_queue_max_wait" is 2s by default.
Normally, it should not sleep for 2s since the comming IO requests will wakeup it.
But there is a small timing window that it missed the wakeup signal actually.
For example,
msgr-worker-0 thread tp_osd_tp thread
OSD::ShardedOpWQ::_enqueue OSD::ShardedOpWQ::_process
--------------------------- ---------------------------
T1: sdata_op_ordering_lock.Lock()
T2: sdata_op_ordering_lock.Lock()
"queue empty"
sdata_op_ordering_lock.Unlock()
"insert op"
sdata_op_ordering_lock.Unlock()
T3: sdata_lock.Lock()
T4: sdata_lock.Lock()
"send wakeup signal"
sdata_lock.Unock()
// here the wakeup signal has no effect actually
// becuase it has not slept yet.
// then, it sleeps.
WaitInterval(2s)
This patch unlocks sdata_op_ordering_lock with sdata_lock hold in OSD::ShardedOpWQ::_process(),
then the timeline becomes,
msgr-worker-0 thread tp_osd_tp thread
OSD::ShardedOpWQ::_enqueue OSD::ShardedOpWQ::_process
--------------------------- ---------------------------
T1: sdata_op_ordering_lock.Lock()
T2: sdata_op_ordering_lock.Lock()
"queue empty"
sdata_lock.Lock()
T3: sdata_op_ordering_lock.Unlock()
"insert op"
sdata_op_ordering_lock.Unlock()
sdata_lock.Lock()
T4: WaitInterval(2s) -> it actually unlocks sdata_lock
"send wakeup signal"
sdata_lock.Unock()
//got signal, wakeup immediately
With this one line change, we can avoid occasional high latency.
Signed-off-by: Ming Lin <ming.lin@alibaba-inc.com>
Fixes the Coverity Scan Report:
CID 1412614 (#2-1 of 2): Uninitialized scalar field (UNINIT_CTOR)
7. uninit_member: Non-static class member m_do_resync is not initialized in this constructor nor in any functions that it calls.
Signed-off-by: Jos Collin <jcollin@redhat.com>
mds may try discover several times for MDirUpdate, rename may kick
in and cause MDCache::path_traverse() to return error.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
MDCache::eval_subtree_root() may tigger scatter-gather process, which
submits log entry. Submitting log entry while adjusting subtree map is
bad, because subtree map in intermediate state may get used/logged.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
When a directory tree become frozen, its WAIT_FROZEN contexts are
executed asynchronously. Before Migrator::export_frozen() set export
bounds, MDCache::try_subtree_merge_at() can merge newly imported
subtree into the frozen directory tree. This causes problem if there
are auth pins in newly imported subtree.
The fix is creating subtree root immediately after directory tree
becomes frozen. The new subtree root has dir_auth 'me, me', so it's
not meregeable.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Fixes 2 problems:
- Do not test Bluestore on FreeBSD, since that does not work (yet)
And all erasure code overwrite tests are executed on BlueStore OSDs
Erasure code overwrites are unsafe on Filestore, see:
http://docs.ceph.com/docs/master/rados/operations/erasure-code/#erasure-coding-with-overwrites
- the JQ expression errors out with:
(version 1.5-1-g940132e-dirty)
====
jq: error (at :232): Cannot iterate over null (null)
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib64/python2.7/json/init.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib64/python2.7/json/decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib64/python2.7/json/decoder.py", line 383, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
====
Adding a ? to the jq expression allows to proceed on null blocks.
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
In preparation for moving this task from ceph/teuthology.git into ceph/ceph.git
The move is necessary because jewel-specific changes are needed, yet teuthology
does not maintain a separate branch for jewel. Also, swift.py is a
Ceph-specific task so it makes more sense to have it in Ceph.
Signed-off-by: Nathan Cutler <ncutler@suse.com>
there is chance that other pieces of application loads PK11 module
already and does not finalize it before calling common_init_finish().
also, upon fork, PK11 module resets its entire status including `nsc_init`,
by which PK11 module tell if it is initialized or not. so the behavior
of NSS_InitContext() could be different before and after fork. that's
another reason to ignore CKR_CRYPTOKI_ALREADY_INITIALIZED error (see
NSS_GetError()).
Fixes: http://tracker.ceph.com/issues/19741
Signed-off-by: Kefu Chai <kchai@redhat.com>