Commit Graph

74518 Commits

Author SHA1 Message Date
Casey Bodley
6b42352b70 Merge pull request #14624 from ceph/wip-s3a-hadoop
qa/tasks: S3A hadoop task to test s3a with Ceph

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2017-06-23 13:46:05 -04:00
Ming Lin
bc68338581 osd: unlock sdata_op_ordering_lock with sdata_lock hold to avoid missing wakeup signal
We are running mysql on top of rbd. sysbench qps occasionally drops to zero
with the INSERT benchmark.

Debug code captured >2s latency between PG::queue_op() and OSD::dequeue_op().
We finally found out that the latency came from below code in OSD::ShardedOpWQ::_process(),

sdata->sdata_cond.WaitInterval(sdata->sdata_lock,
      utime_t(osd->cct->_conf->threadpool_empty_queue_max_wait, 0));

"threadpool_empty_queue_max_wait" is 2s by default.

Normally, it should not sleep for 2s since the comming IO requests will wakeup it.
But there is a small timing window that it missed the wakeup signal actually.
For example,

     msgr-worker-0 thread                    tp_osd_tp thread
     OSD::ShardedOpWQ::_enqueue              OSD::ShardedOpWQ::_process
     ---------------------------             ---------------------------
T1:                                          sdata_op_ordering_lock.Lock()
T2:  sdata_op_ordering_lock.Lock()
                                             "queue empty"
                                             sdata_op_ordering_lock.Unlock()
     "insert op"
     sdata_op_ordering_lock.Unlock()

T3:  sdata_lock.Lock()
T4:                                          sdata_lock.Lock()
     "send wakeup signal"
     sdata_lock.Unock()
                                             // here the wakeup signal has no effect actually
                                             // becuase it has not slept yet.

                                             // then, it sleeps.
                                             WaitInterval(2s)

This patch unlocks sdata_op_ordering_lock with sdata_lock hold in OSD::ShardedOpWQ::_process(),
then the timeline becomes,

     msgr-worker-0 thread                    tp_osd_tp thread
     OSD::ShardedOpWQ::_enqueue              OSD::ShardedOpWQ::_process
     ---------------------------             ---------------------------
T1:                                          sdata_op_ordering_lock.Lock()
T2:  sdata_op_ordering_lock.Lock()
                                             "queue empty"
                                             sdata_lock.Lock()
T3:                                          sdata_op_ordering_lock.Unlock()
     "insert op"
     sdata_op_ordering_lock.Unlock()
     sdata_lock.Lock()

T4:                                          WaitInterval(2s) -> it actually unlocks sdata_lock
     "send wakeup signal"
     sdata_lock.Unock()
                                             //got signal, wakeup immediately

With this one line change, we can avoid occasional high latency.

Signed-off-by: Ming Lin <ming.lin@alibaba-inc.com>
2017-06-23 10:30:35 -07:00
Yehuda Sadeh
90b5efa171 Merge pull request #15665 from oritwas/wip-rgw-reshard-old-bucket
rgw: auto reshard old buckets

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2017-06-23 10:20:42 -07:00
Kefu Chai
5c2774234c osdc/Objecter: release message if it is not handled
Fixes: http://tracker.ceph.com/issues/19741
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-06-24 00:56:17 +08:00
Josh Durgin
9d1f4b68a3 Merge pull request #15821 from jdurgin/wip-20302
qa/suites/powercycle/osd/tasks/radosbench: consume less space

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-06-23 09:14:44 -07:00
John Spray
45b1a108cc Merge pull request #15154 from jcsp/wip-multimds-stable
Remove "experimental" warnings from multimds
2017-06-23 12:11:16 -04:00
Jos Collin
ef65e7aa53 tools/rbd_mirror: initialize Non-static class member m_do_resync in ImageReplayer
Fixes the Coverity Scan Report:
CID 1412614 (#2-1 of 2): Uninitialized scalar field (UNINIT_CTOR)
7. uninit_member: Non-static class member m_do_resync is not initialized in this constructor nor in any functions that it calls.

Signed-off-by: Jos Collin <jcollin@redhat.com>
2017-06-23 21:40:46 +05:30
John Spray
9e7a12b470 doc: multimds is no longer experimental
Signed-off-by: John Spray <john.spray@redhat.com>
2017-06-23 17:07:34 +01:00
John Spray
6ec69a4547 qa: update cephtool test for multimds on by default
Signed-off-by: John Spray <john.spray@redhat.com>
2017-06-23 17:07:34 +01:00
John Spray
b6cfa35458 qa: no longer need to explicitly enable multimds
Signed-off-by: John Spray <john.spray@redhat.com>
2017-06-23 17:07:34 +01:00
John Spray
687b4e909f mds: enable multimds by default in new filesystems
Signed-off-by: John Spray <john.spray@redhat.com>
2017-06-23 17:07:34 +01:00
John Spray
377e8efb6e mon: remove experimental warning on multimds
Signed-off-by: John Spray <john.spray@redhat.com>
2017-06-23 17:07:33 +01:00
Yan, Zheng
7f5bd004b3 mds: don't call StrayManager::eval_stray() for undefined inode
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-06-23 17:07:33 +01:00
Yan, Zheng
58a2f98e89 mds: drop locks before waiting for export targets
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-06-23 17:07:33 +01:00
Yan, Zheng
7432adb36e mds: handle MDirUpdate race
mds may try discover several times for MDirUpdate, rename may kick
in and cause MDCache::path_traverse() to return error.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-06-23 17:07:33 +01:00
Yan, Zheng
ad2a95d98c mds: don't forge replica dirfrag
MDCache::forge_replica_dir() set wrong dir_auth if the forged replica
dirfrag is subtree root.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-06-23 17:07:32 +01:00
Yan, Zheng
34988c3799 mds: avoid submitting log entry while adjusting subtree map
MDCache::eval_subtree_root() may tigger scatter-gather process, which
submits log entry. Submitting log entry while adjusting subtree map is
bad, because subtree map in intermediate state may get used/logged.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-06-23 17:07:32 +01:00
Yan, Zheng
4f43737df8 mds: don't mark nestlock dirty on improper inode
If inode is replica and it has no auth subtree dirfrag, We should
not mark its nestlock dirty.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-06-23 17:07:32 +01:00
Yan, Zheng
944821fec1 mds: create subtree root immediately after directory tree becomes frozen
When a directory tree become frozen, its WAIT_FROZEN contexts are
executed asynchronously. Before Migrator::export_frozen() set export
bounds, MDCache::try_subtree_merge_at() can merge newly imported
subtree into the frozen directory tree. This causes problem if there
are auth pins in newly imported subtree.

The fix is creating subtree root immediately after directory tree
becomes frozen. The new subtree root has dir_auth 'me, me', so it's
not meregeable.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-06-23 17:07:32 +01:00
Yan, Zheng
9d3bb981dc mds: fix stray dentry replication in cache rejoin ack
To replicate s stray dentry, we need to replicate all its ancestors.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-06-23 17:07:31 +01:00
Yuri Weinstein
e2fd54b15c Merge pull request #15795 from myoungwon/wip-print-ignore_redirect
osd/osd_types: add flag name (IGNORE_REDIRECT)

Reviewed-by: Sage Weil <sage@redhat.com>
2017-06-23 08:52:31 -07:00
Casey Bodley
a4d01a6f13 Merge pull request #15656 from aclamk/download_err_with_comp_followup
rgw: use 64-bit offsets for compression

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2017-06-23 11:40:20 -04:00
Sage Weil
eb61a97511 Merge pull request #15848 from xiexingguo/wip-fix-rmcc
src/vstart.sh: kill dead upmap option
2017-06-23 09:42:57 -05:00
Sage Weil
0848bc53d0 Merge pull request #15851 from liewegas/wip-luminous-notes
doc/release-notes: update luminous notes
2017-06-23 09:37:04 -05:00
Sage Weil
987fac2e8c doc/release-notes: 'osd crush class rename' is coming
Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-23 10:36:19 -04:00
Sage Weil
7c58966e4e doc/release-notes: ceph tell <foo> help
Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-23 10:36:19 -04:00
Sage Weil
8f44e6f69a doc/start/os-recommendations: update
Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-23 10:36:19 -04:00
Sage Weil
a5436bd234 doc/release-notes: note debian stretch addition
Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-23 10:36:19 -04:00
Sage Weil
12590d9ec1 doc/release-notes: sleep settings
Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-23 10:36:19 -04:00
Sage Weil
0307c1e5c1 doc/release-notes: link to EC docs
Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-23 10:36:19 -04:00
Sage Weil
7cc863b690 doc/release-notes: update RGW metadata
Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-23 10:36:19 -04:00
Sage Weil
2ba6c29d33 dev/release-notes: various updates from other PR
Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-23 10:36:19 -04:00
Sage Weil
43dcb5be61 doc/release-notes: notes on new CLI commands
Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-23 10:36:19 -04:00
Sage Weil
11d9541a6f mon: 'mon feature list' -> 'mon feature ls'
Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-23 10:36:18 -04:00
Sage Weil
829e767d49 doc/release-notes: update luminous notes
Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-23 10:36:18 -04:00
Kefu Chai
b86cda7040 Merge pull request #15764 from tchaikov/wip-20342
qa/suites/upgrade/hammer-jewel-x: add luminous.yaml

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2017-06-23 22:26:03 +08:00
Sage Weil
f6a49b7b16 Merge pull request #15877 from wjwithagen/wip-wjw-vstart-ceph-mgr-restfull
vstart.sh: Work around mgr restfull not available
2017-06-23 08:57:19 -05:00
xie xingguo
58335f1df5 mon/OSDMonitor: slightly nice error output if set-device-class failed
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-06-23 18:37:36 +08:00
xie xingguo
ec1b974be9 mon/OSDMonitor: set result code properly if we fail to process "swap-bucket"
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-06-23 18:37:35 +08:00
xie xingguo
d6ce05f88c mon/OSDMonitor: "osd crush class rename" support
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-06-23 18:37:35 +08:00
Willem Jan Withagen
3ae960d6be ./src/vstart.sh: Work around mgr restfull not available
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
2017-06-23 11:33:18 +02:00
Yanhu Cao
35e3b0b34a common/config_opts: drop unused opt
Signed-off-by: Yanhu Cao <gmayyyha@gmail.com>
2017-06-23 17:15:47 +08:00
Willem Jan Withagen
19892df1d8 test/osd/osd-scrub-repair.sh: Adjust for FreeBSD
Fixes 2 problems:
 -  Do not test Bluestore on FreeBSD, since that does not work (yet)
    And all erasure code overwrite tests are executed on BlueStore OSDs
    Erasure code overwrites are unsafe on Filestore, see:
    http://docs.ceph.com/docs/master/rados/operations/erasure-code/#erasure-coding-with-overwrites

 -  the JQ expression errors out with:
    (version 1.5-1-g940132e-dirty)
    ====
    jq: error (at :232): Cannot iterate over null (null)
    Traceback (most recent call last):
    File "", line 1, in
    File "/usr/lib64/python2.7/json/init.py", line 338, in loads
    return _default_decoder.decode(s)
    File "/usr/lib64/python2.7/json/decoder.py", line 365, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    File "/usr/lib64/python2.7/json/decoder.py", line 383, in raw_decode
    raise ValueError("No JSON object could be decoded")
    ValueError: No JSON object could be decoded
    ====
    Adding a ? to the jq expression allows to proceed on null blocks.

Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
2017-06-23 10:33:43 +02:00
Kefu Chai
efc0b61ba7 auth/RotatingKeyRing: use std::move() to set secrets
the param will be thrown away anyway. see
CephxClientHandler::handle_response().

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-06-23 15:36:14 +08:00
Nathan Cutler
aab3920977 Merge branch 'master' of /home/smithfarm/src/ceph/upstream/teuthology into wip-swift-task-move-master 2017-06-23 08:30:38 +02:00
Nathan Cutler
7b58ac97e9 tests: move swift.py task to qa/tasks
In preparation for moving this task from ceph/teuthology.git into ceph/ceph.git

The move is necessary because jewel-specific changes are needed, yet teuthology
does not maintain a separate branch for jewel. Also, swift.py is a
Ceph-specific task so it makes more sense to have it in Ceph.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-06-23 08:27:42 +02:00
Kefu Chai
ffeddb4f22 osdc/Objecter: pass vector by const reference
so we can pass temporary object to it as parameter.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-06-23 12:02:03 +08:00
Kefu Chai
40b96745f0 mgr: enable ceph_send_command() to send pg command
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-06-23 12:02:03 +08:00
Kefu Chai
fcc3effd8b crypto: allow PK11 module to load even if it's already initialized
there is chance that other pieces of application loads PK11 module
already and does not finalize it before calling common_init_finish().

also, upon fork, PK11 module resets its entire status including `nsc_init`,
by which PK11 module tell if it is initialized or not. so the behavior
of NSS_InitContext() could be different before and after fork. that's
another reason to ignore CKR_CRYPTOKI_ALREADY_INITIALIZED error (see
NSS_GetError()).

Fixes: http://tracker.ceph.com/issues/19741
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-06-23 11:31:07 +08:00
Yanhu Cao
beee827390 mon/PaxosService: use __func__ instead of hard code function name
Signed-off-by: Yanhu Cao <gmayyyha@gmail.com>
2017-06-23 10:27:49 +08:00