mirror of
https://github.com/ceph/ceph
synced 2025-01-03 01:22:53 +00:00
75434ff70b
152 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Matt Benjamin
|
610d66f531 |
Ceph Accelio/RDMA Transport (XioMessenger).
XioMessenger implements a Ceph Messenger provider for Accelio, a high-performance messaging transport by Mellanox. Current Accelio is layered on ibverbs, and supports Infiniband, ROCE, and other RDMA transports. Future Accelio verions will support alternative transports (including TCP), and flexible transport selection. config: cluster_rdma drives messenger creation ceph_mds ceph_mon and ceph_osd use XioMessengers for cluster communication when cluster_rdma is set Move XioMessenger to msg/xio. This matches the other new Messenger locations. test: tests for tcp and xio messengers (Not tests only.) buffer: add subclass for xio buffers xio: convert to Connection::send_message interface config: -x, --xio as aliases for client_rdma ceph-fuse: create xio messenger if client_rdma Find XioMessenger.h and QueueStrategy.h in msg/xio. ceph-syn: create xio messenger if client_rdma librados: create xio messenger if client_rdma Find XioMessenger.h and QueueStrategy.h in msg/xio. Restore non-abort from Xio Mon integration. Fix xio_client send count, again. xio: must signal cond under mutex lock xio: dispatch strategies support ms_fast_dispatch xio: config variable xio_port_shift remove set_port_shift() from XioMessenger, and just use the value from the configuration xio: don't depend on g_ceph_context for dout XioMessenger now uses its own cct for all logging operations the accelio log function, however, still depends on a global CephContext. so we maintain an extra one, separate from g_ceph_context, in XioMessenger.cc that is initialized on first construction and a reference is held indefinitely script: cephfsnew to automate pool and fs creation Use new on_ow_msg_send_complete hook. Replaces on_msg_delivered for one-way message style. Prototype new xio_discon behavior. On shutdown, XioPortal threads should not exit before Accelio finalizes all sessions. Inline join_sessions, it needs sh_mtx held across wait loop. Fix assert on Cond::Signal. Adds Cond2. Avoid deadlock, xio_disconnect can deliver a session teardown event. Also Mutex2. (Note, Mutex2 and Cond2 are replaced by standard C++ downstream.) Restore SimpleDispatcher Timings. The simple_client/simple_server timings are based on a ping/pong of messages between the client and server, unlike those of the xio_client/server programs, which are one-way (so their corresponding 1-way bandwidth is appx. 2x what the test reports). We assert that the results are in general comparable, because in both setups, a fixed number of messages (def. 50) is maintained in flight. Wrap Accelio mempool in XioPool, add stats. To enable stat prints, set xio_trace_mempool. Currently, prints to stdout at each 64K messages sent or received. Restore _send_message(..). Fix merge errors in simple_client, simple_dispatcher. xio: fix for size in pool stats Add in/outbound msg counters to XioPoolStats. Pool stats are easier to read. Pool stats are easier to read, and if enabled, print on session teardown. This is a convenient time to view stats, and with a small Make pool stats counters atomic. Track requests using hook ctor/dtor. Lockless, portal thread provides atomicity. Adapt to recent changes on Accelio for_next * Accelio options now of opaque type * on_msg_err with extra direction param * RDMA behavior now governed by 2 new options XIO_OPTNAME_MAX_INLINE_DATA XIO_OPTNAME_MAX_INLINE_HEADER * Separated send and recv queue depth xio_messenger: Change xio optname queue depth msgs * Set 16k threshold to rdma buffers instead of send * Change xio optname for queue depth msgs XIO_OPTNAME_SND/RCV_QUEUE_DEPTH_MSGS xio_messenger: Protect Accelio queue depth. (Minimal send flow control.) The guard is per xio_connection, and considering batches. Increment happens only if xio_send_msg succeeded, decrement in on_ms_ow_send_complete and on_msg_error. Note that we don't need atomics because counters are touched only in the correct portal thread. Find XioMsg.h in msg/xio Find XioMessenger.h and QueueStrategy.h in msg/xio (tests). Adapt to 2 Accelio API changes. 1. xio_context_stop loop takes only 1 argument 2. xio_connect() now takes a structure argument, by reference Set CMP0046 iif CMake version >= 3 Move XioMessenger to msg/xio xio: fix for segfault on xio_connect() No more Mutex2, Cond2. xio: number of portal threads is configurable xio: only create additional portals on bind() xio: use QueueStrategy(1) as default xio: Messenger factory accepts ms_type "xio" xio: use ms_type instead of client,cluster_rdma removing the ability to configure the client and cluster networks separately in favor of a single global messenger type --xio is now a command-line alias for --ms_type xio all daemons now use the Messenger::create() factory function instead of conditionally creating XioMessengers the OSD and Monitor classes no longer need separate messengers to deal with both tcp/rdma clients xio: portal binding honors ms_bind_port_min,max xio: remove xio_port_shift port shifting is no longer necessary, because we won't create both tcp and xio messengers for the same service Use Accelio sglist helper macros. xio: make xio buffer unshareable xio: Nuke special_handling. Replace GENERIC with MON (requested by Sage). Signed-off-by: Casey Bodley <casey@cohortfs.com> Signed-off-by: Vu Pham <vu@mellanox.com> Signed-off-by: Matt Benjamin <matt@cohortfs.com> |
||
Ali Maredia
|
0f6b9f2816 |
Combined CMake Build for Hammer
CMake Ceph Build System (Firefly) CMake. Add tests. Respace src/CMakeLists.txt. CMake. Spacing cleanups. CMake for Firefly is Triumphant CMake for Giant Adapt to Giant. Fix installation for scripts and man pages Fix CEPH_LIBDIR and CEPH_PKGLIBDIR defines Add erasure-code libraries uses try_compile() to detect support for -msse flags Fix rados object classes Propagate Casey's cls library change to src/test. Fix CMake build for Hammer. Try-add rados and common to librbd link. Fix name and linkage of libec_lrc. Rename arch/neon.c arm.c Fix libcommon.a dependencies (some unit tests). Authors: Ali Maredia <ali@cohortfs.com> Casey Bodley <casey@cohortfs.com> Adam Emerson <aemerson@cohortfs.com> Marcus Watts <mdw@cohortfs.com> Matt Benjamin <matt@cohortfs.com> Signed-off-by: Matt Benjamin <matt@cohortfs.com> |