1. initialize RDMAServerSocketImpl member
2. initialize RDMAConnectedSocketImpl member
3. pollfd::revents
Though it won't result in any error here, it's better to
initialize the pollfd structure obj's revents to be 0.
Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
rdma_event_channel is blocked by default, if there's no event
in the event channel, rdma_get_cm_event could be blocked forever.
This is not "asynchronous" messenger.
Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
RDMAIWARPConnectedSocketImpl is derived from RDMAConnectedSocketImpl
which already has member "is_server" & "local/peer_qpn" & "get_local_qpn".
Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
1. log every asynchronous type event
2. Deal with IBV_EVENT_QP_LAST_WQE_REACHED log
The QueuePair is switched into IBV_QPS_ERR before posting
Beacon WR. For SRQ, all the SQ/WRs on that QP will be flushed
into CQ and result in IBV_EVENT_QP_LAST_WQE_REACHED.
The above scenario is what we want, it needn't take it as error
with lderr logging.
Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
1. ibv_wc:status IBV_WC_SUCCESS
keep same logic
2. ibv_wc:status IBV_WC_WR_FLUSH_ERR
1) After Beacon is posted into SQ, all the outstanding RQ/WR will
be flushed into CQ with IBV_WC_WR_FLUSH_ERR status. This is right
without special logging.
2) For the other case that trigger IBV_WC_WR_FLUSH_ERR, it need track
more info such as remote QueuePair number and local QP state.
3. ibv_wc:status others
same logic with tracking more info into log
Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
1. ibv_wc:status IBV_WC_RETRY_EXC_ERR
Logging possible reasons:
1) Responder ACK timeout
2) Responder QueuePair in bad status
3) Disconnected
2. ibv_wc:status IBV_WC_WR_FLUSH_ERR
1) After switch QP into error state, all the outstanding SQ/WRs
will be flushed into CQ with IBV_WC_WR_FLUSH_ERR status. This is
right without special logging.
2) For the other case that trigger IBV_WC_WR_FLUSH_ERR, it need track
more info such as remote QueuePair number and local QP state.
3. ibv_wc:status others
same logic with tracking more info into log
Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
Currently, msg/async/rdma is friendly to the hardware with
SRQ feature. This patch makes the NIC without SRQ work.
Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
When remote QP is destroyed, QP will be disconnected. The local QP
is transitioned into error state. Then some asynchronous event or
completion event could be triggered. Need to get the qpn through
RDMAConnectedSocketImpl object.
Add get_local/peer_qpn to get qpn from RDMAConnectedSocketImpl class.
Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
1. remove aync_handler
1). async_handler is never scheduled (which should be scheduled by
center->dispatch_event_external).
2). async_hander wrapper handle_async_event, which will be called
in RDMADispatcher::polling.
So, all async_handler related code are removed.
2. fault won't run to_dead, so removed the commented code
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Beacon is used to detect SQ WQEs drained. There's no need to
to use tx_wr_inflight to check whether SQ WQEs has been drained
before destroying the QueuePair.
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
switch the QueuePair to error state, then post Beacon WR to
send queue. All outstanding WQEs will be flushed to CQ.
In CQ, check the completion queue element to detect SQ WRs has
been drained before destroying the QueuePair.
We don't post another Beacon WR to RQ if SRQ is not used/supported,
the reason is that QueuePair could be destroyed only under all
flushed WRs have been polled from CQ.
Refer to page 474 on below spec:
InfiniBandTM Architecture Specification Volume 1, Release 1.3
spec link: https://cw.infinibandta.org/document/dl/7859
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
ibv_qp is member in class QueuePair. QueuePair has other fields
which is needed in post_chunks_to_rq to be further checked for
different hardware feature e.g. SRQ/iWARP/RoCE
Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
1. Implement below 3 function in class QueuePair to switch QP state
1) int modify_qp_to_error(void);
2) int modify_qp_to_rts(void);
3) int modify_qp_to_rtr(void);
3. All connection meta data are member of class QueuePair.
So, send/recv connection meta data directly in send/recv_cm_meta i.e.
change send/recv_cm_meta API without using parameter cm_meta_data.
4. RDMAConnectedSocketImpl need members to track peer_qpn and local_qpn.
Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
STEADY does not imply TRIM. We want to always trim as many caps as
possible.
Introduced-by: be49866a15
Fixes: https://tracker.ceph.com/issues/41835
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
ObjBencher::seq_read_bench() is using "num_objects > data.started" to make sure
we don't issue more reads than what was written during ObjBencher::write_bench().
However, this does not work op_size != object_size as data.started is number of read
ops issued, not number of objects read.
This fix modifies ObjBencher::seq_read_bench() to use "num_ops > data.started" instead.
Where "num_ops" is metadata saved at the end of ObjBencher::write_bench().
ObjBencher::rand_read_bench() is using "rand() % num_objects" for rand_id and
"rand_id / reads_per_object" to generate object name.
This will not work correctly when reads_per_object != 1 (i.e. when op_size != object_size).
For example, if reads_per_object = 100 and num_objects = 2, then all generated
reads will be directed towards the first object, with no read for the second object.
This fix modifies ObjBencher::rand_read_bench() to use "rand() % num_ops" for rand_id instead.
Where "num_ops" is metadata saved at the end of ObjBencher::write_bench().
This patch also modifies ObjBencher::write_bench() to save number of write operations issued
(num_ops above), rather than number of objects written (num_objects). We can always derive
num_objects based on "(num_ops + ops_per_object - 1) / ops_per_object" where "ops_per_object"
is simply object_size / op_size.
Signed-off-by: Albert H Chen <hselin.chen@gmail.com>
rgw: Allow admin APIs that write metadata to be executed first on the mast…
Reviewed-by: Abhishek Lekshmanan <abhishek@suse.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
rgw: fix memory growth while deleting objects with
Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
it's a follow-up of 793308f8, where the code to use python2/python3 to
determine env_list was removed already
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
when monitor-side close the connection, msgr call MonClient
ms_handle_reset cause reply.get_future be called twice then
assert happen in promise.get_future.
promise<T...>::get_future() noexcept {
assert(!this->_future && this->_state && !this->_task);
return future<T...>(this);
}
use shared_promise instead of promise to solve it.
Signed-off-by: chunmei Liu <chunmei.liu@intel.com>
* refs/pull/28560/head:
cephfs-shell: handle du's arguments elsewhere outside do_du()
cephfs-shell: reuse code
cephfs-shell: rewrite call to perror in do_du
pybind/cephfs: define variable for hexcode used in stat()
test_cephfs_shell: test cephfs-shell command at invocation
cephfs-shell: refactor do_du()
cephfs-shell: option -r is not for reverse
cephfs-shell: extend to_bytes()
test_cephfs_shell: test du with no args
test_cephfs_shell: test du with multiple paths in args
test_cephfs_shell: test behaviour of "du -r"
test_cephfs_shell: test du's output for softlinks
qa/cephfs: add convenience method lstat()
qa/cephfs: add option to make stat() unfollow symlinks
test_cephfs_shell: test du's output for hardlinks
test_cephfs_shell: test du's output for directories
test_cephfs_shell: test du's output for regular files
test_cephfs_shell: add a method to get command output
test_cephfs_shell: allow cmd as list too
test_cephfs_shell: rename and rewrite _cephfs_shell()
test_cephfs_shell: copy humanize() from cephfs-shell
cephfs-shell: print disk usage for non-directory files too
pybind/cephfs: add method that stats symlinks without following
cephfs-shell: Fix 'du' command error
Reviewed-by: Varsha Rao <varao@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/29907/head:
doc: add a doc for vstart_runner.py
Reviewed-by: Varsha Rao <varao@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>