Run a mini cluster and verify that modifying the crushmap,
sending it to the cluster produces the intended effect.
Refs: http://tracker.ceph.com/issues/18943
Signed-off-by: Loic Dachary <ldachary@redhat.com>
The class qualifier is documented for devices and step take
only. Although it shows in the buckets and could be set by the user, it
would be very error prone to do so.
Refs: http://tracker.ceph.com/issues/18943
Signed-off-by: Loic Dachary <ldachary@redhat.com>
The device classes are implemented by modifying:
- the argument of step TAKE in rules
- cloning bucket trees when required by a rule step
This happens (via populate_classes):
- before compiling a rule step TAKE
When the crush map is encoded, the device class information is stored
with it, independently from the rules and the buckets, as a map of
classes for each device & bucket and a map of classes for each rule step
TAKE.
The extra buckets created but not used by any rule do not need to be
preserved and they are removed (via cleanup_classes):
- before decompilation
- after compilation
- after decoding
The client and daemons that are not aware of the device classes are
compatible because the crushmap modified with the new buckets is fully
functional. The invalid names used in the for the generated
buckets (bucket~class) can be CrushWrapper::decode by any existing
client because there is no verification of the name validity during
decoding. It can also be CrushWrapper::dump or CrushCompiler::decompile
via ceph osd dump or crushtool. It cannot, however, be compiled again
because CrushCompiler::compile will try to set the name with
CrushWrapper::set_item_name and it will fail with EINVAL because of the
~.
Fixes: http://tracker.ceph.com/issues/18943
Signed-off-by: Loic Dachary <ldachary@redhat.com>
The step take of a rule may reference a bucket that is cloned
from the original one, with the device class added. In this
case both buckets are busy because they are taken by the rule.
Refs: http://tracker.ceph.com/issues/18943
Signed-off-by: Loic Dachary <ldachary@redhat.com>
previously Dispatcher thread will poll both rx and tx events, then dispatch
these events to RDMAWorker and RDMAConnectedSocketImpl.
Actually tx event handling is a lightweight task and we make these handling
inline now. rx event dispatching is still working.
Another change is adding tx cq to make event polling separated.
removing lots of codes yet.
Signed-off-by: Haomai Wang <haomai@xsky.com>
So it can display the same as ceph osd crush dump from a file instead of
a live cluster. The default format is json-pretty and can be modified
via -f or --format in the same way as the ceph cli.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
removing dead qp's is actually done at polling. if polling is busy then
dead qp will not be removed and active_queue_pair counter is not correct.
issue: 992513
Change-Id: I825e813ce0632fd01f6d29adc87e0e33a2bc13d9
Signed-off-by: DanielBar-On <danielbo@mellanox.com>
* there is chance that some connections is still trying to authorize
itself after the MonClient is shut down. do not assert in this case,
but it is a sign of bug, or bad shutdown sequence, so print a message to
dout().
* do not use active_con->get_auth() as an alternative to `this->auth` if
it is not available. because we promote the authorized conn in
pending_cons as the active_con, and std::swap(active_conn->auth, this->auth)
with the monc_lock. so there is no point to return active_con->get_auth(),
as it's always null.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Previously we assumed that !deleted handles were resident--there
is an observed case where a !deleted handle is !linked. Since
we currently use safe_link mode, an is_linked() check is
available, and exhaustive.
Fixes: http://tracker.ceph.com/issues/19111
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
test/store_test: add deferred test case setup to support explicit min…
Reviewed-by: Varada Kari <varada.kari@sandisk.com>
Reviewed-by: Sage Weil <sage@redhat.com>
It is possible for a reset_desc() call to clear the desc char* while
get_desc() is executing such that it will return a nullptr to the caller.
This can lead to bad results, like a crash in std::string() (which does
not like to take null).
Fix this by not clearing desc. Instead, set a separate flag to indicate
that desc should be (safely) rebuilt on the next get_desc() call.
Fixes: http://tracker.ceph.com/issues/19110
Signed-off-by: Sage Weil <sage@redhat.com>