* add an option "mon_client_hunt_parallel" for the maxmimum number of parallel
hunting sessions.
Fixes: http://tracker.ceph.com/issues/16091
Signed-off-by: Steven Dieffenbach <sdieffen@redhat.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
if monc's tick connect to the mon before monc.set_want_keys() is called,
monc won't ask for the key for MDS service, and hence will fail to
build_authorizer() for MDS service. this change ready us for the
feature of monc-connect-to-mon-in-parallel.
Signed-off-by: Kefu Chai <kchai@redhat.com>
In OSD::ms_handle_reset, we clear session->con before removing any
backoffs. That means we have to check if con has been cleared after any
call to have_backoff, lest we race with ms_handle_reset and it removes the
backoffs but we don't realize our client session is disconnected.
Introduce a helper to do both these checks in a safe way, simplifying
callers while we're at it.
Signed-off-by: Sage Weil <sage@redhat.com>
We may return a raw pointer that is about to get deallocated by
clear_backoffs(). Fix by returning a reference, preventing the free.
Signed-off-by: Sage Weil <sage@redhat.com>
Switch backoffs to be owned by a specific spg_t. Instead of wonky split
logic, just clear them. This is mostly just for convenience; we could
conceivably only clear the range belonging to children (just to stay
tidy--we'll never get a request in that range) but why bother.
The full pg backoffs are still defined by the range for the pg, although
it's a bit redundant--we could just as easily do [min,max). This way we
get readable hobject ranges in the messages that go by without having to
map to/from pgids.
Add Session::add_backoff() helper to keep Session internals out of PG.h.
Signed-off-by: Sage Weil <sage@redhat.com>
A backoff [range] is defined only within a specific spg_t; it does not
pass anything to children on split, or to another primary.
Signed-off-by: Sage Weil <sage@redhat.com>
Any time we are asked to calculate the target we should apply the
pool tiering parameters. The previous logic of only doing so when the
target hadn't been calculated didn't make a whole lot of sense, and broke
our update of *pi that is needed to get the correct pg_num for the target
pool. This didn't really matter for old clusters that take the raw pg,
but for luminous and beyond we need the exact spg_t which requires a
correct pg_num.
Signed-off-by: Sage Weil <sage@redhat.com>
All callers now pass in an explicit pgid, including pg listing. Since
we resend ops on split, there is not need to do any translation here,
even for the jewel and kraken osds that can handle a full hash value.
Signed-off-by: Sage Weil <sage@redhat.com>
pg_read is only used for PG listing and hit_set_{list,get}; these
operations can't and shouldn't consider the tiering overlay.
This makes the _calc_target behavior with the explicit pgid make sense;
otherwise, what would it mean to try to read pg x.1 from pool x and get
redirected to pg y.1 in pool y?
Signed-off-by: Sage Weil <sage@redhat.com>
Things like ObjectContext and lock state that are internal to the OSD
do not need to be in osd_types and shared with other parts of the code
base.
Notably, this fixes the problem with OpRequest needing things from
osd_types.h (osd_reqid_t for starters). Others to follow.
Signed-off-by: Sage Weil <sage@redhat.com>
New clients need the actual pgid as well as the full hash (as part of the
target hobj). Old clients only use the full hash value. We need to pass
both to MOSDOp so it can encode based on the target features.
Signed-off-by: Sage Weil <sage@redhat.com>
New clients will see an actual pgid as well as a full has value in the
hobj. Old clients will continue to see a single (full) hash value.
Signed-off-by: Sage Weil <sage@redhat.com>
Note that it is only (currently) important that this value be accurate
on the current OSD since we only use this value (currently) to discard
ops sent before the split. If we are getting the history from a different
OSD in the cluster that doesn't have an up to date value it doesn't matter
because that implies a primary change and also a client resend.
Signed-off-by: Sage Weil <sage@redhat.com>
New clients will resend.
Old clients will see a last_force_op_resend (now named
last_force_op_resend_preluminous in latest code) and resend.
We know this because we require that the monitors upgrade to luminous
before the OSDs, and the new mon code sets this field on split.
Signed-off-by: Sage Weil <sage@redhat.com>
There are some useful messages at level 1. They're rare and won't affect
performance, but are helpful to see in the log.
Signed-off-by: Sage Weil <sage@redhat.com>