Commit Graph

68607 Commits

Author SHA1 Message Date
Haomai Wang
6016e1d3bc msg/async/rdma: add log to show correct destruct queuepair
Signed-off-by: Haomai Wang <haomai@xsky.com>
2017-02-14 14:30:38 +08:00
Kefu Chai
c3cca96d93 mon/MonClient: remove unnecessary helper functions
refactor _reopen_session() by removing wrapper around it.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-14 12:58:22 +08:00
Kefu Chai
8e019f661c mon/MonClient: remove unnecessary include
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-14 12:58:22 +08:00
Kefu Chai
a2eb6ae3fb mon/monclient: hunt for multiple monitor in parallel
* add an option "mon_client_hunt_parallel" for the maxmimum number of parallel
  hunting sessions.

Fixes: http://tracker.ceph.com/issues/16091
Signed-off-by: Steven Dieffenbach <sdieffen@redhat.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-14 12:58:22 +08:00
Kefu Chai
8729e1eb9c mon/MonClient: mark monc_lock a mutable
so we can label the getters of MonClient with the `const` specifier.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-14 12:58:22 +08:00
Kefu Chai
96862f8f29 mon/MonClient: use __func__ for function names
Signed-off-by: Steven Dieffenbach <sdieffen@redhat.com>
2017-02-14 12:58:22 +08:00
Kefu Chai
79f3f30112 client: move monc->set_want_keys() before monc->init()
if monc's tick connect to the mon before monc.set_want_keys() is called,
monc won't ask for the key for MDS service, and hence will fail to
build_authorizer() for MDS service. this change ready us for the
feature of monc-connect-to-mon-in-parallel.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-14 12:58:22 +08:00
Kefu Chai
5871a6f403 auth: AuthClientHandler::init() pass parameter by const ref
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-14 12:58:22 +08:00
Sage Weil
20809f0bf7 Merge pull request #13149 from liewegas/wip-list-objects
librados: remove legacy object listing API, clean up newer api

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2017-02-13 22:20:20 -06:00
Sage Weil
494f05ac4b Merge pull request #13409 from xiexingguo/wip-fix-throttler-name
os/bluestore: add "_" prefix for internal methods

Reviewed-by: Sage Weil <sage@redhat.com>
2017-02-13 22:16:52 -06:00
Sage Weil
b729e6288f osd: fix backoff vs reset race
In OSD::ms_handle_reset, we clear session->con before removing any
backoffs.  That means we have to check if con has been cleared after any
call to have_backoff, lest we race with ms_handle_reset and it removes the
backoffs but we don't realize our client session is disconnected.

Introduce a helper to do both these checks in a safe way, simplifying
callers while we're at it.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:53 -05:00
Sage Weil
5e885cca22 osd/Session: fix race between have_backoff() and clear_backoffs()
We may return a raw pointer that is about to get deallocated by
clear_backoffs().  Fix by returning a reference, preventing the free.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:53 -05:00
Sage Weil
d708041adc osd: rename backoff config options
Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:52 -05:00
Sage Weil
5825a13403 osd: manage backoffs per-pg; drop special split logic
Switch backoffs to be owned by a specific spg_t.  Instead of wonky split
logic, just clear them.  This is mostly just for convenience; we could
conceivably only clear the range belonging to children (just to stay
tidy--we'll never get a request in that range) but why bother.

The full pg backoffs are still defined by the range for the pg, although
it's a bit redundant--we could just as easily do [min,max).  This way we
get readable hobject ranges in the messages that go by without having to
map to/from pgids.

Add Session::add_backoff() helper to keep Session internals out of PG.h.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:52 -05:00
Sage Weil
b15d12d4b3 osdc/Objecter: manage backoffs per-spg_t
A backoff [range] is defined only within a specific spg_t; it does not
pass anything to children on split, or to another primary.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:52 -05:00
Sage Weil
f06580e2ab messages/MOSDBackoff: add spg_t to message
and make it an MOSDFastDispatchOp.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:52 -05:00
Sage Weil
9adb68f5e2 osdc/Objecter: recalculate target_* on every _calc_target call
Any time we are asked to calculate the target we should apply the
pool tiering parameters.  The previous logic of only doing so when the
target hadn't been calculated didn't make a whole lot of sense, and broke
our update of *pi that is needed to get the correct pg_num for the target
pool.  This didn't really matter for old clusters that take the raw pg,
but for luminous and beyond we need the exact spg_t which requires a
correct pg_num.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:52 -05:00
Sage Weil
e9dbe483fa osdc/Objecter: simplify pgid translation
All callers now pass in an explicit pgid, including pg listing.  Since
we resend ops on split, there is not need to do any translation here,
even for the jewel and kraken osds that can handle a full hash value.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:52 -05:00
Sage Weil
a7dc052f9e osdc/Objecter: use overlay pg_pool_t for subsequent calculations
We use pi for pg_num and other values below; we need to update accordingly
if we follow the overlay.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:52 -05:00
Sage Weil
f8fcefd234 osdc/Objecter: force pg_command ops to ignore overlay
Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:52 -05:00
Sage Weil
ef74cf71aa osdc/Objecter: force pg_read ops to ignore cache overlay
pg_read is only used for PG listing and hit_set_{list,get}; these
operations can't and shouldn't consider the tiering overlay.

This makes the _calc_target behavior with the explicit pgid make sense;
otherwise, what would it mean to try to read pg x.1 from pool x and get
redirected to pg y.1 in pool y?

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:51 -05:00
Sage Weil
557e73cb8a osd/OSDMap: make is_acting_osd_shard an explicit spg_t check
Ensure that the ps value is < the pool pg_num.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:51 -05:00
Sage Weil
b5f128ff55 osd/OSDMap: is_acting_osd_shard
Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:51 -05:00
Sage Weil
c2574fd8c4 osd: drop osd_debug_drop_op_probability
This is unused and not terribly useful.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:51 -05:00
Sage Weil
70be0db4a6 osd: move internal in-memory types to osd_internal_types.h
Things like ObjectContext and lock state that are internal to the OSD
do not need to be in osd_types and shared with other parts of the code
base.

Notably, this fixes the problem with OpRequest needing things from
osd_types.h (osd_reqid_t for starters).  Others to follow.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:51 -05:00
Sage Weil
a0deb73f22 osd/PG: no need to split op waiting lists
Clients are now expected to resend on split, and there is already an
interval change.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:51 -05:00
Sage Weil
3dbfa4fd8b osd: explicitly enumerate ops we can dispatch
This prevents random messages from falling into and OpRequest and
dispatch_op().

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:51 -05:00
Sage Weil
4ab0844887 osd: remove MOSDPGMissing
Removed 7c414c5dab (pre-bobtail).

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:51 -05:00
Sage Weil
baa56a6534 osd: make all fast dispatch ops MOSDFastDispatchOp children
Define common get_spg() and get_map_epoch() methods.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:51 -05:00
Sage Weil
3fd302cf81 osdc/Objecter: populate both actual pgid and full has in MOSDOp
New clients need the actual pgid as well as the full hash (as part of the
target hobj).  Old clients only use the full hash value.  We need to pass
both to MOSDOp so it can encode based on the target features.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:50 -05:00
Sage Weil
f6e219a4df messages/MOSDOp: take spg_t, not pg_t, and drop old ctor
Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:50 -05:00
Sage Weil
fcfec31d91 messages/MOSDOp: new encoding w/ actual pgid separate from hobject hash
New clients will see an actual pgid as well as a full has value in the
hobj.  Old clients will continue to see a single (full) hash value.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:50 -05:00
Sage Weil
6b65397465 messages/MOSDOp: add get_raw_pg()
Many current users expect a full hash value; make that explicit.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:50 -05:00
Sage Weil
c30b3c308a messages/MOSDOp: remove unused reassert_version
Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:50 -05:00
Sage Weil
80af2e4d32 osdc/Objecter: remove reassert_version
We never populate this since we never get an ack.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:50 -05:00
Sage Weil
a6fa7b6568 osd/OSDMap: generalize map_to_pg
So we can do this without constructing an object_locator_t.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:50 -05:00
Sage Weil
0efdd0a338 osd: make use of MOSDOp::get_hobj()
Prefer this to get_object_locator() whereever possible.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:50 -05:00
Sage Weil
e0c037e199 message/MOSDOp: build native hobject_t
Drop unneeded snapid_t snapid and object_locator_t, which just duplicate
hobject_t fields.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:49 -05:00
Sage Weil
e9cfeedb5d osd/PG: fix tracking of last_epoch_split
Note that it is only (currently) important that this value be accurate
on the current OSD since we only use this value (currently) to discard
ops sent before the split.  If we are getting the history from a different
OSD in the cluster that doesn't have an up to date value it doesn't matter
because that implies a primary change and also a client resend.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:49 -05:00
Sage Weil
36d3a29ef5 osd/PG: discard ops from before the last split
New clients will resend.

Old clients will see a last_force_op_resend (now named
last_force_op_resend_preluminous in latest code) and resend.

We know this because we require that the monitors upgrade to luminous
before the OSDs, and the new mon code sets this field on split.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:49 -05:00
Sage Weil
6e65e2665d osdc/Objecter: resend ops on pg split if osd has CEPH_FEATURE_RESEND_ON_SPLIT
Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 23:03:49 -05:00
xie xingguo
7cb210d98d os/bluestore: add "_" prefix for internal methods
These 4 methods are reserved for internal use only.
Prefix them with "_" to keep pace with others.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-02-14 10:22:03 +08:00
Sage Weil
379b4ca362 osd: set default debug level to 1/5
There are some useful messages at level 1. They're rare and won't affect
performance, but are helpful to see in the log.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 19:00:08 -06:00
Sage Weil
af55673668 osd: move a few critical messages to level 0
Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-13 18:58:26 -06:00
Sage Weil
a2fb70ddca Merge pull request #13391 from Adirl/ibv_exp
msg/async/rdma: check if exp verbs avail

Reviewed-by: Haomai Wang <haomai@xsky.com>
2017-02-13 18:54:10 -06:00
Sage Weil
293c766143 Merge pull request #13034 from wjwithagen/wip-wjw-brag
mailmap: Willem Jan Withagen affiliation

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2017-02-13 18:51:26 -06:00
Sage Weil
e5ae7bab66 Merge pull request #13209 from wjwithagen/wip-wjw-freebsd-init-ceph
init-ceph: Make init-ceph work under FreeBSD for init-system

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2017-02-13 18:50:19 -06:00
Sage Weil
625ffe8cc6 Merge pull request #13377 from wjwithagen/wip-wjw-freebsd-jemalloc
CMakeLists.txt: suppress unneeded warning about jemalloc
2017-02-13 18:47:29 -06:00
Sage Weil
b3723a1e60 Merge pull request #13360 from ktdreyer/doc-firewalld-improvements
doc: improve firewalld instructions

Reviewed-by: Sage Weil <sage@redhat.com>
2017-02-13 18:46:15 -06:00
Sage Weil
606aa6567e Merge pull request #13399 from vumrao/wip-vumrao-18919
rgw: change loglevel to 20 for 'System already converted' message

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2017-02-13 18:42:15 -06:00