Commit Graph

93298 Commits

Author SHA1 Message Date
Sage Weil
16980bd12f qa/suites/rados: replace mon_seesaw.py task with a small bash script
The teuthology test did not like the change to remove 'mon addr' from
ceph.conf.  The standalone script is easier to test.

Note that it avoids mon names 'a', 'b', 'c' since the MonMap::build_initial
uses those.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
f857c70c9c qa/suites/fs/upgrade: don't bind to v2 addrs
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
caa3a82ada qa/tasks/mon_thrash: avoid 'mon addr' in mon section
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
f3ddb1c9b8 mon/MonClient: disable ms_bind_msgr2 if NAUTILUS feature not set
Do not try to bind to v2 addresses until all of the mons will know what
we are doing and will be able to advertise those addresses.

This avoids the possibility of corner cases where we bind to one thing
but advertise something different via the various cluster maps.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
ae8b0cd62a osd/OSDMap: maintain compat addr fields
Fixes b47d9135d5 and
9fb1e521c7.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
1aceb2d04b msg/msg_types: add get_legacy_str()
Render a pre-nautilus entity_addr_t string.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
51ede31d55 mds/MDSMap.h: maintain compat addr field
This avoids breaking anyone looking at a pre-nautilus dump.
Fixes ea1481d08d.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
ac82a0b3e8 mon/MgrMap: maintain compat active_addr field
This avoids breaking anyone looking at a pre-nautilus dump.  Fixes
7f787704cd.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
5cba1fb874 mon/MonClient: reconnect to mon if it's addrvec appears to have changed
This primarily kicks in if we connect to a mon's v1 address during the
initial probe and then discover that it has v2+v1.  It's a catch-all,
though, so that we'll reconnect to the (er, a) mon in any case where we
see it's addresses change.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
c4ae3554e1 qa/tasks/ceph.conf.template: increase mon_mgr_mkfs_grace
The grace starts with the monmap creation stamp, and ceph.py does a lot
of work between creating that map and actually starting daemons (e.g.,
preparing all of the osd devices), leading to occasional MGR_DOWN errors.
Double the grace period.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
6f42b656d4 msg/async/ProtocolV2: fill in IP for all peer_addrs
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
283c8d3d34 msg/async: print all addrs on debug lines
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
c078c81031 mon/MonMap: no noname- mon name prefix when for_mkfs
Teuthology no longer puts mon addr in ceph.conf, and instead sets the
mon_host option globally.  That means a different path for
ceph-monstore-tool rebuild to regenerate the monmap, and the generated
map's names need to match teuthologies.

This change fixes the teuthology rebuild test because that tests's mon
names happen to also be 'a', 'b', 'c'.  It's fragile, but it works.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
c29b1d7246 ceph-monstore-tool: print initial monmap
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
4582de3ef3 msg/async/ProtocolV2: advertise ourselves as a v2 addr when using v2 protocol
We may have learned our address from a v1 connection, so myaddrs() is
a v1 addr like [v1:1.2.3.4:123/4392].  When we connect to someone using
msgr2, we should advertise ourselves as a v2 address, or else we risk
confusing everyone because we are a "v1" endpoint using the v2 protocol.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
69408b57e0 msg/async: assert existing protocol matches current protocol
If we are (potentially) replacing a connection, assert that the protocol
version matches.  If it doesn't, something very weird is going on!

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
87b89bd613 msg/async: add missing modelines
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
54efc4517c mon/MonMap: add missing modeline
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
b1aec88c9f vstart.sh: put mon addrs in mon_host, not 'mon addr'
Notably, mon addr won't take an addrvec, while everything else will.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
d151ece3af msg/async: better debug around conn map lookups and updates
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
395947ae40 mon/MonClient: dump initial monmap at debug level 10
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
b92be2ca9b qa/standalone/osd/osd-fast-mark-down: use v1 addr w/ simplemessenger
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
a19b8e5b14 qa/tasks/ceph: set initial monmap features with using addrvec addrs
The --add option will only infer a bare IP to include a v2 addr if the
NAUTILUS feature is there, and that isn't normally present on a freshly
generate monmap.  Add it if we are doing addrvecs!

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
a274e772f2 monmaptool: add --enable-all-features option
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
241d402d7c qa/tasks/ceph: only use monmaptool --addv if addr has [,:v]
Otherwise, we want the --add path, which has the logic to infer ports,
v2+v1, etc.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
ac2430a43d qa/tasks/ceph_manager: make get_mon_status use mon addr
We don't have the 'mon addr' config property any more.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
545df766be qa/tasks/ceph: keep mon addrs in ctx namespace
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
27b7867d83 mon/OSDMonitor: log all osd addrs on boot
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
a1cebd00a9 msg/simple: behave when v2 and v1 addrs are present at target
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
609c6d76bc mon/MonClient: warn if global_id changes
Shouldn't happen!

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
50f5e328d7 msg/Connection: add warning/note on get_peer_global_id
This field isn't populated for loopback connections because the msgr
doesn't have any insight into what global_id its user has.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
d64cac170e mds/MDSDaemon: clean up handle_mds_map debug output a bit
The old wording was misleading!

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
d980907fc4 qa/suites/rados/upgrade: debug mds
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
9e63e63468 mds/MDSRank: improve is_stale_message to handle addrvecs
If we get a connection on a loopback from ourselves, get_source_addrs()
will have everything we bound to, but the mdsmap may only have the v1
address.  Avoid the addrvec comparison by instead comparing the
ConnectionRefs.

NOTE: this implementation is a stopgap.  We should really maintain a map
of ConnectionRefs for the current up set and compare the ConnectionRefs
directly instead of comparing addr(vecs).

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
76ccf140ed msg/async: make loopback detect when sending to one of our many addrs
Drop the assert just because it's inefficient and not necessary.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
68913080b5 qa/suites/rados/upgrade: no aggressive pg num changes
We now run with mixed mons and old mgrs, so this won't work.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
0a67e02c14 mon/OSDMonitor: require nautilus mons for require_osd_release=nautilus
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
deb92c9d89 mon/OSDMonitor: require mimic mons for require_osd_release=mimic
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
576b6a77f1 qa/suites/rados/thrash-old-clients: use legacy addr syntax in ceph.conf
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
103c0238d4 msg/async: preserve peer features when replacing a connection
The features are now stored in the protocol implementation.  When we replace
an existing connection, copy those features so that our connect_msg_reply
calculates the correct features for the session.

This fixes an issue where a 3-mon cluster, after upgrading the two followers
but not the leader, was unable to include the (luminous) leader in the
quorum because it was seeing missing features in the connect reply, because
the new mons were replacing an old instance of the connection and weren't
copying the features, and that old instance had connect_msg.features == 0.

Add some debug lines that helped (finally) identify the problem.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
1ab352dd31 qa/tasks/ceph.py: move methods from teuthology.git into ceph.py directly; support mon bind * options
Having these live in teuthology.git is silly, since they are only consumed
by the ceph task, and it is hard to revise the behavior.

Revise the behavior by adding mon_bind_* options.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
572311b614 mon/MonMap: adjust build_initial behavior for mkfs vs probe
For the mkfs case, interpret an ambiguous port as a v2 address.  For probe,
try both.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
5992a61c2c mon/MonMap: improve ambiguous addr behavior
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
b1493f0d9a qa/suites/rados/upgrade: spread mons a bit
This will mean 2/3 mons have default ports.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
fbdc1358e6 qa/rados/thrash-old-clients: keep mons on separate hosts
This ensures the mons can use default ports, ceph.conf won't have v1: or
v2: prefixes, and old clients will be happy.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
7559a47f5b qa/standalone/mon/misc.sh: tweak test to be more robust
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
2f786f3299 qa/tasks/mon_seesaw: expect v1/v2 prefix in addr
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
55a9c7522f osd/OSDMap: fix is_blacklisted() check to assume type ANY
Note that this still does a copy of the addr struct (as it did before).
This could be more efficient...

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
0610e56265 mon/OSDMonitor: use ANY addr type for blacklisting
Client addresses are untyped in that they can connect to v1 or v2 server
endpoints, so blacklist them as TYPE_ANY.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:20 -06:00
Sage Weil
5a404f9e4b mon/msg_types: TYPE_V1ORV2 -> TYPE_ANY
..and allow us to parse it.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-12-22 15:53:13 -06:00