mgr/orchestrator: Unify `osd create` and `osd add`
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
Also:
* Added some more tests
* Better validation of drive Groups
* Simplified `TestWriteCompletion`
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
* refs/pull/26038/head:
mds: simplify recall warnings
mds: add extra details for cache drop output
qa: test mds_max_caps_per_client conf
mds: limit maximum number of caps held by session
mds: adapt drop cache for incremental recall
mds: recall caps incrementally
mds: adapt drop cache for incremental trim
mds: add throttle for trimming MDCache
mds: cleanup SessionMap init
mds: cleanup Session init
Reviewed-by: Zheng Yan <zyan@redhat.com>
Instead of a timeout and complicated decisions about whether the client is
releasing caps in an expeditious fashion, just use a DecayCounter that tracks
the number of caps we've recalled. This counter is decremented whenever the
client releases caps. If the counter passes a threshold, then we raise the
warning.
Similar reworking is done for the steady-state recall of client caps. Another
release DecayCounter is added so we can tell when the client is not releasing
any more caps.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
-filter out mons from other clusters
-fix parsing of mon name from role
Fixes: http://tracker.ceph.com/issues/38115
Signed-off-by: Casey Bodley <cbodley@redhat.com>
* When the creation of the cluster is delegated to vstart_runner.py
(--create or --create-target-only) the amount of MGRs required
is calculated by the script so there is no more skipped tests
due to insufficient amount of MGRs.
* Additionally, this issue is not reproducible anymore:
Fixes: https://tracker.ceph.com/issues/37964
* Fixed typo: TEUTHOLOFY_PY_REQS
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
As with trimming, use DecayCounters to throttle the number of caps we recall,
both globally and per-session.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
`cache drop` is a long running command that will block the asok interface
(while the tell version does not). Attempting to abort the command with ^C or
equivalents will simply cause the `ceph` command to exit but won't stop the
asok command handler from waiting for the cache drop operation to complete.
Instead, just allow the tell version.
Fixes: http://tracker.ceph.com/issues/38020
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/26012/head:
qa: add test that down fs does not ERR
mon/MDSMonitor: skip offline ERR for down fs
Reviewed-by: Douglas Fuller <dfuller@redhat.com>
* refs/pull/25973/head:
qa: use simpler fs fail to bring fs down
MDSMonitor: add fs fail command
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Douglas Fuller <dfuller@redhat.com>
If there are leftover merges at the end of the run they can take a long
time to get through, blowing our timeout for (waiting for pgs to become
active and to stop splitting/merge) and scrubbing pgs. Stop all of that
at the end of the run so that we don't have to wait so long.
Signed-off-by: Sage Weil <sage@redhat.com>
We used to rely on the monmap bootstrap code to magically create a valid
monmap with named mons because our old-style ceph.conf had mon_addr
values in each mon.foo section. Instead, just feed it a real monmap
from pre-destruction.
In practice, a user can manually generate this monmap, or rename the
mons after the fact with --inject-monmap, or whatever. Out of scope
for this test, so we just do the simplest thing to make the rebuild test
work.
Signed-off-by: Sage Weil <sage@redhat.com>
Use the new config option type names (given by the cluster) in the
dashboard.
Fixes: http://tracker.ceph.com/issues/37843
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
- if force-branch, use that
- otherwise:
- read default-branch from client config
- use suite branch or ceph branch if suite branch is not defined
- if this branch is one of official releases (or master), prefix
it with 'ceph-'
try to clone branch specified above, if failed (branch doesn't exist probably)
and not force-branch, use default-branch.
Also add an option to override ragweed repo.
Switched all force-branch from ragweed qa suite to default-branch.
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Otherwise the Mutation for Truncate is done on obj_id of the last iteration of the previous loop.
Fixes: http://tracker.ceph.com/issues/37836
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/25621/head:
mds: allow boot on read-only
mds: setup readonly mode for PurgeQueue
mds: return string_view for type str
mds: add missing locks for PurgeQueue methods
mds: delete on_error context on des
Reviewed-by: Zheng Yan <zyan@redhat.com>
* refs/pull/25009/head:
librbd: stringify locker name with get_legacy_str()
osdc/Objecter: fix list_watchers addr rendering to match legacy
test/crimson: disable unittest_seastar_messenger test
msg/msg_types: encode entity_addr_t TYPE_ANY as TYPE_LEGACY for pre-nautilus
client: make blacklist detection handle TYPE_ANY entries
mon/OSDMonitor: maintain compat output for 'blacklist ls'
client: maintain compat for {inst,addr}_str in status dump
qa/tasks/ceph_manager: compare osd flush seq #'s as ints
qa/suites/fs: make use of simple.yaml where appropriate
qa/msgr: move msgr factet into generic re-usable dir
crimson: fix monmap build for seastar
doc/start/ceph.conf: trim the sample ceph.conf file
doc/rados/operations: only describe --public-{addr,network} method for adding mons
PendingReleaseNotes: deprecate 'mon addr'
doc: fix some 'mon addr' references
doc/rados/configuration: fix some 'mon addr' references
doc/rados/configuration/network-config-ref: revise network docs somewhat
doc/rados/configuration/network-config-ref: remove totally obsolete section
qa/suites/rados: replace mon_seesaw.py task with a small bash script
qa/suites/fs/upgrade: don't bind to v2 addrs
qa/tasks/mon_thrash: avoid 'mon addr' in mon section
mon/MonClient: disable ms_bind_msgr2 if NAUTILUS feature not set
osd/OSDMap: maintain compat addr fields
msg/msg_types: add get_legacy_str()
mds/MDSMap.h: maintain compat addr field
mon/MgrMap: maintain compat active_addr field
mon/MonClient: reconnect to mon if it's addrvec appears to have changed
qa/tasks/ceph.conf.template: increase mon_mgr_mkfs_grace
msg/async/ProtocolV2: fill in IP for all peer_addrs
msg/async: print all addrs on debug lines
mon/MonMap: no noname- mon name prefix when for_mkfs
ceph-monstore-tool: print initial monmap
msg/async/ProtocolV2: advertise ourselves as a v2 addr when using v2 protocol
msg/async: assert existing protocol matches current protocol
msg/async: add missing modelines
mon/MonMap: add missing modeline
vstart.sh: put mon addrs in mon_host, not 'mon addr'
msg/async: better debug around conn map lookups and updates
mon/MonClient: dump initial monmap at debug level 10
qa/standalone/osd/osd-fast-mark-down: use v1 addr w/ simplemessenger
qa/tasks/ceph: set initial monmap features with using addrvec addrs
monmaptool: add --enable-all-features option
qa/tasks/ceph: only use monmaptool --addv if addr has [,:v]
qa/tasks/ceph_manager: make get_mon_status use mon addr
qa/tasks/ceph: keep mon addrs in ctx namespace
mon/OSDMonitor: log all osd addrs on boot
msg/simple: behave when v2 and v1 addrs are present at target
mon/MonClient: warn if global_id changes
msg/Connection: add warning/note on get_peer_global_id
mds/MDSDaemon: clean up handle_mds_map debug output a bit
qa/suites/rados/upgrade: debug mds
mds/MDSRank: improve is_stale_message to handle addrvecs
msg/async: make loopback detect when sending to one of our many addrs
qa/suites/rados/upgrade: no aggressive pg num changes
mon/OSDMonitor: require nautilus mons for require_osd_release=nautilus
mon/OSDMonitor: require mimic mons for require_osd_release=mimic
qa/suites/rados/thrash-old-clients: use legacy addr syntax in ceph.conf
msg/async: preserve peer features when replacing a connection
qa/tasks/ceph.py: move methods from teuthology.git into ceph.py directly; support mon bind * options
mon/MonMap: adjust build_initial behavior for mkfs vs probe
mon/MonMap: improve ambiguous addr behavior
qa/suites/rados/upgrade: spread mons a bit
qa/rados/thrash-old-clients: keep mons on separate hosts
qa/standalone/mon/misc.sh: tweak test to be more robust
qa/tasks/mon_seesaw: expect v1/v2 prefix in addr
osd/OSDMap: fix is_blacklisted() check to assume type ANY
mon/OSDMonitor: use ANY addr type for blacklisting
mon/msg_types: TYPE_V1ORV2 -> TYPE_ANY
qa/workunits/cephtool: fix blacklist test
qa/suites/upgrade: install old version with only v1 addrs
common/options: by default, bind to both msgr v1 and v2 addresses
vstart.sh: add --msgr1, --msgr2, --msgr21 options
msg/async/ProtocolV2: be flexible with server identity check
msg/msg_types: fix entity_addrvec_t::parse() with null end arg
qa/suites/rados/basic/msgr: no msgr2 addrs in initial monmaps
qa/tasks/ceph: add 'mon_bind_addrvec' and 'mon_bind_msgr2' options
monmaptool: add --addv argument to pass in addrvec directly
qa/suites/rados/basic/msgr: do not use msgr2 with simplemessenger
qa/suites/rados/basic/msgr: async is not experimental
messages/MOSDBoot: fix compat with pre-nautilus
mon/MonMap: allow v1 or v2 to be explicitly specified along with part
msg/msg_types: allow parsing of IPs without assuming v1 vs v2
msg/msg_types: default parse to v2 addrs
msg: standarize on v1: and v2: prefixes for *all* entity_addr_t's
vstart.sh: use msgr2 by default
mon/MonMap: remove get_addr() methods
ceph-mon: adjust startup/bind/join sequence to use addrs
mon: use MonMap::get_addrs() (instead of get_addr())
mon/MonClient: change pending_cons to addrvec-based map
mon/MonMap: fix set_addr() caller, kill wrapper
mon/MonMap: remove addr-based add()
monmaptool: fix --add to do either legacy or msgr2+legacy
monmaptool: clean up iterator use a bit
mon/MonMap: handle ambiguous mon addrs by trying both legacy and msgr
mon/MonMap: take addrvec for set_initial_members
mon/MonMap: use addrvecs for test instances
mon: pass addrvec via MMonJoin
mon/MonmapMonitor: fix 'mon add' to populate addrvec
mon/MonMap: addr -> addrvec
msg/async/ProtocolV2: only update socket_addr if we learned our addr
osd: go active even if mon only accepted our v1 addr
test/msgr: add test for msgr2 protocol
msg/async/ProtocolV2: share socket_addr and all addrs during handshake
msg/async: print socket_addr for the connection
msg/async: msgr2 protocol placeholder
msg/async: move ProtocolV1 class to its own source file
msg/async: keep listen addr in ServerSocket, pass to new connections
msg/async/AsyncMessenger: fix set_addr_unknowns
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
The teuthology test did not like the change to remove 'mon addr' from
ceph.conf. The standalone script is easier to test.
Note that it avoids mon names 'a', 'b', 'c' since the MonMap::build_initial
uses those.
Signed-off-by: Sage Weil <sage@redhat.com>
The grace starts with the monmap creation stamp, and ceph.py does a lot
of work between creating that map and actually starting daemons (e.g.,
preparing all of the osd devices), leading to occasional MGR_DOWN errors.
Double the grace period.
Signed-off-by: Sage Weil <sage@redhat.com>