Commit Graph

966 Commits

Author SHA1 Message Date
Patrick Donnelly
ac302de7b7
qa: silence read-only WRN for damage testing
Fixes: http://tracker.ceph.com/issues/37944

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-01-16 11:55:54 -08:00
Sage Weil
d0bf18379c Merge PR #25917 into master
* refs/pull/25917/head:
	qa/suites/rados/multimon/tasks/mon_recovery: whitelist PG_AVAILABILITY

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2019-01-12 10:25:57 -06:00
Sage Weil
c18a5d2e1c qa/tasks/rebuild_mondb: use monmap to properly name the mons
We used to rely on the monmap bootstrap code to magically create a valid
monmap with named mons because our old-style ceph.conf had mon_addr
values in each mon.foo section.  Instead, just feed it a real monmap
from pre-destruction.

In practice, a user can manually generate this monmap, or rename the
mons after the fact with --inject-monmap, or whatever.  Out of scope
for this test, so we just do the simplest thing to make the rebuild test
work.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-11 16:10:14 -06:00
Sage Weil
af435783b4 qa/suites/rados/multimon/tasks/mon_recovery: whitelist PG_AVAILABILITY
The mgr creates a pool for device health, and mons may be thrashing and
make peering slow.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-11 09:43:07 -06:00
Sage Weil
221afb0e28 Merge PR #25840 into master
* refs/pull/25840/head:
	qa/msgr: add async-v1only case

Reviewed-by: Neha Ojha <nojha@redhat.com>
2019-01-10 17:20:10 -06:00
Josh Durgin
a05f9ebaa6
Merge pull request #25816 from neha-ojha/wip-36686
osd/mon: fix upgrades for pg log hard limit

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Sage Weil  <sage@redhat.com>
2019-01-09 13:17:30 -08:00
Sage Weil
4c69fe2d3b qa/msgr: add async-v1only case
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-08 13:22:39 -06:00
Casey Bodley
1b2b885518
Merge pull request #25381 from cbodley/wip-qa-rgw-cls
qa/rgw: add cls_lock/log/refcount/version tests to verify suite

Reviewed-by: Abhishek Lekshmanan <abhishek@suse.com>
2019-01-08 13:05:11 -05:00
Neha Ojha
c0da26505f qa/suites/upgrade/*-x/stress-split: set pglog_hardlimit flag
Signed-off-by: Neha Ojha <nojha@redhat.com>
2019-01-07 09:42:54 -08:00
Neha Ojha
24c3e2d669 qa/suites/upgrade/luminous-x: add pg log settings
Signed-off-by: Neha Ojha <nojha@redhat.com>
2019-01-07 09:42:54 -08:00
Yuri Weinstein
45af678d3d qa/tests: added pg log settings to mimic-x
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
Signed-off-by: Neha Ojha <nojha@redhat.com>
2019-01-07 09:42:22 -08:00
Sage Weil
1688d8fd92 qa/suites/rados/thrash-old-clients: no async-v2only
Old clients don't support the v2 protocol.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-05 12:26:56 -06:00
Sage Weil
e069c30cb3 Merge remote-tracking branch 'private/wip-mon-kv-fix' into wip-mimic-4 2019-01-04 14:03:56 -06:00
Sage Weil
251f667ef8 Merge PR #25009 into master
* refs/pull/25009/head:
	librbd: stringify locker name with get_legacy_str()
	osdc/Objecter: fix list_watchers addr rendering to match legacy
	test/crimson: disable unittest_seastar_messenger test
	msg/msg_types: encode entity_addr_t TYPE_ANY as TYPE_LEGACY for pre-nautilus
	client: make blacklist detection handle TYPE_ANY entries
	mon/OSDMonitor: maintain compat output for 'blacklist ls'
	client: maintain compat for {inst,addr}_str in status dump
	qa/tasks/ceph_manager: compare osd flush seq #'s as ints
	qa/suites/fs: make use of simple.yaml where appropriate
	qa/msgr: move msgr factet into generic re-usable dir
	crimson: fix monmap build for seastar
	doc/start/ceph.conf: trim the sample ceph.conf file
	doc/rados/operations: only describe --public-{addr,network} method for adding mons
	PendingReleaseNotes: deprecate 'mon addr'
	doc: fix some 'mon addr' references
	doc/rados/configuration: fix some 'mon addr' references
	doc/rados/configuration/network-config-ref: revise network docs somewhat
	doc/rados/configuration/network-config-ref: remove totally obsolete section
	qa/suites/rados: replace mon_seesaw.py task with a small bash script
	qa/suites/fs/upgrade: don't bind to v2 addrs
	qa/tasks/mon_thrash: avoid 'mon addr' in mon section
	mon/MonClient: disable ms_bind_msgr2 if NAUTILUS feature not set
	osd/OSDMap: maintain compat addr fields
	msg/msg_types: add get_legacy_str()
	mds/MDSMap.h: maintain compat addr field
	mon/MgrMap: maintain compat active_addr field
	mon/MonClient: reconnect to mon if it's addrvec appears to have changed
	qa/tasks/ceph.conf.template: increase mon_mgr_mkfs_grace
	msg/async/ProtocolV2: fill in IP for all peer_addrs
	msg/async: print all addrs on debug lines
	mon/MonMap: no noname- mon name prefix when for_mkfs
	ceph-monstore-tool: print initial monmap
	msg/async/ProtocolV2: advertise ourselves as a v2 addr when using v2 protocol
	msg/async: assert existing protocol matches current protocol
	msg/async: add missing modelines
	mon/MonMap: add missing modeline
	vstart.sh: put mon addrs in mon_host, not 'mon addr'
	msg/async: better debug around conn map lookups and updates
	mon/MonClient: dump initial monmap at debug level 10
	qa/standalone/osd/osd-fast-mark-down: use v1 addr w/ simplemessenger
	qa/tasks/ceph: set initial monmap features with using addrvec addrs
	monmaptool: add --enable-all-features option
	qa/tasks/ceph: only use monmaptool --addv if addr has [,:v]
	qa/tasks/ceph_manager: make get_mon_status use mon addr
	qa/tasks/ceph: keep mon addrs in ctx namespace
	mon/OSDMonitor: log all osd addrs on boot
	msg/simple: behave when v2 and v1 addrs are present at target
	mon/MonClient: warn if global_id changes
	msg/Connection: add warning/note on get_peer_global_id
	mds/MDSDaemon: clean up handle_mds_map debug output a bit
	qa/suites/rados/upgrade: debug mds
	mds/MDSRank: improve is_stale_message to handle addrvecs
	msg/async: make loopback detect when sending to one of our many addrs
	qa/suites/rados/upgrade: no aggressive pg num changes
	mon/OSDMonitor: require nautilus mons for require_osd_release=nautilus
	mon/OSDMonitor: require mimic mons for require_osd_release=mimic
	qa/suites/rados/thrash-old-clients: use legacy addr syntax in ceph.conf
	msg/async: preserve peer features when replacing a connection
	qa/tasks/ceph.py: move methods from teuthology.git into ceph.py directly; support mon bind * options
	mon/MonMap: adjust build_initial behavior for mkfs vs probe
	mon/MonMap: improve ambiguous addr behavior
	qa/suites/rados/upgrade: spread mons a bit
	qa/rados/thrash-old-clients: keep mons on separate hosts
	qa/standalone/mon/misc.sh: tweak test to be more robust
	qa/tasks/mon_seesaw: expect v1/v2 prefix in addr
	osd/OSDMap: fix is_blacklisted() check to assume type ANY
	mon/OSDMonitor: use ANY addr type for blacklisting
	mon/msg_types: TYPE_V1ORV2 -> TYPE_ANY
	qa/workunits/cephtool: fix blacklist test
	qa/suites/upgrade: install old version with only v1 addrs
	common/options: by default, bind to both msgr v1 and v2 addresses
	vstart.sh: add --msgr1, --msgr2, --msgr21 options
	msg/async/ProtocolV2: be flexible with server identity check
	msg/msg_types: fix entity_addrvec_t::parse() with null end arg
	qa/suites/rados/basic/msgr: no msgr2 addrs in initial monmaps
	qa/tasks/ceph: add 'mon_bind_addrvec' and 'mon_bind_msgr2' options
	monmaptool: add --addv argument to pass in addrvec directly
	qa/suites/rados/basic/msgr: do not use msgr2 with simplemessenger
	qa/suites/rados/basic/msgr: async is not experimental
	messages/MOSDBoot: fix compat with pre-nautilus
	mon/MonMap: allow v1 or v2 to be explicitly specified along with part
	msg/msg_types: allow parsing of IPs without assuming v1 vs v2
	msg/msg_types: default parse to v2 addrs
	msg: standarize on v1: and v2: prefixes for *all* entity_addr_t's
	vstart.sh: use msgr2 by default
	mon/MonMap: remove get_addr() methods
	ceph-mon: adjust startup/bind/join sequence to use addrs
	mon: use MonMap::get_addrs() (instead of get_addr())
	mon/MonClient: change pending_cons to addrvec-based map
	mon/MonMap: fix set_addr() caller, kill wrapper
	mon/MonMap: remove addr-based add()
	monmaptool: fix --add to do either legacy or msgr2+legacy
	monmaptool: clean up iterator use a bit
	mon/MonMap: handle ambiguous mon addrs by trying both legacy and msgr
	mon/MonMap: take addrvec for set_initial_members
	mon/MonMap: use addrvecs for test instances
	mon: pass addrvec via MMonJoin
	mon/MonmapMonitor: fix 'mon add' to populate addrvec
	mon/MonMap: addr -> addrvec
	msg/async/ProtocolV2: only update socket_addr if we learned our addr
	osd: go active even if mon only accepted our v1 addr
	test/msgr: add test for msgr2 protocol
	msg/async/ProtocolV2: share socket_addr and all addrs during handshake
	msg/async: print socket_addr for the connection
	msg/async: msgr2 protocol placeholder
	msg/async: move ProtocolV1 class to its own source file
	msg/async: keep listen addr in ServerSocket, pass to new connections
	msg/async/AsyncMessenger: fix set_addr_unknowns

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2019-01-04 13:42:09 -06:00
Neha Ojha
a74129d26d qa/suites/upgrade/mimic-x: fix rhel runs
The following fragment was required for rhel on ovh
overrides:
    ansible.cephlab:
      skip_tags: entitlements,packages,repos

Since, this suite runs on smithi in our nightlies, we should not need
this.

Signed-off-by: Neha Ojha <nojha@redhat.com>
2019-01-03 13:39:30 -08:00
Sage Weil
8a3d90199d qa/suites/fs: make use of simple.yaml where appropriate
There's more needed than just ms_type=simple now.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:38 -06:00
Sage Weil
d518eb6cac qa/msgr: move msgr factet into generic re-usable dir
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:38 -06:00
Sage Weil
16980bd12f qa/suites/rados: replace mon_seesaw.py task with a small bash script
The teuthology test did not like the change to remove 'mon addr' from
ceph.conf.  The standalone script is easier to test.

Note that it avoids mon names 'a', 'b', 'c' since the MonMap::build_initial
uses those.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
f857c70c9c qa/suites/fs/upgrade: don't bind to v2 addrs
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
d980907fc4 qa/suites/rados/upgrade: debug mds
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
68913080b5 qa/suites/rados/upgrade: no aggressive pg num changes
We now run with mixed mons and old mgrs, so this won't work.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
576b6a77f1 qa/suites/rados/thrash-old-clients: use legacy addr syntax in ceph.conf
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
b1493f0d9a qa/suites/rados/upgrade: spread mons a bit
This will mean 2/3 mons have default ports.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
fbdc1358e6 qa/rados/thrash-old-clients: keep mons on separate hosts
This ensures the mons can use default ports, ceph.conf won't have v1: or
v2: prefixes, and old clients will be happy.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:31 -06:00
Sage Weil
0692d06979 qa/suites/upgrade: install old version with only v1 addrs
v1+v2 support is new in nautilus.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-12-21 15:31:32 -06:00
Sage Weil
6429537bd7 qa/suites/rados/basic/msgr: no msgr2 addrs in initial monmaps
Signed-off-by: Sage Weil <sage@redhat.com>
2018-12-21 15:31:32 -06:00
Sage Weil
a58fcf9e0f qa/suites/rados/basic/msgr: do not use msgr2 with simplemessenger
Signed-off-by: Sage Weil <sage@redhat.com>
2018-12-21 15:31:32 -06:00
Sage Weil
9a5aa423e0 qa/suites/rados/basic/msgr: async is not experimental
Signed-off-by: Sage Weil <sage@redhat.com>
2018-12-21 15:31:32 -06:00
Sebastian Wagner
933b2cfc28 mgr/orchestrator: Add test orchestrator
1. To be able to run the cli without an external orchestrator.
2. Run the CLI in Teuthology.

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
2018-12-20 10:56:49 +01:00
Sage Weil
9f3cf00b79 Merge PR #25360 into master
* refs/pull/25360/head:
	qa/workunits/mon/pg_autoscaler: clean up pools afterwards
	qa/suites/rados/singletone/all/pg-autoscaler: whitelist health warnings
	qa/tasks/ceph: wait for splits/merges before final scrub
	mon/OSDMonitor: be tidy with target_size_ratio and pre-nautilus code
	mgr/pg_autoscaler: simplify conditions
	qa/suites/rados: add simple pg-autoscaler test
	qa/workunits/cephtool/test.sh: pg_autoscale_mode=off while testing pg_num etc
	doc/rados/operations: document autoscaler and its health warnings
	mgr/pg_autoscaler: add pg autoscaler module
	pybind/mgr/mgr_util: move format_ helpers out of status module
	mon/OSDMonitor: accept optional target_size_{bytes,ratio} to 'osd pool create'
	mon/OSDMonitor: remove max_split_count configurable
	osd/osd_types: pool_opts_t: int -> int64_t
	osd/osd_types: pool_opts: fix whitespace
	osd/osd_types: pool_opts_t: make encoding feature-dependent
	mgr/devicehealth: pg_num_min 1 for device_health_metrics pool
	mon/OSDMonitor: accept optional pg_num_min to 'osd pool create'
	mon/OSDMonitor: apply osd_pool_default_pg_autoscale_mode to new pools
	pybind/mgr/mgr_module: some accessors
	mon/MgrMonitor: enable progress module by default
	osd/osd_types: add pool pg_autoscale_mode, pg_num_min, target_size_{bytes,ratio} properties
	osdc/Objecter: revise get_latest_version locking
	os/memstore: ignore OP_COLL_SET_BITS
	qa: generalise REQUIRE_MEMSTORE
	mgr: drop GIL in get_config
	mon: add 'size' arg to `osd pool create`
	mon: use pg_num_target for checks during creation
	mgr: revise locking in getter paths
	common/options: add `mon_target_pg_per_osd`
	mgr: expose OSDMap.pool_raw_used_rate

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
2018-12-19 21:22:35 -06:00
Sage Weil
b8d45b262c qa/suites/rados/singletone/all/pg-autoscaler: whitelist health warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2018-12-19 14:37:01 -06:00
Sage Weil
2cd1ca6625 qa/suites/rados: add simple pg-autoscaler test
Signed-off-by: Sage Weil <sage@redhat.com>
2018-12-18 13:30:54 -06:00
Sage Weil
09a8e5bce0 qa/suites/upgrade/mimic-x: add missing .qa
Signed-off-by: Sage Weil <sage@redhat.com>
2018-12-18 08:17:31 -06:00
Sage Weil
c7940db6b6 Merge PR #25596 into master
* refs/pull/25596/head:
	qa/suites/upgrade: fix wrt librados3

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-12-18 07:24:03 -06:00
Sage Weil
5612b6714c qa/suites/upgrade: fix wrt librados3
Signed-off-by: Sage Weil <sage@redhat.com>
2018-12-17 13:25:34 -06:00
Sage Weil
dce1623db9 qa/rados/upgrade: align thrashing with upgrade suite, don't import/export pgs
Don't import/export between versions

Fixes: http://tracker.ceph.com/issues/37665
Signed-off-by: Sage Weil <sage@redhat.com>
2018-12-14 07:04:56 -06:00
Kefu Chai
1d973c1e90 qa: downgrade librados2,librbd1 for thrash-old-clients tests
librados2 and librbd1 are installed as a dependency of qemu-kvm.
qemu-kvm is installed by ceph-cm-ansible, see [1].

in thrash-old-clients, jewel packages are installed, but yum does
not allow downgrade unless it's required explicitly. in this change,
we downgrade librbd1 and librados2 to address this issue.

currently, the ceph packages shipped by CentOS/RHEL 7 are still an old
version of jewel. so this issue only kicks in when we try to install
hammer.

this change should address failures like

Command failed on smithi136 with status 1: '\n sudo yum -y install
rbd-fuse\n '

found in rados/thrash-old-clients tests.

---
[1]
3db1cbdc22 (diff-f2b05d775fedff6c5c6689f564b32f1c)

Fixes: http://tracker.ceph.com/issues/37618
Signed-off-by: Kefu Chai <kchai@redhat.com>
2018-12-13 10:49:37 +08:00
Casey Bodley
8bf1c60f6a qa/rgw: add cls_lock/log/refcount tests to verify suite
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2018-12-10 13:56:34 -05:00
Stephan Müller
19b039c28e mgr/dashboard/qa: Fix ECP creation test
The current solution fails on our CI-system as some outputs can have
more values and some parameters like 'w' can vary in different
environments.

As this was only tested before in a vstart cluster environment it
worked.

Through this commit only the given attributes we know to be there,
will be tested.

Fixes: https://tracker.ceph.com/issues/37275
Signed-off-by: Stephan Müller <smueller@suse.com>
2018-12-10 12:37:03 +01:00
Patrick Donnelly
4432aa5f26
Merge PR #24748 into master
* refs/pull/24748/head:
	qa: use 6h timeout for pjd test

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2018-12-07 10:50:57 -08:00
Sage Weil
9ee3ce1ecd Merge PR #25345 into master
* refs/pull/25345/head:
	qa/suites: fix bluestore links
	qa/objectstore: bluestore -> bluestore-{bitmap,stupid}

Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2018-12-06 08:42:04 -06:00
Sage Weil
269910fc8b qa/suites: fix bluestore links
Signed-off-by: Sage Weil <sage@redhat.com>
2018-12-05 10:30:14 -06:00
Kefu Chai
105ca218ee qa/suites/rados/upgrade: set require-osd-release to nautilus
* add qa/releases/nautilus.yaml so it can be reused.
* use releases/nautilus.yaml in luminous-x upgrade test, so
  test_librbd_python.sh is able to use the feature introduced in
  nautilus.

Fixes: http://tracker.ceph.com/issues/37432
Signed-off-by: Kefu Chai <kchai@redhat.com>
2018-11-29 14:35:17 +08:00
Sage Weil
706197a7c7 Merge PR #25272 into master
* refs/pull/25272/head:
	qa: add simple test-volumes.sh workunit and run it from fs/basic_functional
	vstart.sh: create default fs via 'fs volume create'
	mgr/volumes: fix oremote
	mon/MgrMonitor: enable volumes module by default
	mgr: create `volumes` module
	mgr: cleaner constructor for CommandResult
	mgr: block for latest osdmap after command execution
	mgr: add MgrModule.mon_command helper
	ceph_volume_client: enable using existing rados inst
	mon: give ceph-mgr access to 'fs' commands

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Jan Fajerski <jfajerski@suse.com>
2018-11-28 11:34:57 -06:00
Sage Weil
43bf12e12d qa: add simple test-volumes.sh workunit and run it from fs/basic_functional
Signed-off-by: Sage Weil <sage@redhat.com>
2018-11-28 08:54:29 -06:00
Lenz Grimmer
720e0d4bfd
Merge pull request #24900 from zmc/wip-minimal-health
mgr/dashboard: Replace dashboard service

Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
2018-11-28 10:56:40 +01:00
Zack Cerza
50b7d42fe5 mgr/dashboard: Replace dashboard service
This splits out the collection of health and log data from the
/api/dashboard/health controller into /api/health/{full,minimal} and
/api/logs/all.

/health/full contains all the data (minus logs) that /dashboard/health
did, whereas /health/minimal contains only what is needed for the health
component to function. /logs/all contains exactly what the logs portion
of /dashboard/health did.

By using /health/minimal, on a vstart cluster we pull ~1.4KB of data
every 5s, where we used to pull ~6KB; those numbers would get larger
with larger clusters. Once we split out log data, that will drop to
~0.4KB.

Fixes: http://tracker.ceph.com/issues/36675

Signed-off-by: Zack Cerza <zack@redhat.com>
2018-11-27 16:08:53 -07:00
Patrick Donnelly
b76f14569d
Merge PR #24886 into master
* refs/pull/24886/head:
	qa: fix delay type config name

Reviewed-by: Zheng Yan <zyan@redhat.com>
2018-11-27 13:58:26 -08:00
Sage Weil
d69e8d8de8 Merge PR #14092 into master
* refs/pull/14092/head:
	mgr/DaemonServer: fix session leak
	mon/MonClient: ignore new mon commands while stopping
	mgr/DeviceState: fix DeviceState initial refcount
	qa/suites: valgrind ceph-mgr too

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-11-16 07:11:44 -06:00
Lenz Grimmer
34a5ac0b19
Merge pull request #25084 from s0nea/wip-dashboard-add-missing-test-suites
mgr/dashboard/qa: add missing dashboard suites

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Ricardo Marques <rimarques@suse.com>
2018-11-16 11:16:42 +01:00