Commit Graph

4031 Commits

Author SHA1 Message Date
David Zafman
6e3f04365f test: Trap termination so we can capture logs on teuthology timeout
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-09-10 12:23:07 -07:00
vasukulkarni
10f1c4c9de
Merge pull request #23602 from smanjara/wip-test-netem
qa: Task to emulate network delay and packet drop between two given h…
2018-09-10 09:57:10 -07:00
Sage Weil
d71258495e
Merge pull request #23997 from batrick/multimds-qa-broken-symlink
qa: fix symlink
2018-09-10 09:26:12 -05:00
Sage Weil
4d2a73c7f1 Merge PR #23845 into master
* refs/pull/23845/head:
	osd/OSDMap: include age in up and in counts for ceph status
	mon/OSDMonitor: set new_last_{up,in}_change
	osd/OSDMap: store last_up_change and last_in_change
	mgr/MgrMap: include mgr age in map printer
	mon/MgrMap: track active_changed timestamp
	mon: include mon quorum age in status
	include/utime: add utimespan_str helper

Reviewed-by: John Spray <john.spray@redhat.com>
2018-09-10 07:45:58 -05:00
Patrick Donnelly
a45852f8fd
qa: fix symlink
Introduced-by: 6ac1882dc4

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-09-08 19:21:57 -07:00
Sage Weil
f47921f293 qa/standalone/osd/osd-backfill-stats: fixes
Grep from the primary's log, not every osd's log.

For the backfill_remapped task in particular, after the pg_temp change it
just so happens that the primary changes across the pool size change and
thus two different primaries do (some) backfill.  Fix that test to pass
the correct primary.

Other tests are unaffected as they do not (happen to) trigger a primary
change and already satisfied the (removed) check that only one OSD does
backfill.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 17:11:18 -05:00
Sage Weil
09ee3f3538 Merge PR #20469 into master
* refs/pull/20469/head:
	osd/PG: remove warn on delete+merge race
	osd: base project_pg_history on is_new_interval
	osd: make project_pg_history handle concurrent osdmap publish
	osd: handle pg delete vs merge race
	osd/PG: do not purge strays in premerge state
	doc/rados/operations/placement-groups: a few minor corrections
	doc/man/8/ceph: drop enumeration of pg states
	doc/dev/placement-groups: drop old 'splitting' reference
	osd: wait for laggy pgs without osd_lock in handle_osd_map
	osd: drain peering wq in start_boot, not _committed_maps
	osd: kick split children
	osd: no osd_lock for finish_splits
	osd/osd_types: remove is_split assert
	ceph-objectstore-tool: prevent import of pg that has since merged
	qa/suites: test pg merging
	qa/tasks/thrashosds: support merging pgs too
	mon/OSDMonitor: mon_inject_pg_merge_bounce_probability
	doc/rados/operations/placement-groups: update to describe pg_num reductions too
	doc/rados/operations: remove reference to lpgs
	osd: implement pg merge
	osd/PG: implement merge_from
	osdc/Objecter: resend ops on pg merge
	osd: collect and record pg_num changes by pool
	osd: make load_pgs remove message more accurate
	osd/osd_types: pg_t: add is_merge_target()
	osd/osd_types: pg_t::is_merge -> is_merge_source
	osd/osd_types: adding or substracting invalid stats -> invalid stats
	osd/PG: clear_ready_to_merge on_shutdown (or final merge source prep)
	osd: debug pending_creates_from_osd cleanup, don't use cbegin
	ceph-objectstore-tool: debug intervals update
	mgr/ClusterState: discard pg updates for pgs >= pg_num
	mon/OSDMonitor: fix long line
	mon/OSDMonitor: move pool created check into caller
	mon/OSDMonitor: adjust pgp_num_target down along with pg_num_target as needed
	mon/OSDMonitor: add mon_osd_max_initial_pgs to cap initial pool pgs
	osd/OSDMap: set pg[p]_num_target in build_simple*() methods
	mon/PGMap: adjust SMALLER_PGP_NUM warning to use *_target values
	mon/OSDMonitor: set CREATING flag for force-create-pg
	mon/OSDMonitor: start sending new-style pg_create2 messages
	mon/OSDMonitor: set last_force_resend_prenautilus for pg_num_pending changes
	osd: ignore pg creates when pool FLAG_CREATING is not set
	mgr: do not adjust pg_num until FLAG_CREATING removed from pool
	mon/OSDMonitor: add FLAG_CREATING on upgrade if pools still creating
	mon/OSDMonitor: prevent FLAG_CREATING from getting set pre-nautilus
	mon/OSDMonitor: disallow pg_num changes while CREATING flag is set
	mon/OSDMonitor: set POOL_CREATING flag until initial pool pgs are created
	osd/osd_types: add pg_pool_t FLAG_POOL_CREATING
	osd/osd_types: introduce last_force_resend_prenautilus
	osd/PGLog: merge_from helper
	osd: no cache agent or snap trimming during premerge
	osd: notify mon when pending PGs are ready to merge
	mgr: add simple controller to adjust pg[p]_num_actual
	mon/OSDMonitor: MOSDPGReadyToMerge to complete a pg_num change
	mon/OSDMonitor: allow pg_num to adjusted up or down via pg[p]_num_target
	osd/osd_types: make pg merge an interval boundary
	osd/osd_types: add pg_t::is_merge() method
	osd/osd_types: add pg_num_pending to pg_pool_t
	osd: allow multiple threads to block on wait_min_pg_epoch
	osd: restructure advance_pg() call mechanism
	mon/PGMap: prune merged pgs
	mon/PGMap: track pgs by state for each pool
	osd/SnapMapper: allow split_bits to decrease (merge)
	os/bluestore: fix osr_drain before merge
	os/bluestore: allow reuse of osr from existing collection
	os/filestore: (re)implement merge
	os/filestore: add _merge_collections post-check
	os: implement merge_collection
	os/ObjectStore: add merge_collection operation to Transaction
2018-09-07 15:55:21 -05:00
Ilya Dryomov
478aca82eb
Merge pull request #23976 from idryomov/wip-cram-git-clone
qa/tasks/cram: tasks now must live in the repository

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2018-09-07 19:57:42 +02:00
Sage Weil
6bd682f53d ceph-objectstore-tool: prevent import of pg that has since merged
We currently import a portion of the PG if it has split.  Merge is more
complicated, though, mainly because COT is operating in a mode where it
fast-forwards the PG to the latest OSDMap epoch, which means it has to
implement any transformations to the PG (split/merge) independently.
Avoid doing this for merge.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 12:09:05 -05:00
Sage Weil
44de03d5e6 qa/suites: test pg merging
Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 12:09:05 -05:00
Sage Weil
0b59b7a688 qa/tasks/thrashosds: support merging pgs too
Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 12:09:05 -05:00
Sage Weil
4fc02a7f48 osd/OSDMap: include age in up and in counts for ceph status
Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 09:07:50 -05:00
vasukulkarni
93748a325c
Merge pull request #23944 from ceph/wip-s3a-update-mirror
qa/tasks: update mirror link for maven
2018-09-06 14:44:29 -07:00
Ilya Dryomov
592f566b4e qa/tasks/cram: tasks now must live in the repository
Commit 0d8887652d ("qa/tasks/cram: use suite_repo repository for all
cram jobs") removed hardcoded git.ceph.com links, but as it turned out
it is still used for nightlies.  There is no good way to accommodate
the different URL schemes, so let's get rid of URLs altogether.

Fixes: https://tracker.ceph.com/issues/27211
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-09-06 22:32:39 +02:00
Ilya Dryomov
e1c89b51c8 qa/tasks/workunit: factor out overrides and refspec logic
Allow for reuse in the cram task.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-09-06 22:31:55 +02:00
Patrick Donnelly
532f2880bd
Merge PR #23673 into master
* refs/pull/23673/head:
	qa: automate distro/kernel matrix for kclient

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Tested-by: Patrick Donnelly <pdonnell@redhat.com>
2018-09-06 10:57:32 -07:00
Patrick Donnelly
6ac1882dc4
qa: automate distro/kernel matrix for kclient
It's no longer necessary to pass `-k testing` to teuthology-suite. We're also
now regularly testing RHEL 7.5 kernel in upstream testing.

This work is prep for eventually integrating kclient into fs.

Fixes: http://tracker.ceph.com/issues/26995

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-09-06 10:23:59 -07:00
Vasu Kulkarni
13e100259e qa/tasks: update mirror link for maven, the original mirror no longer exists
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2018-09-05 17:08:24 -07:00
Mykola Golub
cec53e9bd2 qa/workunits/rbd: replace usage of 'rados rmpool'
This command was dropped.

Signed-off-by: Mykola Golub <mgolub@suse.com>
2018-09-05 22:52:20 +03:00
Jason Dillaman
0f0176ed4a qa/workunits/rbd: replace usage of 'rados mkpool'
This command was dropped under commit 2c26fb0fe1, so use
'ceph osd pool create' instead.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-09-05 08:17:39 -04:00
Shilpa Jagannath
14da3f07f3 Adds an identifier to the greenlet spawned by task 'link_toggle', which can then looked up by task 'link_recover' to call end() on it. Also fixes task cleanup during unwind.
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
2018-09-05 16:02:23 +05:30
Shilpa Jagannath
6da212f7cf Replaced hardcoded net interface names with yaml configurables.
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
2018-09-05 16:02:13 +05:30
Shilpa Jagannath
e7d0cdacc1 qa: Task to emulate network delay and packet drop between two given hosts
The task uses netem to emulate wide area network delay.
Provides three different configurable options.
1. standard delay: Constant delay with +/- 5ms jitter with normal distribution as default.
2. variable delay: To provide a delay between two given min-max range in milliseconds.
3. packet drop: Toggles packet drop and recovery in regular interval.

Useful in simulating network delays between two clusters while testing
rgw multisite and rbd mirroring configurations.

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
2018-09-05 16:01:11 +05:30
Ilya Dryomov
a0df578139
Merge pull request #23905 from idryomov/wip-cram-suite-repo
qa/tasks/cram: use suite_repo repository for all cram jobs

Reviewed-by: Nathan Cutler <ncutler@suse.com>
2018-09-04 14:27:28 +02:00
Lenz Grimmer
82412896ff
Merge pull request #23491 from p-na/per-osd-settings
mgr/dashboard: Add support for managing individual OSD settings in the backend

Reviewed-by: Sebastian Wagner <swagner@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
2018-09-04 12:01:19 +02:00
Ilya Dryomov
0d8887652d qa/tasks/cram: use suite_repo repository for all cram jobs
Currently git.ceph.com is hardcoded for all cram jobs.  Testing
modifications is a pain: one needs to push to either ceph/ceph.git or
ceph/ceph-ci.git (depending on where the ceph branch is at, triggering
unnecessary builds in the latter case) and wait for the mirror to sync.
Runs scheduled against branches in developer's forks fail.

Move away from git.ceph.com to allow mixing branches and repositories,
similar to workunits.

Fixes: https://tracker.ceph.com/issues/27211
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-09-03 22:07:20 +02:00
Lenz Grimmer
65781a4fa2
Merge pull request #23322 from ricardoasmarques/wip-role-management-api
mgr/dashboard: Add REST API for role management

Reviewed-by: Ricardo Dias <rdias@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
2018-09-03 15:43:02 +02:00
Ricardo Marques
592e4f7c10 mgr/dashboard: Add REST API for role management
Fixes: https://tracker.ceph.com/issues/25138

Signed-off-by: Ricardo Marques <rimarques@suse.com>
2018-09-03 13:15:42 +01:00
Patrick Nawracay
e71466cc49 mgr/dashboard: Add support for managing individual OSD settings (backend)
Add options to mark OSDs in/out/down/reweight/lost/remove/destroy/create

Fixes: http://tracker.ceph.com/issues/24270

Signed-off-by: Patrick Nawracay <pnawracay@suse.com>
2018-09-03 12:51:04 +02:00
Sage Weil
88df536908 Merge PR #23540 into master
* refs/pull/23540/head:
	include/ceph_fs: rename old auid field
	PendingReleaseNotes: note about auid support removal
	radosgw-admin: remove -a --auth-uid arg
	rgw: remove auid member from RGWUserInfo
	auth: remove auid member from EntityAuth
	osd: remove auid session member
	mon: remove auid session member
	doc/dev/cephx_protocol: drop auid reference
	auth: remove auid args from handle_request and verify_authorizer
	mon/OSDMonitor: remove 'osd pool {get,set} <name> auid ...'
	mon/OSDMonitor: remove auid arg for 'osd lspools' and deprecate
	osd/OSDCap: remove auid from grammar
	osd/OSDCap: remove auid from is_capable() etc args
	auth: clean up cap parse error messages
	mon/AuthMonitor: raise health warning on invalid caps
	mon/AuthMonitor: drop ancient auth inc encoding compat
	messages/MPoolOp: drop auid member
	osdc/Objecter: drop change_pool_auid
	pybind/rados: drop auid arg to pool_create
	pybind/rados: drop change_auid
	rados: drop mkpool, rmpool commands
	rados: remove 'chown' command
	librados: deprecate calls that take auid
	librados: mark all auid calls deprecated
	mon/OSDMonitor: drop variable pool auid for prepare_new_pool
	mon/OSDMonitor: remove pool auid change support
	osdc/Objecter: do not pass auid to create_pool
	ceph-authtool: remove auid options
	qa/workunits/cephtool: remove auid tests

Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
2018-09-01 15:53:31 -05:00
Xie Xingguo
0857124d23
Merge pull request #23663 from xiexingguo/wip-incompat-async-fixes
osd: some recovery improvements and cleanups


Reviewed-by: Sage Weil <sage@redhat.com>
2018-09-01 14:27:27 +08:00
Sage Weil
35820f4b88 mon/AuthMonitor: raise health warning on invalid caps
Raise a health warning if we have invalid (unparsable) caps in the auth
database.  Include a simple test.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-08-31 15:54:58 -05:00
Sage Weil
2c26fb0fe1 rados: drop mkpool, rmpool commands
- mkpool and rmpool users should use the normal cli/mon commands

Signed-off-by: Sage Weil <sage@redhat.com>
2018-08-31 09:27:36 -05:00
Sage Weil
d213b2531f rados: remove 'chown' command
Signed-off-by: Sage Weil <sage@redhat.com>
2018-08-31 09:27:36 -05:00
Sage Weil
d6def8ba11 ceph-authtool: remove auid options
Signed-off-by: Sage Weil <sage@redhat.com>
2018-08-31 09:26:19 -05:00
Sage Weil
eaca033d17 qa/workunits/cephtool: remove auid tests
Signed-off-by: Sage Weil <sage@redhat.com>
2018-08-31 09:26:19 -05:00
Ilya Dryomov
a4df8c3562 qa: rbd_workunit_kernel_untar_build: install build dependencies
Commit f0fe0936e6 ("qa: use recent kernel to kernel build testing")
bumped the kernel to 4.17.

Fixes: http://tracker.ceph.com/issues/35074
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-08-31 14:31:03 +02:00
xie xingguo
22786cffa8 osd/PG: force auth_log_shard to be primary when appropriate
So if there are a lot fo missing objects on primary, we can
make use of auth_log_shard to restore client I/O quickly.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-08-31 16:29:25 +08:00
Sage Weil
85083f39b5 Merge PR #23572 into master
* refs/pull/23572/head:
	qa/standalone/osd/osd-force-create-pg: add force-create-pg test
	mon/MonCommands: fix 'osd force-create-pg'

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-08-30 08:52:44 -05:00
Noah Watkins
ea15b625f3 qa/mgr/selftest: handle always-on module fall out
need a non-always-on module. hello doesn't work because it isn't
installed. so switch to selftest.

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
2018-08-28 13:45:58 -07:00
Patrick Donnelly
3aa392ca73
Merge PR #23439 into master
* refs/pull/23439/head:
	qa: whitelist cap revoke warning
	doc: document cap revoke non-responders client eviction
	test: validate client eviction for cap revoke non-responders
	mds: add counter for tracking cap non-responding clients
	mds: evict clients that do not respond to cap revoke by MDS
	mds: pass timeout argument for fetching late clients

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Zheng Yan <zyan@redhat.com>
2018-08-25 13:04:58 -07:00
Patrick Donnelly
4367de377e
qa: whitelist cap revoke warning
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-08-25 12:42:26 -07:00
David Zafman
b0d2c64d6b
Merge pull request #23376 from dzafman/wip-25108
object errors found in be_select_auth_object() aren't logged the same

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-08-23 13:23:55 -07:00
Josh Durgin
cc41b51c6a
Merge pull request #23518 from dzafman/wip-25084
osd: When possible check CRC in build_push_op() so repair can eventually stop

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-08-23 11:39:05 -07:00
David Zafman
687f63e599 test: Update tests for error message changes
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-08-23 11:09:22 -07:00
David Zafman
b40784290f qa: Add new message to whitelist for scrub/repair tests
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-08-23 11:09:22 -07:00
David Zafman
58c4d32203 test: Verify cluster logging of scrub error messages
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-08-23 11:09:22 -07:00
David Zafman
a4f2ca3186
Merge pull request #23695 from dzafman/wip-27056
test: Use pids instead of jobspecs which were wrong

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-08-23 10:45:03 -07:00
David Zafman
bc33170310 test: Use pids instead of jobspecs which were wrong
Fixes: http://tracker.ceph.com/issues/27056

Signed-off-by: David Zafman <dzafman@redhat.com>
2018-08-22 10:57:04 -07:00
Mykola Golub
645233ec82
Merge pull request #23630 from wjwithagen/wjw-fix-rbd-ggate-kldload
test/rbd: rbd_ggate test improvements

Reviewed-by: Mykola Golub <mgolub@suse.com>
2018-08-22 14:45:07 +03:00