* refs/pull/26468/head:
qa: config recall settings to test cache drop
qa: check cache dump works without timeout
mds: add 2nd order recall throttle
mds: drive log flush and cache trim during recall
mds: avoid gather assertion when subs exist
mds: output full details for recall threshold
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
mgr/dashboard: Add support for managing RBD QoS
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Ricardo Dias <rdias@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
If we use the defaults, the MDS/client will recall/release everything quickly.
We want it to take time to see things like the timeout get hit.
Fixes: http://tracker.ceph.com/issues/38348
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
mgr/dashboard: Add UI to configure the telemetry mgr plugin
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Patrick Nawracay <pnawracay@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
allow a user of the orchestrator interface to express that the inventory
query should not read from any cached inventory state.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
* refs/pull/26336/head:
qa/tasks/keystone.py: no need for notcmalloc in example
qa/suites/rgw/tempest/tasks/rgw_tempest: no need for notcmalloc
Reviewed-by: Alfredo Deza <adeza@redhat.com>
Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
mgr/orchestrator: Unify `osd create` and `osd add`
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
Also:
* Added some more tests
* Better validation of drive Groups
* Simplified `TestWriteCompletion`
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
* refs/pull/26038/head:
mds: simplify recall warnings
mds: add extra details for cache drop output
qa: test mds_max_caps_per_client conf
mds: limit maximum number of caps held by session
mds: adapt drop cache for incremental recall
mds: recall caps incrementally
mds: adapt drop cache for incremental trim
mds: add throttle for trimming MDCache
mds: cleanup SessionMap init
mds: cleanup Session init
Reviewed-by: Zheng Yan <zyan@redhat.com>
Instead of a timeout and complicated decisions about whether the client is
releasing caps in an expeditious fashion, just use a DecayCounter that tracks
the number of caps we've recalled. This counter is decremented whenever the
client releases caps. If the counter passes a threshold, then we raise the
warning.
Similar reworking is done for the steady-state recall of client caps. Another
release DecayCounter is added so we can tell when the client is not releasing
any more caps.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
-filter out mons from other clusters
-fix parsing of mon name from role
Fixes: http://tracker.ceph.com/issues/38115
Signed-off-by: Casey Bodley <cbodley@redhat.com>
* When the creation of the cluster is delegated to vstart_runner.py
(--create or --create-target-only) the amount of MGRs required
is calculated by the script so there is no more skipped tests
due to insufficient amount of MGRs.
* Additionally, this issue is not reproducible anymore:
Fixes: https://tracker.ceph.com/issues/37964
* Fixed typo: TEUTHOLOFY_PY_REQS
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
As with trimming, use DecayCounters to throttle the number of caps we recall,
both globally and per-session.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
`cache drop` is a long running command that will block the asok interface
(while the tell version does not). Attempting to abort the command with ^C or
equivalents will simply cause the `ceph` command to exit but won't stop the
asok command handler from waiting for the cache drop operation to complete.
Instead, just allow the tell version.
Fixes: http://tracker.ceph.com/issues/38020
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/26012/head:
qa: add test that down fs does not ERR
mon/MDSMonitor: skip offline ERR for down fs
Reviewed-by: Douglas Fuller <dfuller@redhat.com>
* refs/pull/25973/head:
qa: use simpler fs fail to bring fs down
MDSMonitor: add fs fail command
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Douglas Fuller <dfuller@redhat.com>
If there are leftover merges at the end of the run they can take a long
time to get through, blowing our timeout for (waiting for pgs to become
active and to stop splitting/merge) and scrubbing pgs. Stop all of that
at the end of the run so that we don't have to wait so long.
Signed-off-by: Sage Weil <sage@redhat.com>