The stress-split thrasher already had this off, but the ec variant did
not. We don't support ceph-objectstore-tool exports/imports between major
versions.
Fixes: http://tracker.ceph.com/issues/38294
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/26965/head:
ms/async/ProtocolV2: add ms_die_on_bug and assert rxbuf/txbuf don't get big
msg/async/ProtocolV2: do not reenable pre_auth buffering on from reset_recv_state
Reviewed-by: Ricardo Dias <rdias@suse.com>
* refs/pull/26898/head:
osd/PG: invalidate PG if merging with unexpected version
osd,mon: include more pg merge metadata in pg_pool_t
qa/standalone/osd/pg-split-merge.sh: reproduce pg merge problem with empty pgs
osd: add osd_debug_no_{acting_change,purge_strays}
Reviewed-by: Neha Ojha <nojha@redhat.com>
* refs/pull/26894/head:
qa/standalone/erasure-code/test-erasure-code: adjust test to avoid m=0
erasure-code: ensure m >= 1
mon/OSDMonitor: set ec min_size to k + min(1, m - 1)
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
_DD is k=2 m=0, which we don't allow. Switch it to cDD.
I confess I don't fully understand why this was _DD to begin with, but
I'm pretty sure mapping is there to control the order of results so that
it can be mapped to the CRUSH rule output sanely, and the coding portion
is not relevant to the test.
Signed-off-by: Sage Weil <sage@redhat.com>
Valgrind makes the MDS slowwwww. The newish mds_heartbeat_grace config allows
us to keep sending beacons to the mons even if the internal heartbeat is slow.
This avoids the laggy messages which are useful to grep for unrelated messaging
issues.
Fixes: http://tracker.ceph.com/issues/38723
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
There is a need to introduce this new config option because the MgrModule::get_module_option() and MgrModule::get_localized_module_option() method will be refactored soon and will not support the default parameter anymore. Instead the default value must be configured in the MODULE_OPTIONS. Currently we misuse the server_port depending on if SSL is enabled or not.
Fixes: https://tracker.ceph.com/issues/38331
Signed-off-by: Volker Theile <vtheile@suse.com>
When mgr/selftest/testkey = foo and mgr/selftest/x/testkey is not set,
then get_localized() should return foo.
Signed-off-by: Sage Weil <sage@redhat.com>
The crush/builder.c crush_add_bucket method resizes the max_buckets array
but a power of 2 when it has to expand, but the code in CrushWrapper was
assuming that if the array grew the pos for the new bucket would be the
last position in the new array. This led to a situation where the
crush_choose_arg_map args array size didn't match max_buckets, and
eventually caused a crash.
Fixes: http://tracker.ceph.com/issues/38664
Signed-off-by: Sage Weil <sage@redhat.com>
If the source or target PG version is 0'0, we may silently take the max
of the source and target and still leave the PG complete. This
specifically can happen with an empty PG, as seen with bug 38655. In
theory we could encounter one of the PGs with some other last_update
that doesn't match what we expect. If that ever happens, make sure the
result is incomplete so that backfill can clean up.
Additionally check that the pool metadata for the last merge matches the
PGs at all. This could mismatch if we have an osdmap gap and are forced
to do some merge without merge info at all... in which case we should
definitely invalidate: there should be newer copies of the PG(s), and we
have no idea whether the PGs we are merging are what we want. If this is
some disaster recovery situation, an operator is always free to use
ceph-objectstore-tool to re-mark a PG complete (at their own peril!).
Fixes: http://tracker.ceph.com/issues/38655
Signed-off-by: Sage Weil <sage@redhat.com>
Bluestore caused grep crash with "grep: memory exhausted" due to
size of "block" storage.
Fixes: http://tracker.ceph.com/issues/38678
Signed-off-by: David Zafman <dzafman@redhat.com>
Simple messenger is on it's way out and it doesn't work with msgr2.
Fixes: http://tracker.ceph.com/issues/38676
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
It's now "1/2 unfound":
1/2 objects unfound (50.000%)
..presumably due to the rbd pool init creating the rbd_directory.
Signed-off-by: Sage Weil <sage@redhat.com>
- no need for the default pool size
- no initial osds or it will collide with setup_osds later
- no need for rbd pool at all
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/26794/head:
mon/MgrMonitor: only try to update always_on_modules if >= NAUTILUS
qa/standalone/mon/msgr-v2-transition: add some tests for enabling msgr v2
mon/MonmapMonitor: add 'ceph mon set-addrs <name> <addrvec>' command
Revert "mon/MonClient: disable ms_bind_msgr2 if NAUTILUS feature not set"
mon/OSDMonitor: use legacy_equals to compare osd addrs
msg/msg_types: make legacy_equals() symmetrical
mon/MDSMonitor: stop using get_orig_source_inst()
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
* refs/pull/26770/head:
qa/standalone/osd/osd-force-create-pg: create more pgs
qa/standalone: make sure an osd is running before create_rbd_pool
Reviewed-by: Mykola Golub <mgolub@suse.com>
We've disabled the "clean" shutdown in ceph-mgr due to
https://tracker.ceph.com/issues/38621
Until then, no valgrind leak checks!
Signed-off-by: Sage Weil <sage@redhat.com>
Also:
* Small test_orchestrator refactorization
* Improved Docstring in MgrModule.remote
* Added `raise_if_exception` that raises Exceptions
* Added `OrchestratorError` and `OrchestratorValidationError`
* `_orchestrator_wait` no longer raises anything
* `volumes` model also calls `raise_if_exception`
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
We rename ceph_test_rados_api_tier to add _pp, so the mimic version doesn't
work. And in any case, at this stage the client host has master installed.
Signed-off-by: Sage Weil <sage@redhat.com>
'rbd pool init' now does IO. Drop the pool, or change the pool size to 1.
Fixes: http://tracker.ceph.com/issues/38585
Signed-off-by: Sage Weil <sage@redhat.com>
Add a new 'ceph orchestrator nfs update' command that will take the
NFS clustername and a new count as arguments. That will get translated
to a StatelessServiceSpec and passed to update_stateless_service.
Also, add the necessary stubs to the test_orchestrator and the CLI
QA test.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
For large clusters, we use device classes to isolate storage pools.
The existing 'osd df' output turns out to be too nosiy, say, if
you care about only single storage pool with osds possibly spanning over
all hosts.
With this change you are now being able to do 'osd df' by class (or by pool,
if you simply use classes to separate different pools), or by a specified
crush bucket name you are currently interested in, which is much more
convenient.
Some examples:
```
$ bin/ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME
-1 0.05878 - 60 GiB 6.4 GiB 23 MiB 0 B 6 GiB 54 GiB 10.60 1.00 - root default
-3 0.02939 - 30 GiB 3.2 GiB 12 MiB 0 B 3 GiB 27 GiB 10.60 1.00 - host ceph11
3 aaa 0.00980 1.00000 10 GiB 1.1 GiB 3.9 MiB 0 B 1 GiB 9.0 GiB 10.60 1.00 56 up osd.3
4 bbb 0.00980 1.00000 10 GiB 1.1 GiB 3.9 MiB 0 B 1 GiB 9.0 GiB 10.60 1.00 58 up osd.4
5 ccc 0.00980 1.00000 10 GiB 1.1 GiB 3.9 MiB 0 B 1 GiB 9.0 GiB 10.60 1.00 60 up osd.5
-5 0.02939 - 30 GiB 3.2 GiB 12 MiB 0 B 3 GiB 27 GiB 10.60 1.00 - host ceph12
0 aaa 0.00980 1.00000 10 GiB 1.1 GiB 3.9 MiB 0 B 1 GiB 9.0 GiB 10.60 1.00 50 up osd.0
1 bbb 0.00980 1.00000 10 GiB 1.1 GiB 3.9 MiB 0 B 1 GiB 9.0 GiB 10.60 1.00 61 up osd.1
2 ccc 0.00980 1.00000 10 GiB 1.1 GiB 3.9 MiB 0 B 1 GiB 9.0 GiB 10.60 1.00 51 up osd.2
TOTAL 60 GiB 6.4 GiB 23 MiB 0 B 6 GiB 54 GiB 10.60
MIN/MAX VAR: 1.00/1.00 STDDEV: 0
$ bin/ceph osd df tree class aaa
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME
-1 0.05878 - 20 GiB 2.1 GiB 7.8 MiB 0 B 2 GiB 18 GiB 10.60 1.00 - root default
-3 0.02939 - 10 GiB 1.1 GiB 3.9 MiB 0 B 1 GiB 9.0 GiB 10.60 1.00 - host ceph11
3 aaa 0.00980 1.00000 10 GiB 1.1 GiB 3.9 MiB 0 B 1 GiB 9.0 GiB 10.60 1.00 56 up osd.3
-5 0.02939 - 10 GiB 1.1 GiB 3.9 MiB 0 B 1 GiB 9.0 GiB 10.60 1.00 - host ceph12
0 aaa 0.00980 1.00000 10 GiB 1.1 GiB 3.9 MiB 0 B 1 GiB 9.0 GiB 10.60 1.00 50 up osd.0
TOTAL 20 GiB 2.1 GiB 7.8 MiB 0 B 2 GiB 18 GiB 10.60
MIN/MAX VAR: 1.00/1.00 STDDEV: 0
$ bin/ceph osd df tree name ceph11
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME
-3 0.02939 - 30 GiB 3.2 GiB 12 MiB 0 B 3 GiB 27 GiB 10.60 1.00 - host ceph11
3 aaa 0.00980 1.00000 10 GiB 1.1 GiB 3.9 MiB 0 B 1 GiB 9.0 GiB 10.60 1.00 56 up osd.3
4 bbb 0.00980 1.00000 10 GiB 1.1 GiB 3.9 MiB 0 B 1 GiB 9.0 GiB 10.60 1.00 58 up osd.4
5 ccc 0.00980 1.00000 10 GiB 1.1 GiB 3.9 MiB 0 B 1 GiB 9.0 GiB 10.60 1.00 60 up osd.5
TOTAL 30 GiB 3.2 GiB 12 MiB 0 B 3 GiB 27 GiB 10.60
MIN/MAX VAR: 1.00/1.00 STDDEV: 0
```
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
Instead of generating three tests, each with bluestore-bitmap.yaml, it
generates four tests: one consisting of just bluestore-bitmap.yaml and
the other three without any trace of bluestore. This was introduced in
commit 711df71790 ("qa: objectstore snippets for krbd").
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
This test introduces a map gap. What *should* happen is that when there is
such a gap, we cannot import. Previously, the test didn't reliably produce
a map gap at all, and didn't check that import failed--it verified that it
passed.
Fix the test so that it reliably produces a gap *and* reports
min_last_epoch_clean to the mon so we can trim. Then verify we fail to
import, but can with --force. But remove the pg again, because if we
force an import with a map gap the osd will refuse to start.
Fixes: http://tracker.ceph.com/issues/38525
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/26638/head:
doc: update documentation for standby-replay
qa: update discontinous map test to use mds freezing
mon: add freeze MDS command
qa: update testing for standby-replay
mon: add setting for fs to enable standby-replay
ceph-mds: obsolete hot-standby option
fs: obsolete standby_for config options
messages/MMDSBeacon: use inline init
mds: avoid unnecessary copy of entity_addrvec_t
mds: use inline init for mds_info_t
mds: use rank from MDSMap always
mds: remove obsolete comment
qa: use SIGTERM when stopping vstart service
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Zheng Yan <zyan@redhat.com>
These have bit-rotted and no longer work. No cycles from interested parties
available to fix.
Fixes: https://tracker.ceph.com/issues/38487
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
When run with valgrind, it takes a significant amount of time to complete.
Fixes: http://tracker.ceph.com/issues/38520
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Client unmount during test cleanup will hang if the file system was deleted.
Fixes: http://tracker.ceph.com/issues/38518
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
mgr/dashboard: Configure all mgr modules in UI
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
This better mimics the behavior of teuthology and tests rbd-mirror
daemon's ability to handle a pool deletion.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
For those with multiple storage pools sharing the same devices,
I think it would make much more sense to offer per-pool
commands to bring pools with high priority, e.g., because they
are hosting data of more importance than others, back to normal
quickly.
Fixes: http://tracker.ceph.com/issues/38456
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
* refs/pull/26468/head:
qa: config recall settings to test cache drop
qa: check cache dump works without timeout
mds: add 2nd order recall throttle
mds: drive log flush and cache trim during recall
mds: avoid gather assertion when subs exist
mds: output full details for recall threshold
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/26237/head:
Revert "qa: update test_envlibrados_for_rocksdb.sh for libradospp split"
doc/librados: explicitly mention that the C++ API is not stable
ceph.spec: force use of upgrade devtoolset-gcc under RHEL 7
librados: add symbol versioning to the C++ API
librados: add symbol versioning to the C API
librados: revert librados3/libradoscc back to librados2
Reviewed-by: Kefu Chai <kchai@redhat.com>
Currently each time NFS page is opened and NFS Ganesha is not configured
an error notification is thrown and no extra information is given.
Now the user will be redirected to an information page.
Removed the orchestrator information since it no longer applies.
Fixes: http://tracker.ceph.com/issues/38399
Signed-off-by: Tiago Melo <tmelo@suse.com>
For backwards compatibility and upgrade reasons, the librados2
API needs to be preserved and it needs to continue to be compatible
with dependent libraries like librbd1.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
The test extracts the mon addresses from the monmap, but with the
recent v2 format change it extracted an invalid address.
Fixes: http://tracker.ceph.com/issues/38385
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
* refs/pull/26485/head:
qa/suites/upgrade/luminous-x: force clone v1 format for final rbd python test
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
mgr/dashboard: Add support for managing RBD QoS
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Ricardo Dias <rdias@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
If we use the defaults, the MDS/client will recall/release everything quickly.
We want it to take time to see things like the timeout get hit.
Fixes: http://tracker.ceph.com/issues/38348
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
The markdown test is based on marking down a specific number of times, but
the duplicate commands from the CLI may not get absorbed/batched by the
mon, breaking the test. Override the default qa/tasks/workunit.py
behavior of sending dups.
Fixes: http://tracker.ceph.com/issues/38359
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/24805/head:
qa/suite: add dedup test
src/tools: fix compile error (master version issue)
src/tools: add stats (fixed objects,total objects)
src/tools: make room for cdc
src/tools: make enhacned stats and interface class
src/tools: set timelimit and add signal handler to check progress
src/tools: use the slice thing and make parallel (chunk_scrub)
src/test: add max-thread test in test_dedup_tool.sh
src/tools: use the slice thing and make parallel
src/test: add chunk-scrub test in test_dedup_tool.sh
src/tools: add chunk-scrub op in dedup tool
src/cls/cas: add has_chunk op
src/test: add test_dedup_tool.sh
src/tools: initial works for dedup tool
Reviewed-by: Sage Weil <sage@redhat.com>
* refs/pull/26455/head:
qa/suites/upgrade/mimic-x/stress-split: drop pglog_hardlimit test
qa/suites/upgrade/mimic-x/stress-split: update for msgr2
qa/suites/upgrade/mimic-x/parallel: update for msgr v2
Reviewed-by: Neha Ojha <nojha@redhat.com>
on all distributions
OpenSUSE does not automatically add the | back when setting
the corepattern. I tested this on openSUSE Leap 15.0.
Fixes: http://tracker.ceph.com/issues/38325
Signed-off-by: David Zafman <dzafman@redhat.com>
This reverts commit 5ba6286834.
Signed-off-by: David Zafman <dzafman@redhat.com>
Conflicts:
qa/run-standalone.sh (reseting core_pattern moved to function)
mgr/dashboard: Add UI to configure the telemetry mgr plugin
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Patrick Nawracay <pnawracay@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Otherwise it's annoying because the class list changes between luminous and nautilus,
and we don't want to futz around with changing this setting during the upgrade.
The problematic classes are 'cas' (added) and 'sdk' (not enabled by default but
included by the cls/ workunit.
Signed-off-by: Sage Weil <sage@redhat.com>
The luminous version is (1) not what we want and (2) will fail because
ceph_test_rados_api_tier no longer exists in master.
Signed-off-by: Sage Weil <sage@redhat.com>
This caused qa/standalone/misc/test-ceph-helpers.sh to fail
"MGR_MODULE_DEPENDENCY 8 mgr modules have failed dependencies"
Fixes: http://tracker.ceph.com/issues/38262
Signed-off-by: David Zafman <dzafman@redhat.com>
allow a user of the orchestrator interface to express that the inventory
query should not read from any cached inventory state.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
* refs/pull/26336/head:
qa/tasks/keystone.py: no need for notcmalloc in example
qa/suites/rgw/tempest/tasks/rgw_tempest: no need for notcmalloc
Reviewed-by: Alfredo Deza <adeza@redhat.com>
Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
* refs/pull/25977/head:
qa/suites: exclude new packages when installing old versions
rpm: add dependency on python-kubernetes module to ceph-mgr-rook package
rpm,deb: add rbd_support module to ceph-mgr
packaging: split ceph-mgr diskprediction and rook plugins into own packages
Reviewed-by: Tim Serong <tserong@suse.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
* refs/pull/26059/head:
mon/MonClient: fix keepalive with v2 auth
msg/async/ProtocolV2: reject peer_addrs of -
msg/async/ProtocolV2: clean up feature management
mon/MonClient: set up rotating_secrets, etc before msgr ready
msg/async: let client specify preferred order of modes
msg/async/ProtocolV2: include entity_name, features in reconnect
msg/async/ProtocolV2: fix write_lock usage around AckFrame
qa/suites/rados/verify/validator/valgrind: debug refs = 5
qa/standalone/ceph-helpers: fix health_ok test
auth/AuthRegistry: only complain about disabling cephx if cephx was enabled
auth/AuthRegistry: fix locking for get_supported_methods()
auth: remove AUTH_UNKNOWN weirdness, hardcoded defaults.
msg/async/ProtocolV2: remove unused get_auth_allowed_methods
osd: set up messener auth_* before setting dispatcher (and going 'ready')
mon/AuthMonitor: request max_global_id increase from peon in tick
mon: prime MgrClient only after messengers are initialized
qa/suites/rados/workloads/rados_api_tests.yaml: debug mgrc = 20 on mon
auth: document Auth{Client,Server} interfaces
auth: future-proof AUTH_MODE_* a bit in case we need to change the encoding byte
mon/MonClient: request monmap on open instead of ping
mgr/PyModuleRegistry: add details for MGR_MODULE_{DEPENDENCY,ERROR}
crimson: fix build
mon/MonClient: finsih authenticate() only after we get monmap; fix 'tell mgr'
mon: add auth_lock to protect auth_meta manipulation
ceph-mon: set up auth before binding
mon: defer initial connection auth attempts until initial quorum is formed
mon/MonClient: make MonClientPinger an AuthCleint
ceph_test_msgr: use DummyAuth
auth/DummyAuth: dummy auth server and client for test code
mon/Monitor: fix leak of auth_handler if we error out
doc/dev/cephx: re-wordwrap
doc/dev/cephx: document nautilus change to cephx
vstart.sh: fix --msgr2 option
msg/async/ProtocolV2: use shared_ptr to manage auth_meta
auth/Auth{Client,Server}: pass auth_meta in explicitly
mon/MonClient: behave if authorizer can't be built (yet)
osd: set_auth_server on client_messenger
common/ceph_context: get_moduel_type() for seastar cct
auth: make connection_secret a std::string
auth,msg/async/ProtocolV2: negotiate connection modes
auth/AuthRegistry: refactor handling of auth_*_requred options
osd,mgr,mds: remove unused authorize registries
switch monc, daemons to use new msgr2 auth frame exchange
doc/dev/msgr2: update docs to match implementation for auth frames
auth/AuthClientHandler: add build_initial_request hook
msg/Messenger: attach auth_client and/or auth_server to each Messenger
auth: introduce AuthClient and AuthServer handlers
auth: codify AUTH_MODE_AUTHORIZER
msg/Connection: track peer_id (id portion of entity_name_t) for msgr2
auth/AuthAuthorizeHandler: add get_supported_methods()
auth/AuthAuthorizeHandler: fix args for verify_authorizer()
auth: constify bufferlist arg to AuthAuthorizer::add_challenge()
auth/cephx: share all tickets and connection_secret in initial reply
msg/async,auth: add AuthConnectionMeta to Protocol
auth/AuthClientHandler: pass in session_key, connection_secret pointers
auth/AuthServiceHandler: take session_key and connection_secret as args
auth/cephx: pass more specific type into build_session_auth_info
mon/Session: separate session creation, peer ident, and registration
mon/AuthMonitor: bump max_global_id from on_active() and tick()
mon/AuthMonitor: be more careful with max_global_id
mon: only all ms_handle_authentication() if auth method says we're done
mon/AuthMonitor: fix "finished with auth" condition check
auth: clean up AuthServiceHandler::handle_request() args
auth: clean up AuthServiceHandler::start_session()
mon/AuthMonitor: drop unused op arg to assign_global_id()
msg/async: separate TAG_AUTH_REQUEST_MORE and TAG_AUTH_REPLY_MORE
msg/async: consolidate authorizer checks
msg/async: move get_auth_allowed into ProtocolV2.cc
mon/MonClient: trivial cleanup
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Stopping the osd daemon won't reliably get you HEALTH_WARN or ERR; you have
to make sure it is also marked down.
Signed-off-by: Sage Weil <sage@redhat.com>
Seeing some hangs when the mon is forwarding mgr commands (pg deep-scrub)
to the mgr. This is a buggy test (it should send it to the mgr directly)
but it is helpful to verify the mon forwarding behavior works.
Signed-off-by: Sage Weil <sage@redhat.com>
mgr/orchestrator: make use of @CLICommand
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
Reviewed-by: Noah Watkins <noahwatkins@gmail.com>
krbd was being tested with filestore, up until recently when the
default for osd_objectstore was changed to bluestore. This broke
rbd_simple_big.yaml because bluestore_block_size defaults to 10G.
Pick up the sepia setting of 90G from bluestore-bitmap.yaml.
Run fsx subsuite with both filestore and bluestore.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
failure_reason: '"2019-02-03 22:52:41.561332 osd.10 (osd.10) 790 : cluster [WRN] slow
request 30.154662 seconds old, received at 2019-02-03 22:52:11.406639: osd_op(client.56148.0:39092
8.9 8.70387d99 (undecoded) ondisk+retry+write+known_if_redirected e1372) currently
waiting for peered" in cluster log'
We're restarting OSDs, and may see slow requests in the process.
Signed-off-by: Sage Weil <sage@redhat.com>
Discard no longer guarantees zeroing, use BLKZEROOUT and "fallocate -z"
instead (blkdiscard(8) in xenial doesn't support -z).
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
mgr/orchestrator: Unify `osd create` and `osd add`
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
Also:
* Added some more tests
* Better validation of drive Groups
* Simplified `TestWriteCompletion`
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
* refs/pull/26038/head:
mds: simplify recall warnings
mds: add extra details for cache drop output
qa: test mds_max_caps_per_client conf
mds: limit maximum number of caps held by session
mds: adapt drop cache for incremental recall
mds: recall caps incrementally
mds: adapt drop cache for incremental trim
mds: add throttle for trimming MDCache
mds: cleanup SessionMap init
mds: cleanup Session init
Reviewed-by: Zheng Yan <zyan@redhat.com>
Instead of a timeout and complicated decisions about whether the client is
releasing caps in an expeditious fashion, just use a DecayCounter that tracks
the number of caps we've recalled. This counter is decremented whenever the
client releases caps. If the counter passes a threshold, then we raise the
warning.
Similar reworking is done for the steady-state recall of client caps. Another
release DecayCounter is added so we can tell when the client is not releasing
any more caps.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
-filter out mons from other clusters
-fix parsing of mon name from role
Fixes: http://tracker.ceph.com/issues/38115
Signed-off-by: Casey Bodley <cbodley@redhat.com>
* When the creation of the cluster is delegated to vstart_runner.py
(--create or --create-target-only) the amount of MGRs required
is calculated by the script so there is no more skipped tests
due to insufficient amount of MGRs.
* Additionally, this issue is not reproducible anymore:
Fixes: https://tracker.ceph.com/issues/37964
* Fixed typo: TEUTHOLOFY_PY_REQS
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
As with trimming, use DecayCounters to throttle the number of caps we recall,
both globally and per-session.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
The pool and namespace can now be specified as in a
<pool-name>[/<namespace-name>] format as positional
arguments.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>