If there are leftover merges at the end of the run they can take a long
time to get through, blowing our timeout for (waiting for pgs to become
active and to stop splitting/merge) and scrubbing pgs. Stop all of that
at the end of the run so that we don't have to wait so long.
Signed-off-by: Sage Weil <sage@redhat.com>
We used to rely on the monmap bootstrap code to magically create a valid
monmap with named mons because our old-style ceph.conf had mon_addr
values in each mon.foo section. Instead, just feed it a real monmap
from pre-destruction.
In practice, a user can manually generate this monmap, or rename the
mons after the fact with --inject-monmap, or whatever. Out of scope
for this test, so we just do the simplest thing to make the rebuild test
work.
Signed-off-by: Sage Weil <sage@redhat.com>
Use the new config option type names (given by the cluster) in the
dashboard.
Fixes: http://tracker.ceph.com/issues/37843
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
- if force-branch, use that
- otherwise:
- read default-branch from client config
- use suite branch or ceph branch if suite branch is not defined
- if this branch is one of official releases (or master), prefix
it with 'ceph-'
try to clone branch specified above, if failed (branch doesn't exist probably)
and not force-branch, use default-branch.
Also add an option to override ragweed repo.
Switched all force-branch from ragweed qa suite to default-branch.
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Otherwise the Mutation for Truncate is done on obj_id of the last iteration of the previous loop.
Fixes: http://tracker.ceph.com/issues/37836
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/25621/head:
mds: allow boot on read-only
mds: setup readonly mode for PurgeQueue
mds: return string_view for type str
mds: add missing locks for PurgeQueue methods
mds: delete on_error context on des
Reviewed-by: Zheng Yan <zyan@redhat.com>
* refs/pull/25009/head:
librbd: stringify locker name with get_legacy_str()
osdc/Objecter: fix list_watchers addr rendering to match legacy
test/crimson: disable unittest_seastar_messenger test
msg/msg_types: encode entity_addr_t TYPE_ANY as TYPE_LEGACY for pre-nautilus
client: make blacklist detection handle TYPE_ANY entries
mon/OSDMonitor: maintain compat output for 'blacklist ls'
client: maintain compat for {inst,addr}_str in status dump
qa/tasks/ceph_manager: compare osd flush seq #'s as ints
qa/suites/fs: make use of simple.yaml where appropriate
qa/msgr: move msgr factet into generic re-usable dir
crimson: fix monmap build for seastar
doc/start/ceph.conf: trim the sample ceph.conf file
doc/rados/operations: only describe --public-{addr,network} method for adding mons
PendingReleaseNotes: deprecate 'mon addr'
doc: fix some 'mon addr' references
doc/rados/configuration: fix some 'mon addr' references
doc/rados/configuration/network-config-ref: revise network docs somewhat
doc/rados/configuration/network-config-ref: remove totally obsolete section
qa/suites/rados: replace mon_seesaw.py task with a small bash script
qa/suites/fs/upgrade: don't bind to v2 addrs
qa/tasks/mon_thrash: avoid 'mon addr' in mon section
mon/MonClient: disable ms_bind_msgr2 if NAUTILUS feature not set
osd/OSDMap: maintain compat addr fields
msg/msg_types: add get_legacy_str()
mds/MDSMap.h: maintain compat addr field
mon/MgrMap: maintain compat active_addr field
mon/MonClient: reconnect to mon if it's addrvec appears to have changed
qa/tasks/ceph.conf.template: increase mon_mgr_mkfs_grace
msg/async/ProtocolV2: fill in IP for all peer_addrs
msg/async: print all addrs on debug lines
mon/MonMap: no noname- mon name prefix when for_mkfs
ceph-monstore-tool: print initial monmap
msg/async/ProtocolV2: advertise ourselves as a v2 addr when using v2 protocol
msg/async: assert existing protocol matches current protocol
msg/async: add missing modelines
mon/MonMap: add missing modeline
vstart.sh: put mon addrs in mon_host, not 'mon addr'
msg/async: better debug around conn map lookups and updates
mon/MonClient: dump initial monmap at debug level 10
qa/standalone/osd/osd-fast-mark-down: use v1 addr w/ simplemessenger
qa/tasks/ceph: set initial monmap features with using addrvec addrs
monmaptool: add --enable-all-features option
qa/tasks/ceph: only use monmaptool --addv if addr has [,:v]
qa/tasks/ceph_manager: make get_mon_status use mon addr
qa/tasks/ceph: keep mon addrs in ctx namespace
mon/OSDMonitor: log all osd addrs on boot
msg/simple: behave when v2 and v1 addrs are present at target
mon/MonClient: warn if global_id changes
msg/Connection: add warning/note on get_peer_global_id
mds/MDSDaemon: clean up handle_mds_map debug output a bit
qa/suites/rados/upgrade: debug mds
mds/MDSRank: improve is_stale_message to handle addrvecs
msg/async: make loopback detect when sending to one of our many addrs
qa/suites/rados/upgrade: no aggressive pg num changes
mon/OSDMonitor: require nautilus mons for require_osd_release=nautilus
mon/OSDMonitor: require mimic mons for require_osd_release=mimic
qa/suites/rados/thrash-old-clients: use legacy addr syntax in ceph.conf
msg/async: preserve peer features when replacing a connection
qa/tasks/ceph.py: move methods from teuthology.git into ceph.py directly; support mon bind * options
mon/MonMap: adjust build_initial behavior for mkfs vs probe
mon/MonMap: improve ambiguous addr behavior
qa/suites/rados/upgrade: spread mons a bit
qa/rados/thrash-old-clients: keep mons on separate hosts
qa/standalone/mon/misc.sh: tweak test to be more robust
qa/tasks/mon_seesaw: expect v1/v2 prefix in addr
osd/OSDMap: fix is_blacklisted() check to assume type ANY
mon/OSDMonitor: use ANY addr type for blacklisting
mon/msg_types: TYPE_V1ORV2 -> TYPE_ANY
qa/workunits/cephtool: fix blacklist test
qa/suites/upgrade: install old version with only v1 addrs
common/options: by default, bind to both msgr v1 and v2 addresses
vstart.sh: add --msgr1, --msgr2, --msgr21 options
msg/async/ProtocolV2: be flexible with server identity check
msg/msg_types: fix entity_addrvec_t::parse() with null end arg
qa/suites/rados/basic/msgr: no msgr2 addrs in initial monmaps
qa/tasks/ceph: add 'mon_bind_addrvec' and 'mon_bind_msgr2' options
monmaptool: add --addv argument to pass in addrvec directly
qa/suites/rados/basic/msgr: do not use msgr2 with simplemessenger
qa/suites/rados/basic/msgr: async is not experimental
messages/MOSDBoot: fix compat with pre-nautilus
mon/MonMap: allow v1 or v2 to be explicitly specified along with part
msg/msg_types: allow parsing of IPs without assuming v1 vs v2
msg/msg_types: default parse to v2 addrs
msg: standarize on v1: and v2: prefixes for *all* entity_addr_t's
vstart.sh: use msgr2 by default
mon/MonMap: remove get_addr() methods
ceph-mon: adjust startup/bind/join sequence to use addrs
mon: use MonMap::get_addrs() (instead of get_addr())
mon/MonClient: change pending_cons to addrvec-based map
mon/MonMap: fix set_addr() caller, kill wrapper
mon/MonMap: remove addr-based add()
monmaptool: fix --add to do either legacy or msgr2+legacy
monmaptool: clean up iterator use a bit
mon/MonMap: handle ambiguous mon addrs by trying both legacy and msgr
mon/MonMap: take addrvec for set_initial_members
mon/MonMap: use addrvecs for test instances
mon: pass addrvec via MMonJoin
mon/MonmapMonitor: fix 'mon add' to populate addrvec
mon/MonMap: addr -> addrvec
msg/async/ProtocolV2: only update socket_addr if we learned our addr
osd: go active even if mon only accepted our v1 addr
test/msgr: add test for msgr2 protocol
msg/async/ProtocolV2: share socket_addr and all addrs during handshake
msg/async: print socket_addr for the connection
msg/async: msgr2 protocol placeholder
msg/async: move ProtocolV1 class to its own source file
msg/async: keep listen addr in ServerSocket, pass to new connections
msg/async/AsyncMessenger: fix set_addr_unknowns
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
The teuthology test did not like the change to remove 'mon addr' from
ceph.conf. The standalone script is easier to test.
Note that it avoids mon names 'a', 'b', 'c' since the MonMap::build_initial
uses those.
Signed-off-by: Sage Weil <sage@redhat.com>
The grace starts with the monmap creation stamp, and ceph.py does a lot
of work between creating that map and actually starting daemons (e.g.,
preparing all of the osd devices), leading to occasional MGR_DOWN errors.
Double the grace period.
Signed-off-by: Sage Weil <sage@redhat.com>
The --add option will only infer a bare IP to include a v2 addr if the
NAUTILUS feature is there, and that isn't normally present on a freshly
generate monmap. Add it if we are doing addrvecs!
Signed-off-by: Sage Weil <sage@redhat.com>
Having these live in teuthology.git is silly, since they are only consumed
by the ceph task, and it is hard to revise the behavior.
Revise the behavior by adding mon_bind_* options.
Signed-off-by: Sage Weil <sage@redhat.com>
Updated health controller & test to reflect changes introduced
in 'df' payload.
Return 'total_used_raw_bytes' instead of 'total_used_bytes'
to match CLI 'bin/rados df' used/avail summary in
Landing Page (frontend component).
Do not return 'stats_by_class' to save bandwidth as they are
not needed (right now) in the dashboard.
Fixes: https://tracker.ceph.com/issues/37717
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
* All pool controller methods with same default value
for stats flag.
* Stats requested explicitly by frontend service.
* Updated API tests accordingly.
Fixes: https://tracker.ceph.com/issues/36740
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
1. To be able to run the cli without an external orchestrator.
2. Run the CLI in Teuthology.
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Move it up into CephTestCase so that mgr tests can
use it too, and pick it up in vstart_runner.py so
that these tests will work neatly there.
Signed-off-by: John Spray <john.spray@redhat.com>
The current solution fails on our CI-system as some outputs can have
more values and some parameters like 'w' can vary in different
environments.
As this was only tested before in a vstart cluster environment it
worked.
Through this commit only the given attributes we know to be there,
will be tested.
Fixes: https://tracker.ceph.com/issues/37275
Signed-off-by: Stephan Müller <smueller@suse.com>
* refs/pull/24940/head:
qa: add test for getfattr ceph.dir.pin
client: support getfattr ceph.dir.pin extended attribute
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
The change in b3e69a9609 broke the test's assumption that the endpoint
wouldn't be readable by block-manager. It doesn't looks as though that's
actually problematic for the ECP controller, so just update the test to
use rgw-manager instead.
Signed-off-by: Zack Cerza <zack@redhat.com>
Some classes should still be imported directly from collections;
only OrderedDict, Iterable and Callable (in the context of the
ceph codebase) are found in collections.abc.
The current code works due to the fallback support for Python 2.
Signed-off-by: James Page <james.page@ubuntu.com>
This splits out the collection of health and log data from the
/api/dashboard/health controller into /api/health/{full,minimal} and
/api/logs/all.
/health/full contains all the data (minus logs) that /dashboard/health
did, whereas /health/minimal contains only what is needed for the health
component to function. /logs/all contains exactly what the logs portion
of /dashboard/health did.
By using /health/minimal, on a vstart cluster we pull ~1.4KB of data
every 5s, where we used to pull ~6KB; those numbers would get larger
with larger clusters. Once we split out log data, that will drop to
~0.4KB.
Fixes: http://tracker.ceph.com/issues/36675
Signed-off-by: Zack Cerza <zack@redhat.com>
* refs/pull/17526/head:
qa/tasks/ceph_manager: avoid test_map_discontinuity stall with too few up osds
Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
Some tests have m=2,k=2 and this will break them. Sometimes even if we
have 5 up osds, we end up with 4 and CRUSH gets picky, so build in a
buffer and only do this if we have 6 up.
We don't have an easy way from here to see what the min up osds for healthy
is... basically this map discontinuity test just sucks.
Signed-off-by: Sage Weil <sage@redhat.com>
The behavior of `safe-to-destroy` has changed in
432f194355 (PR#24799) and the backend
needs to be adapted accordingly.
Fixes: http://tracker.ceph.com/issues/37290
Signed-off-by: Patrick Nawracay <pnawracay@suse.com>
Enabled ctx.managers to take cluster name from config in restart() method instead of default 'ceph'.
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
Separate diskprediction local cloud from the diskprediction plugin.
Devicehealth invoke device prediction function related on the global
configuration "device_failure_prediction_mode".
Signed-off-by: Rick Chen <rick.chen@prophetstor.com>
The new info endpoint will provide the frontend with the necessary
information it needs to create new profiles.
Fixes: https://tracker.ceph.com/issues/25156
Signed-off-by: Stephan Müller <smueller@suse.com>
- Fix bug in Dashboard QA unit test framework. Don't set the application type header manually, this is done by the requests library if required.
- Enhance QA unit test helper: Print the response of the API request if it fails. This should help to identify the problem more easily.
- Fix bug in the OSD controller. A parameter needs to be converted to integer.
- Take care that the params of the request object are not modified.
The issue was introduced by PR https://github.com/ceph/ceph/pull/24475. The CherryPy json_in plugin disclosed the errorneous unit test helper implementation.
Fixes: https://tracker.ceph.com/issues/36708
Signed-off-by: Volker Theile <vtheile@suse.com>
Python 3.7 now shows a warning as below.
/usr/bin/ceph:128: DeprecationWarning: Using or importing the ABCs from
'collections' instead of from 'collections.abc' is deprecated, and in
3.8 it will stop working
import rados
This patch addresses the that particular issue.
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
This is related to http://tracker.ceph.com/issues/36453. It is far from
a complete solution, but seems like a positive move.
I tested this change by first disabling my browser cache, and then used
the /docs endpoint to query /api/dashboard/health. Before compression:
Content-Length: 60748
Time: 615ms
After:
Content-Length: 7505
Time: 92ms
Then, I logged into the dashboard as normal and reloaded the page once I
was in. Some values for the reload operation before compression:
Total page load time: 58.48s
vendor.js Content-Length: 6486025
vendor.js time: 48.09s
After:
Total page load time: 14.55s
vendor.js Content-Length: 1143178
vendor.js time: 4.50s
Signed-off-by: Zack Cerza <zack@redhat.com>
This fixes "TypeError: admin_socket() got an unexpected keyword argument
'timeout'". The value is never used.
Signed-off-by: Zack Cerza <zack@redhat.com>
If there is a workunit task associated with the same client, the two
tasks will attempt to clone the suite repo to the same directory.
Worse, if it's parallel tasks, the two clones will clobber each
other.
Fixes: http://tracker.ceph.com/issues/36542
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
* refs/pull/24292/head:
qa: add test for rctime on root inode
mds: set rctime on new system inode
mds: small refactor
Reviewed-by: Zheng Yan <zyan@redhat.com>
This makes it easier to re-run tests against a suite branch without
requiring a full ceph-ci build and repo.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Apparently 15m is not long enough for some workunits like fsstress.
Fixes: http://tracker.ceph.com/issues/36365
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
It is now commented out like it was before,
but I've added a comment what happened during this test with the QA
system. The problem was that even with only a increase of 1 PG the QA
cluster went into a cluster warning state and did not recover in time.
The QA coverage timeout is 2 minutes.
I could not reproduce this behavior with a local cluster, but I've
added a loop to wait until pgp and pg number are equal and the cluster
is in a healthy state again. This can take locally about 5 seconds.
The internal loop has a timeout of 3 minutes.
Fixes: https://tracker.ceph.com/issues/36362
Signed-off-by: Stephan Müller <smueller@suse.com>
The dashboard backend can now unset all set compression arguments if the
compression mode is switched to 'unset'. In the case of 'unset' Ceph
itself will only delete the 'compression_mode' argument, not all other
set arguments. The other arguments that should be removed, too, are
added to the update arguments in order to delete all set arguments.
Fixes: https://tracker.ceph.com/issues/36355
Signed-off-by: Stephan Müller <smueller@suse.com>
Refactor '_get_mon_allow_pool_delete_config' method to be a little bit
more general. The method can now be used to get the value of every
config option known to the cluster.
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
Otherwise a bug preventing an asok operation from completing will cause the
entire job to fail.
Fixes: http://tracker.ceph.com/issues/36335
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
when enabling a module attempt to determine if it is an always on
module, and if it is, then return without waiting on the active manager
daemon to restart---which it won't if it is an always on module.
Signed-off-by: Noah Watkins <nwatkins@redhat.com>
* refs/pull/21566/head:
test: add test for mds drop cache command
mds: command to trim mds cache and client caps
mds: implement journal flush as asynchronous context execution
mds: cleanup some asok commands
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
If there is a bug preventing rm from completing, the workunit will get stuck.
Fixes: http://tracker.ceph.com/issues/36184
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Otherwise QA sits forever waiting for the kclient to umount when there is a
problem.
Fixes: http://tracker.ceph.com/issues/36184
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Otherwise the command will hang if the mount is broken.
Fixes: http://tracker.ceph.com/issues/36184
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/23187/head:
test: make rank argument mandatory when running journal_tool
cephfs-journal-tool: make "--rank" argument mandatory
cephfs-journal-tool: pass local arg vector for Journal actions
cephfs-journal-tool: dump to per rank output file wherever necessary
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/23530/head:
qa/vstart_runner: fix daemons list
PendingReleaseNotes: note multifs support in libcephfs
test/cephfs: add pybind test for mount_root
pybind/cephfs: enable passing filesystem name to mount
libcephfs: add ceph_select_filesystem
common: add doc strings to client_mds_namespace
client: allow passing fs name to mount()
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Conflicts:
PendingReleaseNotes
Two instances of fsstress clobber each other. Just build it in the local sandbox.
Fixes: http://tracker.ceph.com/issues/24177
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Specifically fixes the recurringly occurring `test_osd.py` error on the
`test_scrub` method. But this change should also prevent other issues of
the same kind. Issues of "same kind" are issues which occurr due to
tests which do not immediately result in a clean cluster status and
aren't manually programmed to wait for it.
Fixes: http://tracker.ceph.com/issues/36107
Signed-off-by: Patrick Nawracay <pnawracay@suse.com>
Also, fix a bunch of quirky journal_tool invocations that pass
"--rank" argument as the command argument rather than passing it
as function argument.
Fixes: https://tracker.ceph.com/issues/24780
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Updated integration tests to check data from new python code
Fixes: https://tracker.ceph.com/issues/24573
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
This was missing a cluster name prefix that
was added at some point, and consequently
calls to iter_daemons_of_role were returning
no daemons.
This was causing e.g. TestVolumeClient.test_data_isolated
to fail when run in vstart_runner.
Signed-off-by: John Spray <john.spray@redhat.com>
This module is written by Rick Chen <rick.chen@prophetstor.com> and
provides both a built-in local predictor and a cloud mode that queries
a cloud service (provided by ProphetStor) to predict device failures.
Signed-off-by: Rick Chen <rick.chen@prophetstor.com>
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/20469/head:
osd/PG: remove warn on delete+merge race
osd: base project_pg_history on is_new_interval
osd: make project_pg_history handle concurrent osdmap publish
osd: handle pg delete vs merge race
osd/PG: do not purge strays in premerge state
doc/rados/operations/placement-groups: a few minor corrections
doc/man/8/ceph: drop enumeration of pg states
doc/dev/placement-groups: drop old 'splitting' reference
osd: wait for laggy pgs without osd_lock in handle_osd_map
osd: drain peering wq in start_boot, not _committed_maps
osd: kick split children
osd: no osd_lock for finish_splits
osd/osd_types: remove is_split assert
ceph-objectstore-tool: prevent import of pg that has since merged
qa/suites: test pg merging
qa/tasks/thrashosds: support merging pgs too
mon/OSDMonitor: mon_inject_pg_merge_bounce_probability
doc/rados/operations/placement-groups: update to describe pg_num reductions too
doc/rados/operations: remove reference to lpgs
osd: implement pg merge
osd/PG: implement merge_from
osdc/Objecter: resend ops on pg merge
osd: collect and record pg_num changes by pool
osd: make load_pgs remove message more accurate
osd/osd_types: pg_t: add is_merge_target()
osd/osd_types: pg_t::is_merge -> is_merge_source
osd/osd_types: adding or substracting invalid stats -> invalid stats
osd/PG: clear_ready_to_merge on_shutdown (or final merge source prep)
osd: debug pending_creates_from_osd cleanup, don't use cbegin
ceph-objectstore-tool: debug intervals update
mgr/ClusterState: discard pg updates for pgs >= pg_num
mon/OSDMonitor: fix long line
mon/OSDMonitor: move pool created check into caller
mon/OSDMonitor: adjust pgp_num_target down along with pg_num_target as needed
mon/OSDMonitor: add mon_osd_max_initial_pgs to cap initial pool pgs
osd/OSDMap: set pg[p]_num_target in build_simple*() methods
mon/PGMap: adjust SMALLER_PGP_NUM warning to use *_target values
mon/OSDMonitor: set CREATING flag for force-create-pg
mon/OSDMonitor: start sending new-style pg_create2 messages
mon/OSDMonitor: set last_force_resend_prenautilus for pg_num_pending changes
osd: ignore pg creates when pool FLAG_CREATING is not set
mgr: do not adjust pg_num until FLAG_CREATING removed from pool
mon/OSDMonitor: add FLAG_CREATING on upgrade if pools still creating
mon/OSDMonitor: prevent FLAG_CREATING from getting set pre-nautilus
mon/OSDMonitor: disallow pg_num changes while CREATING flag is set
mon/OSDMonitor: set POOL_CREATING flag until initial pool pgs are created
osd/osd_types: add pg_pool_t FLAG_POOL_CREATING
osd/osd_types: introduce last_force_resend_prenautilus
osd/PGLog: merge_from helper
osd: no cache agent or snap trimming during premerge
osd: notify mon when pending PGs are ready to merge
mgr: add simple controller to adjust pg[p]_num_actual
mon/OSDMonitor: MOSDPGReadyToMerge to complete a pg_num change
mon/OSDMonitor: allow pg_num to adjusted up or down via pg[p]_num_target
osd/osd_types: make pg merge an interval boundary
osd/osd_types: add pg_t::is_merge() method
osd/osd_types: add pg_num_pending to pg_pool_t
osd: allow multiple threads to block on wait_min_pg_epoch
osd: restructure advance_pg() call mechanism
mon/PGMap: prune merged pgs
mon/PGMap: track pgs by state for each pool
osd/SnapMapper: allow split_bits to decrease (merge)
os/bluestore: fix osr_drain before merge
os/bluestore: allow reuse of osr from existing collection
os/filestore: (re)implement merge
os/filestore: add _merge_collections post-check
os: implement merge_collection
os/ObjectStore: add merge_collection operation to Transaction
We currently import a portion of the PG if it has split. Merge is more
complicated, though, mainly because COT is operating in a mode where it
fast-forwards the PG to the latest OSDMap epoch, which means it has to
implement any transformations to the PG (split/merge) independently.
Avoid doing this for merge.
Signed-off-by: Sage Weil <sage@redhat.com>
Commit 0d8887652d ("qa/tasks/cram: use suite_repo repository for all
cram jobs") removed hardcoded git.ceph.com links, but as it turned out
it is still used for nightlies. There is no good way to accommodate
the different URL schemes, so let's get rid of URLs altogether.
Fixes: https://tracker.ceph.com/issues/27211
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Also:
- Do not print **offset** until specified
- Count missing objects correctly (used to be primary's local missing)
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
The task uses netem to emulate wide area network delay.
Provides three different configurable options.
1. standard delay: Constant delay with +/- 5ms jitter with normal distribution as default.
2. variable delay: To provide a delay between two given min-max range in milliseconds.
3. packet drop: Toggles packet drop and recovery in regular interval.
Useful in simulating network delays between two clusters while testing
rgw multisite and rbd mirroring configurations.
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
mgr/dashboard: Add support for managing individual OSD settings in the backend
Reviewed-by: Sebastian Wagner <swagner@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
Currently git.ceph.com is hardcoded for all cram jobs. Testing
modifications is a pain: one needs to push to either ceph/ceph.git or
ceph/ceph-ci.git (depending on where the ceph branch is at, triggering
unnecessary builds in the latter case) and wait for the mirror to sync.
Runs scheduled against branches in developer's forks fail.
Move away from git.ceph.com to allow mixing branches and repositories,
similar to workunits.
Fixes: https://tracker.ceph.com/issues/27211
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
mgr/dashboard: Add REST API for role management
Reviewed-by: Ricardo Dias <rdias@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
Add options to mark OSDs in/out/down/reweight/lost/remove/destroy/create
Fixes: http://tracker.ceph.com/issues/24270
Signed-off-by: Patrick Nawracay <pnawracay@suse.com>
* refs/pull/23439/head:
qa: whitelist cap revoke warning
doc: document cap revoke non-responders client eviction
test: validate client eviction for cap revoke non-responders
mds: add counter for tracking cap non-responding clients
mds: evict clients that do not respond to cap revoke by MDS
mds: pass timeout argument for fetching late clients
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Zheng Yan <zyan@redhat.com>
Enables to change (set/unset) values of settings of the dashboard using
the REST API.
Fixes: https://tracker.ceph.com/issues/24273
Signed-off-by: Patrick Nawracay <pnawracay@suse.com>
an ugly workaround for a python dependency conflict that's broken the
rgw/tempest suite. allows us to preserve the pinned versions of
keystone/tempest without having to maintain a fork of the keystone
repository
Fixes: http://tracker.ceph.com/issues/23659
Signed-off-by: Casey Bodley <cbodley@redhat.com>
'policy show' returns a json-encoded representation of
RGWAccessControlPolicy, while key.get_xml_acl() returns
RGWAccessControlPolicy_S3 encoded as xml. so even with '&format=xml',
the strings won't match
Signed-off-by: Casey Bodley <cbodley@redhat.com>
result.json() throws a 'JSONDecodeError: Expecting value: line 1 column 1'
for requests that return no body, such as 'user rm' 'key rm' 'subuser
rm', 'bucket unlink', etc
Signed-off-by: Casey Bodley <cbodley@redhat.com>
* Assert `pg_placement_num` has the same value as `pg_num`.
* Only set `application_metadata`, if not None.
* `osd pool set` only accepts strings.
* Sync `pgp_num` with `pg_num`.
Signed-off-by: Stephan Müller <smueller@suse.com>
Avoid need for each module to expose a self-test
command: they can just implement the method,
and then get it called via the selftest module.
As well as fewer LOC, this means that the self
test commands are not cluttering the interface
for end users, as they've invisible until
the selftest module is loaded.
Signed-off-by: John Spray <john.spray@redhat.com>
This is being done by passing native CPython objects
back and forth. It's safe because sub-interpreters in CPython
share memory allocation infrastructure and share the GIL.
With a view to PEP554, we limit inter-interpreter calls
to pickleable objects, so that this may be implemented
using byte-arrays in future.
This infrastructure should enable:
- the dashboard to display the status of other modules, for
example the set of progress indicators from `progress`
- dashboard and restful to share an underlying long running
job mechanism.
Signed-off-by: John Spray <john.spray@redhat.com>
This fixes errors caused by remount done by some tests (test_recovery_pool.py)
where the fs name is not given.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
The MDS may not be on the same machine where the cluster command is run.
Fixes: http://tracker.ceph.com/issues/24858
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/21885/head:
qa: update cluster log health warning message
qa: add tests for client features
mds: evict clients that lack required features
mds: cleanup MDSRank::evict_client
mds: infer client version by client metadata and connection's features
mds: introduce "ceph fs set <fs_name> min_compat_client <release_name>"
mds: tell client why it's rejected
mds: introduce cephfs' own feature bits
mds: make Server::prepare_force_open_sessions() update client metadata
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>