The ps output names daemons like 'type.foo', e.g., 'mgr.x'. Now that
the test_orchestrator impl is less bonkers this needs to be adjusted to
match reality.
Signed-off-by: Sage Weil <sage@redhat.com>
A first step to do more automatic code checks on the qa/
directory. This is useful while transitioning to python3.
Also use log_exc to top-level to not run into:
error: Argument 1 to "log_exc" has incompatible type
"Callable[[OSDThrasher], Any]"; expected "OSDThrasher"
Signed-off-by: Thomas Bechtold <tbechtold@suse.com>
Use io.BytesIO instead of cStringIO.StringIO
Use six.ensure_str whenever it needs to convert binary to str.
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
this reverts 9639acfefe, as the test does
make sense. what fails this test is the machinary to marshal/unmarshal
exception fails to handle un-picklable exceptions. the previous commit
is supposed to use a fallback to handle them.
Signed-off-by: Kefu Chai <kchai@redhat.com>
* refs/pull/33636/head:
qa: add upgrade test for volume upgrade from legacy
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
This tests that volumes created using the ceph_volume_client.py library
continue to be accessible/function via the Nautilus/Octopus ceph-mgr
volumes plugin.
Fixes: https://tracker.ceph.com/issues/42723
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/33506/head:
client: add client_fs mount option support
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
- Display services and daemons in the cluster/services page.
- Display daemons in the cluster/hosts/host-detail page (Daemons tab).
This PR also partially addresses https://tracker.ceph.com/issues/43165:
The endpoint `/api/orchestrator/service` is removed.
Create new endpoints:
- `/api/service`: listing all services in the Ceph cluster.
- `/api/service/<service_name>/daemons`: listing daemons for a
service. e.g. daemons of OSD.
- `/api/host/<hostname>/daemons`: listing daemons of a host.
Fixes: https://tracker.ceph.com/issues/44221
Signed-off-by: Kiefer Chang <kiefer.chang@suse.com>
When thrashing, we don't want this markdown behavior. In
fact, the only teuthology test that expects this behavior is
qa/standalone/osd/osd-markdown.
This bit of yaml is shared by all thrashing tests (although the name is
a bit unfortunate).
Fixes: https://tracker.ceph.com/issues/44348
Signed-off-by: Sage Weil <sage@redhat.com>
"client_fs" is one alias for "client_mds_namespace=" and it will be
cleaner and be more user-friendly to use. "client_mds_namespace="
will be kept and backwards compatibility used.
Update the documents at the same time.
Fixes: https://tracker.ceph.com/issues/44212
Signed-off-by: Xiubo Li <xiubli@redhat.com>
* refs/pull/33552/head:
mgr/dashboard: Enhance user create CLI command to force password change
Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
* refs/pull/31200/head:
qa/cephfs: test case for auto reconnect after blacklisted
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/33263/head:
qa/vstart_runner.py: make run()'s interface same as teuthology's run
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
* refs/pull/33427/head:
qa/cephfs: rewrite a bit of code xfstests_dev.py
qa/cephfs: update xfstests-dev deps for RHEL 8
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/32657/head:
test: query using mds id, not rank
mgr: re-enable mds `scrub status` info in ceph status
mon: filter out ceph normal ceph entity types when dumping service metadata
mgr: filter out normal ceph services when processing service map
mgr: helper function to check if a service is a normal ceph service
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/33413/head:
test: verify purge queue w/ large number of subvolumes
test: pass timeout argument to mount::wait_for_dir_empty()
mgr/volumes: access volume in lockless mode when fetching async job
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/33496/head:
mgr/cephadm: combine get_daemons_by_daemon -> get_daemons_by_service
mgr/cephadm: remove apply_mon support
mgr/cephadm: use generics for add_mon
mgr/cephadm: use _apply_service for mgrs
mgr/cephadm: refactor most daemon add methods
mgr/cephadm: refactor _update_service and all apply methods
mgr/cephadm: fix get_unique_name when name in use
Reviewed-by: Sebastian Wagner <swagner@suse.com>
Root subtree may be replicated which would open client sessions early.
Fixes: https://tracker.ceph.com/issues/43796
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Our apply method doesn't support removing mons at this point. And using
it for adding mons is just an awkward version of 'daemon add'.
Update docs and cephadm.py task accordingly.
Signed-off-by: Sage Weil <sage@redhat.com>
mgr/orchestrator: get_hosts return `HostSpec` instead of `InventoryDevice`
Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
mgr/dashboard: Enforce password change upon first login
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
When the URL passed to "curl --silent ..." does not exist, the
resulting file will be populated with the string:
404: Not Found
If that (or something similar) happens, the file size will be
suspiciously low, like < 1000 bytes. Fail the test immediately in this
case.
Signed-off-by: Nathan Cutler <ncutler@suse.com>
The line
rest.replace('.git/', '/')
was added to accommodate weird folks who run teuthology-suite
with an option like this:
--suite-repo https://github.com/ceph/ceph.git/
but they might just as well give the option like this:
--suite-repo https://github.com/ceph/ceph.git
Signed-off-by: Nathan Cutler <ncutler@suse.com>
The variable storing the major version number plays an important role
while updating deps, therefore use a better name that is more
descriptive and makes spotting thereby easier.
Also, add an explanation for why we we have list of deps for fedora and
remove a redundant line of code.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Arguments accepted by method run in teuthology should be exactly same as
that of arguments accepted by vstart_runner.py's method run to prevent
test failure with teuthology due to silly argument mismatch.
Teuthology's entry method run expects **kwargs, therefore so should
vstart_runner.py's run.
Fixes: https://tracker.ceph.com/issues/44117
Signed-off-by: Rishabh Dave <ridave@redhat.com>
We need to either fill out the list_daemons APIs in the dashboard and test
that, or redesign and reimplement the services abstractions in the
orchestration layer. Until then, disable this test.
Signed-off-by: Sage Weil <sage@redhat.com>
Introduce the following:
- A new layout component for the login pages.
- A new route called /login-change-password.
- A guard that checks if a user must change the password (ChangePasswordGuardService). If this is true, redirect to /login-change-password.
- Added LoginPasswordFormComponent (extends UserPasswordFormComponent) for the password form but (looks similar the login page).
Fixes: tracker.ceph.com/issues/24655
Signed-off-by: Volker Theile <vtheile@suse.com>
In addition to logging slow ops in mon and osd specific log files,
re-introduce logging the same information along with slow op type
details to cluster logs as well. The objective is to make debugging
slow ops easier.
Modify the log whitelisting string to "slow request" within qa suites in
order to make the search for the new warning log message within the
cluster log successful. This should not cause any issue as it's a
substring of the earlier string.
Fixes: https://tracker.ceph.com/issues/43975
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
Now angular tree is used instead of ng2-tree, as it provides a better
way to dynamically load children and it provides a way to update all
children without losing all track of everything.
The loading icon will rotate now on any fetch.
The tree will detect new directories and removed directories.
It's also now possible to select the root directory of a CephFS in order
to create snapshots of the whole FS.
Fixes: https://tracker.ceph.com/issues/42617
Signed-off-by: Stephan Müller <smueller@suse.com>
* refs/pull/33194/head:
qa: add tests for mds_join_fs cluster affinity
qa: update cluster warning message for removed MDS
doc: add section on new mds_join_fs behavior
mon/MDSMonitor: enforce mds_join_fs cluster affinity
mon/MDSMonitor: use type of info.rank or mds_rank_t
qa: accept operation on current fs status
qa: add method to enable multifs
qa: fix nested generator use
qa: manage config changes through mons
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
This provides a generic framework for modifying Ceph configuration
changes in tests through the monitors rather than the asok interface or
local ceph.conf changes. Any changes are reverted during test teardown.
A future patch will convert existing tests manipulating the local
ceph.conf or admin socket.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
- We keep ServiceDescription around unmodified (although it will need some
cleanup later)
- We add DaemonDescription, and clean out the service-related ambiguities
- Add a new list_daemons() method for Orchestrator
- Add a new 'ceph orch ps' command
- In cephadm, drop get_services(), and implement list_daemons()
- a million changes to make this work
- Adjust health alert and option names
Signed-off-by: Sage Weil <sage@redhat.com>
this test will end with a failure like
```
2020-01-30T18:15:15.870 INFO:tasks.ceph.mgr.x.smithi042.stderr:Warning: Permanently added 'smithi042.front.sepia.ceph.com,172.21.15.42' (ECDSA) to the list of known hosts.
2020-01-30T18:15:15.925 INFO:tasks.ceph.mgr.x.smithi042.stderr:Permission denied, please try again.
2020-01-30T18:15:15.932 INFO:tasks.ceph.mgr.x.smithi042.stderr:Permission denied, please try again.
2020-01-30T18:15:15.939 INFO:tasks.ceph.mgr.x.smithi042.stderr:root@smithi042.front.sepia.ceph.com: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
```
because mgr is not able to establish an ssh connection to that host with "root".
please note, the teuthology worker is acting using the "ubuntu" account on the
test node, and by default, "root" does not have its pubkey. and actually
`qa/tasks/cephadm.py` does push the pubkey to all the managed hosts before
testing cephadm.
since `qa/tasks/cephadm.py` is a better test for cephadm, let's just
drop this one.
as suites/rados/cephadm already covers cephadm
Signed-off-by: Kefu Chai <kchai@redhat.com>
Give the cluster some time to recover from the unknown
PG state before checking if the OSD is safe to destroy.
Fixes: https://tracker.ceph.com/issues/43912
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
* refs/pull/31633/head:
cephfs-shell: Instead of assert use stat for tests in rmdir
cephfs-shell: Add function for common rmdir test code
cephfs-shell: Add rmdir test for non empty directory
cephfs-shell: Add rmdir -p test for non empty directory
cephfs-shell: Add rmdir -p test for non existing dir
cephfs-shell: Add rmdir -p test to delete all dirs in given path
cephfs-shell: Add rmdir -p test for root directory with empty directories
cephfs-shell: Add rmdir test for valid file
cephfs-shell: Add rmdir test for invalid directory
cephfs-shell: Add rmdir test for valid directory
cephfs-shell: Fix rmdir '-p' issues
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
This is harmless if logging is low, but adds useful info when it is turned
up.
Hunting bug https://tracker.ceph.com/issues/43914
Signed-off-by: Sage Weil <sage@redhat.com>
If we haven't scrubbed everything, we occasinoally re-request scrub in case
the request was missed by the OSD (this can happen). But we were
re-requesting scrub on ALL pgs, and if they are done in a
semi-deterministic order and are slow, then we may never get to the final
ones.
Signed-off-by: Sage Weil <sage@redhat.com>
Fixes:
2020-01-30T04:41:24.697 INFO:tasks.thrashosds.thrasher:fixing pg num pool None
2020-01-30T04:41:24.698 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-sage-testing-2020-01-29-1034/qa/tasks/ceph_manager.py", line 1070, in wrapper
return func(self)
File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-sage-testing-2020-01-29-1034/qa/tasks/ceph_manager.py", line 1200, in _do_thrash
self.choose_action()()
File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-sage-testing-2020-01-29-1034/qa/tasks/ceph_manager.py", line 768, in fix_pgp_num
if self.ceph_manager.set_pool_pgpnum(pool, force):
File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-sage-testing-2020-01-29-1034/qa/tasks/ceph_manager.py", line 2088, in set_pool_pgpnum
assert isinstance(pool_name, six.string_types)
AssertionError
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/32972/head:
python-common/ceph/deployment/translate: use 'prepare' instead of 'batch' for trivial case
qa/tasks/cephadm: pass short dev name to osd prepare
mgr/cephadm: fix detection of just-created OSDs
mgr/cephadm: properly indent raise conditions
mgr/cephadm: add warning to other orchestrators
mgr/cephadm: separate acceptance criterias for Devices
mgr/cephadm: fix typos
mgr/cephadm: move utils in test/utils.py
mgr/ssh: increase disk size to 20G
drivegroups: add support for drivegroups + tests
mgr/orch_cli: allow multiple drivegroups
drivegroups: translate disk spec to ceph-volume call
Reviewed-by: Jan Fajerski <jfajerski@suse.com>
Reviewed-by: Joshua Schmid <jschmid@suse.de>
Zap needs a full path, but create/prepare needs the VG/LV
only if it is an existing LV.
We'll make c-v more friendly later.
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/32750/head:
qa/suites/upgrade/*-x/stress-split: run latest python tests at end
qa/tasks/ceph_manager: kludge around /var/log/ceph permissions
mgr/pg_autoscaler: don't check anything until cluster is nautilus
qa/suites/upgrade: install python3-* as part of final upgrade step
qa/tasks/radosbench: only set object size if != block size
qa/tasks/ceph: simplify mon_health_to_clog suppression during restart
cls_hello: alias write_return_data -> writes_dont_return_data
ceph_test_cls_hello: only do returndata test on octopus+
qa: remove unnecessary package excludes in upgrades/nautilus*
qa: exclude cephadm from upgrade/nautilus-x
qa/suites/upgrade/mimic-x/parallel: fix msgr2 vs octopus ordering
qa/suites/upgrade/nautilus-x/stress-split: fix msgr2 vs octopus ordering
qa/suites/upgrade/mimic-x: fix msgr2 vs octopus ordering
qa/suites/upgrade/nautilus-x: end on octopus
qa/suites/upgrade/mimic-x: finish at octopus
qa/suites/upgrade/nautilus-x: disable TOO_FEW_PGS warning
qa/tasks/ceph: set mon_health_to_clog=false via mon config
qa/suites/upgrade/mimic-x: disable TOO_FEW_PGS warning
Reviewed-by: Kefu Chai <kchai@redhat.com>
* refs/pull/32788/head:
qa/tasks/mgr/dashboard: set pg_num to 32
mgr/pg_autoscaler: default to pg_num[_min] = 32
Reviewed-by: Sage Weil <sage@redhat.com>
The ceph.py task normally makes these permissive. But a package upgrade
can reset the permissions so that we can't read and write the temp
export files. (We put them in these dirs now because it's alreadly
mapped out of cephadm containers to the host.)
Signed-off-by: Sage Weil <sage@redhat.com>
This is mostly pointless, except that the -O option for objects size
used to be -o for pre-octopus, so passing -O breaks the upgrade tests.
Fortunately, the upgrades use the defaults, so we can just skate by here.
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/32787/head:
qa/tasks/cephadm: work around .git suffix on ceph_repo
qa/tasks/cephadm: learn to pull cephadm from github
Reviewed-by: Jan Fajerski <jfajerski@suse.com>
Using assert causes the tests to fail on teuthology due to this test being run
separately from the ceph cluster. Instead use stat for testing.
Signed-off-by: Varsha Rao <varao@redhat.com>
* refs/pull/31232/head:
test: test case for openfiletable MAX_ITEMS_PER_OBJ value verification
mds/OpenFileTable: match MAX_ITEMS_PER_OBJ to osd_deep_scrub_large_omap_object_key_threshold
Reviewed-by: Zheng Yan <zyan@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Rather than verify the latest OSDMap is the same as the rank's
osdmap_epoch_barrier, just use the rank's version. The OSDMap may change
out-of-band with the test startup and thus the epoch's would diverge.
The file system and rank is fresh for each test so there's no reason to
care if the MDS barrier is one epoch behind the latest.
Fixes: https://tracker.ceph.com/issues/43554
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Instead of printing the (useless) traceback, just print a warning about
ignoring the failure. The traceback makes it harder to search for the
real problem in the teuthology log.
Fixes: https://tracker.ceph.com/issues/43718
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/32524/head:
qa/xfstests_dev: change deps for xfstests-dev on ubuntu
qa/cephfs: change deps for xfstests-dev on centos8
vstart_runnner: add sh method to LocalRemote
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
This was asserting that all PGs are active or peered, but that assertion
could fail if the concurrent workload created a new pool.
Switch to a loop that checks several times for the condition to be true.
Fixes: https://tracker.ceph.com/issues/43656
Signed-off-by: Sage Weil <sage@redhat.com>
Rename python to python2 if Ubuntu distro release is 19 or later.
Fixes: https://tracker.ceph.com/issues/43522
Signed-off-by: Rishabh Dave <ridave@redhat.com>
* refs/pull/32713/head:
qa/tasks/cephadm: start watching initial daemons before bootstrap
qa/tasks/cephadm: create /etc/ceph if it doesn't exist
qa/tasks/cephadm: fix log whitelist when there is no whitelist
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
This lets us see output while bootstrap is happening.
(Depends on the teuthology change to use journalctl, see
teuthology commit 4fa83040b05b604280789459f095d6f2ad1b0d01.)
Signed-off-by: Sage Weil <sage@redhat.com>