* add comment to _run_tests()
* use `os.path.commonpath()` instead using string matching directly
for matching given workunit spec with executables.
* allow passing optional args to workunit
Signed-off-by: Kefu Chai <kchai@redhat.com>
as it is shell who interprets ">>" and redirect the stderr to given
file, but the shell process is launched by ubuntu:ububunt without using
sudo, so the command fails with "Permission denied" failure. to address
this issue, in this change, a file with proper priviledges is created
beforehand using `install`, so shell is able to write to it.
also, instead of creating this file in `maybe_redirect_stderr()`, it
returns the command to create the log file.
Signed-off-by: Kefu Chai <kchai@redhat.com>
* refs/pull/36131/head:
doc: document cephfs mirroring dev work
test: add tests for `ceph fs mirror` family of commands
mds: track filesystem mirror peers in fsmap
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/36472/head:
qa/workunits/fs: add test for subvolume
mds: don't move inode with nlink > 1 to global snaprealm if it's in subvolume
mds: disallow hardlink across subvolume
mds: disallow across subvolume rename
mds: disallow creating snapshot on descendent directory of subvolume
mds: add vxattr that marks/clears subvolume flag
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
This new method should allow better control on the process launched by
the passed command. This is achieved by allowing arguments provided by
teuthology.orchestra.run.run().
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Allow only [A-Za-z0-9-_.] characters for FS, volume, subvolume and
subvolume group names and add test for the same.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Since py2 is EOL, and cephadm requires py3 anyway this
patch removes the py2 test iteration from the functional
testing suite.
Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
This commit adds the file path of ceph log directories to the job's
info.yaml log file. The motivation behind this is, in case of job
timeout, the logs would still be tranferred to teuthology host
before nuking test machines using these ceph log directory paths in
job's info.yaml log file.
Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>
This commit adds the file path of ceph log directories to the job's
info.yaml log file. The motivation behind this is, in case of job
timeout, the logs would still be tranferred to teuthology host
before nuking test machines using these ceph log directory paths in
job's info.yaml log file.
Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>
the test for diskprediction_cloud is never enabled, and the used
cloud-based service is not reachable anymore. let's just remove the dead
code.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Previously, the peer uuid variable was empty which resulted in the failure
to remove the duplicate peer.
Fixes: https://tracker.ceph.com/issues/47007
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
mgr/dashboard/api: reduce amount of daemon logs
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
"rbd snap create" now reports progress. Pass --no-progress, as in
commit b5a5fea9e2 ("test/cli-integration/rbd: tweak after snap create
started to show progress").
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
qa/tests: use "-k distro" for all suites (except krbd)
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
* refs/pull/36351/head:
qa/tasks/cephfs/nfs: Add tests for cluster config set and reset
doc/cephfs/nfs: Update the doc about 'reset' and 'set' config interfaces
mgr/volumes/nfs: Add interface for adding user defined configuration
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
If a subvolumes mode or uid/gid values are changed post a snapshot,
and a clone of a snapshot prior to the change is initiated, the clone
inherits the current source subvolumes attributes, rather than the
snapshots attributes.
Fixing this by using the snapshots subvolume root attributes to create
the clone subvolumes root.
Following attributes are picked from the source subvolume snapshot:
- uid, gid, mode, data pool, pool namespace, quota
Fixes: https://tracker.ceph.com/issues/46163
Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
* refs/pull/36501/head:
qa: add tests for mds_min_caps_working_set
mds: add working set minimum for caps
qa: use config_set/config_get
qa: do not append file names to dirname
qa: add exception for test timeouts
Reviewed-by: Douglas Fuller <dfuller@redhat.com>
Otherwise the files generated are not actually under the sub-directory!
This is correcting a confusing aspect of the test infrastructure but
doesn't actually require any changes to the tests.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
To make this easier to catch. It is still a RuntimeError so it should
not affect current tests by default.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
we should redirect stderr for crimson instead for default flavor. this
change addresses a regression introduced by
da76f46461
Signed-off-by: Kefu Chai <kchai@redhat.com>
The API call is a task and the response status is determined by whether
the call is completed within a pre-defined duration (2 seconds) or not.
We should also allow the status when the call takes longer.
Fixes: https://tracker.ceph.com/issues/46812
Signed-off-by: Kiefer Chang <kiefer.chang@suse.com>
* refs/pull/36155/head:
qa: Fix traceback during fs cleanup between tests
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
* refs/pull/35944/head:
qa: defer cleaning the mountpoint's netnses and the bridge
qa/tasks/cephfs/mount.py: remove the stale netnses and bridge
qa/tasks/cephfs/mount.py: try to flush the stale ceph-brx dev info
qa/tasks/cephfs/mount.py: switch to run_shell_payload() helper
qa/tasks/cephfs/mount.py: clean up the none used code
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
mgr/cephadm: fix call to cephadm for daemon restarts etc
Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Varsha Rao <varao@redhat.com>
* extract get_ragweed_branch() out of download() task, for better
readablity.
* use a loop for retry when the first clone fails
* drop the `raise ValueError()` clause as it never happens. we could use
an assert() here, but i don't think it is necessary anyway.
* use sh() instead of run() for better readablity.
* always set ragweed_repo. before this change this variable is
unbounded if `force-branch` is set.
Fixes: https://tracker.ceph.com/issues/46771
Signed-off-by: Kefu Chai <kchai@redhat.com>
The 'mon_allow_pool_delete' option is set to 'True'
in 'setUp' of 'TestVolumes' and is cleared of in
corresponding 'tearDown' function. Hence, any pool
deletion in parent classes such as 'CephFSTestCase'
would fail. This patch fixes the same by setting
the config 'mon_allow_pool_delete' option in the
'CephFSTestCase'.
Fixes: https://tracker.ceph.com/issues/46597
Signed-off-by: Kotresh HR <khiremat@redhat.com>
The netnses maybe created/deleted many times in the whole test cases,
we can defer cleaning them untile the last mountpoint is unmounted
or when the test is exiting.
Fixes: https://tracker.ceph.com/issues/46282
Signed-off-by: Xiubo Li <xiubli@redhat.com>
mgr/dashboard: wait longer for health status to be cleared
Reviewed-by: Ni-Feng Chang <kiefer.chang@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
we could have following logging messages:
tasks.ceph:Waiting for all PGs to be active+clean and split+merged, waiting on ['2.6', '2.5', '1.0', '2.4'] to go clean and/or [] to split/merge
if the cluster has non-active+clean pgs when the "ceph" is about to
end. but this message is a little bit confusing in the sense it
lists "[]" in it.
in this change, only PGs being waited are listed. also, added some
cleanups:
* use "else" to check if the loop is terminated by a break
* remove "0" from the range() call
Signed-off-by: Kefu Chai <kchai@redhat.com>
If the previous test cases failed, the netnses and bridge will be
left. Here will remove them when new test cases begin.
Fixes: https://tracker.ceph.com/issues/45806
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Once we have run the test cases and the ceph-brx bridge is setup,
it will save the config in "/etc/sysconfig/network-scripts/ifcfg-ceph-brx"
or somewhere else. It will be kept after the ceph-brx bridge removed.
So next time once the ceph-brx bridge is created or added, it will
read the config from it, then when we config it again we will get
error like:
"RTNETLINK answers: File exists"
Here we need to flush it before config it.
Fixes: https://tracker.ceph.com/issues/45817
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Test that the osd doesn't crash when it gets a bad incremental osdmap.
Related-to: https://tracker.ceph.com/issues/46443
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
Because of reasons the cluster needs more time to recover from
HEALTH_WARN while changes are made by `test_pool_update_metadata`.
Lets wait several times for the cluster status to be HEALTH_OK
again.
Fixes: https://tracker.ceph.com/issues/46573
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
mgr/dashboard: increase API test coverage in API controllers
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puertat <epuertat@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
keystone's dependencies are installed using its tox.ini,
which in turn uses a constraints file of
https://releases.openstack.org/constraints/upper/ussuri,
and it pins cliff to 3.1.0, which is not able to fulfill the requirement
of osc-lib 2.2.0. as it needs needs cliff>=3.2.0. per
https://releases.openstack.org/ussuri/, the latest osc-lib for
ussuri is 2.0.0. and osc-lib>=2.0.0 is required by
python-openstackclient 2.5.1, so let's use it instead of using the latest
one.
if we install cliff==3.1.0 along with python-openstackclient==5.2.1,
we will have following error, as `CommandManager.add_command_group()`
method was added to cliff in 3.2.0. see
8477c4dbd0,
so cliff failed to work with the latest openstackclient, like:
2020-06-29T17:26:23.402 INFO:teuthology.orchestra.run.smithi039.stderr:'CommandManager' object has no attribute 'add_command_group'
2020-06-29T17:26:23.402 INFO:teuthology.orchestra.run.smithi039.stderr:Traceback (most recent call last):
2020-06-29T17:26:23.403 INFO:teuthology.orchestra.run.smithi039.stderr: File "/home/ubuntu/cephtest/keystone/.tox/venv/lib/python3.6/site-packages/cliff/app.py", line 264, in run
2020-06-29T17:26:23.403 INFO:teuthology.orchestra.run.smithi039.stderr: self.initialize_app(remainder)
2020-06-29T17:26:23.403 INFO:teuthology.orchestra.run.smithi039.stderr: File "/home/ubuntu/cephtest/keystone/.tox/venv/lib/python3.6/site-packages/openstackclient/shell.py", line 133, in
initialize_app
2020-06-29T17:26:23.403 INFO:teuthology.orchestra.run.smithi039.stderr: super(OpenStackShell, self).initialize_app(argv)
2020-06-29T17:26:23.403 INFO:teuthology.orchestra.run.smithi039.stderr: File "/home/ubuntu/cephtest/keystone/.tox/venv/lib/python3.6/site-packages/osc_lib/shell.py", line 442, in initialize_app
2020-06-29T17:26:23.404 INFO:teuthology.orchestra.run.smithi039.stderr: self._load_plugins()
2020-06-29T17:26:23.404 INFO:teuthology.orchestra.run.smithi039.stderr: File "/home/ubuntu/cephtest/keystone/.tox/venv/lib/python3.6/site-packages/openstackclient/shell.py", line 104, in
_load_plugins
2020-06-29T17:26:23.404 INFO:teuthology.orchestra.run.smithi039.stderr: self.command_manager.add_command_group(cmd_group)
2020-06-29T17:26:23.404 INFO:teuthology.orchestra.run.smithi039.stderr:AttributeError: 'CommandManager' object has no attribute 'add_command_group'
2020-06-29T17:26:23.404 INFO:teuthology.orchestra.run.smithi039.stderr:Traceback (most recent call last):
2020-06-29T17:26:23.405 INFO:teuthology.orchestra.run.smithi039.stderr: File "/home/ubuntu/cephtest/keystone/.tox/venv/lib/python3.6/site-packages/osc_lib/shell.py", line 134, in run
2020-06-29T17:26:23.405 INFO:teuthology.orchestra.run.smithi039.stderr: ret_val = super(OpenStackShell, self).run(argv)
2020-06-29T17:26:23.405 INFO:teuthology.orchestra.run.smithi039.stderr: File "/home/ubuntu/cephtest/keystone/.tox/venv/lib/python3.6/site-packages/cliff/app.py", line 264, in run
2020-06-29T17:26:23.405 INFO:teuthology.orchestra.run.smithi039.stderr: self.initialize_app(remainder)
2020-06-29T17:26:23.405 INFO:teuthology.orchestra.run.smithi039.stderr: File "/home/ubuntu/cephtest/keystone/.tox/venv/lib/python3.6/site-packages/openstackclient/shell.py", line 133, in
initialize_app
2020-06-29T17:26:23.405 INFO:teuthology.orchestra.run.smithi039.stderr: super(OpenStackShell, self).initialize_app(argv)
2020-06-29T17:26:23.406 INFO:teuthology.orchestra.run.smithi039.stderr: File "/home/ubuntu/cephtest/keystone/.tox/venv/lib/python3.6/site-packages/osc_lib/shell.py", line 442, in initialize_app
2020-06-29T17:26:23.406 INFO:teuthology.orchestra.run.smithi039.stderr: self._load_plugins()
2020-06-29T17:26:23.406 INFO:teuthology.orchestra.run.smithi039.stderr: File "/home/ubuntu/cephtest/keystone/.tox/venv/lib/python3.6/site-packages/openstackclient/shell.py", line 104, in
_load_plugins
2020-06-29T17:26:23.406 INFO:teuthology.orchestra.run.smithi039.stderr: self.command_manager.add_command_group(cmd_group)
2020-06-29T17:26:23.406 INFO:teuthology.orchestra.run.smithi039.stderr:AttributeError: 'CommandManager' object has no attribute 'add_command_group'
in this change the openstackclients version is pin'ed to the
latest stable of 5.2.1. will have a separated PR to bump up
the cliff version on teuthology side.
Signed-off-by: Kefu Chai <kchai@redhat.com>
rpm,deb,qa,python-common,test: drop python2 support
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
I have absolutely no idea why it's counting features, but
apparently it is and bumping the value to 7 makes it pass.
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Subvolume operations throw a traceback if the volume
doesn't exist. This patch fixes the same.
Fixes: https://tracker.ceph.com/issues/46496
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Addresses the name collisions of volumes, subvolumes,
clones, subvolume groups and snapshots within the tests.
Fixes: https://tracker.ceph.com/issues/43517
Signed-off-by: Kotresh HR <khiremat@redhat.com>
* refs/pull/35755/head:
mgr/volumes: Deprecate protect/unprotect CLI calls for subvolume snapshots
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Kotresh Hiremath Ravishankar <khiremat@redhat.com>
Reviewed-by: Victoria Martinez de la Cruz <vkmc@redhat.com>
Reviewed-by: Goutham Pacha Ravi <gouthamr@redhat.com>
Subvolume snapshots required to be protected, prior to cloning the same.
Also, protected snapshots were not allowed to be unprotected or removed,
if there were in-flight clones, whose source was the snapshot being
removed.
The protection of snapshots explicitly is not required, as these can be
prevented from being removed based only on the in-flight clones checks.
This commit hence deprecates the additional protect/unprotect requirements
prior to cloning a snapshot.
In addition to deprecating the above, support to query a subvolume for
supported features, via the info command, is added. The feature list
is set to "clone" and "auto-protect", where the latter is useful to
decide if protect/unprotect commands are required or not.
Fixes: https://tracker.ceph.com/issues/45371
Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
* refs/pull/34861/head:
test: adjust scrub control tests for optional scrub status
mgr: set `task_dirty_status` on reconnect
mds: send scrub status to ceph-mgr only when scrub is running
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Currently Dashboard Pool usage calculation does not match the output of
'ceph df' command.
Fixes: https://tracker.ceph.com/issues/45185
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
Check container_image_name only if ceph cluster image is not pre-defined in config.
We shouldn't care about container_image_name if there cephadm or ceph already have image defined.
Signed-off-by: Georgios Kyratsas <gkyratsas@suse.com>
Sometimes when teuthology machines are provisioned, the command
`hostname --fqdn` does not provide a fully qualified domain name but
instead just the hostname (e.g., smithi149 instead of
smithi149.front.sepia.ceph.com). This prevents the teuthology test for
rgw-orphan-list from running successfully [for example, the hostname
was for some reason mis-interpreted as the bucket name in the
request].
This commit checks whether the hostname derived from `hostname --fqdn`
contains any '.'s and if it does not, it will append
".front.sepia.ceph.com" to the hostname. This is a hack, but until
teuthology machines are configured appropriately it seems to be a
reasonable work-around.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
I haven't seen it be an issue, but I'm worried a slight different in ping
report timing might result in flapping leaders even with the new
ignore-out-of-quorum code.
Imagine DCs A, B, C where A and B are netsplit: C might first elect A, then
get a propose from B immediately following a successful ping reply that gives
it a better score than A and thus gets an election win; then A could do
the same, etc.
In a default 12-hour halflife, 2-second ping config, the most a single ping
can change the score is by 0.00002314814. Therefore a code default of .0001
and a config default of .0005 should be plenty of room to prevent that in
sane monitor configurations, while still responding quickly if connections are
restored.
Plus of course this only applies to out-of-quorum monitors to peons, so if
a monitor manages to contact the leader they will be allowed to join
instantly.
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
This is super basic right now and only works for monitor daemons
as it has to parse out their IPs from cluster information, then
turn that into the Host objects. We can extend it in future.
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
* refs/pull/35743/head:
qa/tasks/test_nfs: Add test for cluster info
mgr/volumes/nfs: Add cluster show info command
Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
fuse_mount.py. This isn't critical at all to vstart_runner.py runs but
this patch must dramatically reduce the time it takes in case the
command fails with "permission denied" due to lack of superuser
privileges since in this case the command is re-run 9 more times, each
separated by a sleep for 5 seconds.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Volume deletion wasn't validating mon_allow_pool_delete config
before destroying volume metadata. Hence when mon_allow_pool_delete
is set to false, it was deleting metadata but failed to delete pool
resulting in inconsistent state. This patch validates the config
before going ahead with deletion.
Fixes: https://tracker.ceph.com/issues/45662
Signed-off-by: Kotresh HR <khiremat@redhat.com>
qa/tasks: make sh() in vstart_runner.py identical with teuthology.orchestra.remote.sh
Reviewed-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
based on qemu task, use immutable_object_cache task to test parent cache
based on rbd_fio task, use immutable_object_cache task to test parent cache
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
Signed-off-by: Feng Hualong <hualong.feng@intel.com>
* refs/pull/35420/head:
mgr/volumes: Fix pool removal on volume deletion
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Jos Collin <jcollin@redhat.com>
* refs/pull/35664/head:
qa: add omit_sudo=False for commands ran with sudo
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
tasks/cephadm.py gained RGW support very recently and
I'm now facing a dilemman:
* Either we set the upgrade start to 15.2.4 and thus
no longer upgrade from an old version, or
* Disable RGW upgrade for now.
I think doing both would be optinal, but for now, let's
disable RGW, in order to keep the coverage for everything
else.
Fixes: https://tracker.ceph.com/issues/46157
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Before export (and ephemeral) pinned subtrees are stuck in cache
forever.
Add qa test for checking export pinned directories can be trimmed.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Pretty print output once. Use --format=json so the stdout on teuthology
is not pretty printed, taking hundreds of lines.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/35522/head:
vstart_runner: set default values of stdout and stderr to None
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/35540/head:
qa/cephfs: don't pass cmd args from run_as_user as str
qa/cephfs: refactor run_as_root() to user run_as_user()
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>