* refs/pull/34839/head:
qa/cephfs: add FUSE module before running mount -t fusectl
Reviewed-by: Zheng Yan <zyan@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
* refs/pull/34838/head:
vstart_runner: don't use namespaces by default
qa/cephfs: run nsenter commands with superuser privileges
qa/cephfs: look for mountpoint in cmdline file
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/34782/head:
qa/tasks/cephfs/mount.py: remove netns name parsing in mountpoint setter
qa/tasks/vstart_runner.py: add kwargs parameter to ignore the ones it does not understand
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
New, unwritten files, fail when backtracing during scrub.
This is not necessarily bad. So flag such failures as okay and continue
with other entries.
Fixes: https://tracker.ceph.com/issues/43543
Signed-off-by: Milind Changire <mchangir@redhat.com>
Look for self.mountpoint in the contents of /proc/<pid>/cmdline file
when finding asok file for the client so that vstart_runner.py won't end
up picking asok file for a client not created in current run.
This usually never happens so far because PID of newly created processes
is higher than that of previously created processes and list of asok
files returned by "glob.glob(asok_path)" in find_socket() is in
descending order of PIDs.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
If the ceph-fuse client need to flush the caps and does sync wait,
the umount() will just return successfully, then the netns container
will be destroyed and the network will not be reachable, but the
ceph-fuse daemon is still stucked and waiting for the flush caps ack.
This will cause the ceph-fuse daemon get stuck forever and if the
mds daemons get restarted, it will try to reconnect the clients,
but the stucked ceph-fuse daemnon won't reply to it, because it is
not reachable any more.
Fixes: https://tracker.ceph.com/issues/45665
Signed-off-by: Xiubo Li <xiubli@redhat.com>
1. Add --namespace-isolated option to 'subvolume create' command
to create subvolume in a separate RADOS namespace
2. Add "pool_namespace" field to 'subvolume info' command
which displays the rados namespace if set else empty string
Fixes: https://tracker.ceph.com/issues/45289
Signed-off-by: Kotresh HR <khiremat@redhat.com>
After subvolume is created, it can be resized
using subvolume create command. But it was
broken and the same is fixed.
Fixes: https://tracker.ceph.com/issues/45398
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Also, change timeout from 15 minutes to 30 seconds for the command
"mount -t fusectl xxx xxx" since 15 minutes is too much as per Zheng.
Fixes: https://tracker.ceph.com/issues/45304
Signed-off-by: Rishabh Dave <ridave@redhat.com>
cephfs-shell options should reside in cephfs-shell.conf and not in
ceph.conf. Please note, unlike before, -c option of cephfs-shell must
take path to cephfs-shell.conf instead of path to ceph.conf.
This commit also updates the docs and the tests for cephfs-shell accordingly
and renames the variable config_path in cephfs-shell to shell_conf_path so
that it's easy to distinguish.
Fixes: https://tracker.ceph.com/issues/44127
Signed-off-by: Rishabh Dave <ridave@redhat.com>
* refs/pull/35062/head:
qa/tasks/cephfs/fuse_mount.py: retry when the admin socket is not ready
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
* refs/pull/34951/head:
qa/cephfs: run() cleanup whether FS was mounted or not
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
We don't really care if this is left on during the course of the mount.
These settings don't persist across tests anyway.
Fixes: https://tracker.ceph.com/issues/45525
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
In case the mount command in mount() fails, it would still have created
the mountpoint and network namespace for the FS's mount. Therefore, run
cleanup() and cleanup_netns() in umount() and umount_wait() even when
self.mounted is set to False.
Also, move the call to cleanup_netns() in cleanup().
Fixes: https://tracker.ceph.com/issues/45430
Signed-off-by: Rishabh Dave <ridave@redhat.com>
The following command is added
"ceph fs subvolume snapshot info <vol_name> <sub_name> <snap_name> [<group_name>]"
The output is in json format with following fields
created_at: time of creation of snapshot in the format "YYYY-MM-DD HH:MM:SS:ffffff"
data_pool: data pool the snapshot belongs to
has_pending_clones: "yes" if snapshot clone is in progress otherwise "no"
protected: "yes" if snapshot is protected otherwise "no"
size: snapshot size in bytes
Fixes: https://tracker.ceph.com/issues/45237
Signed-off-by: Kotresh HR <khiremat@redhat.com>
* refs/pull/34978/head:
qa/tasks/cephfs/mount.py: always setup the NAT rules
qa/tasks/cephfs/mount.py: fall back to use ip command to setup the bridge
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
If the last test failed just after the 'ceph-brx' bridge is created
but with setuping the NAT rules in iptables, so if we run the test
case again, it will just skip seting the 'ceph-brx' and the NAT rules.
Then deleting the NAT rules will get the errors like:
iptables: Bad rule (does a matching rule exist in that chain?).
Signed-off-by: Xiubo Li <xiubli@redhat.com>
CentOS/RHEL8 have abandoned the bridge-utils package so the brctl
cmd is none avalible, and on Ubuntu 18.04 and 20.04 the nmcli is
buggy to setup the bridge. So this will fall back to use ip cmd
to setup the bridge stuff.
Fixes: https://tracker.ceph.com/issues/45459
Signed-off-by: Xiubo Li <xiubli@redhat.com>
self.mount_a.client_remote.sh() returns an 'str' object rather than a StringIO object. Hence the p.stdout.getvalue() produces an error. This commit fixes this and also fix str and byte mismatch as byte and string were the same object in Python2 but this is not the case in Python3.
Signed-off-by: Sidharth Anupkrishnan <sanupkri@redhat.com>
The refactor intends to simplify the test and make it more readable,
It also changes how this test creates directory -- tempfile.mkdtemp is
used instead of coreutils mkdir to avoid bug reported here -
https://tracker.ceph.com/issues/43567.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
INFO:teuthology.orchestra.run.smithi13a2:> (cd /home/ubuntu/cephtest && exec sudo chmod 1777 /home/ubuntu/cephtest/mnt.2)
INFO:teuthology.orchestra.run.smithi132.stderr:chmod: changing permissions of '/home/ubuntu/cephtest/mnt.2': Permission denied
DEBUG:teuthology.orchestra.run:got remote process result: 1
Here just wait and rety for 10 times.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Fix "Error EINVAL: invalid size option '4915200.0'".
In pytho2 int/2 will get a int type result, but in python3 it maybe
a float type.
Fixes: https://tracker.ceph.com/issues/45247
Signed-off-by: Xiubo Li <xiubli@redhat.com>
The pybind now has dropped the WITH_PYTHON2 option, and for now only
py3 supported.
Fixes: https://tracker.ceph.com/issues/45103
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Methods like run_shell effectively conduct positive test on the given
command. Add methods that runs given command expecting failure and then
verifies return value and error message with given one. Rewrite testcmd,
testcmd_as_user and testcmd_as_root to create these new methods for
negative testing since run_shell, run_as_user and run_as_root is
equivalent of running positive test.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
LocalFuseMount and LocalKernelMount can directly inherit these methods
from CephFSMount via FuseMount and KernelMount respectively. Moving
would avoid duplication and would make these methods more accessible
for reusing via inheritance.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
"ceph fs status" json format outputs to stderr instead of
stdout. This patch fixes the same.
Fixes: https://tracker.ceph.com/issues/44962
Signed-off-by: Kotresh HR <khiremat@redhat.com>
After creating a filesystem using the 'fs new' command, the value
of the 'data' and 'metadata' key of the datapool and metadatapool's
application tag 'cephfs' should be the filesystem's name. This
didn't happen when the data or metadata pool's application metadata
'cephfs' was enabled before the pool was used in the 'fs new' command.
Fix this during the handling of the 'fs new' command by setting the
value of the key of the pool's application metadata 'cephfs' to the
filesystem's name even when the application metadata 'cephfs' is
already enabled or set.
Fixes: https://tracker.ceph.com/issues/43761
Signed-off-by: Ramana Raja <rraja@redhat.com>
After making a RADOS pool a filesystem's data pool using the
'add_data_pool' command, the value of the 'data' key of the pool's
application metadata 'cephfs' should be the filesystem's name. This
didn't happen when the pool's application metadata 'cephfs' was
enabled before the pool was made the data pool. Fix this during the
handling of the 'add_data_pool' command by setting the value of
the 'data' key of the pool's application metadata 'cephfs' to the
filesystem's name even when the application metadata 'cephfs' is
already enabled or set.
Fixes: https://tracker.ceph.com/issues/43061
Signed-off-by: Ramana Raja <rraja@redhat.com>
The boolean self.mounted parameter is conflicted with mounted() helper,
so we will hit error like:
INFO:tasks.cephfs_test_runner:TypeError: 'bool' object is not callable
Fixes: https://tracker.ceph.com/issues/44044
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Mostly we should wait the mountpoint to get ready, especially for
the fuse mountpoint, sometimes it may take a few seconds to get
ready.
Fixes: https://tracker.ceph.com/issues/44044
Signed-off-by: Xiubo Li <xiubli@redhat.com>
If the network couldn't response due to some reasons, the 'stat' cmd
will stuck until the network recovery, the best case is it will stuck
forever.
Fixes: https://tracker.ceph.com/issues/44044
Signed-off-by: Xiubo Li <xiubli@redhat.com>
This will isolate the network namespace for each mount point with
a private ip address and iptables, etc.
For the kill() stuff it will just do DOWN the veth interface instead
of sending ipmi request for kernel mount and kill the fuse processes
for the fuse mount. This could avoid sending the socket FIN to the
ceph cluster.
Fixes: https://tracker.ceph.com/issues/44044
Signed-off-by: Xiubo Li <xiubli@redhat.com>
This will make sure the rstat between subdir's inode and dirfrag
are matched.
Fixes: https://tracker.ceph.com/issues/44380
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Make sure that the dirfrag auto merging and splitting are disabled
explicitly.
Fixes: https://tracker.ceph.com/issues/44380
Signed-off-by: Xiubo Li <xiubli@redhat.com>
The ceph code will set the dnfirst=2 as default instead of CEPH_NOSNAP.
And this will avoid the MDS daemon assert crash:
ceph_assert(in->first <= straydn->first)
Fixes: https://tracker.ceph.com/issues/44380
Signed-off-by: Xiubo Li <xiubli@redhat.com>
The following interface is added
"ceph fs subvolume info <vol_name> <sub_name> [<group_name>]"
The output is in json format with following fields
1. atime: access time of subvolume path in the format "YYYY-MM-DD HH:MM:SS"
2. mtime: modification time of subvolume path in the format "YYYY-MM-DD HH:MM:SS"
3. ctime: change time of subvolume path in the format "YYYY-MM-DD HH:MM:SS"
4. uid: uid of subvolume path
5. gid: gid of subvolume path
6. mode: mode of subvolume path
7. mon_addrs: list of monitor addresses
8. bytes_pcent: quota used in percentage if quota is set, else displays "undefined"
9. bytes_quota: quota size in bytes if quota is set, else displays "infinite"
10. bytes_used: current used size of the subvolume in bytes
11. created_at: time of creation of subvolume in the format "YYYY-MM-DD HH:MM:SS"
12. data_pool: data pool the subvolume belongs to
13. path: absolute path of a subvolume
14. type: subvolume type indicating whether it's clone or subvolume
Fixes: https://tracker.ceph.com/issues/44277
Signed-off-by: Kotresh HR <khiremat@redhat.com>
this change partially reverts e46eb8348e.
xattrs could contain non-utf8 encoded data, and should be captured using
BytesIO. moreover, it will be fed to `ceph-dencoder`, which expects
binary when performing "import".
Signed-off-by: Kefu Chai <kchai@redhat.com>
collect the keys instead of filtering a dict,
to address following failure:
```
2020-04-05T12:15:36.275 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
2020-04-05T12:15:36.275 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_tchaikov_ceph_wip-qa-py3/qa/tasks/cephfs/test_strays.py", line 29, in test_files_throttle
2020-04-05T12:15:36.275 INFO:tasks.cephfs_test_runner: self._test_throttling(self.FILES_THROTTLE)
2020-04-05T12:15:36.276 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_tchaikov_ceph_wip-qa-py3/qa/tasks/cephfs/test_strays.py", line 96, in _test_throttling
2020-04-05T12:15:36.276 INFO:tasks.cephfs_test_runner: return self._do_test_throttling(throttle_type)
2020-04-05T12:15:36.278 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_tchaikov_ceph_wip-qa-py3/qa/tasks/cephfs/test_strays.py", line 176, in _do_test_throttling
2020-04-05T12:15:36.278 INFO:tasks.cephfs_test_runner: mds_max_purge_ops = int(self.fs.get_config("mds_max_purge_ops", 'mds'))
2020-04-05T12:15:36.279 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_tchaikov_ceph_wip-qa-py3/qa/tasks/cephfs/filesystem.py", line 285, in get_config
2020-04-05T12:15:36.279 INFO:tasks.cephfs_test_runner: service_id = random.sample(filter(lambda i: self.mds_daemons[i].running(), self.mds_daemons), 1)[0]
2020-04-05T12:15:36.280 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/git.ceph.com_git_teuthology_py3/virtualenv/lib/python3.5/random.py", line 311, in sample
2020-04-05T12:15:36.280 INFO:tasks.cephfs_test_runner: raise TypeError("Population must be a sequence or set. For dicts, use list(d).")
2020-04-05T12:15:36.280 INFO:tasks.cephfs_test_runner:TypeError: Population must be a sequence or set. For dicts, use list(d).
```
Signed-off-by: Kefu Chai <kchai@redhat.com>
in python2, dict.values() and dict.keys() return lists. but in python3,
they return views, which cannot be indexed directly using an integer index.
there are three use cases when we access these views in python3:
1. get the first element
2. get all the elements and then *might* want to access them by index
3. get the first element assuming there is only a single element in
the view
4. iterate thru the view
in the 1st case, we cannot assume the number of elements, so to be
python3 compatible, we should use `next(iter(a_dict))` instead.
in the 2nd case, in this change, the view is materialized using
`list(a_dict)`.
in the 3rd case, we can just continue using the short hand of
```py
(first_element,) = a_dict.keys()
```
to unpack the view. this works in both python2 and python3.
in the 4th case, the existing code works in both python2 and python3, as
both list and view can be iterated using `iter`, and `len` works as
well.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Normal ceph services can send task status updates to manager.
Task status is tracked in service map implying that normal
ceph services have entries in service map and daemon tracking
index (daemon state). But the manager prunes entries from daemon
state when it receives an updated map (fs, mon, etc...). This
causes periodic pruning of service map entries to fail for normal
ceph services (those which send task status updates) since it
expects a corresponding entry in daemon state.
Signed-off-by: Venky Shankar <vshankar@redhat.com>