Merge PR #60746 into main

* refs/pull/60746/head:
	client: skip unexpected command replies
	mgr: indicate map message is acked instead of unhandled
	osdc/Objecter: convert to ms_dispatch2 for ack
	client: indicate maps are acked not processed
	msg: add alternate statuses for ms_dispatch2 handling
	tools/cephfs_mirror: do not process maps with fast dispatch
	doc: add docs for volumes interface for charmap
	qa: add tests for subvolume charmap settings
	pybind/mgr/volumes: wire up charmap for subvol/subvolgroup
	pybind/mgr: send MDS commands through cephfs client
	pybind/cephfs: wire up mds_command2
	mgr: add module method to send notifications
	libcephfs: add mds_command2 for asynchronous commands
	mgr: excise CephFS client from mgr C++ base
	mgr: use std namespace
	doc: add docs for CephFS charmap config
	qa: add charmap tests
	qa: add helpful exceptions for attr changes
	qa: ignore libicu leaks
	client: add wrappings for charmap manipuluation of dentry names
	client: add dir_result_t::dentry::print
	win32: add libicu Windows build
	CMakeLists: add boost::locale dependency for client
	install-deps: unconditionally install boost libraries
	test/libcephfs: update root operation return values
	client: refactor all path traversals through path_walk
	test/libcephfs: test parallel creates
	test/libcephfs: add test for lookup failure after readdir
	client: init dentry shared_gen with invalid value
	client: add _lookup debugging
	client: remove redundant check
	client: dump InodeStat from mds
	mds: encode optmetadata in InodeStat sent to clients
	mds: check client features for charmap
	mds: add client feature bit for charmap
	mds: wire up vxattr for changing charmap
	mds: inherit charmap on mkdir
	mds,include: add charmap optmetadata
	mds,include: add inode_t optional metadata
	client: hide alternate_name from API
	client: move alternate_name once
	client: optimize alternate_name passing to helper
	client: relocate definition
	client: print dentry with alternate_name on dump
	client: move inode dump to print method
	mds: add debugging for encoding lease stat
	mds: make encode_lease a proper method
	mds: add fscrypt metadata for inode stat size
	client: use DentryRef for ref counting in MetaRequest
	client: add DentryRef
	client: add helper for determining if a perm check is necessary
	client: cache client_permissions config
	client: add debugging for conf changes
	client: sort configs
	client/UserPerm: add print method
	client: note mount parameters in debug log
	client: print stat mode in octal
	common: add missing op string
	include/filepath: add empty path check

Reviewed-by: Venky Shankar <vshankar@redhat.com>
This commit is contained in:
Patrick Donnelly 2025-03-03 08:31:36 -05:00
commit 048fc68c51
No known key found for this signature in database
GPG Key ID: FA47FD0B0367D313
85 changed files with 3533 additions and 1273 deletions

View File

@ -729,6 +729,11 @@ if(WITH_RADOSGW AND WITH_RADOSGW_LUA_PACKAGES)
list(APPEND BOOST_COMPONENTS filesystem)
endif()
if(WITH_LIBCEPHFS)
find_package(ICU REQUIRED COMPONENTS uc i18n)
list(APPEND BOOST_COMPONENTS locale)
endif()
set(Boost_USE_MULTITHREADED ON)
CMAKE_DEPENDENT_OPTION(WITH_BOOST_VALGRIND "Boost support for valgrind" OFF

View File

@ -27,6 +27,10 @@
of the column showing the state of a group snapshot in the unformatted CLI
output is changed from 'STATUS' to 'STATE'. The state of a group snapshot
that was shown as 'ok' is now shown as 'complete', which is more descriptive.
* CephFS: Directories may now be configured with case-insensitive or
normalized directory entry names. This is an inheritable configuration making
it apply to an entire directory tree. For more information, see
https://docs.ceph.com/en/latest/cephfs/charmap/
* Based on tests performed at scale on an HDD based Ceph cluster, it was found
that scheduling with mClock was not optimal with multiple OSD shards. For
example, in the test cluster with multiple OSD node failures, the client

187
doc/cephfs/charmap.rst Normal file
View File

@ -0,0 +1,187 @@
.. _charmap:
CephFS Directory Entry Name Normalization and Case Folding
==========================================================
CephFS allows configuring directory trees to **normalize** and possibly **case
fold** directory entry names. This is typically a useful property for file
systems exported by gateways like Samba which enforce a case-insensitive view
of the file system, typically with performance penalties on file systems which
are not case-insensitive.
The following virtual extended attributes control the **character mapping**
rules for directory entries:
* ``ceph.dir.casesensitive``: A boolean setting for the case sensitivity of the directory. If true, case fold the directory entry names.
* ``ceph.dir.normalization``: A string setting for the type of Unicode normalization to apply for directory entry names. Currently the normalization forms D (``nfd``), C (``nfc``), KD (``nfkd``), and KC (``nfkc``) are understood by the client.
* ``ceph.dir.encoding``: A string setting for the encoding to use and enforce for directory entry names. The default and presently only supported encoding is UTF-8 (``utf8``).
There is also a convenience virtual extended attribute that is useful for
getting the JSON encoding of the case sensitivity, normalization, and encoding
configurations:
* ``ceph.dir.charmap``: The complete character mapping configuration for a directory.
It can also be used to **remove** all settings and restore the default CephFS behavior
for directory entry names: uninterpreted bytes without ``/`` that are NUL terminated.
Note the following restrictions on manipulating any of these extended attributes:
* The directory must be empty.
* The directory must not be part of a snapshot.
New subdirectories created under a directory with a ``charmap`` configuration will
inherit (copy) the parent's configuration.
.. note:: You can remove a ``charmap`` on a subdirectory which inherited
the configuration so long as the preconditions apply: it is empty
and not part of an existing snapshot.
Normalization
-------------
The ``ceph.dir.normalization`` attribute accepts the following normalization forms:
* **nfd**: Form D (Canonical Decomposition)
* **nfc**: Form C (Canonical Decomposition, followed by Canonical Composition)
* **nfkd**: Form KD (Compatibility Decomposition)
* **nfkc**: Form KC (Compatibility Decomposition, followed by Canonical Composition)
The default normalization for a character mapping configuration is ``nfd``.
.. note:: For more information about Unicode normalization forms, please see `Unicode normalization standard documents`_.
Whenever a directory entry name is generated during path traversal or lookup,
the client will apply the normalization to the name before submitting any
operation to the MDS. On the MDS side, the directory entry names which
are stored are only these normalized names.
For example, to set the normalization on a directory:
::
$ setfattr -n ceph.dir.normalization -v "" foo/
$ getfattr -n ceph.dir.charmap foo/
# file: foo/
ceph.dir.charmap="{\"casesensitive\":true,\"normalization\":\"nfd\",\"encoding\":\"utf8\"}"
$ getfattr -n ceph.dir.normalization foo/
# file: foo/
ceph.dir.normalization="nfd"
.. note:: Setting the empty string will cause the MDS to pick the default normalization.
All character mapping configurations must have a normalization enabled. Removing the normalization
will cause the default to be restored:
::
$ setfattr -n ceph.dir.normalization -v nfc foo/
$ getfattr -n ceph.dir.normalization foo/
# file: foo/
ceph.dir.normalization="nfc"
$ setfattr -x ceph.dir.normalization foo/
$ getfattr -n ceph.dir.normalization foo/
# file: foo/
ceph.dir.normalization="nfd"
To remove normlization on a directory, you must remove the ``ceph.dir.charmap``
configuration.
.. note:: The MDS maintains an ``alternate_name`` metadata (also used for
encryption) for directory entries which allows the client to persist the
original un-normalized name used by the application. The MDS does not
interpret this metadata in any way; it's only used by clients to reconstruct
the original name of the directory entry.
Case Folding
------------
The ``ceph.dir.casesensitive`` attribute accepts a boolean value. By
default, names are case-sensitive (as normal in a POSIX file system). Setting
this value to false will make the directory (and its children)
case-insensitive.
Case folding requires that names are also normalized. By default, after setting
a directory to be case-insensitive, the ``charmap`` will be:
::
$ setfattr -n ceph.dir.casesensitive -v 0 foo/
$ getfattr -n ceph.dir.casesensitive foo/
# file: foo/
ceph.dir.casesensitive="0"
$ getfattr -n ceph.dir.charmap foo/
# file: foo/
ceph.dir.charmap="{\"casesensitive\":false,\"normalization\":\"nfd\",\"encoding\":\"utf8\"}"
Note that setting the case sensitivity on a directory will cause the default
normalization to be selected.
.. note:: Normalization is applied before case folding. The directory entry name used
by the MDS is the case folded and normalized name.
Removing Character Mapping
--------------------------
If a directory is empty and not part of a snapshot, the ``charmap`` can be
removed:
::
$ setfattr -x ceph.dir.charmap foo/
One can confirm that this restores the normal CephFS behavior:
::
$ getfattr -n ceph.dir.charmap foo/
foo/: ceph.dir.charmap: No such attribute
If the attribute does not exist, then there is no character mapping for the
directory. Note that a (future) child or parent directory may have a charmap
configuration but it will have no effect on this directory. A charmap
configuration is only inherited at directory creation.
.. note:: The default charmap includes normalization that cannot be disabled.
The only way to turn off this functionality is by removing
this ``charmap`` virtual extended attribute.
Restricting Incompatible Client Access
--------------------------------------
The MDS protects access to directory trees with a ``charmap`` via a new client
feature bit. The MDS will not allow a client that does not understand the
``charmap`` feature to modify a directory with a ``charmap`` configuration
except to unlink files or remove subdirectories.
You can also require that all clients understand the ``charmap`` feature
to use the file system at all:
.. prompt:: bash #
ceph fs required_client_features <fs_name> add charmap
.. note:: The kernel driver does not understand the ``charmap`` feature
and probably will not because existing kernel libraries have
opinionated case folding and normalization forms. For this reason,
adding ``charmap`` to the required client features is not
recommended.
Permissions
-----------
As with other CephFS virtual extended atributes, a client may only set the
``charmap`` configuration on a directory with the **p** MDS auth cap. Viewing
the configuration does not require this cap.
.. _Unicode normalization standard documents: https://unicode.org/reports/tr15/

View File

@ -1017,6 +1017,121 @@ This enables distributed subtree partitioning policy for the "csi" subvolume
group. This will cause every subvolume within the group to be automatically
pinned to one of the available ranks on the file system.
Normalization and Case Sensitivity
----------------------------------
The subvolumegroup and subvolume interefaces have a porcelain layer API to
manipulate the ``ceph.dir.charmap`` configurations (see also :ref:`charmap`).
Configuring the charmap
~~~~~~~~~~~~~~~~~~~~~~~
To configure the charmap, for a subvolumegroup:
.. prompt:: bash #
ceph fs subvolumegroup charmap set <vol_name> <group_name> <setting> <value>
Or for a subvolume:
.. prompt:: bash #
ceph fs subvolume charmap set <vol_name> <subvol> <--group_name=name> <setting> <value>
For example:
.. prompt:: bash #
ceph fs subvolumegroup charmap set vol csi normalization nfd
outputs:
::
{"casesensitive":true,"normalization":"nfd","encoding":"utf8"}
Reading the charmap
~~~~~~~~~~~~~~~~~~~
To read the configuration, for a subvolumegroup:
.. prompt:: bash #
ceph fs subvolumegroup charmap get <vol_name> <group_name> <setting>
Or for a subvolume:
.. prompt:: bash #
ceph fs subvolume charmap get <vol_name> <subvol> <--group_name=name> <setting>
For example:
.. prompt:: bash #
ceph fs subvolume charmap get vol subvol --group_name=csi casesensitive
::
0
To read the full ``charmap``, for a subvolumegroup:
.. prompt:: bash #
ceph fs subvolumegroup charmap get <vol_name> <group_name>
Or for a subvolume:
.. prompt:: bash #
ceph fs subvolume charmap get <vol_name> <subvol> <--group_name=name>
For example:
.. prompt:: bash #
ceph fs subvolumegroup charmap get vol csi
outputs:
::
{"casesensitive":false,"normalization":"nfd","encoding":"utf8"}
Removing the charmap
~~~~~~~~~~~~~~~~~~~~
To remove the configuration, for a subvolumegroup:
.. prompt:: bash #
ceph fs subvolumegroup charmap rm <vol_name> <group_name
Or for a subvolume:
.. prompt:: bash #
ceph fs subvolume charmap rm <vol_name> <subvol> <--group_name=name>
For example:
.. prompt:: bash #
ceph fs subvolumegroup charmap rm vol csi
outputs:
::
{}
.. note:: A charmap can only be removed when a subvolumegroup or subvolume is empty.
Subvolume quiesce
-----------------

View File

@ -145,6 +145,7 @@ CephFS Concepts
Distributed Metadata Cache <mdcache>
Dynamic Metadata Management in CephFS <dynamic-metadata-management>
CephFS IO Path <cephfs-io-path>
Case Sensitivity and Normalization <charmap>
LazyIO <lazyio>
Directory fragmentation <dirfrags>
Multiple active MDS daemons <multimds>

View File

@ -191,11 +191,6 @@ function install_boost_on_ubuntu {
grep -e 'libboost[0-9].[0-9]\+-dev' |
cut -d' ' -f2 |
cut -d'.' -f1,2)
if test -n "$installed_ver"; then
if echo "$installed_ver" | grep -q "^$boost_ver"; then
return
fi
fi
local codename=$1
local project=libboost
local sha1=55f34507d322314fb0294629b7c0bb406de07aec
@ -212,6 +207,7 @@ function install_boost_on_ubuntu {
ceph-libboost-date-time${boost_ver}-dev \
ceph-libboost-filesystem${boost_ver}-dev \
ceph-libboost-iostreams${boost_ver}-dev \
ceph-libboost-locale${boost_ver}-dev \
ceph-libboost-program-options${boost_ver}-dev \
ceph-libboost-python${boost_ver}-dev \
ceph-libboost-random${boost_ver}-dev \

View File

@ -0,0 +1,9 @@
# charmap is only supported by the userspace Client at this time
teuthology:
postmerge:
- if not is_fuse() then reject() end
tasks:
- cephfs_test_runner:
fail_on_skip: false
modules:
- tasks.cephfs.test_dir_charmap

View File

@ -23,6 +23,14 @@ log = logging.getLogger(__name__)
UMOUNT_TIMEOUT = 300
class NoSuchAttributeError(SystemError):
pass
class InvalidArgumentError(SystemError):
pass
class DirectoryNotEmptyError(SystemError):
pass
class OperationNotPermittedError(SystemError):
pass
class CephFSMountBase(object):
def __init__(self, ctx, test_dir, client_id, client_remote,
@ -1564,7 +1572,23 @@ class CephFSMountBase(object):
# gives you [''] instead of []
return []
def removexattr(self, path, key, **kwargs):
def _convert_attr_error(self, p, e):
stderr = p.stderr.getvalue()
log.error("attr: %s", stderr)
if "No such attribute" in stderr:
raise NoSuchAttributeError()
elif "Invalid" in stderr:
raise InvalidArgumentError()
elif "Permission" in stderr:
raise PermissionError()
elif "Directory not empty" in stderr:
raise DirectoryNotEmptyError()
elif "Operation not permitted" in stderr:
raise OperationNotPermittedError()
else:
raise e
def removexattr(self, path, key, helpfulexception=False, **kwargs):
"""
Wrap setfattr removal.
@ -1576,9 +1600,21 @@ class CephFSMountBase(object):
if kwargs.pop('sudo', False):
kwargs['args'].insert(0, 'sudo')
kwargs['omit_sudo'] = False
self.run_shell(**kwargs)
wait = kwargs.setdefault('wait', True)
if wait:
kwargs['wait'] = False
p = self.run_shell(**kwargs)
try:
if wait:
p.wait()
except CommandFailedError as e:
if helpfulexception:
return self._convert_attr_error(p, e)
else:
raise
return p
def setfattr(self, path, key, val, **kwargs):
def setfattr(self, path, key, val, helpfulexception=False, **kwargs):
"""
Wrap setfattr.
@ -1591,9 +1627,22 @@ class CephFSMountBase(object):
if kwargs.pop('sudo', False):
kwargs['args'].insert(0, 'sudo')
kwargs['omit_sudo'] = False
return self.run_shell(**kwargs)
wait = kwargs.setdefault('wait', True)
if wait:
kwargs['wait'] = False
p = self.run_shell(**kwargs)
try:
if wait:
p.wait()
except CommandFailedError as e:
if helpfulexception:
return self._convert_attr_error(p, e)
else:
raise
return p
def getfattr(self, path, attr, **kwargs):
def getfattr(self, path, attr, helpfulexception=False, **kwargs):
"""
Wrap getfattr: return the values of a named xattr on one file, or
None if the attribute is not found.
@ -1604,16 +1653,23 @@ class CephFSMountBase(object):
if kwargs.pop('sudo', False):
kwargs['args'].insert(0, 'sudo')
kwargs['omit_sudo'] = False
kwargs['wait'] = False
wait = kwargs.setdefault('wait', True)
if wait:
kwargs['wait'] = False
p = self.run_shell(**kwargs)
try:
p.wait()
except CommandFailedError as e:
if e.exitstatus == 1 and "No such attribute" in p.stderr.getvalue():
return None
if wait:
p.wait()
else:
raise
return p
except CommandFailedError as e:
if helpfulexception:
return self._convert_attr_error(p, e)
else:
if e.exitstatus == 1 and "No such attribute" in p.stderr.getvalue():
return None
else:
raise
return str(p.stdout.getvalue())
def df(self):

View File

@ -0,0 +1,370 @@
import base64
import json
from logging import getLogger
from teuthology.exceptions import CommandFailedError
from tasks.cephfs.cephfs_test_case import CephFSTestCase
from tasks.cephfs.mount import NoSuchAttributeError, InvalidArgumentError, DirectoryNotEmptyError
log = getLogger(__name__)
class CharMapMixin:
def check_cs(self, path, **kwargs):
what = kwargs
what.setdefault("casesensitive", True)
what.setdefault("normalization", "nfd")
what.setdefault("encoding", "utf8")
v = self.mount_a.getfattr(path, "ceph.dir.charmap", helpfulexception=True)
J = json.loads(v)
log.debug("cs = %s", v)
self.assertEqual(what, J)
class TestCharMapVxattr(CephFSTestCase, CharMapMixin):
CLIENTS_REQUIRED = 1
MDSS_REQUIRED = 1
def test_cs_get_charmap_none(self):
"""
That getvxattr for a charmap fails if not present in Inode.
"""
self.mount_a.run_shell_payload("mkdir foo/")
try:
self.check_cs("foo")
except NoSuchAttributeError:
pass
else:
self.fail("should raise error")
def test_cs_get_charmap_set(self):
"""
That setvxattr fails for charmap.
"""
self.mount_a.run_shell_payload("mkdir foo/")
try:
self.mount_a.setfattr("foo/", "ceph.dir.charmap", "0", helpfulexception=True)
except InvalidArgumentError:
pass
else:
self.fail("should raise error")
def test_cs_set_charmap_inherited(self):
"""
That charmap is inherited.
"""
self.mount_a.run_shell_payload("mkdir foo/")
self.mount_a.setfattr("foo/", "ceph.dir.casesensitive", "0")
self.mount_a.run_shell_payload("mkdir foo/bar")
self.check_cs("foo/bar/", casesensitive=False)
def test_cs_get_charmap_none_rm(self):
"""
That rmvxattr actually removes the metadata from the Inode.
"""
self.mount_a.run_shell_payload("mkdir foo/")
self.mount_a.setfattr("foo/", "ceph.dir.casesensitive", "0")
self.mount_a.removexattr("foo/", "ceph.dir.charmap")
try:
self.check_cs("foo")
except NoSuchAttributeError:
pass
else:
self.fail("should raise error")
def test_cs_get_charmap_none_dup(self):
"""
That rmvxattr is idempotent.
"""
self.mount_a.run_shell_payload("mkdir foo/")
self.mount_a.setfattr("foo/", "ceph.dir.casesensitive", "0")
self.mount_a.removexattr("foo/", "ceph.dir.charmap")
self.mount_a.removexattr("foo/", "ceph.dir.charmap")
def test_cs_set_encoding_valid(self):
"""
That we can set ceph.dir.encoding and check it.
"""
self.mount_a.run_shell_payload("mkdir foo/")
self.mount_a.setfattr("foo/", "ceph.dir.encoding", "utf16")
self.check_cs("foo", encoding="utf16")
self.mount_a.setfattr("foo/", "ceph.dir.encoding", "utf8")
self.check_cs("foo", encoding="utf8")
def test_cs_set_encoding_garbage(self):
"""
That a garbage encoding is accepted but prevents creating any dentries.
"""
self.mount_a.run_shell_payload("mkdir foo/")
self.mount_a.setfattr("foo/", "ceph.dir.encoding", "garbage")
self.check_cs("foo", encoding="garbage")
try:
p = self.mount_a.run_shell_payload("mkdir foo/test", wait=False)
p.wait()
except CommandFailedError:
stderr = p.stderr.getvalue()
self.assertIn("Permission denied", stderr)
else:
self.fail("should fail")
def test_cs_rm_encoding(self):
"""
That removing the encoding without any other charmappings will restore access.
"""
self.mount_a.run_shell_payload("mkdir foo/")
self.mount_a.setfattr("foo/", "ceph.dir.encoding", "garbage")
self.mount_a.removexattr("foo/", "ceph.dir.encoding")
self.check_cs("foo")
self.mount_a.run_shell_payload("mkdir foo/test")
def test_cs_set_insensitive_valid(self):
"""
That we can set ceph.dir.casesensitive and check it.
"""
self.mount_a.run_shell_payload("mkdir foo/")
self.mount_a.setfattr("foo/", "ceph.dir.casesensitive", "0")
self.check_cs("foo", casesensitive=False)
self.mount_a.setfattr("foo/", "ceph.dir.casesensitive", "1")
self.check_cs("foo")
def test_cs_set_insensitive_garbage(self):
"""
That setting ceph.dir.casesensitive to garbage is rejected (should be bool).
"""
self.mount_a.run_shell_payload("mkdir foo/")
try:
self.mount_a.setfattr("foo/", "ceph.dir.casesensitive", "abc", helpfulexception=True)
except InvalidArgumentError:
pass
else:
self.fail("should fail")
try:
self.check_cs("foo")
except NoSuchAttributeError:
pass
else:
self.fail("should raise error")
def test_cs_rm_insensitive(self):
"""
That we can remove ceph.dir.casesensitive and restore the default.
"""
self.mount_a.run_shell_payload("mkdir foo/")
self.mount_a.setfattr("foo/", "ceph.dir.casesensitive", "0")
self.mount_a.removexattr("foo/", "ceph.dir.casesensitive")
self.check_cs("foo")
def test_cs_set_normalization(self):
"""
That we can set ceph.dir.normalization and check it.
"""
self.mount_a.run_shell_payload("mkdir foo/")
self.mount_a.setfattr("foo/", "ceph.dir.normalization", "nfc")
self.check_cs("foo", normalization="nfc")
self.mount_a.setfattr("foo/", "ceph.dir.normalization", "nfd")
self.check_cs("foo", normalization="nfd")
def test_cs_set_normalization_garbage(self):
"""
That a garbage normalization is accepted but prevents creating any dentries.
"""
self.mount_a.run_shell_payload("mkdir foo/")
self.mount_a.setfattr("foo/", "ceph.dir.normalization", "abc")
self.check_cs("foo", normalization="abc")
try:
p = self.mount_a.run_shell_payload("mkdir foo/test", wait=False)
p.wait()
except CommandFailedError:
stderr = p.stderr.getvalue()
self.assertIn("Permission denied", stderr)
else:
self.fail("should fail")
def test_cs_feature_bit(self):
"""
That the CEPHFS_FEATURE_CHARMAP feature bit enforces access.
"""
self.mount_a.run_shell_payload("mkdir foo/")
self.mount_a.setfattr("foo/", "ceph.dir.casesensitive", "0")
self.check_cs("foo", casesensitive=False)
self.mount_a.run_shell_payload("dd if=/dev/urandom of=foo/Test1 bs=4k count=1")
CEPHFS_FEATURE_CHARMAP = 22
# all but CEPHFS_FEATURE_CHARMAP
features = ",".join([str(i) for i in range(CEPHFS_FEATURE_CHARMAP)])
mntargs = [f"--client_debug_inject_features={features}"]
self.mount_a.remount(mntargs=mntargs)
self.check_cs("foo", casesensitive=False)
cmds = [
"mkdir foo/test2",
"ln -s . foo/test2",
"ln foo/Test1 foo/test2",
"dd if=/dev/urandom of=foo/test2 bs=4k count=1",
"mv foo/Test1 foo/Test2",
]
for cmd in cmds:
try:
p = self.mount_a.run_shell_payload(cmd, wait=False)
p.wait()
except CommandFailedError:
stderr = p.stderr.getvalue()
self.assertIn("Operation not permitted", stderr)
else:
self.fail("should fail")
okay_cmds = [
"ls foo/",
"stat foo/test1",
"rm foo/test1",
]
for cmd in okay_cmds:
try:
p = self.mount_a.run_shell_payload(cmd, wait=False)
p.wait()
except CommandFailedError:
stderr = p.stderr.getvalue()
self.fail("command failed:\n%s", stderr)
def test_cs_remount(self):
"""
That a remount continues to see the charmap.
"""
self.mount_a.run_shell_payload("mkdir foo/")
self.mount_a.setfattr("foo/", "ceph.dir.casesensitive", "0")
self.check_cs("foo", casesensitive=False)
self.mount_a.umount_wait()
self.mount_a.mount()
self.check_cs("foo", casesensitive=False)
def test_cs_not_empty_set_insensitive(self):
"""
That setting a charmap fails for a non-empty directory.
"""
attrs = {
"ceph.dir.casesensitive": "0",
"ceph.dir.normalization": "nfc",
"ceph.dir.encoding": "utf8",
}
self.mount_a.run_shell_payload("mkdir -p foo/dir")
for attr, v in attrs.items():
try:
self.mount_a.setfattr("foo/", attr, v, helpfulexception=True)
except DirectoryNotEmptyError:
pass
else:
self.fail("should fail")
try:
self.check_cs("foo")
except NoSuchAttributeError:
pass
else:
self.fail("should fail")
class TestNormalization(CephFSTestCase, CharMapMixin):
"""
Test charmap normalization.
"""
def test_normalization(self):
"""
That a normalization works for a conventional example.
"""
dname = "Grüßen"
self.mount_a.run_shell_payload("mkdir foo/")
self.mount_a.setfattr("foo/", "ceph.dir.normalization", "nfd") # default
self.mount_a.run_shell_payload(f"mkdir foo/{dname}")
c = self.fs.read_cache("foo", depth=0)
self.assertEqual(len(c), 1)
frags = c[0]['dirfrags']
self.assertEqual(len(frags), 1)
frag = frags[0]
dentries = frag['dentries']
self.assertEqual(len(dentries), 1)
dentry = dentries[0]
# ü to u + u0308
self.assertEqual(dentry['path'], "foo/Gru\u0308\u00dfen")
altn = dentry['alternate_name']
altn_bin = base64.b64decode(altn)
expected = bytes([0x47, 0x72, 0xc3, 0xbc, 0xc3, 0x9f, 0x65, 0x6e]) # 8 not 9 chars
self.assertIn(expected, altn_bin)
class TestEncoding(CephFSTestCase, CharMapMixin):
"""
Test charmap encoding.
"""
def test_encoding(self):
"""
That an encoding-only charmap still normalizes.
"""
# N.B.: you cannot disable normalization. Setting to empty string
# restores default.
dname = "Grüßen"
self.mount_a.run_shell_payload("mkdir foo/")
self.mount_a.setfattr("foo/", "ceph.dir.encoding", "utf8")
self.mount_a.run_shell_payload(f"mkdir foo/{dname}")
c = self.fs.read_cache("foo", depth=0)
self.assertEqual(len(c), 1)
frags = c[0]['dirfrags']
self.assertEqual(len(frags), 1)
frag = frags[0]
dentries = frag['dentries']
self.assertEqual(len(dentries), 1)
dentry = dentries[0]
self.assertEqual(dentry['path'], "foo/Gru\u0308\u00dfen")
altn = dentry['alternate_name']
altn_bin = base64.b64decode(altn)
expected = bytes([0x47, 0x72, 0xc3, 0xbc, 0xc3, 0x9f, 0x65, 0x6e]) # 8 not 9 chars
self.assertIn(expected, altn_bin)
class TestCaseFolding(CephFSTestCase, CharMapMixin):
"""
Test charmap case folding.
"""
def test_casefolding(self):
"""
That a case folding works for a conventional example.
"""
dname = "Grüßen"
self.mount_a.run_shell_payload("mkdir foo/")
self.mount_a.setfattr("foo/", "ceph.dir.casesensitive", "0")
self.mount_a.run_shell_payload(f"mkdir foo/{dname}")
c = self.fs.read_cache("foo", depth=0)
self.assertEqual(len(c), 1)
frags = c[0]['dirfrags']
self.assertEqual(len(frags), 1)
frag = frags[0]
dentries = frag['dentries']
self.assertEqual(len(dentries), 1)
dentry = dentries[0]
path = dentry['path'].encode('utf-8')
# Grüßen to Gru \u0308 ssen
# foo/gru\u0308ssen
expected = bytes([0x66, 0x6f, 0x6f, 0x2f, 0x67, 0x72, 0x75, 0xcc, 0x88, 0x73, 0x73, 0x65, 0x6e])
self.assertEqual(path, expected)
# Grüßen
altn = dentry['alternate_name']
altn_bin = base64.b64decode(altn)
expected = base64.b64decode("R3LDvMOfZW4=") # 8 chars, not 9
self.assertIn(expected, altn_bin)

View File

@ -1861,6 +1861,32 @@ class TestSubvolumeGroups(TestVolumesHelper):
# verify trash dir is clean
self._wait_for_trash_empty()
def test_subvolumegroup_charmap(self):
attrs = {
"normalization": "nfc",
"encoding": "utf8",
"casesensitive": False,
}
group = "foo"
self._fs_cmd("subvolumegroup", "create", self.volname, group)
for setting, value in attrs.items():
self._fs_cmd("subvolumegroup", "charmap", "set", self.volname, "--group_name", group, setting, str(value))
v = self._fs_cmd("subvolumegroup", "charmap", "get", self.volname, "--group_name", group)
v = json.loads(v)
self.assertEqual(v, attrs)
def test_subvolumegroup_charmap_rm(self):
group = "foo"
self._fs_cmd("subvolumegroup", "create", self.volname, group)
self._fs_cmd("subvolumegroup", "charmap", "set", self.volname, "--group_name", group, "normalization", "nfc")
self._fs_cmd("subvolumegroup", "charmap", "rm", self.volname, "--group_name", group)
try:
self._fs_cmd("subvolumegroup", "charmap", "get", self.volname, "--group_name", group)
except CommandFailedError:
pass # ENODATA
else:
self.fail("should fail")
def test_subvolume_group_rm_force(self):
# test removing non-existing subvolume group with --force
group = self._gen_subvol_grp_name()
@ -2052,6 +2078,42 @@ class TestSubvolumes(TestVolumesHelper):
# remove group
self._fs_cmd("subvolumegroup", "rm", self.volname, group)
def test_subvolume_charmap_inherited(self):
subvolume = self._gen_subvol_name()
group = self._gen_subvol_grp_name()
self._fs_cmd("subvolumegroup", "create", self.volname, group)
self._fs_cmd("subvolumegroup", "charmap", "set", self.volname, "--group_name", group, "casesensitive", "0")
self._fs_cmd("subvolume", "create", self.volname, subvolume, "--group_name", group)
v = self._fs_cmd("subvolume", "charmap", "get", self.volname, "--group_name", group, subvolume)
v = json.loads(v)
self.assertEqual(v['casesensitive'], False)
def test_subvolume_charmap(self):
subvolume = self._gen_subvol_name()
attrs = {
"normalization": "nfkd",
"encoding": "utf8",
"casesensitive": False,
}
self._fs_cmd("subvolume", "create", self.volname, subvolume)
for setting, value in attrs.items():
self._fs_cmd("subvolume", "charmap", "set", self.volname, subvolume, setting, str(value))
v = self._fs_cmd("subvolume", "charmap", "get", self.volname, subvolume)
v = json.loads(v)
self.assertEqual(v, attrs)
def test_subvolume_charmap_rm(self):
subvolume = self._gen_subvol_name()
self._fs_cmd("subvolume", "create", self.volname, subvolume)
self._fs_cmd("subvolume", "charmap", "set", self.volname, subvolume, "normalization", "nfkc")
self._fs_cmd("subvolume", "charmap", "rm", self.volname, subvolume)
try:
self._fs_cmd("subvolume", "charmap", "get", self.volname, subvolume)
except CommandFailedError:
pass # ENODATA
else:
self.fail("should fail")
def test_subvolume_create_idempotence(self):
# create subvolume
subvolume = self._gen_subvol_name()

View File

@ -1115,3 +1115,11 @@
fun:_dl_init
...
}
{
client icu leaks
Memcheck:Leak
match-leak-kinds: reachable
fun:malloc
obj:/usr/lib64/libicuuc.so.67.1
...
}

View File

@ -12,4 +12,8 @@ set(libclient_srcs
add_library(client STATIC ${libclient_srcs})
target_link_libraries(client
legacy-option-headers
osdc)
osdc
Boost::locale
ICU::uc
ICU::i18n
)

File diff suppressed because it is too large Load Diff

View File

@ -39,11 +39,13 @@
#include "osdc/ObjectCacher.h"
#include "RWRef.h"
#include "DentryRef.h"
#include "InodeRef.h"
#include "MetaSession.h"
#include "UserPerm.h"
#include <fstream>
#include <locale>
#include <map>
#include <memory>
#include <set>
@ -159,6 +161,7 @@ struct dir_result_t {
explicit dentry(int64_t o) : offset(o) {}
dentry(int64_t o, std::string n, std::string an, InodeRef in) :
offset(o), name(std::move(n)), alternate_name(std::move(an)), inode(std::move(in)) {}
void print(std::ostream& os) const;
};
struct dentry_off_lt {
bool operator()(const dentry& d, int64_t off) const {
@ -167,7 +170,7 @@ struct dir_result_t {
};
explicit dir_result_t(Inode *in, const UserPerm& perms, int fd);
explicit dir_result_t(InodeRef in, const UserPerm& perms, int fd);
static uint64_t make_fpos(unsigned h, unsigned l, bool hash) {
@ -269,11 +272,6 @@ public:
typedef int (*add_dirent_cb_t)(void *p, struct dirent *de, struct ceph_statx *stx, off_t off, Inode *in);
struct walk_dentry_result {
InodeRef in;
std::string alternate_name;
};
class CommandHook : public AdminSocketHook {
public:
explicit CommandHook(Client *client);
@ -399,15 +397,22 @@ public:
void seekdir(dir_result_t *dirp, loff_t offset);
int may_delete(const char *relpath, const UserPerm& perms);
int link(const char *existing, const char *newname, const UserPerm& perm, std::string alternate_name="");
int link(const char *oldpath, const char *newpath, const UserPerm& perm) {
return do_link(oldpath, newpath, perm);
}
int unlink(const char *path, const UserPerm& perm);
int unlinkat(int dirfd, const char *relpath, int flags, const UserPerm& perm);
int rename(const char *from, const char *to, const UserPerm& perm, std::string alternate_name="");
int rename(const char *from, const char *to, const UserPerm& perm) {
return do_rename(from, to, perm);
}
// dirs
int mkdir(const char *path, mode_t mode, const UserPerm& perm, std::string alternate_name="");
int mkdirat(int dirfd, const char *relpath, mode_t mode, const UserPerm& perm,
std::string alternate_name="");
int mkdir(const char *path, mode_t mode, const UserPerm& perm) {
return mkdirat(CEPHFS_AT_FDCWD, path, mode, perm);
}
int mkdirat(int dirfd, const char *path, mode_t mode, const UserPerm& perm) {
return do_mkdirat(dirfd, path, mode, perm);
}
int mkdirs(const char *path, mode_t mode, const UserPerm& perms);
int rmdir(const char *path, const UserPerm& perms);
@ -415,12 +420,12 @@ public:
int readlink(const char *path, char *buf, loff_t size, const UserPerm& perms);
int readlinkat(int dirfd, const char *relpath, char *buf, loff_t size, const UserPerm& perms);
int symlink(const char *existing, const char *newname, const UserPerm& perms, std::string alternate_name="");
int symlinkat(const char *target, int dirfd, const char *relpath, const UserPerm& perms,
std::string alternate_name="");
// path traversal for high-level interface
int walk(std::string_view path, struct walk_dentry_result* result, const UserPerm& perms, bool followsym=true);
int symlink(const char *target, const char *linkpath, const UserPerm& perms) {
return symlinkat(target, CEPHFS_AT_FDCWD, linkpath, perms);
}
int symlinkat(const char *target, int dirfd, const char *linkpath, const UserPerm& perms) {
return do_symlinkat(target, dirfd, linkpath, perms);
}
// inode stuff
unsigned statx_to_mask(unsigned int flags, unsigned int want);
@ -464,21 +469,21 @@ public:
// file ops
int mknod(const char *path, mode_t mode, const UserPerm& perms, dev_t rdev=0);
int create_and_open(int dirfd, const char *relpath, int flags, const UserPerm& perms,
mode_t mode, int stripe_unit, int stripe_count, int object_size,
const char *data_pool, std::string alternate_name);
int open(const char *path, int flags, const UserPerm& perms, mode_t mode=0, std::string alternate_name="") {
return open(path, flags, perms, mode, 0, 0, 0, NULL, alternate_name);
int open(const char *path, int flags, const UserPerm& perms, mode_t mode=0) {
return open(path, flags, perms, mode, 0, 0, 0, NULL);
}
int open(const char *path, int flags, const UserPerm& perms,
mode_t mode, int stripe_unit, int stripe_count, int object_size,
const char *data_pool, std::string alternate_name="");
int openat(int dirfd, const char *relpath, int flags, const UserPerm& perms,
const char *data_pool) {
return openat(CEPHFS_AT_FDCWD, path, flags, perms, mode, stripe_unit, stripe_count, object_size, data_pool);
}
int openat(int dirfd, const char *path, int flags, const UserPerm& perms, mode_t mode=0) {
return openat(dirfd, path, flags, perms, mode, 0, 0, 0, NULL);
}
int openat(int dirfd, const char *path, int flags, const UserPerm& perms,
mode_t mode, int stripe_unit, int stripe_count,
int object_size, const char *data_pool, std::string alternate_name);
int openat(int dirfd, const char *path, int flags, const UserPerm& perms, mode_t mode=0,
std::string alternate_name="") {
return openat(dirfd, path, flags, perms, mode, 0, 0, 0, NULL, alternate_name);
int object_size, const char *data_pool) {
return do_openat(dirfd, path, flags, perms, mode, stripe_unit, stripe_count, object_size, data_pool);
}
int lookup_hash(inodeno_t ino, inodeno_t dirino, const char *name,
@ -562,7 +567,7 @@ public:
int mds_check_access(std::string& path, const UserPerm& perms, int mask);
// Inode permission checking
int inode_permission(Inode *in, const UserPerm& perms, unsigned want);
int inode_permission(const InodeRef& in, const UserPerm& perms, unsigned want);
// expose caps
int get_caps_issued(int fd);
@ -933,6 +938,13 @@ public:
return std::make_pair(opened_inodes, inode_map.size());
}
void set_is_fuse() {
is_fuse = true;
}
bool get_fuse_default_permissions() const {
return fuse_default_permissions;
}
/* timer_lock for 'timer' */
ceph::mutex timer_lock = ceph::make_mutex("Client::timer_lock");
SafeTimer timer;
@ -945,20 +957,33 @@ public:
std::unique_ptr<PerfCounters> logger;
std::unique_ptr<MDSMap> mdsmap;
bool fuse_default_permissions;
bool _collect_and_send_global_metrics;
protected:
struct walk_dentry_result {
DentryRef dn;
InodeRef target;
InodeRef diri;
std::string dname;
std::string alternate_name;
filepath getpath() const;
void print(std::ostream& os) const;
};
std::list<ceph::condition_variable*> waiting_for_reclaim;
/* Flags for check_caps() */
static const unsigned CHECK_CAPS_NODELAY = 0x1;
static const unsigned CHECK_CAPS_SYNCHRONOUS = 0x2;
void check_caps(Inode *in, unsigned flags);
void check_caps(const InodeRef& in, unsigned flags);
bool _wrap_name(const Inode& diri, std::string& dname, std::string& alternate_name);
std::string _unwrap_name(const Inode& diri, const std::string& dname, const std::string& alternate_name);
void set_cap_epoch_barrier(epoch_t e);
void handle_command_reply(const MConstRef<MCommandReply>& m);
bool handle_command_reply(const MConstRef<MCommandReply>& m);
int fetch_fsmap(bool user);
int resolve_mds(
const std::string &mds_spec,
@ -982,6 +1007,9 @@ protected:
void dump_mds_requests(Formatter *f);
void dump_mds_sessions(Formatter *f, bool cap_dump=false);
// path traversal for testing, looking at walk_dentry_result
int walk(std::string_view path, struct walk_dentry_result* result, const UserPerm& perms, bool followsym=true);
int make_request(MetaRequest *req, const UserPerm& perms,
InodeRef *ptarget = 0, bool *pcreated = 0,
mds_rank_t use_mds=-1, bufferlist *pdirbl=0,
@ -1010,10 +1038,24 @@ protected:
void handle_client_reply(const MConstRef<MClientReply>& reply);
bool is_dir_operation(MetaRequest *request);
int path_walk(const filepath& fp, struct walk_dentry_result* result, const UserPerm& perms, bool followsym=true, int mask=0,
InodeRef dirinode=nullptr);
int path_walk(const filepath& fp, InodeRef *end, const UserPerm& perms,
bool followsym=true, int mask=0, InodeRef dirinode=nullptr);
int create_and_open(int dirfd, const char *relpath, int flags, const UserPerm& perms,
mode_t mode, int stripe_unit, int stripe_count, int object_size,
const char *data_pool, std::string alternate_name);
int do_mkdirat(int dirfd, const char *relpath, mode_t mode, const UserPerm& perm, std::string alternate_name="");
int do_rename(const char *from, const char *to, const UserPerm& perm, std::string alternate_name="");
int do_link(const char *existing, const char *newname, const UserPerm& perm, std::string alternate_name="");
int do_symlinkat(const char *target, int dirfd, const char *linkpath, const UserPerm& perms, std::string alternate_name="");
int do_openat(int dirfd, const char *path, int flags, const UserPerm& perms, mode_t mode, int stripe_unit, int stripe_count, int object_size, const char *data_pool, std::string alternate_name="");
struct PathWalk_ExtraOptions {
bool followsym = true;
unsigned int mask = 0;
bool is_rename = false;
bool require_target = true;
};
int path_walk(InodeRef dirinode, const filepath& fp, struct walk_dentry_result* result, const UserPerm& perms, const PathWalk_ExtraOptions& extra_options);
int path_walk(InodeRef dirinode, const filepath& fp, InodeRef *end, const UserPerm& perms, const PathWalk_ExtraOptions& extra_options);
// fake inode number for 32-bits ino_t
void _assign_faked_ino(Inode *in);
@ -1033,7 +1075,7 @@ protected:
void invalidate_snaprealm_and_children(SnapRealm *realm);
void refresh_snapdir_attrs(Inode *in, Inode *diri);
Inode *open_snapdir(Inode *diri);
InodeRef open_snapdir(const InodeRef& diri);
int get_fd() {
int fd = free_fd_set.range_start();
@ -1093,12 +1135,12 @@ protected:
void unlink(Dentry *dn, bool keepdir, bool keepdentry);
int fill_stat(Inode *in, struct stat *st, frag_info_t *dirstat=0, nest_info_t *rstat=0);
int fill_stat(InodeRef& in, struct stat *st, frag_info_t *dirstat=0, nest_info_t *rstat=0) {
int fill_stat(const InodeRef& in, struct stat *st, frag_info_t *dirstat=0, nest_info_t *rstat=0) {
return fill_stat(in.get(), st, dirstat, rstat);
}
void fill_statx(Inode *in, unsigned int mask, struct ceph_statx *stx);
void fill_statx(InodeRef& in, unsigned int mask, struct ceph_statx *stx) {
void fill_statx(const InodeRef& in, unsigned int mask, struct ceph_statx *stx) {
return fill_statx(in.get(), mask, stx);
}
@ -1110,7 +1152,7 @@ protected:
void trim_dentry(Dentry *dn);
void trim_caps(MetaSession *s, uint64_t max);
void _invalidate_kernel_dcache();
void _trim_negative_child_dentries(InodeRef& in);
void _trim_negative_child_dentries(const InodeRef& in);
void dump_inode(Formatter *f, Inode *in, set<Inode*>& did, bool disconnected);
void dump_cache(Formatter *f); // debug
@ -1120,7 +1162,7 @@ protected:
void dump_status(Formatter *f); // debug
bool ms_dispatch2(const MessageRef& m) override;
Dispatcher::dispatch_result_t ms_dispatch2(const MessageRef& m) override;
void ms_handle_connect(Connection *con) override;
bool ms_handle_reset(Connection *con) override;
@ -1650,64 +1692,56 @@ private:
// internal interface
// call these with client_lock held!
int _do_lookup(Inode *dir, const std::string& name, int mask, InodeRef *target,
int _do_lookup(const InodeRef& dir, const std::string& name, int mask, InodeRef *target,
const UserPerm& perms);
int _lookup(Inode *dir, const std::string& dname, int mask, InodeRef *target,
const UserPerm& perm, std::string* alternate_name=nullptr,
bool is_rename=false);
int _lookup(const InodeRef& dir, const std::string& name, std::string& alternate_name,
int mask, InodeRef *target, const UserPerm& perm, bool is_rename=false);
int _link(Inode *in, Inode *dir, const char *name, const UserPerm& perm, std::string alternate_name,
InodeRef *inp = 0);
int _link(Inode *diri_from, const char* path_from, Inode* diri_to, const char* path_to, const UserPerm& perm, std::string alternate_name);
int _unlink(Inode *dir, const char *name, const UserPerm& perm);
int _rename(Inode *olddir, const char *oname, Inode *ndir, const char *nname, const UserPerm& perm, std::string alternate_name);
int _mkdir(Inode *dir, const char *name, mode_t mode, const UserPerm& perm,
InodeRef *inp = 0, const std::map<std::string, std::string> &metadata={},
std::string alternate_name="");
int _rmdir(Inode *dir, const char *name, const UserPerm& perms);
int _rmdir(Inode *dir, const char *name, const UserPerm& perms, bool check_perms=true);
int _symlink(Inode *dir, const char *name, const char *target,
const UserPerm& perms, std::string alternate_name, InodeRef *inp = 0);
int _mknod(Inode *dir, const char *name, mode_t mode, dev_t rdev,
const UserPerm& perms, InodeRef *inp = 0);
bool make_absolute_path_string(Inode *in, std::string& path);
bool make_absolute_path_string(const InodeRef& in, std::string& path);
int _do_setattr(Inode *in, struct ceph_statx *stx, int mask,
const UserPerm& perms, InodeRef *inp,
std::vector<uint8_t>* aux=nullptr);
void stat_to_statx(struct stat *st, struct ceph_statx *stx);
int __setattrx(Inode *in, struct ceph_statx *stx, int mask,
const UserPerm& perms, InodeRef *inp = 0);
int _setattrx(InodeRef &in, struct ceph_statx *stx, int mask,
int _setattrx(const InodeRef &in, struct ceph_statx *stx, int mask,
const UserPerm& perms);
int _setattr(InodeRef &in, struct stat *attr, int mask,
int _setattr(const InodeRef &in, struct stat *attr, int mask,
const UserPerm& perms);
int _ll_setattrx(Inode *in, struct ceph_statx *stx, int mask,
const UserPerm& perms, InodeRef *inp = 0);
int _getattr(Inode *in, int mask, const UserPerm& perms, bool force=false);
int _getattr(InodeRef &in, int mask, const UserPerm& perms, bool force=false) {
return _getattr(in.get(), mask, perms, force);
}
int _readlink(Inode *in, char *buf, size_t size);
int _getattr(const InodeRef& in, int mask, const UserPerm& perms, bool force=false);
int _readlink(const InodeRef& diri, const char* relpath, char *buf, size_t size, const UserPerm& perms);
int _getxattr(Inode *in, const char *name, void *value, size_t len,
const UserPerm& perms);
int _getxattr(InodeRef &in, const char *name, void *value, size_t len,
int _getxattr(const InodeRef &in, const char *name, void *value, size_t len,
const UserPerm& perms);
int _getvxattr(Inode *in, const UserPerm& perms, const char *attr_name,
ssize_t size, void *value, mds_rank_t rank);
int _listxattr(Inode *in, char *names, size_t len, const UserPerm& perms);
int _do_setxattr(Inode *in, const char *name, const void *value, size_t len,
int flags, const UserPerm& perms);
int _setxattr(Inode *in, const char *name, const void *value, size_t len,
int flags, const UserPerm& perms);
int _setxattr(InodeRef &in, const char *name, const void *value, size_t len,
int _setxattr(const InodeRef &in, const char *name, const void *value, size_t len,
int flags, const UserPerm& perms);
int _setxattr_check_data_pool(std::string& name, std::string& value, const OSDMap *osdmap);
void _setxattr_maybe_wait_for_osdmap(const char *name, const void *value, size_t len);
int _removexattr(Inode *in, const char *nm, const UserPerm& perms);
int _removexattr(InodeRef &in, const char *nm, const UserPerm& perms);
int _open(Inode *in, int flags, mode_t mode, Fh **fhp,
int _open(const InodeRef& in, int flags, mode_t mode, Fh **fhp,
const UserPerm& perms);
int _renew_caps(Inode *in);
int _create(Inode *in, const char *name, int flags, mode_t mode, InodeRef *inp,
int _create(const walk_dentry_result& wdr, int flags, mode_t mode, InodeRef *inp,
Fh **fhp, int stripe_unit, int stripe_count, int object_size,
const char *data_pool, bool *created, const UserPerm &perms,
std::string alternate_name);
@ -1742,19 +1776,19 @@ private:
int _flock(Fh *fh, int cmd, uint64_t owner);
int _lazyio(Fh *fh, int enable);
Dentry *get_or_create(Inode *dir, const char* name);
Dentry *get_or_create(Inode *dir, const std::string& name);
int xattr_permission(Inode *in, const char *name, unsigned want,
const UserPerm& perms);
int may_setattr(Inode *in, struct ceph_statx *stx, int mask,
int may_setattr(const InodeRef& in, struct ceph_statx *stx, int mask,
const UserPerm& perms);
int may_open(Inode *in, int flags, const UserPerm& perms);
int may_lookup(Inode *dir, const UserPerm& perms);
int may_create(Inode *dir, const UserPerm& perms);
int may_delete(Inode *dir, const char *name, const UserPerm& perms);
int may_hardlink(Inode *in, const UserPerm& perms);
int may_open(const InodeRef& in, int flags, const UserPerm& perms);
int may_lookup(const InodeRef& dir, const UserPerm& perms);
int may_create(const InodeRef& dir, const UserPerm& perms);
int may_delete(const walk_dentry_result& wdr, const UserPerm& perms, bool check_perms=true);
int may_hardlink(const InodeRef& in, const UserPerm& perms);
int _getattr_for_perm(Inode *in, const UserPerm& perms);
int _getattr_for_perm(const InodeRef& in, const UserPerm& perms);
vinodeno_t _get_vino(Inode *in);
@ -1810,10 +1844,10 @@ private:
void _release_filelocks(Fh *fh);
void _update_lock_state(struct flock *fl, uint64_t owner, ceph_lock_state_t *lock_state);
int _posix_acl_create(Inode *dir, mode_t *mode, bufferlist& xattrs_bl,
int _posix_acl_create(const InodeRef& dir, mode_t *mode, bufferlist& xattrs_bl,
const UserPerm& perms);
int _posix_acl_chmod(Inode *in, mode_t mode, const UserPerm& perms);
int _posix_acl_permission(Inode *in, const UserPerm& perms, unsigned want);
int _posix_acl_chmod(const InodeRef& in, mode_t mode, const UserPerm& perms);
int _posix_acl_permission(const InodeRef& in, const UserPerm& perms, unsigned want);
mds_rank_t _get_random_up_mds() const;
@ -1830,6 +1864,10 @@ private:
void update_io_stat_read(utime_t latency);
void update_io_stat_write(utime_t latency);
bool should_check_perms() const {
return (is_fuse && !fuse_default_permissions) || (!is_fuse && client_permissions);
}
uint32_t deleg_timeout = 0;
client_switch_interrupt_callback_t switch_interrupt_cb = nullptr;
@ -1956,6 +1994,12 @@ private:
std::vector<MDSCapAuth> cap_auths;
feature_bitset_t myfeatures;
bool is_fuse = false;
bool client_permissions;
bool fuse_default_permissions;
std::locale m_locale;
};
/**

View File

@ -9,6 +9,7 @@
#include "Inode.h"
#include "common/Formatter.h"
#include "common/strescape.h"
void Dentry::dump(Formatter *f) const
{
@ -27,7 +28,33 @@ void Dentry::dump(Formatter *f) const
f->dump_int("cap_shared_gen", cap_shared_gen);
}
std::ostream &operator<<(std::ostream &oss, const Dentry &dn)
void Dentry::print(std::ostream& os) const
{
return oss << dn.dir->parent_inode->vino() << "[\"" << dn.name << "\"]";
os << dir->parent_inode->vino();
os << "[";
os << "\"" << binstrprint(name) << "\"";
if (!alternate_name.empty()) {
os << " altn=\"" << binstrprint(alternate_name, 16) << "\"";
}
os << " ref=" << ref;
if (inode) {
os << " ino=" << inode->vino();
} else {
os << " ino=nil";
}
os << " csg=" << cap_shared_gen;
if (is_renaming) {
os << " is_renaming=true";
}
os << "]";
}
void intrusive_ptr_add_ref(Dentry* dn)
{
dn->get();
}
void intrusive_ptr_release(Dentry* dn)
{
dn->put();
}

View File

@ -94,7 +94,7 @@ public:
}
void dump(Formatter *f) const;
friend std::ostream &operator<<(std::ostream &oss, const Dentry &Dentry);
void print(std::ostream&) const;
Dir *dir;
const std::string name;
@ -105,7 +105,7 @@ public:
utime_t lease_ttl;
uint64_t lease_gen = 0;
ceph_seq_t lease_seq = 0;
int cap_shared_gen = 0;
int cap_shared_gen = -1;
std::string alternate_name;
bool is_renaming = false;

23
src/client/DentryRef.h Normal file
View File

@ -0,0 +1,23 @@
// -*- mode:C++; tab-width:8; c-basic-offset:2; indent-tabs-mode:t -*-
// vim: ts=8 sw=2 smarttab
/*
* Ceph - scalable distributed file system
*
* Copyright (C) 2024 IBM, Inc.
*
* This is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License version 2.1, as published by the Free Software
* Foundation. See file COPYING.
*
*/
#ifndef CEPH_CLIENT_DENTRYREF_H
#define CEPH_CLIENT_DENTRYREF_H
#include <boost/intrusive_ptr.hpp>
class Dentry;
void intrusive_ptr_add_ref(Dentry *in);
void intrusive_ptr_release(Dentry *in);
typedef boost::intrusive_ptr<Dentry> DentryRef;
#endif

View File

@ -42,26 +42,26 @@ Inode::~Inode()
}
}
ostream& operator<<(ostream &out, const Inode &in)
void Inode::print(std::ostream& out) const
{
out << in.vino() << "("
<< "faked_ino=" << in.faked_ino
<< " nref=" << in.get_nref()
<< " ll_ref=" << in.ll_ref
<< " cap_refs=" << in.cap_refs
<< " open=" << in.open_by_mode
<< " mode=" << oct << in.mode << dec
<< " size=" << in.size << "/" << in.max_size
<< " nlink=" << in.nlink
<< " btime=" << in.btime
<< " mtime=" << in.mtime
<< " ctime=" << in.ctime
<< " change_attr=" << in.change_attr
<< " caps=" << ccap_string(in.caps_issued());
if (!in.caps.empty()) {
out << vino() << "("
<< "faked_ino=" << faked_ino
<< " nref=" << get_nref()
<< " ll_ref=" << ll_ref
<< " cap_refs=" << cap_refs
<< " open=" << open_by_mode
<< " mode=" << oct << mode << dec
<< " size=" << size << "/" << max_size
<< " nlink=" << nlink
<< " btime=" << btime
<< " mtime=" << mtime
<< " ctime=" << ctime
<< " change_attr=" << change_attr
<< " caps=" << ccap_string(caps_issued());
if (!caps.empty()) {
out << "(";
bool first = true;
for (const auto &pair : in.caps) {
for (const auto &pair : caps) {
if (!first)
out << ',';
out << pair.first << '=' << ccap_string(pair.second.issued);
@ -69,28 +69,31 @@ ostream& operator<<(ostream &out, const Inode &in)
}
out << ")";
}
if (in.dirty_caps)
out << " dirty_caps=" << ccap_string(in.dirty_caps);
if (in.flushing_caps)
out << " flushing_caps=" << ccap_string(in.flushing_caps);
if (dirty_caps)
out << " dirty_caps=" << ccap_string(dirty_caps);
if (flushing_caps)
out << " flushing_caps=" << ccap_string(flushing_caps);
if (in.flags & I_COMPLETE)
if (flags & I_COMPLETE)
out << " COMPLETE";
if (in.is_file())
out << " " << in.oset;
if (is_file())
out << " " << oset;
if (!in.dentries.empty())
out << " parents=" << in.dentries;
if (!dentries.empty())
out << " parents=" << dentries;
if (in.is_dir() && in.has_dir_layout())
if (is_dir() && has_dir_layout())
out << " has_dir_layout";
if (in.quota.is_enabled())
out << " " << in.quota;
if (quota.is_enabled())
out << " " << quota;
out << ' ' << &in << ")";
return out;
if (optmetadata.size() > 0) {
out << " " << optmetadata;
}
out << ' ' << this << ")";
}

View File

@ -15,6 +15,8 @@
#include "mds/mdstypes.h" // hrm
#include "include/cephfs/types.h"
#include "messages/MClientReply.h"
#include "osdc/ObjectCacher.h"
#include "InodeRef.h"
@ -164,6 +166,10 @@ struct Inode : RefCountedObject {
std::vector<uint8_t> fscrypt_auth;
std::vector<uint8_t> fscrypt_file;
decltype(InodeStat::optmetadata) optmetadata;
using optkind_t = decltype(InodeStat::optmetadata)::optkind_t;
bool is_fscrypt_enabled() {
return !!fscrypt_auth.size();
}
@ -329,6 +335,15 @@ struct Inode : RefCountedObject {
void rm_fh(Fh *f) {fhs.erase(f);}
void set_async_err(int r);
void dump(Formatter *f) const;
void print(std::ostream&) const;
bool has_charmap() const {
return optmetadata.has_opt(optkind_t::CHARMAP);
}
auto& get_charmap() const {
auto& opt = optmetadata.get_opt(optkind_t::CHARMAP);
return opt.template get_meta< charmap_md_t >();
}
void break_all_delegs() { break_deleg(false); };
@ -360,6 +375,4 @@ private:
};
std::ostream& operator<<(std::ostream &out, const Inode &in);
#endif

View File

@ -56,28 +56,18 @@ void MetaRequest::dump(Formatter *f) const
f->dump_unsigned("owner_gid", head.owner_gid);
}
MetaRequest::~MetaRequest()
{
if (_dentry)
_dentry->put();
if (_old_dentry)
_old_dentry->put();
}
void MetaRequest::set_dentry(Dentry *d) {
ceph_assert(_dentry == NULL);
_dentry = d;
_dentry->get();
void MetaRequest::set_dentry(DentryRef dn) {
ceph_assert(_dentry.get() == NULL);
_dentry = std::move(dn);
}
Dentry *MetaRequest::dentry() {
return _dentry;
return _dentry.get();
}
void MetaRequest::set_old_dentry(Dentry *d) {
ceph_assert(_old_dentry == NULL);
_old_dentry = d;
_old_dentry->get();
void MetaRequest::set_old_dentry(DentryRef dn) {
ceph_assert(_old_dentry.get() == NULL);
_old_dentry = std::move(dn);
}
Dentry *MetaRequest::old_dentry() {
return _old_dentry;
return _old_dentry.get();
}

View File

@ -9,20 +9,20 @@
#include "include/xlist.h"
#include "include/filepath.h"
#include "mds/mdstypes.h"
#include "DentryRef.h"
#include "InodeRef.h"
#include "UserPerm.h"
#include "messages/MClientRequest.h"
#include "messages/MClientReply.h"
class Dentry;
class dir_result_t;
struct MetaRequest {
private:
InodeRef _inode, _old_inode, _other_inode;
Dentry *_dentry = NULL; //associated with path
Dentry *_old_dentry = NULL; //associated with path2
DentryRef _dentry; //associated with path
DentryRef _old_dentry; //associated with path2
int abort_rc = 0;
public:
ceph::coarse_mono_time created = ceph::coarse_mono_clock::zero();
@ -83,7 +83,7 @@ public:
head.owner_uid = -1;
head.owner_gid = -1;
}
~MetaRequest();
~MetaRequest() = default;
/**
* Prematurely terminate the request, such that callers
@ -115,14 +115,17 @@ public:
void set_inode(Inode *in) {
_inode = in;
}
void set_inode(InodeRef in) {
_inode = std::move(in);
}
Inode *inode() {
return _inode.get();
}
void take_inode(InodeRef *out) {
out->swap(_inode);
}
void set_old_inode(Inode *in) {
_old_inode = in;
void set_old_inode(InodeRef in) {
_old_inode = std::move(in);
}
Inode *old_inode() {
return _old_inode.get();
@ -130,8 +133,8 @@ public:
void take_old_inode(InodeRef *out) {
out->swap(_old_inode);
}
void set_other_inode(Inode *in) {
_other_inode = in;
void set_other_inode(InodeRef in) {
_other_inode = std::move(in);
}
Inode *other_inode() {
return _other_inode.get();
@ -139,9 +142,9 @@ public:
void take_other_inode(InodeRef *out) {
out->swap(_other_inode);
}
void set_dentry(Dentry *d);
void set_dentry(DentryRef d);
Dentry *dentry();
void set_old_dentry(Dentry *d);
void set_old_dentry(DentryRef d);
Dentry *old_dentry();
MetaRequest* get() {

View File

@ -88,6 +88,12 @@ public:
gids = o.gids;
alloced_gids = false;
}
void print(std::ostream& os) const {
os
<< "UserPerm(uid=" << m_uid
<< " gid=" << m_gid
<< ")";
}
};
#endif

View File

@ -1295,7 +1295,7 @@ static void do_init(void *data, fuse_conn_info *conn)
conn->want |= FUSE_CAP_SPLICE_MOVE;
#if !defined(__APPLE__)
if (!client->fuse_default_permissions && client->ll_handle_umask()) {
if (!client->get_fuse_default_permissions() && client->ll_handle_umask()) {
// apply umask in userspace if posix acl is enabled
if(conn->capable & FUSE_CAP_DONT_MASK)
conn->want |= FUSE_CAP_DONT_MASK;
@ -1783,6 +1783,7 @@ fuse_req_t CephFuse::Handle::get_fuse_req()
CephFuse::CephFuse(Client *c, int fd) : _handle(new CephFuse::Handle(c, fd))
{
c->set_is_fuse();
}
CephFuse::~CephFuse()

View File

@ -292,6 +292,7 @@ const char *ceph_mds_op_name(int op)
case CEPH_MDS_OP_LOOKUPNAME: return "lookupname";
case CEPH_MDS_OP_GETATTR: return "getattr";
case CEPH_MDS_OP_DUMMY: return "dummy";
case CEPH_MDS_OP_GETVXATTR: return "getvxattr";
case CEPH_MDS_OP_SETXATTR: return "setxattr";
case CEPH_MDS_OP_SETATTR: return "setattr";
case CEPH_MDS_OP_RMXATTR: return "rmxattr";

View File

@ -312,7 +312,8 @@ options:
default: true
services:
- mds_client
with_legacy: true
flags:
- runtime
- name: client_dirsize_rbytes
type: bool
level: advanced

View File

@ -319,6 +319,17 @@ int ceph_mount(struct ceph_mount_info *cmount, const char *root);
*/
int64_t ceph_get_fs_cid(struct ceph_mount_info *cmount);
typedef void (*libcephfs_c_completion_t)(int rc, const void* out, size_t outlen, const void* outs, size_t outslen, void* ud);
int ceph_mds_command2(struct ceph_mount_info *cmount,
const char *mds_spec,
const char **cmd,
size_t cmdlen,
const char *inbuf, size_t inbuflen,
int one_shot,
libcephfs_c_completion_t c,
void* ud);
/**
* Execute a management command remotely on an MDS.
*

View File

@ -243,6 +243,93 @@ inline bool operator<(const vinodeno_t &l, const vinodeno_t &r) {
(l.ino == r.ino && l.snapid < r.snapid);
}
template<template<typename> class Allocator>
class charmap_md_t {
public:
static constexpr int STRUCT_V = 1;
static constexpr int COMPAT_V = 1;
using str_t = std::basic_string<char, std::char_traits<char>, Allocator<char>>;
charmap_md_t() = default;
charmap_md_t(auto const& cimd) {
casesensitive = cimd.casesensitive;
normalization = cimd.normalization;
encoding = cimd.encoding;
}
charmap_md_t<Allocator>& operator=(auto const& other) {
casesensitive = other.is_casesensitive();
normalization = other.get_normalization();
encoding = other.get_encoding();
return *this;
}
void encode(ceph::buffer::list& bl, uint64_t features) const {
ENCODE_START(STRUCT_V, COMPAT_V, bl);
ceph::encode(casesensitive, bl);
ceph::encode(normalization, bl);
ceph::encode(encoding, bl);
ENCODE_FINISH(bl);
}
void decode(ceph::buffer::list::const_iterator& p) {
DECODE_START(STRUCT_V, p);
ceph::decode(casesensitive, p);
ceph::decode(normalization, p);
ceph::decode(encoding, p);
DECODE_FINISH(p);
}
void print(std::ostream& os) const {
os << "charmap_md_t(s=" << casesensitive << " f=" << normalization << " e=" << encoding << ")";
}
std::string_view get_normalization() const {
return std::string_view(normalization);
}
std::string_view get_encoding() const {
return std::string_view(encoding);
}
void set_normalization(std::string_view sv) {
normalization = sv;
}
void set_encoding(std::string_view sv) {
encoding = sv;
}
void mark_casesensitive() {
casesensitive = true;
}
void mark_caseinsensitive() {
casesensitive = false;
}
bool is_casesensitive() const {
return casesensitive;
}
void dump(ceph::Formatter* f) const {
f->dump_bool("casesensitive", casesensitive);
f->dump_string("normalization", normalization);
f->dump_string("encoding", encoding);
}
constexpr std::string_view get_default_normalization() const {
return DEFAULT_NORMALIZATION;
}
constexpr std::string_view get_default_encoding() const {
return DEFAULT_ENCODING;
}
private:
static constexpr std::string_view DEFAULT_NORMALIZATION = "nfd";
static constexpr std::string_view DEFAULT_ENCODING = "utf8";
bool casesensitive = true;
str_t normalization{DEFAULT_NORMALIZATION};
str_t encoding{DEFAULT_ENCODING};
};
typedef enum {
QUOTA_MAX_FILES,
QUOTA_MAX_BYTES,
@ -383,6 +470,232 @@ enum {
DAMAGE_FRAGTREE // fragtree -- repair by searching
};
template<template<typename> class Allocator>
class unknown_md_t {
public:
void encode(ceph::buffer::list& bl, uint64_t features) const {
encode_nohead(payload, bl);
}
void decode(ceph::buffer::list::const_iterator& p) {
bufferlist bl;
DECODE_UNKNOWN(bl, p);
auto blp = bl.cbegin();
blp.copy(blp.get_remaining(), payload);
}
void print(std::ostream& os) const {
os << "unknown_md_t(len=" << payload.size() << ")";
}
void dump(ceph::Formatter* f) const {
f->dump_bool("length", payload.length());
}
private:
std::vector<uint8_t,Allocator<uint8_t>> payload;
};
template<template<typename> class Allocator>
struct optmetadata_server_t {
using opts = std::variant<
unknown_md_t<Allocator>,
charmap_md_t<Allocator>
>;
enum kind_t : uint64_t {
UNKNOWN,
CHARMAP,
_MAX
};
};
template<template<typename> class Allocator>
struct optmetadata_client_t {
using opts = std::variant<
unknown_md_t<Allocator>,
charmap_md_t<Allocator>
>;
enum kind_t : uint64_t {
UNKNOWN,
CHARMAP,
_MAX
};
};
template<typename... Ts>
void defconstruct_type(std::variant<Ts...>& v, std::size_t i)
{
constexpr auto N = sizeof...(Ts);
static const std::array<std::variant<Ts...>, N> lookup = {Ts{}...};
v = lookup[i];
}
template<typename M, template<typename> class Allocator>
struct optmetadata_singleton {
using optmetadata_t = typename M::opts;
using kind_t = typename M::kind_t;
optmetadata_singleton(kind_t kind = kind_t::UNKNOWN)
{
u64kind = (uint64_t)kind;
defconstruct_type(optmetadata, get_kind());
}
auto get_kind() const {
constexpr auto optsmax = std::variant_size_v<optmetadata_t>;
static_assert(kind_t::_MAX == optsmax);
static_assert(kind_t::UNKNOWN == 0);
if (u64kind > optsmax) {
return kind_t::UNKNOWN;
} else {
return (kind_t)u64kind;
}
}
template<template< template<typename> class > class T>
auto& get_meta() {
return std::get< T<Allocator> >(optmetadata);
}
template<template< template<typename> class > class T>
auto& get_meta() const {
return std::get< T<Allocator> >(optmetadata);
}
void print(std::ostream& os) const {
os << "(k=" << u64kind << " m=";
std::visit([&os](auto& o) { o.print(os); }, optmetadata);
os << ")";
}
void dump(ceph::Formatter* f) const {
f->dump_int("kind", u64kind);
f->dump_object("metadata", optmetadata);
}
void encode(ceph::buffer::list& bl, uint64_t features) const {
// no versioning, use optmetadata
ceph::encode(u64kind, bl);
std::visit([&bl, features](auto& o) { o.encode(bl, features); }, optmetadata);
}
void decode(ceph::buffer::list::const_iterator& p) {
ceph::decode(u64kind, p);
*this = optmetadata_singleton((kind_t)u64kind);
std::visit([&p](auto& o) { o.decode(p); }, optmetadata);
}
bool operator<(const optmetadata_singleton& other) const {
return u64kind < other.u64kind;
}
private:
uint64_t u64kind = 0;
optmetadata_t optmetadata;
};
template<typename Singleton, template<typename> class Allocator>
struct optmetadata_multiton {
static constexpr int STRUCT_V = 1;
static constexpr int COMPAT_V = 1;
using optkind_t = typename Singleton::kind_t;
using optvec_t = std::vector<Singleton,Allocator<Singleton>>;
void encode(ceph::buffer::list& bl, uint64_t features) const {
// no versioning, use payload
ENCODE_START(STRUCT_V, COMPAT_V, bl);
ceph::encode(opts, bl);
ENCODE_FINISH(bl);
}
void decode(ceph::buffer::list::const_iterator& p) {
DECODE_START(STRUCT_V, p);
ceph::decode(opts, p);
DECODE_FINISH(p);
}
void print(std::ostream& os) const {
os << "optm(len=" << opts.size() << " " << opts << ")";
}
void dump(ceph::Formatter* f) const {
f->dump_bool("length", opts.size());
f->open_array_section("opts");
for (auto& opt : opts) {
f->dump_object("opt", opt);
}
f->dump_object("opts", opts);
}
bool has_opt(optkind_t kind) const {
auto f = [kind](auto& o) {
return o.get_kind() == kind;
};
auto it = std::find_if(opts.begin(), opts.end(), std::move(f));
return it != opts.end();
}
auto& get_opt(optkind_t kind) const {
auto f = [kind](auto& o) {
return o.get_kind() == kind;
};
auto it = std::find_if(opts.begin(), opts.end(), std::move(f));
return *it;
}
auto& get_opt(optkind_t kind) {
auto f = [kind](auto& o) {
return o.get_kind() == kind;
};
auto it = std::find_if(opts.begin(), opts.end(), std::move(f));
return *it;
}
auto& get_or_create_opt(optkind_t kind) {
auto f = [kind](auto& o) {
return o.get_kind() == kind;
};
if (auto it = std::find_if(opts.begin(), opts.end(), std::move(f)); it != opts.end()) {
return *it;
}
auto it = std::lower_bound(opts.begin(), opts.end(), kind);
it = opts.emplace(it, kind);
return *it;
}
void del_opt(optkind_t kind) {
auto f = [kind](auto& o) {
return o.get_kind() == kind;
};
auto it = std::remove_if(opts.begin(), opts.end(), std::move(f));
opts.erase(it, opts.end());
}
auto size() const {
return opts.size();
}
private:
optvec_t opts;
};
template<typename T, template<typename> class Allocator>
static inline void encode(optmetadata_singleton<T, Allocator> const& o, ::ceph::buffer::list& bl, uint64_t features=0)
{
ENCODE_DUMP_PRE();
o.encode(bl, features);
ENCODE_DUMP_POST(cl);
}
template<typename T, template<typename> class Allocator>
static inline void decode(optmetadata_singleton<T, Allocator>& o, ::ceph::buffer::list::const_iterator& p)
{
o.decode(p);
}
template<typename Singleton, template<typename> class Allocator>
static inline void encode(optmetadata_multiton<Singleton,Allocator> const& o, ::ceph::buffer::list& bl, uint64_t features=0)
{
ENCODE_DUMP_PRE();
o.encode(bl, features);
ENCODE_DUMP_POST(cl);
}
template<typename Singleton, template<typename> class Allocator>
static inline void decode(optmetadata_multiton<Singleton,Allocator>& o, ::ceph::buffer::list::const_iterator& p)
{
o.decode(p);
}
template<template<typename> class Allocator = std::allocator>
struct inode_t {
/**
@ -390,6 +703,7 @@ struct inode_t {
* Do not forget to add any new fields to the compare() function.
* ***************
*/
using optmetadata_singleton_server_t = optmetadata_singleton<optmetadata_server_t<Allocator>,Allocator>;
using client_range_map = std::map<client_t,client_writeable_range_t,std::less<client_t>,Allocator<std::pair<const client_t,client_writeable_range_t>>>;
static const uint8_t F_EPHEMERAL_DISTRIBUTED_PIN = (1<<0);
@ -506,6 +820,25 @@ struct inode_t {
return get_flag(F_QUIESCE_BLOCK);
}
bool has_charmap() const {
return optmetadata.has_opt(optmetadata_singleton_server_t::kind_t::CHARMAP);
}
auto& get_charmap() const {
auto& opt = optmetadata.get_opt(optmetadata_singleton_server_t::kind_t::CHARMAP);
return opt.template get_meta< charmap_md_t >();
}
auto& get_charmap() {
auto& opt = optmetadata.get_opt(optmetadata_singleton_server_t::kind_t::CHARMAP);
return opt.template get_meta< charmap_md_t >();
}
auto& set_charmap() {
auto& opt = optmetadata.get_or_create_opt(optmetadata_singleton_server_t::kind_t::CHARMAP);
return opt.template get_meta< charmap_md_t >();
}
void del_charmap() {
optmetadata.del_opt(optmetadata_singleton_server_t::kind_t::CHARMAP);
}
void encode(ceph::buffer::list &bl, uint64_t features) const;
void decode(ceph::buffer::list::const_iterator& bl);
void dump(ceph::Formatter *f) const;
@ -608,6 +941,8 @@ struct inode_t {
std::vector<uint8_t,Allocator<uint8_t>> fscrypt_file;
std::vector<uint8_t,Allocator<uint8_t>> fscrypt_last_block;
optmetadata_multiton<optmetadata_singleton_server_t,Allocator> optmetadata;
private:
bool older_is_consistent(const inode_t &other) const;
};
@ -616,7 +951,7 @@ private:
template<template<typename> class Allocator>
void inode_t<Allocator>::encode(ceph::buffer::list &bl, uint64_t features) const
{
ENCODE_START(19, 6, bl);
ENCODE_START(20, 6, bl);
encode(ino, bl);
encode(rdev, bl);
@ -676,6 +1011,8 @@ void inode_t<Allocator>::encode(ceph::buffer::list &bl, uint64_t features) const
encode(fscrypt_file, bl);
encode(fscrypt_last_block, bl);
encode(optmetadata, bl, features);
ENCODE_FINISH(bl);
}
@ -793,6 +1130,11 @@ void inode_t<Allocator>::decode(ceph::buffer::list::const_iterator &p)
if (struct_v >= 19) {
decode(fscrypt_last_block, p);
}
if (struct_v >= 20) {
decode(optmetadata, p);
}
DECODE_FINISH(p);
}
@ -944,6 +1286,7 @@ void inode_t<Allocator>::generate_test_instances(std::list<inode_t*>& ls)
template<template<typename> class Allocator>
int inode_t<Allocator>::compare(const inode_t<Allocator> &other, bool *divergent) const
{
// TODO: fscrypt / optmetadata: https://tracker.ceph.com/issues/70188
ceph_assert(ino == other.ino);
*divergent = false;
if (version == other.version) {

View File

@ -1507,6 +1507,24 @@ decode(std::array<T, N>& v, bufferlist::const_iterator& p)
unsigned struct_end = bl.get_off() + struct_len; \
do {
#define DECODE_UNKNOWN(payload, bl) \
do { \
__u8 struct_v, struct_compat; \
using ::ceph::decode; \
decode(struct_v, bl); \
decode(struct_compat, bl); \
__u32 struct_len; \
decode(struct_len, bl); \
if (struct_len > bl.get_remaining()) \
throw ::ceph::buffer::malformed_input(DECODE_ERR_PAST(__PRETTY_FUNCTION__)); \
payload.clear(); \
using ::ceph::encode; \
encode(struct_v, payload); \
encode(struct_compat, payload); \
encode(struct_len, payload); \
bl.copy(struct_len, payload); \
} while (0)
/* BEWARE: any change to this macro MUST be also reflected in the duplicative
* DECODE_START_LEGACY_COMPAT_LEN! */
#define __DECODE_START_LEGACY_COMPAT_LEN(v, compatv, lenv, skip_v, bl) \

View File

@ -98,7 +98,7 @@ class filepath {
ino = b;
}
void set_path(std::string_view s) {
if (s[0] == '/') {
if (!s.empty() && s[0] == '/') {
path = s.substr(1);
ino = 1;
} else {

View File

@ -517,6 +517,55 @@ extern "C" int ceph_set_mount_timeout(struct ceph_mount_info *cmount, uint32_t t
return ceph_conf_set(cmount, "client_mount_timeout", timeout_str.c_str());
}
class CommandCContext : public Context {
public:
CommandCContext(libcephfs_c_completion_t c, void* ud) : c(c), ud(ud) {}
void finish(int rc) {
c(rc, outbl.c_str(), outbl.length(), outs.c_str(), outs.size(), ud);
}
libcephfs_c_completion_t c;
void* ud;
bufferlist outbl;
std::string outs;
};
extern "C" int ceph_mds_command2(struct ceph_mount_info *cmount,
const char *mds_spec,
const char **cmd,
size_t cmdlen,
const char *inbuf, size_t inbuflen,
int one_shot,
libcephfs_c_completion_t c,
void* ud)
{
bufferlist inbl;
std::vector<string> cmdv;
if (!cmount->is_initialized()) {
return -ENOTCONN;
}
// Construct inputs
for (size_t i = 0; i < cmdlen; ++i) {
cmdv.push_back(cmd[i]);
}
inbl.append(inbuf, inbuflen);
// Issue remote command
auto* ctx = new CommandCContext(c, ud);
int r = cmount->get_client()->mds_command(mds_spec, cmdv, inbl, &ctx->outbl, &ctx->outs, ctx, one_shot);
if (r != 0) {
delete ctx;
return r;
}
return 0;
}
extern "C" int ceph_mds_command(struct ceph_mount_info *cmount,
const char *mds_spec,
const char **cmd,

View File

@ -106,7 +106,7 @@ void Beacon::init(const MDSMap &mdsmap)
});
}
bool Beacon::ms_dispatch2(const ref_t<Message>& m)
Dispatcher::dispatch_result_t Beacon::ms_dispatch2(const ref_t<Message>& m)
{
dout(25) << __func__ << ": processing " << m << dendl;
if (m->get_type() == MSG_MDS_BEACON) {

View File

@ -53,7 +53,7 @@ public:
void init(const MDSMap &mdsmap);
void shutdown();
bool ms_dispatch2(const ref_t<Message> &m) override;
Dispatcher::dispatch_result_t ms_dispatch2(const ref_t<Message> &m) override;
void ms_handle_connect(Connection *c) override {}
bool ms_handle_reset(Connection *c) override {return false;}
void ms_handle_remote_reset(Connection *c) override {}

View File

@ -315,6 +315,9 @@ ostream& operator<<(ostream& out, const CInode& in)
if (in.get_inode()->get_quiesce_block()) {
out << " qblock";
}
if (in.get_inode()->optmetadata.size() > 0) {
out << " " << in.get_inode()->optmetadata;
}
out << " " << &in;
out << "]";
@ -2161,7 +2164,7 @@ void CInode::decode_lock_iflock(bufferlist::const_iterator& p)
void CInode::encode_lock_ipolicy(bufferlist& bl)
{
ENCODE_START(3, 1, bl);
ENCODE_START(4, 1, bl);
if (is_dir()) {
encode(get_inode()->version, bl);
encode(get_inode()->ctime, bl);
@ -2170,8 +2173,10 @@ void CInode::encode_lock_ipolicy(bufferlist& bl)
encode(get_inode()->export_pin, bl);
encode(get_inode()->flags, bl);
encode(get_inode()->export_ephemeral_random_pin, bl);
encode(get_inode()->optmetadata, bl);
} else {
encode(get_inode()->flags, bl);
encode(get_inode()->optmetadata, bl);
}
ENCODE_FINISH(bl);
}
@ -2180,7 +2185,7 @@ void CInode::decode_lock_ipolicy(bufferlist::const_iterator& p)
{
ceph_assert(!is_auth());
auto _inode = allocate_inode(*get_inode());
DECODE_START(3, p);
DECODE_START(4, p);
if (is_dir()) {
decode(_inode->version, p);
utime_t tm;
@ -2194,10 +2199,16 @@ void CInode::decode_lock_ipolicy(bufferlist::const_iterator& p)
decode(_inode->flags, p);
decode(_inode->export_ephemeral_random_pin, p);
}
if (struct_v >= 4) {
decode(_inode->optmetadata, p);
}
} else {
if (struct_v >= 3) {
decode(_inode->flags, p);
}
if (struct_v >= 4) {
decode(_inode->optmetadata, p);
}
}
DECODE_FINISH(p);
@ -3945,7 +3956,24 @@ int CInode::encode_inodestat(bufferlist& bl, Session *session,
} else {
xattr_version = 0;
}
bufferlist optmdbl;
{
decltype(InodeStat::optmetadata) optmetadata;
using kind_t = decltype(optmetadata)::optkind_t;
auto* csp = get_charmap();
if (csp) {
dout(25) << *csp << dendl;
auto& opt = optmetadata.get_or_create_opt(kind_t::CHARMAP);
auto& cs = opt.template get_meta< charmap_md_t >();
cs = *csp;
dout(25) << "cs now " << cs << dendl;
}
encode(optmetadata, optmdbl);
}
// do we have room?
if (max_bytes) {
unsigned bytes =
@ -3956,7 +3984,11 @@ int CInode::encode_inodestat(bufferlist& bl, Session *session,
8 + 8 + 8 + 8 + 8 + sizeof(struct ceph_timespec) + // dirstat.nfiles ~ rstat.rctime
sizeof(__u32) + sizeof(__u32) * 2 * dirfragtree._splits.size() + // dirfragtree
sizeof(__u32) + symlink.length() + // symlink
sizeof(struct ceph_dir_layout); // dir_layout
sizeof(struct ceph_dir_layout) // dir_layout
+ 4 + file_i->fscrypt_auth.size() // len + data
+ 4 + file_i->fscrypt_file.size() // len + data
+ optmdbl.length()
;
if (xattr_version) {
bytes += sizeof(__u32) + sizeof(__u32); // xattr buffer len + number entries
@ -4110,7 +4142,7 @@ int CInode::encode_inodestat(bufferlist& bl, Session *session,
* note: encoding matches MClientReply::InodeStat
*/
if (session->info.has_feature(CEPHFS_FEATURE_REPLY_ENCODING)) {
ENCODE_START(7, 1, bl);
ENCODE_START(8, 1, bl);
encode(std::tuple{
oi->ino,
snapid,
@ -4162,6 +4194,8 @@ int CInode::encode_inodestat(bufferlist& bl, Session *session,
encode(!file_i->fscrypt_auth.empty(), bl);
encode(file_i->fscrypt_auth, bl);
encode(file_i->fscrypt_file, bl);
encode_nohead(optmdbl, bl);
// encode inodestat
ENCODE_FINISH(bl);
}
else {
@ -5538,6 +5572,16 @@ void CInode::set_export_pin(mds_rank_t rank)
maybe_export_pin(true);
}
charmap_md_t<mempool::mds_co::pool_allocator> const* CInode::get_charmap() const
{
dout(25) << __func__ << ": " << *this << dendl;
auto const& pi = get_projected_inode();
if (pi->has_charmap()) {
return &pi->get_charmap();
}
return nullptr;
}
mds_rank_t CInode::get_export_pin(bool inherit) const
{
auto&& balancer = mdcache->mds->balancer;

View File

@ -1013,6 +1013,8 @@ class CInode : public MDSCacheObject, public InodeStoreBase, public Counter<CIno
return !projected_parent.empty();
}
charmap_md_t<mempool::mds_co::pool_allocator> const* get_charmap() const;
mds_rank_t get_export_pin(bool inherit=true) const;
void check_pin_policy(mds_rank_t target);
void set_export_pin(mds_rank_t rank);

View File

@ -4564,6 +4564,7 @@ void Locker::encode_lease(bufferlist& bl, const session_info_t& info,
const LeaseStat& ls)
{
if (info.has_feature(CEPHFS_FEATURE_REPLY_ENCODING)) {
dout(25) << "encode lease reply encoding: " << ls << dendl;
ENCODE_START(2, 1, bl);
encode(ls.mask, bl);
encode(ls.duration_ms, bl);
@ -4572,6 +4573,7 @@ void Locker::encode_lease(bufferlist& bl, const session_info_t& info,
ENCODE_FINISH(bl);
}
else {
dout(25) << "encode lease NO reply encoding: " << ls << dendl;
encode(ls.mask, bl);
encode(ls.duration_ms, bl);
encode(ls.seq, bl);

View File

@ -195,7 +195,7 @@ public:
void issue_client_lease(CDentry *dn, CInode *in, const MDRequestRef &mdr, utime_t now, bufferlist &bl);
void revoke_client_leases(SimpleLock *lock);
static void encode_lease(bufferlist& bl, const session_info_t& info, const LeaseStat& ls);
void encode_lease(bufferlist& bl, const session_info_t& info, const LeaseStat& ls);
protected:
void send_lock_message(SimpleLock *lock, int msg);

View File

@ -992,7 +992,7 @@ void MDSDaemon::respawn()
bool MDSDaemon::ms_dispatch2(const ref_t<Message> &m)
Dispatcher::dispatch_result_t MDSDaemon::ms_dispatch2(const ref_t<Message> &m)
{
dout(25) << __func__ << ": processing " << m << dendl;
std::lock_guard l(mds_lock);

View File

@ -145,7 +145,7 @@ class MDSDaemon : public Dispatcher {
class MDSSocketHook *asok_hook = nullptr;
private:
bool ms_dispatch2(const ref_t<Message> &m) override;
Dispatcher::dispatch_result_t ms_dispatch2(const ref_t<Message> &m) override;
bool ms_handle_fast_authentication(Connection *con) override;
void ms_handle_accept(Connection *con) override;
void ms_handle_connect(Connection *con) override;

View File

@ -131,7 +131,7 @@ void MetricAggregator::shutdown() {
}
}
bool MetricAggregator::ms_dispatch2(const ref_t<Message> &m) {
Dispatcher::dispatch_result_t MetricAggregator::ms_dispatch2(const ref_t<Message> &m) {
dout(25) << " processing " << m << dendl;
if (m->get_type() == MSG_MDS_METRICS &&
m->get_connection()->get_peer_type() == CEPH_ENTITY_TYPE_MDS) {

View File

@ -35,7 +35,7 @@ public:
void notify_mdsmap(const MDSMap &mdsmap);
bool ms_dispatch2(const ref_t<Message> &m) override;
Dispatcher::dispatch_result_t ms_dispatch2(const ref_t<Message> &m) override;
void ms_handle_connect(Connection *c) override {
}

View File

@ -24,7 +24,7 @@ MetricsHandler::MetricsHandler(CephContext *cct, MDSRank *mds)
mds(mds) {
}
bool MetricsHandler::ms_dispatch2(const ref_t<Message> &m) {
Dispatcher::dispatch_result_t MetricsHandler::ms_dispatch2(const ref_t<Message> &m) {
if (m->get_type() == CEPH_MSG_CLIENT_METRICS &&
m->get_connection()->get_peer_type() == CEPH_ENTITY_TYPE_CLIENT) {
handle_client_metrics(ref_cast<MClientMetrics>(m));

View File

@ -37,7 +37,7 @@ class MetricsHandler : public Dispatcher {
public:
MetricsHandler(CephContext *cct, MDSRank *mds);
bool ms_dispatch2(const ref_t<Message> &m) override;
Dispatcher::dispatch_result_t ms_dispatch2(const ref_t<Message> &m) override;
void ms_handle_connect(Connection *c) override {
}

View File

@ -4742,6 +4742,23 @@ bool Server::is_valid_layout(file_layout_t *layout)
return true;
}
bool Server::can_handle_charmap(const MDRequestRef& mdr, CDentry* dn)
{
CDir *dir = dn->get_dir();
CInode *diri = dir->get_inode();
if (auto* csp = diri->get_charmap()) {
dout(20) << __func__ << ": with " << *csp << dendl;
auto& client_metadata = mdr->session->info.client_metadata;
bool allowed = client_metadata.features.test(CEPHFS_FEATURE_CHARMAP);
if (!allowed) {
dout(5) << " client cannot handle charmap" << dendl;
respond_to_request(mdr, -EPERM);
return false;
}
}
return true;
}
void Server::handle_client_openc(const MDRequestRef& mdr)
{
const cref_t<MClientRequest> &req = mdr->client_request;
@ -4771,6 +4788,10 @@ void Server::handle_client_openc(const MDRequestRef& mdr)
ceph_assert(dnl->is_null());
if (!can_handle_charmap(mdr, dn)) {
return;
}
if (req->get_alternate_name().size() > alternate_name_max) {
dout(10) << " alternate_name longer than " << alternate_name_max << dendl;
respond_to_request(mdr, -ENAMETOOLONG);
@ -5718,12 +5739,11 @@ void Server::handle_client_setlayout(const MDRequestRef& mdr)
journal_and_reply(mdr, cur, 0, le, new C_MDS_inode_update_finish(this, mdr, cur));
}
bool Server::xlock_policylock(const MDRequestRef& mdr, CInode *in, bool want_layout, bool xlock_snaplock)
bool Server::xlock_policylock(const MDRequestRef& mdr, CInode *in, bool want_layout, bool xlock_snaplock, MutationImpl::LockOpVec lov)
{
if (mdr->locking_state & MutationImpl::ALL_LOCKED)
return true;
MutationImpl::LockOpVec lov;
lov.add_xlock(&in->policylock);
if (xlock_snaplock)
lov.add_xlock(&in->snaplock);
@ -6513,6 +6533,136 @@ void Server::handle_client_setvxattr(const MDRequestRef& mdr, CInode *cur)
auto pi = cur->project_inode(mdr);
cur->setxattr_ephemeral_dist(val);
pip = pi.inode.get();
} else if (name == "ceph.dir.charmap"sv) {
// inheritance / InodeStat
if (!cur->is_dir() || cur->is_root()) {
respond_to_request(mdr, -EINVAL);
return;
}
dout(25) << "not root, is dir" << dendl;
MutationImpl::LockOpVec lov;
lov.add_rdlock(&cur->filelock); // to verify it's empty
if (!xlock_policylock(mdr, cur, false, false, std::move(lov)))
return;
if (_dir_is_nonempty(mdr, cur)) {
respond_to_request(mdr, -ENOTEMPTY);
return;
}
if (cur->snaprealm && cur->snaprealm->srnode.snaps.size()) {
respond_to_request(mdr, -ENOTEMPTY);
return;
}
if (is_rmxattr) {
if (!cur->get_projected_inode()->has_charmap()) {
respond_to_request(mdr, 0);
return;
}
auto pi = cur->project_inode(mdr);
pip = pi.inode.get();
dout(20) << "deleting charmap metadata" << dendl;
pip->del_charmap();
} else {
respond_to_request(mdr, -EINVAL);
return;
}
} else if (name == "ceph.dir.casesensitive"sv) {
if (is_rmxattr) {
value = "1";
}
MutationImpl::LockOpVec lov;
lov.add_rdlock(&cur->filelock); // to verify it's empty
if (!xlock_policylock(mdr, cur, false, false, std::move(lov)))
return;
if (_dir_is_nonempty(mdr, cur)) {
respond_to_request(mdr, -ENOTEMPTY);
return;
}
if (cur->snaprealm && cur->snaprealm->srnode.snaps.size()) {
respond_to_request(mdr, -ENOTEMPTY);
return;
}
bool val;
try {
val = boost::lexical_cast<bool>(value);
} catch (boost::bad_lexical_cast const&) {
dout(10) << "bad vxattr value, unable to parse bool for " << name << dendl;
respond_to_request(mdr, -EINVAL);
return;
}
auto pi = cur->project_inode(mdr);
pip = pi.inode.get();
auto& c = pip->set_charmap();
if (val) {
c.mark_casesensitive();
dout(20) << "marking case sensitive: " << c << dendl;
} else {
c.mark_caseinsensitive();
dout(20) << "marking case insensitive: " << c << dendl;
}
} else if (name == "ceph.dir.normalization"sv) {
if (is_rmxattr) {
value = "";
}
MutationImpl::LockOpVec lov;
lov.add_rdlock(&cur->filelock); // to verify it's empty
if (!xlock_policylock(mdr, cur, false, false, std::move(lov)))
return;
if (_dir_is_nonempty(mdr, cur)) {
respond_to_request(mdr, -ENOTEMPTY);
return;
}
if (cur->snaprealm && cur->snaprealm->srnode.snaps.size()) {
respond_to_request(mdr, -ENOTEMPTY);
return;
}
auto pi = cur->project_inode(mdr);
pip = pi.inode.get();
auto& c = pip->set_charmap();
if (value.size() > 0) {
c.set_normalization(value);
} else {
c.set_normalization(c.get_default_normalization());
}
dout(20) << "set normalization: " << c << dendl;
} else if (name == "ceph.dir.encoding"sv) {
if (is_rmxattr) {
value = "";
}
MutationImpl::LockOpVec lov;
lov.add_rdlock(&cur->filelock); // to verify it's empty
if (!xlock_policylock(mdr, cur, false, false, std::move(lov)))
return;
if (_dir_is_nonempty(mdr, cur)) {
respond_to_request(mdr, -ENOTEMPTY);
return;
}
if (cur->snaprealm && cur->snaprealm->srnode.snaps.size()) {
respond_to_request(mdr, -ENOTEMPTY);
return;
}
auto pi = cur->project_inode(mdr);
pip = pi.inode.get();
auto& c = pip->set_charmap();
if (value.size() > 0) {
c.set_encoding(value);
} else {
c.set_encoding(c.get_default_encoding());
}
dout(20) << "set encoding: " << c << dendl;
} else {
dout(10) << " unknown vxattr " << name << dendl;
respond_to_request(mdr, -EINVAL);
@ -7016,6 +7166,41 @@ void Server::handle_client_getvxattr(const MDRequestRef& mdr)
}
} else if (xattr_name == "ceph.quiesce.block"sv) {
*css << cur->get_projected_inode()->get_quiesce_block();
} else if (xattr_name == "ceph.dir.charmap"sv) {
auto&& pip = cur->get_projected_inode();
if (!pip->has_charmap()) {
r = -ENODATA;
} else {
auto& c = pip->get_charmap();
Formatter* f = new JSONFormatter;
f->dump_object("charmap", c);
f->flush(*css);
delete f;
}
} else if (xattr_name == "ceph.dir.casesensitive"sv) {
auto&& pip = cur->get_projected_inode();
if (!pip->has_charmap()) {
r = -ENODATA;
} else {
auto& c = pip->get_charmap();
*css << c.is_casesensitive();
}
} else if (xattr_name == "ceph.dir.encoding"sv) {
auto&& pip = cur->get_projected_inode();
if (!pip->has_charmap()) {
r = -ENODATA;
} else {
auto& c = pip->get_charmap();
*css << c.get_encoding();
}
} else if (xattr_name == "ceph.dir.normalization"sv) {
auto&& pip = cur->get_projected_inode();
if (!pip->has_charmap()) {
r = -ENODATA;
} else {
auto& c = pip->get_charmap();
*css << c.get_normalization();
}
} else if (xattr_name.substr(0, 12) == "ceph.dir.pin"sv) {
if (xattr_name == "ceph.dir.pin"sv) {
*css << cur->get_projected_inode()->export_pin;
@ -7230,6 +7415,11 @@ void Server::handle_client_mkdir(const MDRequestRef& mdr)
return;
ceph_assert(dn->get_projected_linkage()->is_null());
if (!can_handle_charmap(mdr, dn)) {
return;
}
if (req->get_alternate_name().size() > alternate_name_max) {
dout(10) << " alternate_name longer than " << alternate_name_max << dendl;
respond_to_request(mdr, -ENAMETOOLONG);
@ -7247,11 +7437,16 @@ void Server::handle_client_mkdir(const MDRequestRef& mdr)
// it's a directory.
dn->push_projected_linkage(newi);
auto _inode = newi->_get_inode();
auto* _inode = newi->_get_inode();
_inode->version = dn->pre_dirty();
_inode->rstat.rsubdirs = 1;
_inode->accounted_rstat = _inode->rstat;
_inode->update_backtrace();
if (auto* csp = diri->get_charmap()) {
dout(20) << " with " << *csp << dendl;
auto& c = _inode->set_charmap();
c = *csp;
}
snapid_t follows = mdcache->get_global_snaprealm()->get_newest_seq();
SnapRealm *realm = dn->get_dir()->inode->find_snaprealm();
@ -7322,6 +7517,11 @@ void Server::handle_client_symlink(const MDRequestRef& mdr)
return;
ceph_assert(dn->get_projected_linkage()->is_null());
if (!can_handle_charmap(mdr, dn)) {
return;
}
if (req->get_alternate_name().size() > alternate_name_max) {
dout(10) << " alternate_name longer than " << alternate_name_max << dendl;
respond_to_request(mdr, -ENAMETOOLONG);
@ -7422,6 +7622,11 @@ void Server::handle_client_link(const MDRequestRef& mdr)
}
ceph_assert(destdn->get_projected_linkage()->is_null());
if (!can_handle_charmap(mdr, destdn)) {
return;
}
if (req->get_alternate_name().size() > alternate_name_max) {
dout(10) << " alternate_name longer than " << alternate_name_max << dendl;
respond_to_request(mdr, -ENAMETOOLONG);
@ -8839,6 +9044,10 @@ void Server::handle_client_rename(const MDRequestRef& mdr)
CInode *srci = srcdnl->get_inode();
dout(10) << " srci " << *srci << dendl;
if (!can_handle_charmap(mdr, destdn)) {
return;
}
// -- some sanity checks --
if (destdn == srcdn) {
dout(7) << "rename src=dest, noop" << dendl;

View File

@ -235,7 +235,8 @@ public:
void handle_client_file_readlock(const MDRequestRef& mdr);
bool xlock_policylock(const MDRequestRef& mdr, CInode *in,
bool want_layout=false, bool xlock_snaplock=false);
bool want_layout=false, bool xlock_snaplock=false,
MutationImpl::LockOpVec lov={});
CInode* try_get_auth_inode(const MDRequestRef& mdr, inodeno_t ino);
void handle_client_setattr(const MDRequestRef& mdr);
void handle_client_setlayout(const MDRequestRef& mdr);
@ -263,6 +264,8 @@ public:
// check layout
bool is_valid_layout(file_layout_t *layout);
bool can_handle_charmap(const MDRequestRef& mdr, CDentry* dn);
// open
void handle_client_open(const MDRequestRef& mdr);
void handle_client_openc(const MDRequestRef& mdr); // O_CREAT variant.
@ -467,11 +470,15 @@ private:
xattr_name == "ceph.dir.subvolume" ||
xattr_name == "ceph.dir.pin" ||
xattr_name == "ceph.dir.pin.random" ||
xattr_name == "ceph.dir.pin.distributed";
xattr_name == "ceph.dir.pin.distributed" ||
xattr_name == "ceph.dir.charmap"sv ||
xattr_name == "ceph.dir.normalization"sv ||
xattr_name == "ceph.dir.encoding"sv ||
xattr_name == "ceph.dir.casesensitive"sv;
}
static bool is_ceph_dir_vxattr(std::string_view xattr_name) {
return (xattr_name == "ceph.dir.layout" ||
return xattr_name == "ceph.dir.layout" ||
xattr_name == "ceph.dir.layout.json" ||
xattr_name == "ceph.dir.layout.object_size" ||
xattr_name == "ceph.dir.layout.stripe_unit" ||
@ -482,7 +489,11 @@ private:
xattr_name == "ceph.dir.layout.pool_namespace" ||
xattr_name == "ceph.dir.pin" ||
xattr_name == "ceph.dir.pin.random" ||
xattr_name == "ceph.dir.pin.distributed");
xattr_name == "ceph.dir.pin.distributed" ||
xattr_name == "ceph.dir.charmap"sv ||
xattr_name == "ceph.dir.normalization"sv ||
xattr_name == "ceph.dir.encoding"sv ||
xattr_name == "ceph.dir.casesensitive"sv;
}
static bool is_ceph_file_vxattr(std::string_view xattr_name) {

View File

@ -33,6 +33,7 @@ static const std::array feature_names
"new_snaprealm_info",
"has_owner_uidgid",
"client_mds_auth_caps",
"charmap",
};
static_assert(feature_names.size() == CEPHFS_FEATURE_MAX + 1);

View File

@ -51,7 +51,8 @@ namespace ceph {
#define CEPHFS_FEATURE_NEW_SNAPREALM_INFO 19
#define CEPHFS_FEATURE_HAS_OWNER_UIDGID 20
#define CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK 21
#define CEPHFS_FEATURE_MAX 21
#define CEPHFS_FEATURE_CHARMAP 22
#define CEPHFS_FEATURE_MAX 22
#define CEPHFS_FEATURES_ALL { \
0, 1, 2, 3, 4, \
@ -73,7 +74,8 @@ namespace ceph {
CEPHFS_FEATURE_32BITS_RETRY_FWD, \
CEPHFS_FEATURE_NEW_SNAPREALM_INFO, \
CEPHFS_FEATURE_HAS_OWNER_UIDGID, \
CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK \
CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK, \
CEPHFS_FEATURE_CHARMAP, \
}
#define CEPHFS_METRIC_FEATURES_ALL { \

View File

@ -115,6 +115,8 @@ struct DirStat {
};
struct InodeStat {
using optmetadata_singleton_client_t = optmetadata_singleton<optmetadata_client_t<std::allocator>,std::allocator>;
vinodeno_t vino;
uint32_t rdev = 0;
version_t version = 0;
@ -149,12 +151,18 @@ struct InodeStat {
std::vector<uint8_t> fscrypt_auth;
std::vector<uint8_t> fscrypt_file;
optmetadata_multiton<optmetadata_singleton_client_t,std::allocator> optmetadata;
public:
InodeStat() {}
InodeStat(ceph::buffer::list::const_iterator& p, const uint64_t features) {
decode(p, features);
}
void print(std::ostream& os) const {
os << "InodeStat(... " << optmetadata << ")";
}
void decode(ceph::buffer::list::const_iterator &p, const uint64_t features) {
using ceph::decode;
if (features == (uint64_t)-1) {
@ -221,6 +229,9 @@ struct InodeStat {
decode(fscrypt_auth, p);
decode(fscrypt_file, p);
}
if (struct_v >= 8) {
decode(optmetadata, p);
}
DECODE_FINISH(p);
}
else {

View File

@ -54,11 +54,11 @@ ActivePyModules::ActivePyModules(
DaemonStateIndex &ds, ClusterState &cs,
MonClient &mc, LogChannelRef clog_,
LogChannelRef audit_clog_, Objecter &objecter_,
Client &client_, Finisher &f, DaemonServer &server,
Finisher &f, DaemonServer &server,
PyModuleRegistry &pmr)
: module_config(module_config_), daemon_state(ds), cluster_state(cs),
monc(mc), clog(clog_), audit_clog(audit_clog_), objecter(objecter_),
client(client_), finisher(f),
finisher(f),
cmd_finisher(g_ceph_context, "cmd_finisher", "cmdfin"),
server(server), py_module_registry(pmr)
{
@ -246,7 +246,7 @@ PyObject *ActivePyModules::get_python(const std::string &what)
} else if (what == "modified_config_options") {
without_gil_t no_gil;
auto all_daemons = daemon_state.get_all();
set<string> names;
std::set<string> names;
for (auto& [key, daemon] : all_daemons) {
std::lock_guard l(daemon->lock);
for (auto& [name, valmap] : daemon->config) {
@ -783,7 +783,7 @@ std::map<std::string, std::string> ActivePyModules::get_services() const
void ActivePyModules::update_kv_data(
const std::string prefix,
bool incremental,
const map<std::string, std::optional<bufferlist>, std::less<>>& data)
const std::map<std::string, std::optional<bufferlist>, std::less<>>& data)
{
std::lock_guard l(lock);
bool do_config = false;
@ -1119,7 +1119,7 @@ PyObject *ActivePyModules::get_foreign_config(
std::map<std::string,std::string,std::less<>> config;
cluster_state.with_osdmap([&](const OSDMap &osdmap) {
map<string,string> crush_location;
std::map<string,string> crush_location;
string device_class;
if (entity.is_osd()) {
osdmap.crush->get_full_location(who, &crush_location);
@ -1278,7 +1278,7 @@ void ActivePyModules::set_device_wear_level(const std::string& devid,
float wear_level)
{
// update mgr state
map<string,string> meta;
std::map<string,string> meta;
daemon_state.with_device(
devid,
[wear_level, &meta] (DeviceState& dev) {

View File

@ -21,7 +21,6 @@
#include "PyFormatter.h"
#include "osdc/Objecter.h"
#include "client/Client.h"
#include "common/LogClient.h"
#include "mon/MgrMap.h"
#include "mon/MonCommand.h"
@ -33,6 +32,10 @@
#include "ClusterState.h"
#include "OSDPerfMetricTypes.h"
#include <map>
#include <set>
#include <string>
class health_check_map_t;
class DaemonServer;
class MgrSession;
@ -54,7 +57,6 @@ class ActivePyModules
MonClient &monc;
LogChannelRef clog, audit_clog;
Objecter &objecter;
Client &client;
Finisher &finisher;
TTLCache<std::string, PyObject*> ttl_cache;
public:
@ -73,7 +75,7 @@ public:
std::map<std::string, std::string> store_data,
bool mon_provides_kv_sub,
DaemonStateIndex &ds, ClusterState &cs, MonClient &mc,
LogChannelRef clog_, LogChannelRef audit_clog_, Objecter &objecter_, Client &client_,
LogChannelRef clog_, LogChannelRef audit_clog_, Objecter &objecter_,
Finisher &f, DaemonServer &server, PyModuleRegistry &pmr);
~ActivePyModules();
@ -81,7 +83,6 @@ public:
// FIXME: wrap for send_command?
MonClient &get_monc() {return monc;}
Objecter &get_objecter() {return objecter;}
Client &get_client() {return client;}
PyObject *cacheable_get_python(const std::string &what);
PyObject *get_python(const std::string &what);
PyObject *get_server_python(const std::string &hostname);
@ -179,7 +180,7 @@ public:
void update_kv_data(
const std::string prefix,
bool incremental,
const map<std::string, std::optional<bufferlist>, std::less<>>& data);
const std::map<std::string, std::optional<bufferlist>, std::less<>>& data);
void _refresh_config_map();
// Public so that MonCommandCompletion can use it

View File

@ -207,21 +207,11 @@ ceph_send_command(BaseMgrModule *self, PyObject *args, PyObject *kwargs)
f->queue(command_c);
});
} else if (std::string(type) == "mds") {
int r = self->py_modules->get_client().mds_command(
name,
{cmd_json},
inbuf,
&command_c->outbl,
&command_c->outs,
new C_OnFinisher(command_c, &self->py_modules->cmd_finisher),
one_shot);
if (r != 0) {
string msg("failed to send command to mds: ");
msg.append(cpp_strerror(r));
PyEval_RestoreThread(tstate);
PyErr_SetString(PyExc_RuntimeError, msg.c_str());
return nullptr;
}
string msg("cannot send command to mds via this interface: ");
msg.append(cpp_strerror(-ENOSYS));
PyEval_RestoreThread(tstate);
PyErr_SetString(PyExc_RuntimeError, msg.c_str());
return nullptr;
} else if (std::string(type) == "pg") {
pg_t pgid;
if (!pgid.parse(name)) {
@ -372,6 +362,20 @@ ceph_set_health_checks(BaseMgrModule *self, PyObject *args)
Py_RETURN_NONE;
}
static PyObject*
ceph_notify_all(BaseMgrModule *self, PyObject *args)
{
char *type = nullptr;
char *id = nullptr;
if (!PyArg_ParseTuple(args, "ss:ceph_notify_all", &type, &id)) {
return nullptr;
}
without_gil([&] {
self->py_modules->notify_all(type, id);
});
return nullptr;
}
static PyObject*
ceph_state_get(BaseMgrModule *self, PyObject *args)
@ -1432,6 +1436,9 @@ PyMethodDef BaseMgrModule_methods[] = {
{"_ceph_get", (PyCFunction)ceph_state_get, METH_VARARGS,
"Get a cluster object"},
{"_ceph_notify_all", (PyCFunction)ceph_notify_all, METH_VARARGS,
"notify all modules"},
{"_ceph_get_server", (PyCFunction)ceph_get_server, METH_VARARGS,
"Get a server object"},

View File

@ -44,7 +44,7 @@ if(WITH_MGR)
target_include_directories(ceph-mgr PRIVATE
$<TARGET_PROPERTY:RocksDB::RocksDB,INTERFACE_INCLUDE_DIRECTORIES>)
target_link_libraries(ceph-mgr
osdc client heap_profiler
osdc heap_profiler
global-static ceph-common
Boost::python${MGR_PYTHON_VERSION_MAJOR}${MGR_PYTHON_VERSION_MINOR}
Python3::Python

View File

@ -51,6 +51,11 @@
#include <iomanip>
#include <list>
#include <map>
#include <string>
#include <vector>
#define dout_context g_ceph_context
#define dout_subsys ceph_subsys_mgr
#undef dout_prefix
@ -343,7 +348,7 @@ bool DaemonServer::ms_handle_refused(Connection *con)
return false;
}
bool DaemonServer::ms_dispatch2(const ref_t<Message>& m)
Dispatcher::dispatch_result_t DaemonServer::ms_dispatch2(const ref_t<Message>& m)
{
// Note that we do *not* take ::lock here, in order to avoid
// serializing all message handling. It's up to each handler
@ -815,14 +820,14 @@ bool DaemonServer::handle_report(const ref_t<MMgrReport>& m)
void DaemonServer::_generate_command_map(
cmdmap_t& cmdmap,
map<string,string> &param_str_map)
std::map<string,string> &param_str_map)
{
for (auto p = cmdmap.begin();
p != cmdmap.end(); ++p) {
if (p->first == "prefix")
continue;
if (p->first == "caps") {
vector<string> cv;
std::vector<string> cv;
if (cmd_getval(cmdmap, "caps", cv) &&
cv.size() % 2 == 0) {
for (unsigned i = 0; i < cv.size(); i += 2) {
@ -856,7 +861,7 @@ bool DaemonServer::_allowed_command(
const string &module,
const string &prefix,
const cmdmap_t& cmdmap,
const map<string,string>& param_str_map,
const std::map<string,string>& param_str_map,
const MonCommand *this_cmd) {
if (s->entity_name.is_mon()) {
@ -1003,7 +1008,7 @@ void DaemonServer::log_access_denied(
}
void DaemonServer::_check_offlines_pgs(
const set<int>& osds,
const std::set<int>& osds,
const OSDMap& osdmap,
const PGMap& pgmap,
offline_pg_report *report)
@ -1013,7 +1018,7 @@ void DaemonServer::_check_offlines_pgs(
report->osds = osds;
for (const auto& q : pgmap.pg_stat) {
set<int32_t> pg_acting; // net acting sets (with no missing if degraded)
std::set<int32_t> pg_acting; // net acting sets (with no missing if degraded)
bool found = false;
if (q.second.state == 0) {
report->unknown.insert(q.first);
@ -1075,7 +1080,7 @@ void DaemonServer::_check_offlines_pgs(
}
void DaemonServer::_maximize_ok_to_stop_set(
const set<int>& orig_osds,
const std::set<int>& orig_osds,
unsigned max,
const OSDMap& osdmap,
const PGMap& pgmap,
@ -1093,9 +1098,9 @@ void DaemonServer::_maximize_ok_to_stop_set(
// semi-arbitrarily start with the first osd in the set
offline_pg_report report;
set<int> osds = orig_osds;
std::set<int> osds = orig_osds;
int parent = *osds.begin();
set<int> children;
std::set<int> children;
while (true) {
// identify the next parent
@ -1161,7 +1166,7 @@ bool DaemonServer::_handle_command(
session->inst.name = m->get_source();
}
map<string,string> param_str_map;
std::map<string,string> param_str_map;
std::stringstream ss;
int r = 0;
@ -1376,7 +1381,7 @@ bool DaemonServer::_handle_command(
}
for (auto& con : p->second) {
assert(HAVE_FEATURE(con->get_features(), SERVER_OCTOPUS));
vector<spg_t> pgs = { spgid };
std::vector<spg_t> pgs = { spgid };
con->send_message(new MOSDScrub2(monc->get_fsid(),
epoch,
pgs,
@ -1392,10 +1397,10 @@ bool DaemonServer::_handle_command(
prefix == "osd repair") {
string whostr;
cmd_getval(cmdctx->cmdmap, "who", whostr);
vector<string> pvec;
std::vector<string> pvec;
get_str_vec(prefix, pvec);
set<int> osds;
std::set<int> osds;
if (whostr == "*" || whostr == "all" || whostr == "any") {
cluster_state.with_osdmap([&](const OSDMap& osdmap) {
for (int i = 0; i < osdmap.get_max_osd(); i++)
@ -1421,9 +1426,9 @@ bool DaemonServer::_handle_command(
return true;
}
}
set<int> sent_osds, failed_osds;
std::set<int> sent_osds, failed_osds;
for (auto osd : osds) {
vector<spg_t> spgs;
std::vector<spg_t> spgs;
epoch_t epoch;
cluster_state.with_osdmap_and_pgmap([&](const OSDMap& osdmap, const PGMap& pgmap) {
epoch = osdmap.get_epoch();
@ -1469,7 +1474,7 @@ bool DaemonServer::_handle_command(
} else if (prefix == "osd pool scrub" ||
prefix == "osd pool deep-scrub" ||
prefix == "osd pool repair") {
vector<string> pool_names;
std::vector<string> pool_names;
cmd_getval(cmdctx->cmdmap, "who", pool_names);
if (pool_names.empty()) {
ss << "must specify one or more pool names";
@ -1477,8 +1482,8 @@ bool DaemonServer::_handle_command(
return true;
}
epoch_t epoch;
map<int32_t, vector<pg_t>> pgs_by_primary; // legacy
map<int32_t, vector<spg_t>> spgs_by_primary;
std::map<int32_t, std::vector<pg_t>> pgs_by_primary; // legacy
std::map<int32_t, std::vector<spg_t>> spgs_by_primary;
cluster_state.with_osdmap([&](const OSDMap& osdmap) {
epoch = osdmap.get_epoch();
for (auto& pool_name : pool_names) {
@ -1533,8 +1538,8 @@ bool DaemonServer::_handle_command(
prefix == "osd test-reweight-by-pg" ||
prefix == "osd test-reweight-by-utilization";
int64_t oload = cmd_getval_or<int64_t>(cmdctx->cmdmap, "oload", 120);
set<int64_t> pools;
vector<string> poolnames;
std::set<int64_t> pools;
std::vector<string> poolnames;
cmd_getval(cmdctx->cmdmap, "pools", poolnames);
cluster_state.with_osdmap([&](const OSDMap& osdmap) {
for (const auto& poolname : poolnames) {
@ -1686,10 +1691,10 @@ bool DaemonServer::_handle_command(
} else if (prefix == "osd safe-to-destroy" ||
prefix == "osd destroy" ||
prefix == "osd purge") {
set<int> osds;
std::set<int> osds;
int r = 0;
if (prefix == "osd safe-to-destroy") {
vector<string> ids;
std::vector<string> ids;
cmd_getval(cmdctx->cmdmap, "ids", ids);
cluster_state.with_osdmap([&](const OSDMap& osdmap) {
r = osdmap.parse_osd_id_list(ids, &osds, &ss);
@ -1711,7 +1716,7 @@ bool DaemonServer::_handle_command(
cmdctx->reply(r, ss);
return true;
}
set<int> active_osds, missing_stats, stored_pgs, safe_to_destroy;
std::set<int> active_osds, missing_stats, stored_pgs, safe_to_destroy;
int affected_pgs = 0;
cluster_state.with_osdmap_and_pgmap([&](const OSDMap& osdmap, const PGMap& pg_map) {
if (pg_map.num_pg_unknown > 0) {
@ -1846,9 +1851,9 @@ bool DaemonServer::_handle_command(
monc->start_mon_command({cmd}, {}, nullptr, &on_finish->outs, on_finish);
return true;
} else if (prefix == "osd ok-to-stop") {
vector<string> ids;
std::vector<string> ids;
cmd_getval(cmdctx->cmdmap, "ids", ids);
set<int> osds;
std::set<int> osds;
int64_t max = 1;
cmd_getval(cmdctx->cmdmap, "max", max);
int r;
@ -1898,11 +1903,11 @@ bool DaemonServer::_handle_command(
prefix == "osd pool force-backfill" ||
prefix == "osd pool cancel-force-recovery" ||
prefix == "osd pool cancel-force-backfill") {
vector<string> vs;
std::vector<string> vs;
get_str_vec(prefix, vs);
auto& granularity = vs.front();
auto& forceop = vs.back();
vector<pg_t> pgs;
std::vector<pg_t> pgs;
// figure out actual op just once
int actual_op = 0;
@ -1916,10 +1921,10 @@ bool DaemonServer::_handle_command(
actual_op = OFR_RECOVERY | OFR_CANCEL;
}
set<pg_t> candidates; // deduped
std::set<pg_t> candidates; // deduped
if (granularity == "pg") {
// covnert pg names to pgs, discard any invalid ones while at it
vector<string> pgids;
std::vector<string> pgids;
cmd_getval(cmdctx->cmdmap, "pgid", pgids);
for (auto& i : pgids) {
pg_t pgid;
@ -1932,7 +1937,7 @@ bool DaemonServer::_handle_command(
}
} else {
// per pool
vector<string> pool_names;
std::vector<string> pool_names;
cmd_getval(cmdctx->cmdmap, "who", pool_names);
if (pool_names.empty()) {
ss << "must specify one or more pool names";
@ -2027,7 +2032,7 @@ bool DaemonServer::_handle_command(
// message per distinct OSD
cluster_state.with_osdmap([&](const OSDMap& osdmap) {
// group pgs to process by osd
map<int, vector<spg_t>> osdpgs;
std::map<int, std::vector<spg_t>> osdpgs;
for (auto& pgid : pgs) {
int primary;
spg_t spg;
@ -2273,7 +2278,7 @@ bool DaemonServer::_handle_command(
cmdctx->reply(r, ss);
return true;
} else if (prefix == "device ls") {
set<string> devids;
std::set<string> devids;
TextTable tbl;
if (f) {
f->open_array_section("devices");
@ -2372,7 +2377,7 @@ bool DaemonServer::_handle_command(
} else if (prefix == "device ls-by-host") {
string host;
cmd_getval(cmdctx->cmdmap, "host", host);
set<string> devids;
std::set<string> devids;
daemon_state.list_devids_by_server(host, &devids);
if (f) {
f->open_array_section("devices");
@ -2461,7 +2466,7 @@ bool DaemonServer::_handle_command(
r = -EINVAL;
cmdctx->reply(r, ss);
} else {
map<string,string> meta;
std::map<string,string> meta;
daemon_state.with_device_create(
devid,
[from, to, &meta] (DeviceState& dev) {
@ -2487,7 +2492,7 @@ bool DaemonServer::_handle_command(
} else if (prefix == "device rm-life-expectancy") {
string devid;
cmd_getval(cmdctx->cmdmap, "devid", devid);
map<string,string> meta;
std::map<string,string> meta;
if (daemon_state.with_device_write(devid, [&meta] (DeviceState& dev) {
dev.rm_life_expectancy();
meta = dev.metadata;
@ -2742,7 +2747,7 @@ void DaemonServer::send_report()
});
});
map<daemon_metric, unique_ptr<DaemonHealthMetricCollector>> accumulated;
std::map<daemon_metric, unique_ptr<DaemonHealthMetricCollector>> accumulated;
for (auto service : {"osd", "mon"} ) {
auto daemons = daemon_state.get_by_service(service);
for (const auto& [key,state] : daemons) {
@ -2783,9 +2788,9 @@ void DaemonServer::adjust_pgs()
double max_misplaced = g_conf().get_val<double>("target_max_misplaced_ratio");
bool aggro = g_conf().get_val<bool>("mgr_debug_aggressive_pg_num_changes");
map<string,unsigned> pg_num_to_set;
map<string,unsigned> pgp_num_to_set;
set<pg_t> upmaps_to_clear;
std::map<string,unsigned> pg_num_to_set;
std::map<string,unsigned> pgp_num_to_set;
std::set<pg_t> upmaps_to_clear;
cluster_state.with_osdmap_and_pgmap([&](const OSDMap& osdmap, const PGMap& pg_map) {
unsigned creating_or_unknown = 0;
for (auto& i : pg_map.num_pg_by_state) {
@ -2855,7 +2860,7 @@ void DaemonServer::adjust_pgs()
<< dendl;
ok = false;
}
vector<int32_t> source_acting;
std::vector<int32_t> source_acting;
for (auto &merge_participant : {merge_source, merge_target}) {
bool is_merge_source = merge_participant == merge_source;
if (osdmap.have_pg_upmaps(merge_participant)) {
@ -3147,7 +3152,7 @@ void DaemonServer::got_service_map()
void DaemonServer::got_mgr_map()
{
std::lock_guard l(lock);
set<std::string> have;
std::set<std::string> have;
cluster_state.with_mgrmap([&](const MgrMap& mgrmap) {
auto md_update = [&] (DaemonKey key) {
std::ostringstream oss;
@ -3283,8 +3288,8 @@ bool DaemonServer::asok_command(
even those get stuck. Please enable \"mgr_enable_op_tracker\", and the tracker \
will start to track new ops received afterwards.";
set<string> filters;
vector<string> filter_str;
std::set<string> filters;
std::vector<string> filter_str;
if (cmd_getval(cmdmap, "filterstr", filter_str)) {
copy(filter_str.begin(), filter_str.end(),
inserter(filters, filters.end()));

View File

@ -20,6 +20,7 @@
#include <set>
#include <string>
#include <unordered_map>
#include <vector>
#include "common/ceph_mutex.h"
#include "common/LogClient.h"
@ -52,10 +53,10 @@ struct MDSPerfMetricQuery;
struct offline_pg_report {
set<int> osds;
set<pg_t> ok, not_ok, unknown;
set<pg_t> ok_become_degraded, ok_become_more_degraded; // ok
set<pg_t> bad_no_pool, bad_already_inactive, bad_become_inactive; // not ok
std::set<int> osds;
std::set<pg_t> ok, not_ok, unknown;
std::set<pg_t> ok_become_degraded, ok_become_more_degraded; // ok
std::set<pg_t> bad_no_pool, bad_already_inactive, bad_become_inactive; // not ok
bool ok_to_stop() const {
return not_ok.empty() && unknown.empty();
@ -183,7 +184,7 @@ private:
const PGMap& pgmap,
offline_pg_report *report);
void _maximize_ok_to_stop_set(
const set<int>& orig_osds,
const std::set<int>& orig_osds,
unsigned max,
const OSDMap& osdmap,
const PGMap& pgmap,
@ -274,7 +275,7 @@ public:
LogChannelRef auditcl);
~DaemonServer() override;
bool ms_dispatch2(const ceph::ref_t<Message>& m) override;
Dispatcher::dispatch_result_t ms_dispatch2(const ceph::ref_t<Message>& m) override;
bool ms_handle_fast_authentication(Connection *con) override;
void ms_handle_accept(Connection *con) override;
bool ms_handle_reset(Connection *con) override;

View File

@ -14,7 +14,6 @@
#include <Python.h>
#include "osdc/Objecter.h"
#include "client/Client.h"
#include "common/errno.h"
#include "mon/MonClient.h"
#include "include/stringify.h"
@ -26,15 +25,17 @@
# include "include/libcephsqlite.h"
#endif
#include "mgr/MgrContext.h"
#include "DaemonServer.h"
#include "messages/MMgrDigest.h"
#include "mds/FSMap.h"
#include "messages/MCommand.h"
#include "messages/MCommandReply.h"
#include "messages/MLog.h"
#include "messages/MServiceMap.h"
#include "messages/MFSMap.h"
#include "messages/MKVData.h"
#include "messages/MLog.h"
#include "messages/MMgrDigest.h"
#include "messages/MServiceMap.h"
#include "MgrContext.h"
#include "DaemonServer.h"
#include "PyModule.h"
#include "Mgr.h"
@ -54,10 +55,9 @@ using std::string;
Mgr::Mgr(MonClient *monc_, const MgrMap& mgrmap,
PyModuleRegistry *py_module_registry_,
Messenger *clientm_, Objecter *objecter_,
Client* client_, LogChannelRef clog_, LogChannelRef audit_clog_) :
LogChannelRef clog_, LogChannelRef audit_clog_) :
monc(monc_),
objecter(objecter_),
client(client_),
client_messenger(clientm_),
finisher(g_ceph_context, "Mgr", "mgr-fin"),
digest_received(false),
@ -115,7 +115,7 @@ void MetadataUpdate::finish(int r)
if (daemon_state.exists(key)) {
DaemonStatePtr state = daemon_state.get(key);
map<string,string> m;
std::map<string,string> m;
{
std::lock_guard l(state->lock);
state->hostname = daemon_meta.at("hostname").get_str();
@ -143,7 +143,7 @@ void MetadataUpdate::finish(int r)
}
daemon_meta.erase("hostname");
map<string,string> m;
std::map<string,string> m;
for (const auto &[key, val] : daemon_meta) {
m.emplace(key, val.get_str());
}
@ -332,7 +332,7 @@ void Mgr::init()
++p) {
string devid = p->first.substr(7);
dout(10) << " updating " << devid << dendl;
map<string,string> meta;
std::map<string,string> meta;
ostringstream ss;
int r = get_json_str_map(p->second, ss, &meta, false);
if (r < 0) {
@ -351,7 +351,7 @@ void Mgr::init()
py_module_registry->active_start(
daemon_state, cluster_state,
pre_init_store, mon_allows_kv_sub,
*monc, clog, audit_clog, *objecter, *client,
*monc, clog, audit_clog, *objecter,
finisher, server);
cluster_state.final_init();
@ -451,7 +451,7 @@ void Mgr::load_all_metadata()
daemon_meta.erase("name");
daemon_meta.erase("hostname");
map<string,string> m;
std::map<string,string> m;
for (const auto &[key, val] : daemon_meta) {
m.emplace(key, val.get_str());
}
@ -476,7 +476,7 @@ void Mgr::load_all_metadata()
osd_metadata.erase("id");
osd_metadata.erase("hostname");
map<string,string> m;
std::map<string,string> m;
for (const auto &i : osd_metadata) {
m[i.first] = i.second.get_str();
}
@ -587,7 +587,7 @@ void Mgr::handle_mon_map()
daemon_state.cull("mon", names_exist);
}
bool Mgr::ms_dispatch2(const ref_t<Message>& m)
Dispatcher::dispatch_result_t Mgr::ms_dispatch2(const ref_t<Message>& m)
{
dout(10) << *m << dendl;
std::lock_guard l(lock);
@ -595,16 +595,16 @@ bool Mgr::ms_dispatch2(const ref_t<Message>& m)
switch (m->get_type()) {
case MSG_MGR_DIGEST:
handle_mgr_digest(ref_cast<MMgrDigest>(m));
break;
return Dispatcher::HANDLED();
case CEPH_MSG_MON_MAP:
/* MonClient passthrough of MonMap to us */
handle_mon_map(); /* use monc's monmap */
py_module_registry->notify_all("mon_map", "");
break;
return Dispatcher::ACKNOWLEDGED();
case CEPH_MSG_FS_MAP:
handle_fs_map(ref_cast<MFSMap>(m));
py_module_registry->notify_all("fs_map", "");
return false; // I shall let this pass through for Client
return Dispatcher::ACKNOWLEDGED();
case CEPH_MSG_OSD_MAP:
handle_osd_map();
py_module_registry->notify_all("osd_map", "");
@ -612,14 +612,14 @@ bool Mgr::ms_dispatch2(const ref_t<Message>& m)
// Continuous subscribe, so that we can generate notifications
// for our MgrPyModules
objecter->maybe_request_map();
break;
return Dispatcher::ACKNOWLEDGED();
case MSG_SERVICE_MAP:
handle_service_map(ref_cast<MServiceMap>(m));
//no users: py_module_registry->notify_all("service_map", "");
break;
return Dispatcher::ACKNOWLEDGED();
case MSG_LOG:
handle_log(ref_cast<MLog>(m));
break;
return Dispatcher::HANDLED();
case MSG_KV_DATA:
{
auto msg = ref_cast<MKVData>(m);
@ -656,12 +656,10 @@ bool Mgr::ms_dispatch2(const ref_t<Message>& m)
}
}
}
break;
return Dispatcher::HANDLED();
default:
return false;
return Dispatcher::UNHANDLED();
}
return true;
}
@ -742,7 +740,7 @@ bool Mgr::got_mgr_map(const MgrMap& m)
std::lock_guard l(lock);
dout(10) << m << dendl;
set<string> old_modules;
std::set<string> old_modules;
cluster_state.with_mgrmap([&](const MgrMap& m) {
old_modules = m.modules;
});

View File

@ -17,31 +17,28 @@
// Python.h comes first because otherwise it clobbers ceph's assert
#include <Python.h>
#include "mds/FSMap.h"
#include "messages/MFSMap.h"
#include "msg/Messenger.h"
#include "auth/Auth.h"
#include "common/Finisher.h"
#include "mon/MgrMap.h"
#include "msg/Dispatcher.h"
#include "msg/Messenger.h"
#include "DaemonServer.h"
#include "PyModuleRegistry.h"
#include "DaemonState.h"
#include "ClusterState.h"
#include "DaemonServer.h"
#include "DaemonState.h"
#include "PyModuleRegistry.h"
class MCommand;
class MMgrDigest;
class MLog;
class MServiceMap;
class Objecter;
class Client;
class MFSMap;
class Mgr : public AdminSocketHook {
protected:
MonClient *monc;
Objecter *objecter;
Client *client;
Messenger *client_messenger;
mutable ceph::mutex lock = ceph::make_mutex("Mgr::lock");
@ -74,7 +71,7 @@ public:
Mgr(MonClient *monc_, const MgrMap& mgrmap,
PyModuleRegistry *py_module_registry_,
Messenger *clientm_, Objecter *objecter_,
Client *client_, LogChannelRef clog_, LogChannelRef audit_clog_);
LogChannelRef clog_, LogChannelRef audit_clog_);
~Mgr();
bool is_initialized() const {return initialized;}
@ -91,7 +88,7 @@ public:
bool got_mgr_map(const MgrMap& m);
bool ms_dispatch2(const ceph::ref_t<Message>& m);
Dispatcher::dispatch_result_t ms_dispatch2(const ceph::ref_t<Message>& m);
void background_init(Context *completion);

View File

@ -98,7 +98,7 @@ void MgrClient::shutdown()
}
}
bool MgrClient::ms_dispatch2(const ref_t<Message>& m)
Dispatcher::dispatch_result_t MgrClient::ms_dispatch2(const ref_t<Message>& m)
{
std::lock_guard l(lock);

View File

@ -116,7 +116,7 @@ public:
void set_mgr_optional(bool optional_) {mgr_optional = optional_;}
bool ms_dispatch2(const ceph::ref_t<Message>& m) override;
Dispatcher::dispatch_result_t ms_dispatch2(const ceph::ref_t<Message>& m) override;
bool ms_handle_reset(Connection *con) override;
void ms_handle_remote_reset(Connection *con) override {}
bool ms_handle_refused(Connection *con) override;

View File

@ -52,7 +52,6 @@ MgrStandby::MgrStandby(int argc, const char **argv) :
"mgr",
Messenger::get_random_nonce())),
objecter{g_ceph_context, client_messenger.get(), &monc, poolctx},
client{client_messenger.get(), &monc, &objecter},
mgrc(g_ceph_context, client_messenger.get(), &monc.monmap),
log_client(g_ceph_context, client_messenger.get(), &monc.monmap, LogClient::NO_FLAGS),
clog(log_client.create_channel(CLOG_CHANNEL_CLUSTER)),
@ -131,7 +130,6 @@ int MgrStandby::init()
// Initialize Messenger
client_messenger->add_dispatcher_tail(this);
client_messenger->add_dispatcher_head(&objecter);
client_messenger->add_dispatcher_tail(&client);
client_messenger->start();
poolctx.start(2);
@ -198,7 +196,6 @@ int MgrStandby::init()
objecter.set_client_incarnation(0);
objecter.init();
objecter.start();
client.init();
timer.init();
py_module_registry.init();
@ -369,7 +366,7 @@ void MgrStandby::handle_mgr_map(ref_t<MMgrMap> mmap)
dout(1) << "Activating!" << dendl;
active_mgr.reset(new Mgr(&monc, map, &py_module_registry,
client_messenger.get(), &objecter,
&client, clog, audit_clog));
clog, audit_clog));
active_mgr->background_init(new LambdaContext(
[this](int r){
// Advertise our active-ness ASAP instead of waiting for
@ -405,26 +402,24 @@ void MgrStandby::handle_mgr_map(ref_t<MMgrMap> mmap)
}
}
bool MgrStandby::ms_dispatch2(const ref_t<Message>& m)
Dispatcher::dispatch_result_t MgrStandby::ms_dispatch2(const ref_t<Message>& m)
{
std::lock_guard l(lock);
dout(10) << state_str() << " " << *m << dendl;
Dispatcher::dispatch_result_t r;
if (m->get_type() == MSG_MGR_MAP) {
handle_mgr_map(ref_cast<MMgrMap>(m));
r = Dispatcher::ACKNOWLEDGED();
}
bool handled = false;
if (active_mgr) {
auto am = active_mgr;
lock.unlock();
handled = am->ms_dispatch2(m);
r = am->ms_dispatch2(m);
lock.lock();
}
if (m->get_type() == MSG_MGR_MAP) {
// let this pass through for mgrc
handled = false;
}
return handled;
return r;
}

View File

@ -21,7 +21,6 @@
#include "common/Timer.h"
#include "common/LogClient.h"
#include "client/Client.h"
#include "mon/MonClient.h"
#include "osdc/Objecter.h"
#include "PyModuleRegistry.h"
@ -44,7 +43,6 @@ protected:
MonClient monc;
std::unique_ptr<Messenger> client_messenger;
Objecter objecter;
Client client;
MgrClient mgrc;
@ -73,7 +71,7 @@ public:
MgrStandby(int argc, const char **argv);
~MgrStandby() override;
bool ms_dispatch2(const ceph::ref_t<Message>& m) override;
Dispatcher::dispatch_result_t ms_dispatch2(const ceph::ref_t<Message>& m) override;
bool ms_handle_reset(Connection *con) override { return false; }
void ms_handle_remote_reset(Connection *con) override {}
bool ms_handle_refused(Connection *con) override;

View File

@ -211,7 +211,7 @@ void PyModuleRegistry::active_start(
const std::map<std::string, std::string> &kv_store,
bool mon_provides_kv_sub,
MonClient &mc, LogChannelRef clog_, LogChannelRef audit_clog_,
Objecter &objecter_, Client &client_, Finisher &f,
Objecter &objecter_, Finisher &f,
DaemonServer &server)
{
std::lock_guard locker(lock);
@ -234,7 +234,7 @@ void PyModuleRegistry::active_start(
module_config,
kv_store, mon_provides_kv_sub,
ds, cs, mc,
clog_, audit_clog_, objecter_, client_, f, server,
clog_, audit_clog_, objecter_, f, server,
*this));
for (const auto &i : modules) {

View File

@ -17,16 +17,17 @@
// First because it includes Python.h
#include "PyModule.h"
#include <string>
#include <map>
#include <set>
#include <memory>
#include "common/LogClient.h"
#include "ActivePyModules.h"
#include "StandbyPyModules.h"
#include <map>
#include <memory>
#include <set>
#include <string>
#include <vector>
class MgrSession;
/**
@ -70,7 +71,7 @@ public:
void update_kv_data(
const std::string prefix,
bool incremental,
const map<std::string, std::optional<bufferlist>, std::less<>>& data) {
const std::map<std::string, std::optional<bufferlist>, std::less<>>& data) {
ceph_assert(active_modules);
active_modules->update_kv_data(prefix, incremental, data);
}
@ -114,7 +115,7 @@ public:
const std::map<std::string, std::string> &kv_store,
bool mon_provides_kv_sub,
MonClient &mc, LogChannelRef clog_, LogChannelRef audit_clog_,
Objecter &objecter_, Client &client_, Finisher &f,
Objecter &objecter_, Finisher &f,
DaemonServer &server);
void standby_start(MonClient &mc, Finisher &f);
@ -165,7 +166,7 @@ public:
*/
void get_health_checks(health_check_map_t *checks);
void get_progress_events(map<std::string,ProgressEvent> *events) {
void get_progress_events(std::map<std::string,ProgressEvent> *events) {
if (active_modules) {
active_modules->get_progress_events(events);
}

View File

@ -22,6 +22,8 @@
#include "include/common_fwd.h"
#include "msg/MessageRef.h"
#include <variant>
class Messenger;
class Connection;
class CryptoKey;
@ -124,7 +126,24 @@ public:
}
/* ms_dispatch2 because otherwise the child must define both */
virtual bool ms_dispatch2(const MessageRef &m) {
struct HANDLED {};
struct UNHANDLED {};
struct ACKNOWLEDGED {};
typedef std::variant<bool, HANDLED, UNHANDLED, ACKNOWLEDGED> dispatch_result_t;
static inline dispatch_result_t fold_dispatch_result(dispatch_result_t r) {
if (std::holds_alternative<bool>(r)) {
if (std::get<bool>(r)) {
return HANDLED();
} else {
return UNHANDLED();
}
} else {
return r;
}
}
virtual dispatch_result_t ms_dispatch2(const MessageRef &m) {
/* allow old style dispatch handling that expects a Message * with a floating ref */
MessageRef mr(m);
if (ms_dispatch(mr.get())) {

View File

@ -736,11 +736,17 @@ public:
*/
void ms_deliver_dispatch(const ceph::ref_t<Message> &m) {
m->set_dispatch_stamp(ceph_clock_now());
bool acked = false;
for ([[maybe_unused]] const auto& [priority, dispatcher] : dispatchers) {
if (dispatcher->ms_dispatch2(m)) {
auto r = Dispatcher::fold_dispatch_result(dispatcher->ms_dispatch2(m));
if (std::holds_alternative<Dispatcher::HANDLED>(r)) {
return;
} else if (std::holds_alternative<Dispatcher::ACKNOWLEDGED>(r)) {
acked = true;
}
}
if (acked)
return;
lsubdout(cct, ms, 0) << "ms_deliver_dispatch: unhandled message " << m << " " << *m << " from "
<< m->get_source_inst() << dendl;
ceph_assert(!cct->_conf->ms_die_on_unhandled_msg);

View File

@ -428,7 +428,7 @@ void NVMeofGwMonitorClient::handle_nvmeof_gw_map(ceph::ref_t<MNVMeofGwMap> nmap)
map = new_map;
}
bool NVMeofGwMonitorClient::ms_dispatch2(const ref_t<Message>& m)
Dispatcher::dispatch_result_t NVMeofGwMonitorClient::ms_dispatch2(const ref_t<Message>& m)
{
std::lock_guard l(lock);
dout(10) << "got map type " << m->get_type() << dendl;

View File

@ -74,7 +74,7 @@ public:
~NVMeofGwMonitorClient() override;
// Dispatcher interface
bool ms_dispatch2(const ceph::ref_t<Message>& m) override;
Dispatcher::dispatch_result_t ms_dispatch2(const ceph::ref_t<Message>& m) override;
bool ms_handle_reset(Connection *con) override { return false; }
void ms_handle_remote_reset(Connection *con) override {}
bool ms_handle_refused(Connection *con) override { return false; };

View File

@ -1000,52 +1000,61 @@ void Objecter::_do_watch_notify(boost::intrusive_ptr<LingerOp> info,
info->finished_async();
}
bool Objecter::ms_dispatch(Message *m)
Dispatcher::dispatch_result_t Objecter::ms_dispatch2(const MessageRef &m)
{
ldout(cct, 10) << __func__ << " " << cct << " " << *m << dendl;
switch (m->get_type()) {
// these we exlusively handle
case CEPH_MSG_OSD_OPREPLY:
handle_osd_op_reply(static_cast<MOSDOpReply*>(m));
return true;
m->get(); /* ref to be consumed */
handle_osd_op_reply(ref_cast<MOSDOpReply>(m).get());
return Dispatcher::HANDLED();
case CEPH_MSG_OSD_BACKOFF:
handle_osd_backoff(static_cast<MOSDBackoff*>(m));
return true;
m->get(); /* ref to be consumed */
handle_osd_backoff(ref_cast<MOSDBackoff>(m).get());
return Dispatcher::HANDLED();
case CEPH_MSG_WATCH_NOTIFY:
handle_watch_notify(static_cast<MWatchNotify*>(m));
m->put();
return true;
/* ref not consumed! */
handle_watch_notify(ref_cast<MWatchNotify>(m).get());
return Dispatcher::HANDLED();
case MSG_COMMAND_REPLY:
if (m->get_source().type() == CEPH_ENTITY_TYPE_OSD) {
handle_command_reply(static_cast<MCommandReply*>(m));
return true;
m->get(); /* ref to be consumed */
handle_command_reply(ref_cast<MCommandReply>(m).get());
return Dispatcher::HANDLED();
} else {
return false;
return Dispatcher::UNHANDLED();
}
case MSG_GETPOOLSTATSREPLY:
handle_get_pool_stats_reply(static_cast<MGetPoolStatsReply*>(m));
return true;
m->get(); /* ref to be consumed */
handle_get_pool_stats_reply(ref_cast<MGetPoolStatsReply>(m).get());
return Dispatcher::HANDLED();
case CEPH_MSG_POOLOP_REPLY:
handle_pool_op_reply(static_cast<MPoolOpReply*>(m));
return true;
m->get(); /* ref to be consumed */
handle_pool_op_reply(ref_cast<MPoolOpReply>(m).get());
return Dispatcher::HANDLED();
case CEPH_MSG_STATFS_REPLY:
handle_fs_stats_reply(static_cast<MStatfsReply*>(m));
return true;
m->get(); /* ref to be consumed */
handle_fs_stats_reply(ref_cast<MStatfsReply>(m).get());
return Dispatcher::HANDLED();
// these we give others a chance to inspect
// MDS, OSD
case CEPH_MSG_OSD_MAP:
handle_osd_map(static_cast<MOSDMap*>(m));
return false;
/* ref not consumed! */
handle_osd_map(ref_cast<MOSDMap>(m).get());
return Dispatcher::ACKNOWLEDGED();
default:
return Dispatcher::UNHANDLED();
}
return false;
}
void Objecter::_scan_requests(

View File

@ -2730,7 +2730,8 @@ private:
// messages
public:
bool ms_dispatch(Message *m) override;
Dispatcher::dispatch_result_t ms_dispatch2(const MessageRef &m) override;
bool ms_can_fast_dispatch_any() const override {
return true;
}
@ -2743,10 +2744,8 @@ private:
return false;
}
}
void ms_fast_dispatch(Message *m) override {
if (!ms_dispatch(m)) {
m->put();
}
void ms_fast_dispatch2(const MessageRef& m) override {
[[maybe_unused]] auto s = ms_dispatch2(m);
}
void handle_osd_op_reply(class MOSDOpReply *m);

View File

@ -71,6 +71,10 @@ cdef extern from "cephfs/libcephfs.h" nogil:
int ceph_setattrx(ceph_mount_info *cmount, const char *relpath, statx *stx, int mask, int flags)
int ceph_fsetattrx(ceph_mount_info *cmount, int fd, statx *stx, int mask)
ctypedef void (*libcephfs_c_completion_t)(int rc, const void* out, size_t outlen, const void* outs, size_t outslen, void* ud) nogil
int ceph_mds_command2(ceph_mount_info* cmount, const char* mds_spec, const char** cmd, size_t cmdlen, const char* inbuf, size_t inbuflen, int one_shot, libcephfs_c_completion_t c, void* ud)
int ceph_mds_command(ceph_mount_info *cmount, const char *mds_spec, const char **cmd, size_t cmdlen,
const char *inbuf, size_t inbuflen, char **outbuf, size_t *outbuflen,
char **outs, size_t *outslen)

View File

@ -111,6 +111,13 @@ cdef extern from "Python.h":
int _PyBytes_Resize(PyObject **string, Py_ssize_t newsize) except -1
void PyEval_InitThreads()
cdef void completion_callback(int rc, const void* out, size_t outlen, const void* outs, size_t outslen, void* ud) nogil:
# This GIL awkwardness is due to incompatible types with function pointers defined with mds_command2:
with gil:
pyout = (<unsigned char*>out)[:outlen]
pyouts = (<unsigned char*>outs)[:outslen]
(<object>ud).complete(rc, pyout, pyouts)
ref.Py_DECREF(<object>ud)
class Error(Exception):
def get_error_code(self):
@ -2280,6 +2287,49 @@ cdef class LibCephFS(object):
raise make_ex(ret, "error in rename {} to {}".format(src.decode(
'utf-8'), dst.decode('utf-8')))
def mds_command2(self, result, mds_spec, args, input_data=None, one_shot=False):
"""
:param: result: a completion object with a complete method accepting an integer rc, bytes output, and bytes error output
:param: mds_spec: the identity of one or more MDS to send the command to (e.g. "*" or "fsname:0")
:param: args: the JSON-encoded MDS command
:param: input_data: optional input data to the command
:param: one_shot: optional boolean indicating if the command should only be tried/sent once
:returns: 0 if command is/will be sent or an exception is raised
"""
if input_data is None:
input_data = ""
mds_spec = cstr(mds_spec, 'mds_spec')
args = cstr(args, 'args')
input_data = cstr(input_data, 'input_data')
cdef:
char *_mds_spec = opt_str(mds_spec)
char **_cmd = to_bytes_array([args])
size_t _cmdlen = 1
char *_inbuf = input_data
size_t _inbuf_len = len(input_data)
int _one_shot = one_shot
try:
with nogil:
ret = ceph_mds_command2(self.cluster, _mds_spec,
<const char **>_cmd, _cmdlen,
<const char*>_inbuf, _inbuf_len,
_one_shot,
completion_callback,
<void*>result)
if ret == 0:
ref.Py_INCREF(result)
else:
raise make_ex(ret, "error in mds_command2")
finally:
free(_cmd)
def mds_command(self, mds_spec, args, input_data):
"""
:returns: 3-tuple of output status int, output status string, output data

View File

@ -95,6 +95,9 @@ cdef nogil:
pass
int ceph_fsetattrx(ceph_mount_info *cmount, int fd, statx *stx, int mask):
pass
ctypedef void (*libcephfs_c_completion_t)(int rc, const void* out, size_t outlen, const void* outs, size_t outslen, void* ud) nogil
int ceph_mds_command2(ceph_mount_info* cmount, const char* mds_spec, const char** cmd, size_t cmdlen, const char* inbuf, size_t inbuflen, int one_shot, libcephfs_c_completion_t c, void* ud):
pass
int ceph_mds_command(ceph_mount_info *cmount, const char *mds_spec, const char **cmd, size_t cmdlen,
const char *inbuf, size_t inbuflen, char **outbuf, size_t *outbuflen,
char **outs, size_t *outslen):

View File

@ -69,6 +69,7 @@ class BaseMgrModule(object):
def _ceph_cluster_log(self, channel: str, priority: int, message: str) -> None: ...
def _ceph_get_context(self) -> object: ...
def _ceph_get(self, data_name: str) -> Any: ...
def _ceph_notify_all(self, what: str, tag: str) -> None: ...
def _ceph_get_server(self, hostname: Optional[str]) -> Union[ServerInfoT,
List[ServerInfoT]]: ...
def _ceph_get_perf_schema(self, svc_type: str, svc_name: str) -> Dict[str, Any]: ...

View File

@ -24,6 +24,7 @@ if TYPE_CHECKING:
else:
from typing_extensions import Literal
import cephfs
import inspect
import logging
import errno
@ -1108,6 +1109,8 @@ class MgrModule(ceph_module.BaseMgrModule, MgrModuleLoggingMixin):
# Keep a librados instance for those that need it.
self._rados: Optional[rados.Rados] = None
self._cephfs: Optional[cephfs.LibCephFS] = None
# this does not change over the lifetime of an active mgr
self._mgr_ips: Optional[str] = None
@ -1850,6 +1853,18 @@ class MgrModule(ceph_module.BaseMgrModule, MgrModuleLoggingMixin):
return (rc, stdout, stderr)
class _CommandResultWrapper:
def __init__(self, module: 'MgrModule', tag: Optional[str], result: CommandResult):
if tag is None:
tag = ""
self.module = module
self.tag = tag
self.result = result
def complete(self, r: int, outb: bytes, outs: bytes) -> None:
self.result.complete(r, outb.decode('utf-8'), outs.decode('utf-8'))
self.module._ceph_notify_all("command", self.tag)
def send_command(
self,
result: CommandResult,
@ -1883,7 +1898,13 @@ class MgrModule(ceph_module.BaseMgrModule, MgrModuleLoggingMixin):
:param bool one_shot: a keyword-only param to make the command abort
with EPIPE when the target resets or refuses to reconnect
"""
self._ceph_send_command(result, svc_type, svc_id, command, tag, inbuf, one_shot=one_shot)
if svc_type == "mds":
wrapped_result = self._CommandResultWrapper(self, tag, result)
self.log.info(f"do mds_command: mds.{svc_id} {command}")
self.cephfs.mds_command2(wrapped_result, svc_id, command, inbuf, one_shot=one_shot)
else:
self._ceph_send_command(result, svc_type, svc_id, command, tag, inbuf, one_shot=one_shot)
def tool_exec(
self,
@ -2332,6 +2353,19 @@ class MgrModule(ceph_module.BaseMgrModule, MgrModuleLoggingMixin):
self._ceph_register_client(None, self._rados.get_addrs(), False)
return self._rados
@property
def cephfs(self) -> cephfs.LibCephFS:
"""
An (unmounted) cephfs instance to be shared by any classes within this
mgr module that want one.
"""
if self._cephfs:
return self._cephfs
self._cephfs = cephfs.LibCephFS(rados_inst=self.rados)
self._cephfs.init()
return self._cephfs
@staticmethod
def can_run() -> Tuple[bool, str]:
"""

View File

@ -0,0 +1,58 @@
import errno
import cephfs
from ..exception import VolumeException
from ceph.utils import strtobool
_charmap_type = {
"casesensitive": lambda x: int(strtobool(x)),
"normalization": lambda x: str(x),
"encoding": lambda x: str(x),
}
def charmap_set(fs, path, setting, value):
"""
Set and get a charmap on a directory.
"""
if setting not in _charmap_type:
raise VolumeException(-errno.EINVAL, f"charmap setting invalid")
try:
value = _charmap_type[setting](value)
except ValueError:
raise VolumeException(-errno.EINVAL, f"charmap value wrong type: {setting}")
try:
fs.setxattr(path, f"ceph.dir.{setting}", str(value).encode('utf-8'), 0)
except cephfs.Error as e:
raise VolumeException(-e.args[0], e.args[1])
try:
return fs.getxattr(path, f"ceph.dir.charmap").decode('utf-8')
except cephfs.Error as e:
raise VolumeException(-e.args[0], e.args[1])
def charmap_rm(fs, path):
"""
Remove a charmap on a directory.
"""
try:
fs.removexattr(path, "ceph.dir.charmap", 0)
except cephfs.Error as e:
raise VolumeException(-e.args[0], e.args[1])
def charmap_get(fs, path, setting):
"""
Get a charmap on a directory.
"""
if setting not in _charmap_type and setting != 'charmap':
raise VolumeException(-errno.EINVAL, f"charmap setting invalid")
try:
return fs.getxattr(path, f"ceph.dir.{setting}").decode('utf-8')
except cephfs.Error as e:
raise VolumeException(-e.args[0], e.args[1])

View File

@ -6,6 +6,7 @@ from contextlib import contextmanager
import cephfs
from .snapshot_util import mksnap, rmsnap
from .charmap_util import charmap_get, charmap_set, charmap_rm
from .pin_util import pin
from .template import GroupTemplate
from ..fs_util import listdir, listsnaps, get_ancestor_xattr, create_base_dir, has_subdir
@ -78,6 +79,15 @@ class Group(GroupTemplate):
def pin(self, pin_type, pin_setting):
return pin(self.fs, self.path, pin_type, pin_setting)
def charmap_set(self, setting, value):
return charmap_set(self.fs, self.path, setting, value)
def charmap_rm(self):
return charmap_rm(self.fs, self.path)
def charmap_get(self, setting):
return charmap_get(self.fs, self.path, setting)
def create_snapshot(self, snapname):
snappath = os.path.join(self.path,
self.vol_spec.snapshot_dir_prefix.encode('utf-8'),

View File

@ -40,6 +40,7 @@ class SubvolumeOpType(Enum):
CREATE = 'create'
REMOVE = 'rm'
REMOVE_FORCE = 'rm-force'
CHARMAP = 'charmap'
PIN = 'pin'
LIST = 'ls'
GETPATH = 'getpath'

View File

@ -9,6 +9,7 @@ from pathlib import Path
import cephfs
from ..charmap_util import charmap_get, charmap_set, charmap_rm
from ..pin_util import pin
from .subvolume_attrs import SubvolumeTypes
from .metadata_manager import MetadataManager
@ -342,6 +343,15 @@ class SubvolumeBase(object):
def pin(self, pin_type, pin_setting):
return pin(self.fs, self.base_path, pin_type, pin_setting)
def charmap_set(self, setting, value):
return charmap_set(self.fs, self.path, setting, value)
def charmap_rm(self):
return charmap_rm(self.fs, self.path)
def charmap_get(self, setting):
return charmap_get(self.fs, self.path, setting)
def init_config(self, version, subvolume_type,
subvolume_path, subvolume_state):
self.metadata_mgr.init(version, subvolume_type.value,

View File

@ -423,6 +423,58 @@ class VolumeClient(CephfsClient["Module"]):
ret = self.volume_exception_to_retval(ve)
return ret
def subvolume_charmap_set(self, **kwargs):
ret = 0, "", ""
volname = kwargs['vol_name']
subvolname = kwargs['sub_name']
setting = kwargs['setting']
value = kwargs['value']
groupname = kwargs['group_name']
try:
with open_volume(self, volname) as fs_handle:
with open_group(fs_handle, self.volspec, groupname) as group:
with open_subvol(self.mgr, fs_handle, self.volspec, group, subvolname, SubvolumeOpType.CHARMAP) as subvolume:
v = subvolume.charmap_set(setting, value)
ret = 0, v, ""
except VolumeException as ve:
ret = self.volume_exception_to_retval(ve)
return ret
def subvolume_charmap_rm(self, **kwargs):
ret = 0, "", ""
volname = kwargs['vol_name']
subvolname = kwargs['sub_name']
groupname = kwargs['group_name']
try:
with open_volume(self, volname) as fs_handle:
with open_group(fs_handle, self.volspec, groupname) as group:
with open_subvol(self.mgr, fs_handle, self.volspec, group, subvolname, SubvolumeOpType.CHARMAP) as subvolume:
subvolume.charmap_rm()
ret = 0, json.dumps({}), ""
except VolumeException as ve:
ret = self.volume_exception_to_retval(ve)
return ret
def subvolume_charmap_get(self, **kwargs):
ret = 0, "", ""
volname = kwargs['vol_name']
subvolname = kwargs['sub_name']
setting = kwargs['setting']
groupname = kwargs['group_name']
try:
with open_volume(self, volname) as fs_handle:
with open_group(fs_handle, self.volspec, groupname) as group:
with open_subvol(self.mgr, fs_handle, self.volspec, group, subvolname, SubvolumeOpType.CHARMAP) as subvolume:
v = subvolume.charmap_get(setting)
ret = 0, v, ""
except VolumeException as ve:
ret = self.volume_exception_to_retval(ve)
return ret
def subvolume_getpath(self, **kwargs):
ret = None
volname = kwargs['vol_name']
@ -1131,6 +1183,51 @@ class VolumeClient(CephfsClient["Module"]):
ret = self.volume_exception_to_retval(ve)
return ret
def subvolume_group_charmap_set(self, **kwargs):
ret = 0, "", ""
volname = kwargs['vol_name']
groupname = kwargs['group_name']
setting = kwargs['setting']
value = kwargs['value']
try:
with open_volume(self, volname) as fs_handle:
with open_group(fs_handle, self.volspec, groupname) as group:
v = group.charmap_set(setting, value)
ret = 0, v, ""
except VolumeException as ve:
ret = self.volume_exception_to_retval(ve)
return ret
def subvolume_group_charmap_rm(self, **kwargs):
ret = 0, "", ""
volname = kwargs['vol_name']
groupname = kwargs['group_name']
try:
with open_volume(self, volname) as fs_handle:
with open_group(fs_handle, self.volspec, groupname) as group:
group.charmap_rm()
ret = 0, json.dumps({}), ""
except VolumeException as ve:
ret = self.volume_exception_to_retval(ve)
return ret
def subvolume_group_charmap_get(self, **kwargs):
ret = 0, "", ""
volname = kwargs['vol_name']
groupname = kwargs['group_name']
setting = kwargs['setting']
try:
with open_volume(self, volname) as fs_handle:
with open_group(fs_handle, self.volspec, groupname) as group:
v = group.charmap_get(setting)
ret = 0, v, ""
except VolumeException as ve:
ret = self.volume_exception_to_retval(ve)
return ret
def subvolume_group_exists(self, **kwargs):
volname = kwargs['vol_name']
ret = 0, "", ""

View File

@ -331,6 +331,31 @@ class Module(orchestrator.OrchestratorClientMixin, MgrModule):
'desc': "Set MDS pinning policy for subvolumegroup",
'perm': 'rw'
},
{
'cmd': 'fs subvolumegroup charmap set'
' name=vol_name,type=CephString'
' name=group_name,type=CephString,req=true'
' name=setting,type=CephChoices,strings=casesensitive|normalization|encoding'
' name=value,type=CephString,req=true',
'desc': "Set charmap settings for subvolumegroup",
'perm': 'rw'
},
{
'cmd': 'fs subvolumegroup charmap rm'
' name=vol_name,type=CephString'
' name=group_name,type=CephString,req=true',
'desc': "Remove charmap settings for subvolumegroup",
'perm': 'rw'
},
{
'cmd': 'fs subvolumegroup charmap get'
' name=vol_name,type=CephString'
' name=group_name,type=CephString,req=true'
' name=setting,type=CephChoices,strings=casesensitive|normalization|encoding,req=false',
'desc': "Get charmap settings for subvolumegroup",
'perm': 'rw'
},
{
'cmd': 'fs subvolumegroup snapshot ls '
'name=vol_name,type=CephString '
@ -459,6 +484,33 @@ class Module(orchestrator.OrchestratorClientMixin, MgrModule):
'desc': "Set MDS pinning policy for subvolume",
'perm': 'rw'
},
{
'cmd': 'fs subvolume charmap set'
' name=vol_name,type=CephString'
' name=sub_name,type=CephString'
' name=setting,type=CephChoices,strings=casesensitive|normalization|encoding'
' name=value,type=CephString,req=true'
' name=group_name,type=CephString,req=false',
'desc': "Set charmap settings for subvolumegroup",
'perm': 'rw'
},
{
'cmd': 'fs subvolume charmap rm'
' name=vol_name,type=CephString'
' name=sub_name,type=CephString'
' name=group_name,type=CephString,req=false',
'desc': "Remove charmap settings for subvolume",
'perm': 'rw'
},
{
'cmd': 'fs subvolume charmap get'
' name=vol_name,type=CephString'
' name=sub_name,type=CephString'
' name=setting,type=CephChoices,strings=casesensitive|normalization|encoding,req=false'
' name=group_name,type=CephString,req=false',
'desc': "Get charmap settings for subvolumegroup",
'perm': 'rw'
},
{
'cmd': 'fs subvolume snapshot protect '
'name=vol_name,type=CephString '
@ -820,6 +872,23 @@ class Module(orchestrator.OrchestratorClientMixin, MgrModule):
group_name=cmd['group_name'], pin_type=cmd['pin_type'],
pin_setting=cmd['pin_setting'])
@mgr_cmd_wrap
def _cmd_fs_subvolumegroup_charmap_set(self, inbuf, cmd):
return self.vc.subvolume_group_charmap_set(vol_name=cmd['vol_name'],
group_name=cmd['group_name'],
setting=cmd['setting'],
value=cmd['value'])
@mgr_cmd_wrap
def _cmd_fs_subvolumegroup_charmap_rm(self, inbuf, cmd):
return self.vc.subvolume_group_charmap_rm(vol_name=cmd['vol_name'],
group_name=cmd['group_name'])
@mgr_cmd_wrap
def _cmd_fs_subvolumegroup_charmap_get(self, inbuf, cmd):
return self.vc.subvolume_group_charmap_get(vol_name=cmd['vol_name'],
group_name=cmd['group_name'],
setting=cmd.get('setting', 'charmap'))
@mgr_cmd_wrap
def _cmd_fs_subvolumegroup_snapshot_create(self, inbuf, cmd):
return self.vc.create_subvolume_group_snapshot(vol_name=cmd['vol_name'],
@ -912,6 +981,27 @@ class Module(orchestrator.OrchestratorClientMixin, MgrModule):
pin_setting=cmd['pin_setting'],
group_name=cmd.get('group_name', None))
@mgr_cmd_wrap
def _cmd_fs_subvolume_charmap_set(self, inbuf, cmd):
return self.vc.subvolume_charmap_set(vol_name=cmd['vol_name'],
sub_name=cmd['sub_name'],
setting=cmd['setting'],
value=cmd['value'],
group_name=cmd.get('group_name', None))
@mgr_cmd_wrap
def _cmd_fs_subvolume_charmap_rm(self, inbuf, cmd):
return self.vc.subvolume_charmap_rm(vol_name=cmd['vol_name'],
sub_name=cmd['sub_name'],
group_name=cmd.get('group_name', None))
@mgr_cmd_wrap
def _cmd_fs_subvolume_charmap_get(self, inbuf, cmd):
return self.vc.subvolume_charmap_get(vol_name=cmd['vol_name'],
sub_name=cmd['sub_name'],
setting=cmd.get('setting', 'charmap'),
group_name=cmd.get('group_name', None))
@mgr_cmd_wrap
def _cmd_fs_subvolume_snapshot_protect(self, inbuf, cmd):
return self.vc.protect_subvolume_snapshot(vol_name=cmd['vol_name'], sub_name=cmd['sub_name'],

View File

@ -34,6 +34,8 @@ namespace ca = ceph::async;
class ClientScaffold : public Client {
public:
using Client::walk_dentry_result;
ClientScaffold(Messenger *m, MonClient *mc, Objecter *objecter_) : Client(m, mc, objecter_) {}
virtual ~ClientScaffold()
{ }
@ -87,6 +89,31 @@ public:
wait_on_list(waiting_for_reclaim);
return session->reclaim_state == MetaSession::RECLAIM_FAIL ? true : false;
}
/* Expose alternate_name for testing. There is no need to use virtual
* methods as we will call these only from the Derived class.
*/
int symlinkat(const char *target, int dirfd, const char *linkpath, const UserPerm& perms, std::string alternate_name) {
return do_symlinkat(target, dirfd, linkpath, perms, std::move(alternate_name));
}
int mkdirat(int dirfd, const char *path, mode_t mode, const UserPerm& perm, std::string alternate_name) {
return do_mkdirat(CEPHFS_AT_FDCWD, path, mode, perm, std::move(alternate_name));
}
int rename(const char *from, const char *to, const UserPerm& perm, std::string alternate_name) {
return do_rename(from, to, perm, std::move(alternate_name));
}
int link(const char *oldpath, const char *newpath, const UserPerm& perm, std::string alternate_name) {
return do_link(oldpath, newpath, perm, std::move(alternate_name));
}
int openat(int dirfd, const char *path, int flags, const UserPerm& perms,
mode_t mode, int stripe_unit, int stripe_count,
int object_size, const char *data_pool, std::string alternate_name) {
return do_openat(dirfd, path, flags, perms, mode, stripe_unit, stripe_count, object_size, data_pool, std::move(alternate_name));
}
int walk(std::string_view path, struct walk_dentry_result* result, const UserPerm& perms, bool followsym=true) {
return Client::walk(path, result, perms, followsym);
}
};
class TestClient : public ::testing::Test {
@ -141,6 +168,7 @@ public:
delete messenger;
messenger = nullptr;
}
// TODO expose altname versions
protected:
static inline ceph::async::io_context_pool icp;
static inline UserPerm myperm{0,0};

View File

@ -24,7 +24,7 @@
TEST_F(TestClient, AlternateNameRemount) {
auto altname = std::string("foo");
auto dir = fmt::format("{}_{}", ::testing::UnitTest::GetInstance()->current_test_info()->name(), getpid());
ASSERT_EQ(0, client->mkdir(dir.c_str(), 0777, myperm, altname));
ASSERT_EQ(0, client->mkdirat(CEPHFS_AT_FDCWD, dir.c_str(), 0777, myperm, altname));
client->unmount();
TearDown();
@ -32,7 +32,7 @@ TEST_F(TestClient, AlternateNameRemount) {
client->mount("/", myperm, true);
{
Client::walk_dentry_result wdr;
ClientScaffold::walk_dentry_result wdr;
ASSERT_EQ(0, client->walk(dir.c_str(), &wdr, myperm));
ASSERT_EQ(wdr.alternate_name, altname);
}
@ -43,10 +43,10 @@ TEST_F(TestClient, AlternateNameRemount) {
TEST_F(TestClient, AlternateNameMkdir) {
auto dir = fmt::format("{}_{}", ::testing::UnitTest::GetInstance()->current_test_info()->name(), getpid());
ASSERT_EQ(0, client->mkdir(dir.c_str(), 0777, myperm, "foo"));
ASSERT_EQ(0, client->mkdirat(CEPHFS_AT_FDCWD, dir.c_str(), 0777, myperm, "foo"));
{
Client::walk_dentry_result wdr;
ClientScaffold::walk_dentry_result wdr;
ASSERT_EQ(0, client->walk(dir.c_str(), &wdr, myperm));
ASSERT_EQ(wdr.alternate_name, "foo");
}
@ -57,10 +57,10 @@ TEST_F(TestClient, AlternateNameMkdir) {
TEST_F(TestClient, AlternateNameLong) {
auto altname = std::string(4096+1024, '-');
auto dir = fmt::format("{}_{}", ::testing::UnitTest::GetInstance()->current_test_info()->name(), getpid());
ASSERT_EQ(0, client->mkdir(dir.c_str(), 0777, myperm, altname));
ASSERT_EQ(0, client->mkdirat(CEPHFS_AT_FDCWD, dir.c_str(), 0777, myperm, altname));
{
Client::walk_dentry_result wdr;
ClientScaffold::walk_dentry_result wdr;
ASSERT_EQ(0, client->walk(dir.c_str(), &wdr, myperm));
ASSERT_EQ(wdr.alternate_name, altname);
}
@ -71,13 +71,13 @@ TEST_F(TestClient, AlternateNameLong) {
TEST_F(TestClient, AlternateNameCreat) {
auto altname = std::string("foo");
auto file = fmt::format("{}_{}", ::testing::UnitTest::GetInstance()->current_test_info()->name(), getpid());
int fd = client->open(file.c_str(), O_CREAT|O_WRONLY, myperm, 0777, altname);
int fd = client->openat(CEPHFS_AT_FDCWD, file.c_str(), O_CREAT|O_WRONLY, myperm, 0777, 0, 0, 0, nullptr, altname);
ASSERT_LE(0, fd);
ASSERT_EQ(3, client->write(fd, "baz", 3, 0));
ASSERT_EQ(0, client->close(fd));
{
Client::walk_dentry_result wdr;
ClientScaffold::walk_dentry_result wdr;
ASSERT_EQ(0, client->walk(file, &wdr, myperm));
ASSERT_EQ(wdr.alternate_name, altname);
}
@ -86,17 +86,17 @@ TEST_F(TestClient, AlternateNameCreat) {
TEST_F(TestClient, AlternateNameSymlink) {
auto altname = std::string("foo");
auto file = fmt::format("{}_{}", ::testing::UnitTest::GetInstance()->current_test_info()->name(), getpid());
int fd = client->open(file.c_str(), O_CREAT|O_WRONLY, myperm, 0777, altname);
int fd = client->openat(CEPHFS_AT_FDCWD, file.c_str(), O_CREAT|O_WRONLY, myperm, 0777, 0, 0, 0, nullptr, altname);
ASSERT_LE(0, fd);
ASSERT_EQ(3, client->write(fd, "baz", 3, 0));
ASSERT_EQ(0, client->close(fd));
auto file2 = file+"2";
auto altname2 = altname+"2";
ASSERT_EQ(0, client->symlink(file.c_str(), file2.c_str(), myperm, altname2));
ASSERT_EQ(0, client->symlinkat(file.c_str(), CEPHFS_AT_FDCWD, file2.c_str(), myperm, altname2));
{
Client::walk_dentry_result wdr;
ClientScaffold::walk_dentry_result wdr;
ASSERT_EQ(0, client->walk(file2, &wdr, myperm, false));
ASSERT_EQ(wdr.alternate_name, altname2);
ASSERT_EQ(0, client->walk(file, &wdr, myperm));
@ -107,7 +107,7 @@ TEST_F(TestClient, AlternateNameSymlink) {
TEST_F(TestClient, AlternateNameRename) {
auto altname = std::string("foo");
auto file = fmt::format("{}_{}", ::testing::UnitTest::GetInstance()->current_test_info()->name(), getpid());
int fd = client->open(file.c_str(), O_CREAT|O_WRONLY, myperm, 0777, altname);
int fd = client->openat(CEPHFS_AT_FDCWD, file.c_str(), O_CREAT|O_WRONLY, myperm, 0777, 0, 0, 0, nullptr, altname);
ASSERT_LE(0, fd);
ASSERT_EQ(3, client->write(fd, "baz", 3, 0));
ASSERT_EQ(0, client->close(fd));
@ -118,7 +118,7 @@ TEST_F(TestClient, AlternateNameRename) {
ASSERT_EQ(0, client->rename(file.c_str(), file2.c_str(), myperm, altname2));
{
Client::walk_dentry_result wdr;
ClientScaffold::walk_dentry_result wdr;
ASSERT_EQ(0, client->walk(file2, &wdr, myperm));
ASSERT_EQ(wdr.alternate_name, altname2);
}
@ -127,7 +127,7 @@ TEST_F(TestClient, AlternateNameRename) {
TEST_F(TestClient, AlternateNameRenameExistMatch) {
auto altname = std::string("foo");
auto file = fmt::format("{}_{}", ::testing::UnitTest::GetInstance()->current_test_info()->name(), getpid());
int fd = client->open(file.c_str(), O_CREAT|O_WRONLY, myperm, 0777, altname);
int fd = client->openat(CEPHFS_AT_FDCWD, file.c_str(), O_CREAT|O_WRONLY, myperm, 0777, 0, 0, 0, nullptr, altname);
ASSERT_LE(0, fd);
ASSERT_EQ(3, client->write(fd, "baz", 3, 0));
ASSERT_EQ(0, client->close(fd));
@ -135,7 +135,7 @@ TEST_F(TestClient, AlternateNameRenameExistMatch) {
auto file2 = file+"2";
auto altname2 = altname+"2";
fd = client->open(file2.c_str(), O_CREAT|O_WRONLY, myperm, 0777, altname2);
fd = client->openat(CEPHFS_AT_FDCWD, file2.c_str(), O_CREAT|O_WRONLY, myperm, 0777, 0, 0, 0, nullptr, altname2);
ASSERT_LE(0, fd);
ASSERT_EQ(3, client->write(fd, "baz", 3, 0));
ASSERT_EQ(0, client->close(fd));
@ -143,7 +143,7 @@ TEST_F(TestClient, AlternateNameRenameExistMatch) {
ASSERT_EQ(0, client->rename(file.c_str(), file2.c_str(), myperm, altname2));
{
Client::walk_dentry_result wdr;
ClientScaffold::walk_dentry_result wdr;
ASSERT_EQ(0, client->walk(file2, &wdr, myperm));
ASSERT_EQ(wdr.alternate_name, altname2);
}
@ -152,7 +152,7 @@ TEST_F(TestClient, AlternateNameRenameExistMatch) {
TEST_F(TestClient, AlternateNameRenameExistMisMatch) {
auto altname = std::string("foo");
auto file = fmt::format("{}_{}", ::testing::UnitTest::GetInstance()->current_test_info()->name(), getpid());
int fd = client->open(file.c_str(), O_CREAT|O_WRONLY, myperm, 0777, altname);
int fd = client->openat(CEPHFS_AT_FDCWD, file.c_str(), O_CREAT|O_WRONLY, myperm, 0777, 0, 0, 0, nullptr, altname);
ASSERT_LE(0, fd);
ASSERT_EQ(3, client->write(fd, "baz", 3, 0));
ASSERT_EQ(0, client->close(fd));
@ -160,7 +160,7 @@ TEST_F(TestClient, AlternateNameRenameExistMisMatch) {
auto file2 = file+"2";
auto altname2 = altname+"2";
fd = client->open(file2.c_str(), O_CREAT|O_WRONLY, myperm, 0777, altname+"mismatch");
fd = client->openat(CEPHFS_AT_FDCWD, file2.c_str(), O_CREAT|O_WRONLY, myperm, 0777, 0, 0, 0, nullptr, altname+"mismatch");
ASSERT_LE(0, fd);
ASSERT_EQ(3, client->write(fd, "baz", 3, 0));
ASSERT_EQ(0, client->close(fd));
@ -168,7 +168,7 @@ TEST_F(TestClient, AlternateNameRenameExistMisMatch) {
ASSERT_EQ(-EINVAL, client->rename(file.c_str(), file2.c_str(), myperm, altname2));
{
Client::walk_dentry_result wdr;
ClientScaffold::walk_dentry_result wdr;
ASSERT_EQ(0, client->walk(file2, &wdr, myperm));
ASSERT_EQ(wdr.alternate_name, altname+"mismatch");
}
@ -177,7 +177,7 @@ TEST_F(TestClient, AlternateNameRenameExistMisMatch) {
TEST_F(TestClient, AlternateNameLink) {
auto altname = std::string("foo");
auto file = fmt::format("{}_{}", ::testing::UnitTest::GetInstance()->current_test_info()->name(), getpid());
int fd = client->open(file.c_str(), O_CREAT|O_WRONLY, myperm, 0777, altname);
int fd = client->openat(CEPHFS_AT_FDCWD, file.c_str(), O_CREAT|O_WRONLY, myperm, 0777, 0, 0, 0, nullptr, altname);
ASSERT_LE(0, fd);
ASSERT_EQ(3, client->write(fd, "baz", 3, 0));
ASSERT_EQ(0, client->close(fd));
@ -188,7 +188,7 @@ TEST_F(TestClient, AlternateNameLink) {
ASSERT_EQ(0, client->link(file.c_str(), file2.c_str(), myperm, altname2));
{
Client::walk_dentry_result wdr;
ClientScaffold::walk_dentry_result wdr;
ASSERT_EQ(0, client->walk(file2, &wdr, myperm));
ASSERT_EQ(wdr.alternate_name, altname2);
ASSERT_EQ(0, client->walk(file, &wdr, myperm));

View File

@ -41,10 +41,23 @@
#include <map>
#include <vector>
#include <thread>
#include <random>
#include <regex>
using namespace std;
static std::string generate_random_string(int length = 20) {
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<> distrib('a', 'z');
std::string str;
for (int i = 0; i < length; ++i) {
str += static_cast<char>(distrib(gen));
}
return str;
}
TEST(LibCephFS, OpenEmptyComponent) {
pid_t mypid = getpid();
@ -2047,30 +2060,34 @@ TEST(LibCephFS, SetSize) {
TEST(LibCephFS, OperationsOnRoot)
{
UserPerm *rootcred = ceph_userperm_new(0, 0, 0, NULL);
ASSERT_TRUE(rootcred);
struct ceph_mount_info *cmount;
ASSERT_EQ(ceph_create(&cmount, NULL), 0);
ASSERT_EQ(ceph_conf_read_file(cmount, NULL), 0);
ASSERT_EQ(0, ceph_create(&cmount, NULL));
ASSERT_EQ(0, ceph_conf_read_file(cmount, NULL));
ASSERT_EQ(0, ceph_conf_parse_env(cmount, NULL));
ASSERT_EQ(ceph_mount(cmount, "/"), 0);
ASSERT_EQ(0, ceph_init(cmount));
ASSERT_EQ(0, ceph_mount_perms_set(cmount, rootcred));
ASSERT_EQ(0, ceph_mount(cmount, "/"));
char dirname[32];
sprintf(dirname, "/somedir%x", getpid());
ASSERT_EQ(ceph_mkdir(cmount, dirname, 0755), 0);
ASSERT_EQ(ceph_rmdir(cmount, "/"), -EBUSY);
ASSERT_EQ(ceph_rmdir(cmount, "/"), -EINVAL);
ASSERT_EQ(ceph_link(cmount, "/", "/"), -EEXIST);
ASSERT_EQ(ceph_link(cmount, dirname, "/"), -EEXIST);
ASSERT_EQ(ceph_link(cmount, "nonExisitingDir", "/"), -ENOENT);
ASSERT_EQ(ceph_unlink(cmount, "/"), -EISDIR);
ASSERT_EQ(ceph_unlink(cmount, "/"), -EINVAL);
ASSERT_EQ(ceph_rename(cmount, "/", "/"), -EBUSY);
ASSERT_EQ(ceph_rename(cmount, dirname, "/"), -EBUSY);
ASSERT_EQ(ceph_rename(cmount, "nonExistingDir", "/"), -EBUSY);
ASSERT_EQ(ceph_rename(cmount, "/", dirname), -EBUSY);
ASSERT_EQ(ceph_rename(cmount, "/", "nonExistingDir"), -EBUSY);
ASSERT_EQ(ceph_rename(cmount, "/", "/"), -EINVAL);
ASSERT_EQ(ceph_rename(cmount, dirname, "/"), -EINVAL);
ASSERT_EQ(ceph_rename(cmount, "nonExistingDir", "/"), -ENOENT);
ASSERT_EQ(ceph_rename(cmount, "/", dirname), -EINVAL);
ASSERT_EQ(ceph_rename(cmount, "/", "nonExistingDir"), -EINVAL);
ASSERT_EQ(ceph_mkdir(cmount, "/", 0777), -EEXIST);
@ -2081,6 +2098,7 @@ TEST(LibCephFS, OperationsOnRoot)
ASSERT_EQ(ceph_symlink(cmount, "nonExistingDir", "/"), -EEXIST);
ceph_shutdown(cmount);
ceph_userperm_destroy(rootcred);
}
// no rlimits on Windows
@ -2132,6 +2150,109 @@ TEST(LibCephFS, ShutdownRace)
}
#endif
TEST(LibCephFS, CreateParallel)
{
char fname[4096] = "dir/file_parallel";
std::mutex lock; // for memory barrier
std::array<int, 256> fds;
std::array<std::thread, fds.size()> threads;
std::fill(fds.begin(), fds.end(), -1);
struct ceph_mount_info *cmount;
ASSERT_EQ(0, ceph_create(&cmount, NULL));
ASSERT_EQ(0, ceph_conf_read_file(cmount, NULL));
ASSERT_EQ(0, ceph_conf_parse_env(cmount, NULL));
ASSERT_EQ(0, ceph_mount(cmount, "/"));
ASSERT_EQ(0, ceph_mkdirs(cmount, "dir/dummy", 0777));
ASSERT_EQ(0, ceph_unmount(cmount));
ASSERT_EQ(0, ceph_mount(cmount, "/"));
{
struct stat buf;
ASSERT_EQ(0, ceph_stat(cmount, "dir", &buf));
}
for (size_t i = 0; i < fds.size(); ++i) {
auto l = [cmount,&fname,&fds,&lock](int i) {
int fd = ceph_open(cmount, fname, O_CREAT|O_WRONLY, 0777);
std::lock_guard locker(lock);
fds[i] = fd;
};
threads[i] = std::thread(std::move(l), i);
}
for (size_t i = 0; i < fds.size(); ++i) {
threads[i].join();
}
std::lock_guard locker(lock);
for (size_t i = 0; i < fds.size(); ++i) {
ASSERT_GE(fds[i], 0);
}
ASSERT_EQ(0, ceph_unlink(cmount, fname));
ASSERT_EQ(0, ceph_rmdir(cmount, "dir/dummy"));
ASSERT_EQ(0, ceph_rmdir(cmount, "dir"));
ceph_shutdown(cmount);
}
TEST(LibCephFS, CreateExclParallel)
{
char fname[4096] = "dir/file_parallel";
std::mutex lock; // for memory barrier
std::array<int, 256> fds;
std::array<std::thread, fds.size()> threads;
std::fill(fds.begin(), fds.end(), -1);
struct ceph_mount_info *cmount;
ASSERT_EQ(0, ceph_create(&cmount, NULL));
ASSERT_EQ(0, ceph_conf_read_file(cmount, NULL));
ASSERT_EQ(0, ceph_conf_parse_env(cmount, NULL));
ASSERT_EQ(0, ceph_mount(cmount, "/"));
ASSERT_EQ(0, ceph_mkdirs(cmount, "dir/dummy", 0777));
ASSERT_EQ(0, ceph_unmount(cmount));
ASSERT_EQ(0, ceph_mount(cmount, "/"));
{
struct stat buf;
ASSERT_EQ(0, ceph_stat(cmount, "dir", &buf));
}
for (size_t i = 0; i < fds.size(); ++i) {
auto l = [cmount,&fname,&fds,&lock](int i) {
int fd = ceph_open(cmount, fname, O_CREAT|O_EXCL|O_WRONLY, 0777);
std::lock_guard locker(lock);
fds[i] = fd;
};
threads[i] = std::thread(std::move(l), i);
}
for (size_t i = 0; i < fds.size(); ++i) {
threads[i].join();
}
std::lock_guard locker(lock);
int found = -1;
for (size_t i = 0; i < fds.size(); ++i) {
if (fds[i] >= 0) {
ASSERT_EQ(found, -1);
found = i;
} else {
ASSERT_EQ(fds[i], -EEXIST);
}
}
ASSERT_EQ(0, ceph_unlink(cmount, fname));
ASSERT_EQ(0, ceph_rmdir(cmount, "dir/dummy"));
ASSERT_EQ(0, ceph_rmdir(cmount, "dir"));
ceph_shutdown(cmount);
}
static void get_current_time_utimbuf(struct utimbuf *utb)
{
utime_t t = ceph_clock_now();
@ -3638,6 +3759,94 @@ TEST(LibCephFS, SnapdirAttrs) {
ceph_shutdown(cmount);
}
TEST(LibCephFS, SnapDirLookup) {
struct ceph_mount_info *cmount;
ASSERT_EQ(ceph_create(&cmount, NULL), 0);
ASSERT_EQ(ceph_conf_read_file(cmount, NULL), 0);
ASSERT_EQ(0, ceph_conf_parse_env(cmount, NULL));
ASSERT_EQ(0, ceph_mount(cmount, NULL));
auto top = generate_random_string();
ASSERT_EQ(ceph_mkdir(cmount, top.c_str(), 0777), 0);
ASSERT_EQ(0, ceph_chdir(cmount, top.c_str()));
ASSERT_EQ(ceph_mkdir(cmount, "foo", 0777), 0);
ASSERT_EQ(ceph_mkdir(cmount, "foo/bar", 0777), 0);
ASSERT_EQ(ceph_mkdir(cmount, "foo/.snap/snap", 0777), 0);
ASSERT_EQ(0, ceph_unmount(cmount));
ASSERT_EQ(0, ceph_mount(cmount, NULL));
ASSERT_EQ(0, ceph_chdir(cmount, top.c_str()));
ASSERT_EQ(0, ceph_chdir(cmount, "foo/"));
{
struct ceph_dir_result *cdr = NULL;
struct dirent* de = nullptr;
ASSERT_EQ(0, ceph_opendir(cmount, ".", &cdr));
while((de = ceph_readdir(cmount, cdr))) {
continue;
}
ASSERT_EQ(0, ceph_closedir(cmount, cdr));
}
{
struct stat buf;
ASSERT_EQ(0, ceph_stat(cmount, "bar", &buf));
}
ASSERT_EQ(0, ceph_unmount(cmount));
ASSERT_EQ(0, ceph_mount(cmount, NULL));
ASSERT_EQ(0, ceph_chdir(cmount, top.c_str()));
ASSERT_EQ(0, ceph_chdir(cmount, "foo/"));
{
struct ceph_dir_result *cdr = NULL;
struct dirent* de = nullptr;
ASSERT_EQ(0, ceph_opendir(cmount, ".snap/", &cdr));
while((de = ceph_readdir(cmount, cdr))) {
continue;
}
ASSERT_EQ(0, ceph_closedir(cmount, cdr));
}
{
struct stat buf;
ASSERT_EQ(0, ceph_stat(cmount, ".snap/snap/bar", &buf));
}
ASSERT_EQ(0, ceph_unmount(cmount));
ASSERT_EQ(0, ceph_mount(cmount, NULL));
ASSERT_EQ(0, ceph_chdir(cmount, top.c_str()));
ASSERT_EQ(0, ceph_chdir(cmount, "foo/"));
{
struct ceph_dir_result *cdr = NULL;
struct dirent* de = nullptr;
ASSERT_EQ(0, ceph_opendir(cmount, ".snap/snap/", &cdr));
while((de = ceph_readdir(cmount, cdr))) {
continue;
}
ASSERT_EQ(0, ceph_closedir(cmount, cdr));
}
{
struct stat buf;
ASSERT_EQ(0, ceph_stat(cmount, ".snap/snap/bar", &buf));
}
ASSERT_EQ(0, ceph_unmount(cmount));
ASSERT_EQ(0, ceph_mount(cmount, NULL));
ASSERT_EQ(0, ceph_chdir(cmount, top.c_str()));
ASSERT_EQ(ceph_rmdir(cmount, "foo/.snap/snap"), 0);
ASSERT_EQ(ceph_rmdir(cmount, "foo/bar"), 0);
ASSERT_EQ(ceph_rmdir(cmount, "foo"), 0);
ASSERT_EQ(0, ceph_unmount(cmount));
ceph_shutdown(cmount);
}
TEST(LibCephFS, SnapdirAttrsOnSnapCreate) {
struct ceph_mount_info *cmount;
ASSERT_EQ(ceph_create(&cmount, NULL), 0);
@ -3793,3 +4002,47 @@ TEST(LibCephFS, SnapdirAttrsOnSnapRename) {
ASSERT_EQ(0, ceph_unmount(cmount));
ceph_shutdown(cmount);
}
TEST(LibCephFS, SubdirLookupAfterReaddir_ll) {
struct ceph_mount_info *cmount;
Inode *root, *subdir = NULL;
struct ceph_statx stx;
ASSERT_EQ(ceph_create(&cmount, NULL), 0);
ASSERT_EQ(ceph_conf_read_file(cmount, NULL), 0);
ASSERT_EQ(0, ceph_conf_parse_env(cmount, NULL));
ASSERT_EQ(0, ceph_mount(cmount, NULL));
ceph_rmdir(cmount, "foo/bar");
ceph_rmdir(cmount, "foo");
ASSERT_EQ(ceph_mkdir(cmount, "foo", 0777), 0);
ASSERT_EQ(ceph_mkdir(cmount, "foo/bar", 0777), 0);
ASSERT_EQ(0, ceph_unmount(cmount));
ASSERT_EQ(0, ceph_mount(cmount, NULL));
UserPerm *perms = ceph_mount_perms(cmount);
{
ASSERT_EQ(0, ceph_ll_walk(cmount, ".", &root, &stx, CEPH_STATX_INO, 0, perms));
ASSERT_EQ(0, ceph_ll_lookup(cmount, root, "foo/bar", &subdir, &stx, CEPH_STATX_INO, 0, perms));
}
{
struct ceph_dir_result *cdr = NULL;
struct dirent *de = NULL;
ASSERT_EQ(0, ceph_ll_opendir(cmount, root, &cdr, perms));
while((de = ceph_readdir(cmount, cdr))) {
continue;
}
ASSERT_EQ(0, ceph_ll_releasedir(cmount, cdr));
}
{
ASSERT_EQ(0, ceph_ll_lookup(cmount, root, "foo/bar", &subdir, &stx, CEPH_STATX_INO, 0, perms));
}
ASSERT_EQ(0, ceph_unmount(cmount));
ceph_shutdown(cmount);
}

View File

@ -31,16 +31,7 @@ ClusterWatcher::ClusterWatcher(CephContext *cct, MonClient *monc, ServiceDaemon
ClusterWatcher::~ClusterWatcher() {
}
bool ClusterWatcher::ms_can_fast_dispatch2(const cref_t<Message> &m) const {
return m->get_type() == CEPH_MSG_FS_MAP;
}
void ClusterWatcher::ms_fast_dispatch2(const ref_t<Message> &m) {
bool handled = ms_dispatch2(m);
ceph_assert(handled);
}
bool ClusterWatcher::ms_dispatch2(const ref_t<Message> &m) {
Dispatcher::dispatch_result_t ClusterWatcher::ms_dispatch2(const ref_t<Message> &m) {
if (m->get_type() == CEPH_MSG_FS_MAP) {
if (m->get_connection()->get_peer_type() == CEPH_ENTITY_TYPE_MON) {
handle_fsmap(ref_cast<MFSMap>(m));

View File

@ -38,12 +38,7 @@ public:
Listener &listener);
~ClusterWatcher();
bool ms_can_fast_dispatch_any() const override {
return true;
}
bool ms_can_fast_dispatch2(const cref_t<Message> &m) const override;
void ms_fast_dispatch2(const ref_t<Message> &m) override;
bool ms_dispatch2(const ref_t<Message> &m) override;
Dispatcher::dispatch_result_t ms_dispatch2(const ref_t<Message> &m) override;
void ms_handle_connect(Connection *c) override {
}

View File

@ -114,8 +114,10 @@ wnbdSrcDir="${depsSrcDir}/wnbd"
wnbdLibDir="${depsToolsetDir}/wnbd/lib"
dokanSrcDir="${depsSrcDir}/dokany"
dokanLibDir="${depsToolsetDir}/dokany/lib"
libicuSrcDir="${depsSrcDir}/icu"
libicuLibDir="${depsToolsetDir}/libicu"
depsDirs="$lz4Dir;$sslDir;$boostDir;$zlibDir;$backtraceDir;$snappyDir"
depsDirs="$lz4Dir;$sslDir;$boostDir;$zlibDir;$backtraceDir;$snappyDir;$libicuLibDir"
depsDirs+=";$winLibDir"
# Cmake recommends using CMAKE_PREFIX_PATH instead of link_directories.

View File

@ -41,6 +41,11 @@ dokanTag="v2.0.5.1000"
dokanSrcDir="${depsSrcDir}/dokany"
dokanLibDir="${depsToolsetDir}/dokany/lib"
libicuUrl="https://github.com/unicode-org/icu"
libicuTag="release-76-1"
libicuSrcDir="${depsSrcDir}/icu"
libicuLibDir="${depsToolsetDir}/libicu"
mingwLlvmUrl="https://github.com/mstorsjo/llvm-mingw/releases/download/20230320/llvm-mingw-20230320-ucrt-ubuntu-18.04-x86_64.tar.xz"
mingwLlvmSha256Sum="bc367753dea829d219be32e2e64e2d15d03158ce8e700ae5210ca3d78e6a07ea"
mingwLlvmDir="${DEPS_DIR}/mingw-llvm"
@ -357,5 +362,27 @@ $MINGW_DLLTOOL -d $dokanSrcDir/dokan/dokan.def \
# sys/public.h without the "sys" prefix.
cp $dokanSrcDir/sys/public.h $dokanSrcDir/dokan
echo "Building libicu."
cd $depsSrcDir
if [[ ! -d $libicuSrcDir ]]; then
git clone --branch $libicuTag --depth 1 $libicuUrl
cd $libicuSrcDir
fi
mkdir -p $libicuSrcDir/build-windows
mkdir -p $libicuSrcDir/build-linux
cd $libicuSrcDir/build-linux
../icu4c/source/configure
_make
cd $libicuSrcDir/build-windows
../icu4c/source/configure \
--enable-static \
--host=${MINGW_BASE} \
--with-cross-build=$PWD/../build-linux \
--prefix=$libicuLibDir
_make
_make install
echo "Finished building Ceph dependencies."
touch $depsToolsetDir/completed