Upstart script for mapping / unmapping rbd device based on /etc/ceph/rbdmap file.
It does not mount or unmount filesystem, this part should be performed by _netdev option in fstab.
Signed-off-by: Laurent Barbe <laurent@ksperis.com>
Add configuration variable to override compatible acting set handling.
Later we'll check the osdmap that all OSDs are updated to use new acting sets.
Fixes: #6990
Signed-off-by: David Zafman <david.zafman@inktank.com>
Primarily useful to run scripts from qa/workunits as part of make check.
vstart_wrapper.sh starts a vstart.sh cluster, runs the command given in
argument and tearsdown cluster when it completes.
The vstart_wrapped_tests.sh script contains the list of scripts that
need the vstart_wrapper.sh to run. It would not be necessary if automake
allowed passing argument to tests scripts. It also adds markers to the
output to facilitate searching the output because it can be very verbose.
This wrapper is kept simple and will probably evolve into something more
sophisticated depending on the scripts being added to
vstart_wrapper_tests.sh. There are numerous options, ranging from
parsing the yaml from ceph-qa-suite to figure out the configuration
cluster to converting the same yaml into a puppet manifest that is
applied locally or even driving OpenStack instances to avoid messing
with the local machine. But this would probably be overkill at this
point.
Signed-off-by: Loic Dachary <loic@dachary.org>
A recent coverity run found two "defects" in rbd.cc:
** CID 1138367: Time of check time of use (TOCTOU)
/rbd.cc: 2024 in do_kernel_rm(const char *)()
2019 const char *fname = "/sys/bus/rbd/remove_single_major";
2020 if (stat(fname, &sbuf)) {
2021 fname = "/sys/bus/rbd/remove";
2022 }
2023
2024 int fd = open(fname, O_WRONLY);
2025 if (fd < 0) {
** CID 1138368: Time of check time of use (TOCTOU)
/rbd.cc: 1735 in do_kernel_add(const char *, const char *, const char *)()
same as above, s/remove/add
There is nothing racey going on here, and this is not an instance of
TOCTOU, but, instead of silencing coverity with annotatations, redo
this with two open() calls.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
killall fails to kill all OSDs when called as a oneliner. Replace with a
loop using pkill that retries until there are no more process to kill by
the required name.
Signed-off-by: Loic Dachary <loic@dachary.org>
Instead of removing them only in the current directory. Leftovers
prevent running make check-coverage properly because lcov fails
when stumbling on old .gcno files with
lcov -d . -c -i -o check-coverage_base_full.lcov
Processing os/BtrfsFileStoreBackend.gcno
geninfo: ERROR: ceph/src/os/BtrfsFileStoreBackend.gcno: reached
unexpected end of file
Signed-off-by: Loic Dachary <loic@dachary.org>
We were leaking the static leader_supported_mon_commands. Move this into
the class so that we can clean up in the destructor.
Rename get_command_descriptions -> format_command_descriptions.
Fixes: #7009
Signed-off-by: Sage Weil <sage@inktank.com>
In FileJournal::_check_disk_write_cache(), use pclose() instead of
fclose() to close a stream, created by popen().
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
For the purposes of FileJournal::_check_disk_write_cache(), use
get_linux_version(), which is based on uname(2), instead of parsing the
contents of /proc/version.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
get_linux_version() returns a version of the currently running kernel,
encoded as in int, and is contained in common/linux_version.[ch].
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Break up AC_CHECK_HEADERS macro into one header-file per line so it's
easier to read and make changes.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Don't add new caps to stale session when importing inodes. Don't
touch session when importing caps because it confuses the stale
session detection.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
It's wrong to erase open_ino_info_t after finishing contexts, because
MDCache::open_ino() can be called again when finishing contexts.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Introduce a new flag in cap import message. If client finds the flag
is set, it releases exporter's caps (send release to the exporter).
This saves the cap export message and a "mds to mds" message.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
When importing subtree, the importer sends cap import messages to clients
before the import subtree operation is considered as success. If the
exporter crashes before EExport event is journalled, the importer needs to
re-export client caps. This confuses clients, and makes them lose track of
auth caps.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
For rename operation that changes inode's authority, if master mds
of the operation crashed, inode's original auth mds sends export
messages to clients when it receives the master mds' resolve ack
message, Client can't reply on the export message to add caps for
the master mds, then reconnect the cap when the master mds enters
reconnect stage. Because client may receive the export message after
receiving mdsmap that claims the master mds is in reconnect stage.
The fix is include cap exports in resolve message, so the master mds
can send import messages to clients when it enters the rejoin stage.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
when exporting indoes with client caps, the importer sends cap import
messages to clients, the exporter sends cap export messages to clients.
A client can receive these two messages in any order. If a client first
receives cap import message, it adds the imported caps. but the caps
from the exporter are still considered as valid. This can compromise
consistence. If MDS crashes while importing caps, clients can only
receive cap export messages, but don't receive cap import messages.
These clients don't know which MDS is the cap importer, so they can't
send cap reconnect when the MDS recovers.
We can handle above issues by including counterpart's information in
cap import/export messages. If a client first receives cap import
message, it added the imported caps, then removes the the exporter's
caps. If a client first receives cap export message, it removes the
exported caps, then adds caps for the importer.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
use MMDSSlaveRequest::OP_FINISH slave request to send information
of rename imported caps back to the exporter. This is preparation
for including counterpart's information in cap import/export message.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Use cache rejoin ack message to send information of rejoin imported
caps back to the exporter. Also move the code that exports reconnect
caps to MDCache::handle_cache_rejoin_ack()
This is preparation for including counterpart's information in cap
import/export message.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Introduce a new class Capability::Import and use it to send information
of imported caps back to the exporter. This is preparation for including
counterpart's information in cap import/export message.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Following sequence of events can happen when exporting inodes:
- client sends open file request to mds.0
- mds.0 handles the request and sends inode stat back to the client
- mds.0 export the inode to mds.1
- mds.1 sends cap import message to the client
- mds.0 sends cap export message to the client
- client receives the cap import message from mds.1, but the client
still doesn't have corresponding inode in the cache. So the client
releases the imported caps.
- client receives the open file reply from mds.0
- client receives the cap export message from mds.0.
After the end of these events, the client doesn't have any cap for
the opened file.
To fix the message ordering issue, this patch introduces a new session
operation FLUSHMSG. Before exporting caps, we send a FLUSHMSG seesion
message to client and wait for the acknowledgment. When receiveing the
FLUSHMSG_ACK message from client, we are sure that clients have received
all messages sent previously.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
For case:
- client voluntarily releases some caps through cap update message
- mds shares the new max by sending cap grant message
- mds recevies the cap update message
If mds doesn't increase the cap sequence when sharing the max size.
It can't determine if the cap update message was sent before or after
client reveived the cap grant message that updates max size.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
encode inode version in auth mds' lock messages, so that version
of replica inodes get updated. This is important because client
use inode version in mds reply to check if the cached inode is
already up-to-date. It skips updating the inode if it thinks the
inode is already up-to-date.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
If MDS receives a client request, but find there is an existing
slave request. It's possible that other MDS forwarded the request
to us, but the MMDSSlaveRequest::OP_FINISH message arrives after
the client request.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>