Commit Graph

30229 Commits

Author SHA1 Message Date
Loic Dachary
202d1f7b8d Merge pull request #964 from apeters1971/wip-arch-sse2
ARCH: add variable for sse2 register

Reviewed-by: Loic Dachary <loic@dachary.org>
2013-12-18 09:57:18 -08:00
Andreas Peters
9414970362 ARCH: adding SSE2 flag to arch-test 2013-12-18 18:05:17 +01:00
Sage Weil
fd5f40269e Merge pull request #965 from ksperis/rbdmap.upstart
upstart: add rbdmap script

Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-18 08:57:29 -08:00
Loic Dachary
d76188d760 Merge pull request #952 from kri5/master
vstart: Update apache conf to run against apache 2.4

Reviewed-by: Loic Dachary <loic@dachary.org>
2013-12-18 07:00:42 -08:00
Christophe Courtaut
30f8aa1d7c vstart: Update apache conf to run against apache 2.4
Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
2013-12-18 15:54:53 +01:00
Laurent Barbe
b86d450ad4 upstart: add rbdmap script
Upstart script for mapping / unmapping rbd device based on /etc/ceph/rbdmap file.
It does not mount or unmount filesystem, this part should be performed by _netdev option in fstab.

Signed-off-by: Laurent Barbe <laurent@ksperis.com>
2013-12-18 14:20:24 +01:00
Andreas Peters
e4537d31d9 ARCH: add variable for sse2 register 2013-12-18 11:16:52 +01:00
Sage Weil
0d217cf9e9 qa/workunits/cephtool/test.sh: clean up our client.xx.keyring
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-17 18:18:46 -08:00
David Zafman
434dce1ffe Merge pull request #960 from ceph/wip-6990
Add backward comptible acting set until all OSDs updated

Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-12-17 11:46:48 -08:00
David Zafman
19cff890eb Add backward comptible acting set until all OSDs updated
Add configuration variable to override compatible acting set handling.
Later we'll check the osdmap that all OSDs are updated to use new acting sets.

Fixes: #6990

Signed-off-by: David Zafman <david.zafman@inktank.com>
2013-12-17 11:43:40 -08:00
Sage Weil
2e5a461e4a Merge pull request #953 from dachary/wip-qa-suite
use qa/workunits/cephtool/test.sh as a unittest

Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-17 10:46:49 -08:00
Alexandre Oliva
f5d32a33d2 mds: drop unused find_ino_dir
Remove all traces of find_ino_dir, it is no longer used.

Signed-off-by: Alexandre Oliva <oliva@gnu.org>
2013-12-17 09:03:37 -08:00
Alexandre Oliva
c60a3644eb Fix typo in #undef in ceph-dencoder
Signed-off-by: Alexandre Oliva <oliva@gnu.org>
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-17 09:02:55 -08:00
Loic Dachary
9e456555ff qa: add ../qa/workunits/cephtool/test.sh to unittests
Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-17 17:53:02 +01:00
Sage Weil
526e2528cb Merge pull request #957 from ceph/wip-rbd-coverity
rbd: make coverity happy

Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-17 08:51:32 -08:00
Loic Dachary
c1eb55c6b0 qa: vstart wrapper helper for unittests
Primarily useful to run scripts from qa/workunits as part of make check.

vstart_wrapper.sh starts a vstart.sh cluster, runs the command given in
argument and tearsdown cluster when it completes.

The vstart_wrapped_tests.sh script contains the list of scripts that
need the vstart_wrapper.sh to run. It would not be necessary if automake
allowed passing argument to tests scripts. It also adds markers to the
output to facilitate searching the output because it can be very verbose.

This wrapper is kept simple and will probably evolve into something more
sophisticated depending on the scripts being added to
vstart_wrapper_tests.sh. There are numerous options, ranging from
parsing the yaml from ceph-qa-suite to figure out the configuration
cluster to converting the same yaml into a puppet manifest that is
applied locally or even driving OpenStack instances to avoid messing
with the local machine. But this would probably be overkill at this
point.

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-17 17:46:23 +01:00
Ilya Dryomov
0edbda204b rbd: make coverity happy
A recent coverity run found two "defects" in rbd.cc:

** CID 1138367:  Time of check time of use  (TOCTOU)
/rbd.cc: 2024 in do_kernel_rm(const char *)()

2019   const char *fname = "/sys/bus/rbd/remove_single_major";
2020   if (stat(fname, &sbuf)) {
2021     fname = "/sys/bus/rbd/remove";
2022   }
2023
2024   int fd = open(fname, O_WRONLY);
2025   if (fd < 0) {

** CID 1138368:  Time of check time of use  (TOCTOU)
/rbd.cc: 1735 in do_kernel_add(const char *, const char *, const char *)()

same as above, s/remove/add

There is nothing racey going on here, and this is not an instance of
TOCTOU, but, instead of silencing coverity with annotatations, redo
this with two open() calls.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2013-12-17 17:42:30 +02:00
Loic Dachary
d93881f3c8 vstart/stop: use pkill instead of killall
killall fails to kill all OSDs when called as a oneliner. Replace with a
loop using pkill that retries until there are no more process to kill by
the required name.

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-17 13:48:18 +01:00
Loic Dachary
ae56cef396 qa: recursively remove .gcno and .gcda
Instead of removing them only in the current directory. Leftovers
prevent running make check-coverage properly because lcov fails
when stumbling on old .gcno files with

lcov -d . -c -i -o check-coverage_base_full.lcov
Processing os/BtrfsFileStoreBackend.gcno
geninfo: ERROR: ceph/src/os/BtrfsFileStoreBackend.gcno: reached
         unexpected end of file

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-17 13:47:48 +01:00
Sage Weil
6f431200e3 ceph_test_rados_api_tier: fix HitSetTrim vs split, too
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-16 17:10:48 -08:00
Sage Weil
00f436c144 Merge pull request #904 from ceph/wip-mds-cluster2
Wip mds cluster2

Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-16 17:03:27 -08:00
Sage Weil
c5bccfef88 ceph_test_rados_api_tier: fix HitSetRead test race with split
Recalculate the hash on each iteration in case we are racing with split.

Fixes: #7013
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-16 16:52:35 -08:00
Sage Weil
94da54ff95 Merge pull request #954 from ceph/wip-7009
mon: move supported_commands fields, methods into Monitor, and fix leak

Reviewed-by: Greg Farnum <greg@inktank.com>
2013-12-16 16:31:39 -08:00
Sage Weil
7e618c937b mon: move supported_commands fields, methods into Monitor, and fix leak
We were leaking the static leader_supported_mon_commands.  Move this into
the class so that we can clean up in the destructor.

Rename get_command_descriptions -> format_command_descriptions.

Fixes: #7009
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-16 16:09:44 -08:00
Sage Weil
1597d4e9f5 Merge pull request #951 from ceph/wip-linux-version
common: introduce get_linux_version()

Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-16 09:27:43 -08:00
Ilya Dryomov
824b3d8e84 FileJournal: use pclose() to close a popen() stream
In FileJournal::_check_disk_write_cache(), use pclose() instead of
fclose() to close a stream, created by popen().

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2013-12-16 18:57:22 +02:00
Ilya Dryomov
6696ab6479 FileJournal: switch to get_linux_version()
For the purposes of FileJournal::_check_disk_write_cache(), use
get_linux_version(), which is based on uname(2), instead of parsing the
contents of /proc/version.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2013-12-16 18:57:22 +02:00
Ilya Dryomov
fcf6e9878b common: introduce get_linux_version()
get_linux_version() returns a version of the currently running kernel,
encoded as in int, and is contained in common/linux_version.[ch].

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2013-12-16 18:57:21 +02:00
Ilya Dryomov
a2babe27e8 configure: break up AC_CHECK_HEADERS into one header-file per line
Break up AC_CHECK_HEADERS macro into one header-file per line so it's
easier to read and make changes.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2013-12-16 18:57:21 +02:00
Yan, Zheng
4526d13a9d mds: fix stale session handling for multiple mds
Don't add new caps to stale session when importing inodes. Don't
touch session when importing caps because it confuses the stale
session detection.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 14:24:52 +08:00
Yan, Zheng
43f7268f5d mds: properly set dirty flag when journalling import
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 14:24:52 +08:00
Yan, Zheng
802df76f68 mds: properly update mdsdir's authority during recovery
dirfrag of mdsdir doesn't inherit its parent inode's authority.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 14:24:52 +08:00
Yan, Zheng
b6d1d8f186 mds: finish opening sessions even if import aborted
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 14:24:52 +08:00
Yan, Zheng
80005f1ece mds: fix discover path race
When C_MDC_RetryDiscoverPath executed, we may have already become
auth mds of base

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 14:24:50 +08:00
Sage Weil
58d68995c4 Merge pull request #947 from dachary/wip-6824
mon: set ceph osd (down|out|in|rm) error code on failure

Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-15 21:16:48 -08:00
Yan, Zheng
5fdcc568c6 mds: fix bug in MDCache::open_ino_finish
It's wrong to erase open_ino_info_t after finishing contexts, because
MDCache::open_ino() can be called again when finishing contexts.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 12:15:25 +08:00
Yan, Zheng
71d1eb374a mds: add CEPH_FEATURE_EXPORT_PEER and bump the protocal version
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 12:15:25 +08:00
Yan, Zheng
d0b744a1d6 client: handle session flush message
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 12:15:25 +08:00
Yan, Zheng
05b192faab mds: simplify how to export non-auth caps
Introduce a new flag in cap import message. If client finds the flag
is set, it releases exporter's caps (send release to the exporter).
This saves the cap export message and a "mds to mds" message.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 12:15:25 +08:00
Yan, Zheng
9dc52ff04b mds: send cap import messages to clients after importing subtree succeeds
When importing subtree, the importer sends cap import messages to clients
before the import subtree operation is considered as success. If the
exporter crashes before EExport event is journalled, the importer needs to
re-export client caps. This confuses clients, and makes them lose track of
auth caps.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 12:15:25 +08:00
Yan, Zheng
6a565881f6 mds: re-send cap exports in resolve message.
For rename operation that changes inode's authority, if master mds
of the operation crashed, inode's original auth mds sends export
messages to clients when it receives the master mds' resolve ack
message, Client can't reply on the export message to add caps for
the master mds, then reconnect the cap when the master mds enters
reconnect stage. Because client may receive the export message after
receiving mdsmap that claims the master mds is in reconnect stage.

The fix is include cap exports in resolve message, so the master mds
can send import messages to clients when it enters the rejoin stage.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 12:15:25 +08:00
Yan, Zheng
4fdeb00df2 mds: include counterpart's information in cap import/export messages
when exporting indoes with client caps, the importer sends cap import
messages to clients, the exporter sends cap export messages to clients.
A client can receive these two messages in any order. If a client first
receives cap import message, it adds the imported caps. but the caps
from the exporter are still considered as valid. This can compromise
consistence. If MDS crashes while importing caps, clients can only
receive cap export messages, but don't receive cap import messages.
These clients don't know which MDS is the cap importer, so they can't
send cap reconnect when the MDS recovers.

We can handle above issues by including counterpart's information in
cap import/export messages. If a client first receives cap import
message, it added the imported caps, then removes the the exporter's
caps. If a client first receives cap export message, it removes the
exported caps, then adds caps for the importer.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 12:15:25 +08:00
Yan, Zheng
ef902ee0b9 mds: send info of imported caps back to the exporter (rename)
use MMDSSlaveRequest::OP_FINISH slave request to send information
of rename imported caps back to the exporter. This is preparation
for including counterpart's information in cap import/export message.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 12:15:25 +08:00
Yan, Zheng
85171fd6c2 mds: send info of imported caps back to the exporter (cache rejoin)
Use cache rejoin ack message to send information of rejoin imported
caps back to the exporter. Also move the code that exports reconnect
caps to MDCache::handle_cache_rejoin_ack()

This is preparation for including counterpart's information in cap
import/export message.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 12:15:24 +08:00
Yan, Zheng
ff8b9ac358 mds: send info of imported caps back to the exporter (export dir)
Introduce a new class Capability::Import and use it to send information
of imported caps back to the exporter. This is preparation for including
counterpart's information in cap import/export message.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 12:15:24 +08:00
Yan, Zheng
d00ec7915c mds: flush session messages before exporting caps
Following sequence of events can happen when exporting inodes:

- client sends open file request to mds.0
- mds.0 handles the request and sends inode stat back to the client
- mds.0 export the inode to mds.1
- mds.1 sends cap import message to the client
- mds.0 sends cap export message to the client
- client receives the cap import message from mds.1, but the client
  still doesn't have corresponding inode in the cache. So the client
  releases the imported caps.
- client receives the open file reply from mds.0
- client receives the cap export message from mds.0.

After the end of these events, the client doesn't have any cap for
the opened file.

To fix the message ordering issue, this patch introduces a new session
operation FLUSHMSG. Before exporting caps, we send a FLUSHMSG seesion
message to client and wait for the acknowledgment. When receiveing the
FLUSHMSG_ACK message from client, we are sure that clients have received
all messages sent previously.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 12:15:24 +08:00
Yan, Zheng
77515b7a3c mds: increase cap sequence when sharing max size
For case:
 - client voluntarily releases some caps through cap update message
 - mds shares the new max by sending cap grant message
 - mds recevies the cap update message

If mds doesn't increase the cap sequence when sharing the max size.
It can't determine if the cap update message was sent before or after
client reveived the cap grant message that updates max size.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 12:15:24 +08:00
Yan, Zheng
65259796ae mds: include inode version in auth mds' lock messages
encode inode version in auth mds' lock messages, so that version
of replica inodes get updated. This is important because client
use inode version in mds reply to check if the cached inode is
already up-to-date. It skips updating the inode if it thinks the
inode is already up-to-date.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 12:15:24 +08:00
Yan, Zheng
f134c77267 mds: avoid allocating MDRequest::More when cleanup request
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 12:15:24 +08:00
Yan, Zheng
e6c4d32e64 mds: waiting for slave reuqest to finish
If MDS receives a client request, but find there is an existing
slave request. It's possible that other MDS forwarded the request
to us, but the MMDSSlaveRequest::OP_FINISH message arrives after
the client request.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-12-16 12:15:24 +08:00