Commit Graph

25386 Commits

Author SHA1 Message Date
Sage Weil
1ddea41fc5 Merge pull request #217 from alram/master
Fix: use absolute path with udev

Reviewed-by: Sage Weil <sage@inktank.com>
2013-04-15 20:32:46 -07:00
Alexandre Marangone
785b25f53d Fix: use absolute path with udev
Avoids the following: udevd[61613]: failed to execute '/lib/udev/bash'
'bash -c 'while [ ! -e /dev/mapper/....

Signed-off-by: Alexandre Marangone <alexandre.marangone@inktank.com>
2013-04-15 15:57:00 -07:00
Josh Durgin
98de67d424 qa: add workunit for running qemu-iotests
This uses the old stand-alone qemu-iotests repo so it works with the
version of qemu in Ubuntu 12.04. The tests depend tightly on qemu
version, so to use later tests we'd need to install corresponding
versions of qemu.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-04-12 17:59:35 -07:00
Greg Farnum
6b98162f2b mds: output error number when failing to load an MDSTable
Signed-off-by: Greg Farnum <greg@inktank.com>
2013-04-12 13:14:17 -07:00
Gary Lowell
ae71b576a7 init-radosgw.sysv: New radosgw init file for rpm based systems
Added init-radosgw.sys file for rpm based systems, added it to
the tarball list in the makefile, and updated the specfile to
install it.  Also added the a dependency in ceph since it uses
utility routes from that package (On debian systems these are
packaged in ceph-common).  Incorporated review comments from
Alex. (Bug #4571)

Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
Reviewed-by: Alexandre Marangone  <alexandre.marangone@inktank.com>
2013-04-11 23:02:08 -07:00
Sam Lang
d777b8e66b Merge pull request #213 from ceph/wip-sessionmap-4644
mds: fix session_info_t decoding

Reviewed-by:  Sam Lang <sam.lang@inktank.com>
2013-04-11 09:08:04 -07:00
Gregory Farnum
e32849c4ee Merge pull request #212 from ceph/wip-4451 2013-04-11 08:45:06 -07:00
Sam Lang
4977f3eab0 mds: Delay export on missing inodes for reconnect
The reconnect caps sent by the client on reconnect may not have
inodes found in the inode cache until after clientreplay (when
the client creates a new file, for example). Currently, we send an
export for that cap to the client if we don't see an inode in the cache
and path_is_mine() returns false (for example, if the client didn't
send a path because the file was already unlinked).
Instead, we want to delay handling of the reconnect cap until
clientreplay completes.

This patch modifies handle_client_reconnect() so that we don't assume
the cap isn't ours if we don't have an inode for it, but instead delay
recovery for later. An export cap message is only sent if the inode exists
and the cap isn't ours (non-auth) during reconnect. If any remaining
recovered caps exist in the recovered list once the mds goes active, we
send export messages at that point.

Also, after removing the path_is_mine check,
MDCache::parallel_fetch_traverse_dir() needs to skip non-auth dirfrags.

Fixes #4451.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-11 10:25:46 -05:00
Sam Lang
3a1cf53c30 client: Unify session close handling
If mds failure causes client reconnect while the
client is unmounting, the client will send a session
close request to the mds even if there are outstanding
inodes in the cache waiting to receive flush_acks.   This
causes the mds to send back a session close message and
the client closes the connection, so that when the mds tries
to send flush acks back to the client, they get dropped, resulting
in the client hanging on unmount.  The pattern for this bug is:

1. mds restart
2. client sends session open request
3. client unmount sets unmounting flag and waits for flush_acks
4. mds sends session open reply
5. client sends session close request (because its unmounting)
6. mds sends session close, client closes connection
7. mds tries to send flush_acks, but drops them because the connection
is gone

This patch unifies the session close handling so that the client
only sends a session close in unmount once all flush acks have been
received.  If the mds restarts during session close, the reconnect
logic will kick the session close waiter so that session close requests
are re-sent for session close replies not yet received.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-04-11 10:25:46 -05:00
Samuel Just
a3298713bb OSD: make pg upgrade logging quiet
Fixes: #4701
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-10 14:24:23 -07:00
Samuel Just
ac720a091d Merge branch 'wip_4654' into next
Fixes: #wip_4654
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-10 14:00:13 -07:00
Alex Elder
351d9b270f rbd qa/workunits: add rbd read data test
This adds a new test script for validating data reads from a mapped
rbd image is what it's expected to be.

See the content of the file for a bit more explanation.

Signed-off-by: Alex Elder <elder@inktank.com>
2013-04-10 15:54:13 -05:00
caleb miles
bb8d1c9897 rgw_admin: Create keys for a new user by default.
Create a new key pair for new users or when --gen-access-key is specified.

Signed-off-by: caleb miles <caleb.miles@inktank.com>
2013-04-10 15:49:01 -04:00
Samuel Just
170d4a3d79 FileJournal: start_seq is seq+1 if journalq.empty()
This is also the same as journaled_seq + 1 for writeahead
journaling, but not for parallel journaling.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-10 12:48:02 -07:00
Samuel Just
90c256d757 FileJournal: fix off by one error in committed_thru
journalq.front().first is the sequence number of the entry
at journalq.front().second.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-10 12:48:02 -07:00
Samuel Just
a4fa0a8200 Journal: commits may not include all journaled seqs
At one point, a commit had to drain the FileStore op
queue.  This is no longer the case.  Consequently, the
journal may have to wait more than one commit for the
filestore to create a stable commit point at a particular
sequence.  Handling this requires two changes:

1) We cannot transition to FULL_WAIT until we receive
a commit_start on a seq >= journaled_seq.
2) We cannot remove the journal completion plug until get
a committed_thru on a seq >= header.start_seq at least as
new as the oldest committed item in the journal.  If on
replay, the journal does not include fs_op_seq, we ignore
it, which is fine since we won't have reported those
entries committed!

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-10 12:48:01 -07:00
Samuel Just
13474b089b Journal: pass the sequence number to commit_start
A subsequent patch will need to see the committing seq.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-10 12:47:23 -07:00
Yan, Zheng
a1d9cbe5af mds: fix session_info_t decoding
commit 0bcf2ac081 changes session_info_t's format, but there is
a typo in the code that decodes old format. We also need to
handle struct_v == 1, which had the same encoding but without
the size guards (which is all handled by DECODE_START_LEGACY_COMPAT).

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Greg Farnum <greg@inktank.com>
2013-04-10 12:46:30 -07:00
Josh Durgin
4b656730ff test_stress_watch: remove bogus asserts
There's no reason to check the duration of a watch. The notify will
timeout after 30s on the OSD, but there's no guarantee the client will
see that in any bounded time. This test is really meant as a stress
test of the OSDs anyway, not of the clients, so just remove asserts
about operation duration.

Fixes: #4591
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sam Just <sam.just@inktank.com>
2013-04-10 11:36:36 -07:00
Josh Durgin
3888a12385 test: update rbd formatted-output for progress changes
Progress output now goes to stderr instead of stdout.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-04-10 10:43:13 -07:00
Greg Farnum
8eb5465c10 Merge branch 'wip-journaler-4618' into next
Reviewed-by: Sam Lang <sam.lang@inktank.com>
2013-04-09 16:00:41 -07:00
Greg Farnum
95374c628b config: fix osd_client_message_cap comment
Signed-off-by: Greg Farnum <greg@inktank.com>
2013-04-09 12:11:27 -07:00
Greg Farnum
cecbb4d88a Merge remote-tracking branch 'origin/wip-osd-throttle2' into next
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-09 12:11:15 -07:00
Samuel Just
a48739d9ab FileJournal: clarify meaning of start_seq and fix initialization
Second guessing the first sequence number from the FileStore
was silly and broke tests which had the temerity to start at
1 instead of 2...

Fixes: #4687
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-09 10:46:03 -07:00
Greg Farnum
88ab8413de Revert "global: call config observers on global_init (and start logging!)"
This reverts commit a309177466. This commit
includes calls that involve Mutexes, Lockers, and lockdep -- which isn't
yet set up, so things break horribly. A more subtle approach is required.

Signed-off-by: Greg Farnum <greg@inktank.com>
2013-04-08 18:20:53 -07:00
Dan Mick
be801f6c50 mon: Use _daemon version of argparse functions
Allow argparse functions to fail if no argument given by using
special versions that avoid the default CLI behavior of "cerr/exit"

Fixes: #4678
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-04-08 17:49:15 -07:00
Dan Mick
c76bbc2e6d ceph_argparse: add _daemon versions of argparse calls
mon needs to call argparse for a couple of -- options, and the
argparse_witharg routines were attempting to cerr/exit on missing
arguments.  This is appropriate for the CLI usage, but not the daemon
usage.  Add a 'cli' flag that can be set false for the daemon usage
(and cause the parsing routine to return false instead of exit).

The daemon's parsing code due for a rewrite soon.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-04-08 17:49:15 -07:00
Samuel Just
d7b7acefc8 Pipe: call discard_requeued_up_to under pipe_lock
Fixes: #4627
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-08 17:02:45 -07:00
Gregory Farnum
1a3890a59f Merge pull request #202 from ceph/wip-log-boot
Fixes #4676.

Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-08 15:53:30 -07:00
Greg Farnum
4cb18b5a6f journaler: remove the unused prefetch_from member variable
Signed-off-by: Greg Farnum <greg@inktank.com>
2013-04-08 14:09:23 -07:00
Gregory Farnum
9c2d017d52 Merge pull request #206 from ceph/wip-4660
mds: Keep LogSegment ref for openc backtrace

Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-08 11:18:53 -07:00
Sam Lang
3cdc61eca2 mds: Keep LogSegment ref for openc backtrace
The MDRequest is destroyed once the client reply is sent, but
we need the reference to the LogSegment for updating the backtrace, so
store a temporary ref to the LogSegment for later.

Fixes #4660.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-04-08 12:45:54 -05:00
Greg Farnum
edc9ddfde9 mds: fix journaler to set temp_fetch_len appropriately and read the requested amount
The _prefetch() function which intereprets temp_fetch_len interprets
it as the amount of data we need from read_pos, which is the beginning
of read_buf. So by setting it to the amount *more* we needed, we were
getting stuck forever if we actually hit this condition. Fix it by
setting temp_fetch_len based on the amount of data we need in aggregate.

Furthermore, we were previously rounding *down* the requested amount in
order to read only full log segments. Round up instead!

Fixes #4618

Signed-off-by: Greg Farnum <greg@inktank.com>
2013-04-08 09:10:35 -07:00
Sage Weil
a309177466 global: call config observers on global_init (and start logging!)
Currently we don't start logging on daemon startup unless the log_file
parameter was adjusted by ceph.conf.  Instead, we should call all config
observers so that the logging subsystem is fully configured and we log
even prior to the daemonize and common_init_finish (when we call observers
again).  This fixes logging for the initial period before we daemonize.
For some of the daemons (osd, mon), this includes significant work.  It
also fixes the problem where users don't see the 'ceph version ...' banner
on daemon start.

Backport: bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-07 09:07:21 -07:00
Sage Weil
1e7ddd9e9f global: flush log before stopping/starting on daemonize
Ensure that we push log data out before we restart logging.  This may not
be strictly necessary, but it avoids a whole class of possible pitfalls.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-07 09:07:21 -07:00
Sage Weil
f5ba0fbbe7 mon: make 'osd crush move ...' idempotent
If we don't need to move the item, return success.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-06 13:54:43 -07:00
Sage Weil
628e9ae26d librbd: fix DiffIterateStress again
- fix seed
- the array indices are points in time; no need to subtract one from i!
- pick a random seed and print it to stdout

I ran this with several different seeds without failure, so I am confident
we are in good shape.  And if we ever get a future failure, we'll have the
seed to reproduce.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-06 09:37:52 -07:00
Sage Weil
aca0aea1bf osd: throttle client messages by count, not just by bytes
This lets us put a cap on outstanding client IOs.  This is particularly
important for clients issuing lots of small IOs.

Fixes: #4579
Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-06 08:17:01 -07:00
Sage Weil
f7070e9568 msgr: add second per-message throttler to message policy
We already have a throttler that lets of limit the amount of memory
consumed by messages from a given source.  Currently this is based only
on the size of the message payload.  Add a second throttler that limits
the number of messages so that we can effectively throttle small requests
as well.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-06 08:17:01 -07:00
Sage Weil
79b71441f8 librbd: fix DiffIterateStress test
If we write to an interval that didn't previously exist and then discard
it so that it again doesn't exist, all during the same interval, then we
should not include it in the 'written' set (or exists set, obviously).

Similarly, when we got to look at a merged diff, we can ignore extents
that were written (and possibly zeroed) if they neither existed before nor
after.

Bump up the iteration count to get more confidence that this is actually
correct.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-05 22:28:38 -07:00
Yehuda Sadeh
b083dece36 rgw: translate object marker to raw format
Fixes: #4600
Object marker should be treated as an object, so that name is formatted
correctly when getting the raw oid.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-04-05 10:27:40 -07:00
caleb miles
be6961bd28 Allow creation of buckets starting with underscore in RGW
Signed-off-by caleb miles <caleb.miles@inktank.com>
2013-04-05 10:26:29 -07:00
Gary Lowell
debce05510 Merge pull request #198 from dalgaaf/wip-da-spec
Fix some install and rpm SPEC issues

Reviewed-by: Gary Lowell <gary.lowell@inktank.com>
2013-04-05 09:46:27 -07:00
Sage Weil
1f2d5bba5b Merge remote-tracking branch 'gh/next' 2013-04-04 22:22:43 -07:00
Samuel Just
6ef9d87a02 FileJournal: introduce start_seq header entry
FileStore::header_t::start_seq now encodes the op seq which may be
written at FileStore::header_t::start.  This way, FileStore::open()
can pass a valid sequence number to read_entry for validation.
Otherwise, read_entry has no way of knowing whether a failure of a
read at header.start indicates that the journal was empty, or that
the entry is corrupt.  With start_seq, read_entry can assume
corruption if start_seq <= committed_up_to.

Fixes: #4527
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-04-04 12:48:44 -07:00
Samuel Just
f12a5ed546 FileJournal: fill in committed_up_to for old headers
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-04-04 12:48:36 -07:00
Danny Al-Gaaf
e5cecd7656 debian/ceph-test.install: add installed but not packaged files
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-04-04 18:38:11 +02:00
Danny Al-Gaaf
a3a658dc53 ceph.spec.in: add installed but not packaged files to ceph-test
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-04-04 18:30:40 +02:00
Danny Al-Gaaf
8cf3319f5d ceph.spec.in: remove some twice created directories
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-04-04 18:27:13 +02:00
Danny Al-Gaaf
6bc14889be ceph.spec.in: fix udev rules.d files handling
Move 50-rbd.rules into the ceph base package since the related
ceph-rbdnamer binary is part of this package. Use correct install
pattern.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-04-04 18:23:40 +02:00