Commit Graph

23200 Commits

Author SHA1 Message Date
Samuel Just
44625d4460 config_opts.h: default osd_recovery_delay_start to 0
This setting was intended to prevent recovery from overwhelming peering traffic
by delaying the recovery_wq until osd_recovery_delay_start seconds after pgs
stop being added to it.  This should be less necessary now that recovery
messages are sent with strictly lower priority then peering messages.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Gregory Farnum <greg@inktank.com>
2013-01-10 11:10:04 -08:00
Gregory Farnum
eb997e25e0 Merge pull request #31 from chrisglass/expose_cluster_stats_to_python
Added python wrapper to rados_cluster_stat
2013-01-10 10:09:25 -08:00
Chris Glass
797b3db39b Added python wrapper to rados_cluster_stat
The new get_cluster_stats() method on the rados.Rados object calls
the rados_cluster_stat() function in the librados library.

Signed-off-by: Christopher Glass <christopher.glass@canonical.com>
2013-01-10 14:43:49 +01:00
Dan Mick
00898c1860 rbd: allow copy of zero-length images. Includes simple test.
Fixes: #3765
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-01-09 16:11:51 -08:00
Dan Mick
1c3d6840d1 doc/install/debian.rst: fix typo in link ref; broke doc build
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-01-09 16:10:36 -08:00
Dan Mick
133e4e3473 Merge branch 'next'
Want to get various rbd-related fixes together for upgrade testing
2013-01-09 15:11:36 -08:00
Samuel Just
48f1394683 ReplicatedPG: increment scrubber.errors rather than errors
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-01-09 14:40:36 -08:00
Filippos Giannakos
62e721a91c librados: add aio stat tests
Implement simple write-stat test, and a write-stat-remove-stat test cycle.

Signed-off-by: Filippos Giannakos <philipgian@grnet.gr>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-08 19:39:19 -08:00
Filippos Giannakos
879578c1d1 librados: implement aio_stat
Implement aio stat and also export this functionality to the C API.

Signed-off-by: Filippos Giannakos <philipgian@grnet.gr>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-08 19:38:43 -08:00
Sage Weil
5b12b514b0 osd: make missing head non-fatal during scrub
If we encounter a scrub without a preceeding head, warn instead of
crashing.  Note that this is still something we can't repair.

See #3705.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-01-08 18:32:38 -08:00
Sylvain Munaut
e1da85f286 rgw: Fix crash when FastCGI frontend doesn't set SCRIPT_URI
Fixes: #3735
Signed-off-by: caleb miles <caleb.miles@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-01-08 18:29:27 -08:00
caleb miles
eba314a811 rgw: fix handler leak in handle_request
Fixes: #3682
Signed-off-by: caleb miles <caleb.miles@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-01-08 18:28:31 -08:00
Dan Mick
4483285c9f librbd: Allow get_lock_info to fail
If the lock class isn't present, EOPNOTSUPP is returned for lock calls
on newer OSDs, but sadly EIO on older; we need to treat both as
acceptable failures for RBD images.  rados lock list will still fail.

Fixes #3744.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-01-08 18:25:46 -08:00
Sage Weil
77ddf2760e doc/release-notes: v0.48.3argonaut
Signed-off-by: Sage Weil <sage@inktank.com>
2013-01-08 18:21:12 -08:00
Sage Weil
f07921bebc doc/install: new URLs for argonaut vs bobtail
Also restructure the document a bit to make the choice of packages more
clear.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-01-07 20:51:04 -08:00
Sage Weil
72674ad447 doc/release-notes: v0.56.1
Signed-off-by: Sage Weil <sage@inktank.com>
2013-01-07 20:46:31 -08:00
Noah Watkins
1b194b2564 Merge branch 'wip-stripe-gran'
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-01-07 16:14:16 -08:00
Noah Watkins
26e8438a55 test: enforce -ENOTCONN contract in libcephfs
Tests all relevant calls for -ENOTCONN when used with an unmounted
ceph_mount_info param.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2013-01-07 15:50:36 -08:00
Noah Watkins
5c58aa96e4 libcephfs: return -ENOTCONN when call unmounted
Adds -ENOTCONN return value for stat, fchmod, fchown, lchown.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2013-01-07 15:49:52 -08:00
Samuel Just
f83fcf63a9 PG: set DEGRADED in Active AdvMap handler based on pool size
Otherwise, if the acting set does not change, the pg might
not show up as degraded if the pool size now exceeds the
acting set size.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-01-07 15:16:09 -08:00
Noah Watkins
c41210934c libcephfs: clarify interface return value
Document that ceph_get_stripe_unit_granularity may return an error code
(e.g. -ENOTCONN). The interface requires a mount, but currently we
return a compile-time constant. Other error codes may be possible in the
future.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2013-01-07 15:04:33 -08:00
Sage Weil
c8f8c7e652 Merge branch 'next' 2013-01-07 13:12:33 -08:00
Sage Weil
1b39b31678 Merge branch 'wip-3678-b' into next
Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-01-07 13:04:13 -08:00
Sage Weil
d16ad9263d msg/Pipe: prepare Message data for wire under pipe_lock
We cannot trust the Message bufferlists or other structures to be
stable without pipe_lock, as another Pipe may claim and modify the sent
list items while we are writing to the socket.

Related to #3678.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-01-07 13:02:58 -08:00
Sage Weil
40706afc66 msgr: update Message envelope in encode, not write_message
Fill out the Message header, footer, and calculate CRCs during
encoding, not write_message().  This removes most modifications from
Pipe::write_message().

Signed-off-by: Sage Weil <sage@inktank.com>
2013-01-07 13:02:58 -08:00
Sage Weil
62586884af osdc/Objecter: fix linger_ops iterator invalidation on pool deletion
The call to check_linger_pool_dne() may unregister the linger request,
invalidating the iterator.  To avoid this, increment the iterator at
the top of the loop.

This mirror the fix in 4bf9078286 for
regular non-linger ops.

Fixes: #3734
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-01-07 12:58:39 -08:00
David Zafman
4c9f4c3ca8 ceph-fuse: rename ceph_ll_* to fuse_ll_*
To not conflict with future linuxbox pull for nfs-ganesha.

Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-01-07 10:34:16 -08:00
Sage Weil
4cfc4903c6 msg/Pipe: encode message inside pipe_lock
This modifies bufferlists in the Message struct, and it is possible
for multiple instances of the Pipe to get references on the Message;
make sure they don't modify those bufferlists concurrently.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-01-06 20:38:28 -08:00
Sage Weil
a058f16113 msg/Pipe: associate sending msgs to con inside lock
Associate a sending message with the connection inside the pipe_lock.
This way if a racing thread tries to steal these messages it will
be sure to reset the con point *after* we do such that it the con
pointer is valid in encode_payload() (and later).

This may be part of #3678.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-01-06 20:38:25 -08:00
Sage Weil
2a1eb466d3 msg/Pipe: fix msg leak in requeue_sent()
The sent list owns a reference to each message.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-01-06 20:38:22 -08:00
Sage Weil
ce49968938 os/FileJournal: include limits.h
Needed for IOV_MAX.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-01-05 20:54:09 -08:00
Noah Watkins
e9efa33253 java: add stripe unit granularity tests
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2013-01-05 11:17:58 -08:00
Noah Watkins
ececcf57b8 java: update javadoc comments
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2013-01-05 11:12:25 -08:00
Noah Watkins
cdd138daa5 java: fix whitespace
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2013-01-05 11:10:53 -08:00
Joe Buck
6954bf3392 java: add support for get_stripe_unit_granularity
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Noah Watkins <noahwatkins@gmail.com>
2013-01-05 11:08:31 -08:00
Noah Watkins
abcda95b75 libcephfs: expose stripe unit granularity
Assists clients in choosing layout parameters.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2013-01-05 11:08:31 -08:00
Sage Weil
415294c0f9 Merge branch 'next' 2013-01-04 20:48:12 -08:00
Sage Weil
988a521735 osd: special case CALL op to not have RD bit effects
In commit 20496b8d2b we treat a CALL as
different from a normal "read", but we did not adjust the behavior
determined by the RD bit in the op.  We tried to fix that in
91e941aef9, but changing the op code breaks
compatibility, so that was reverted.

Instead, special-case CALL in the helper--the only point in the code that
actually checks for the RD bit.  (And fix one lingering user to use that
helper appropriately.)

Fixes: #3731
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-01-04 20:46:56 -08:00
Sage Weil
d3abd0fe0b Revert "OSD: remove RD flag from CALL ops"
This reverts commit 91e941aef9.

We cannot change this op code without breaking compatibility
with old code (client and server).  We'll have to special case
this op code instead.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-01-04 20:46:48 -08:00
Noah Watkins
3a9408742a libcephfs: delete client after messenger shutdown
Prevents race between messages being dispatched to the client after the
client has been free'd.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-01-04 19:51:52 -08:00
Dan Mick
0978dc4963 rbd: Don't call ProgressContext's finish() if there's an error.
do_copy was different from the others; call pc.fail() on error and
do not call pc.finish().

Fixes: #3729
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-01-04 18:02:55 -08:00
Samuel Just
e89b6ade63 ReplicatedPG: remove old-head optization from push_to_replica
This optimization allowed the primary to push a clone as a single push in the
case that the head object on the replica is old and happens to be at the same
version as the clone.  In general, using head in clone_subsets is tricky since
we might be writing to head during the push.  calc_clone_subsets does not
consider head (probably for this reason).  Handling the clone from head case
properly would require blocking writes on head in the interim which is probably
a bad trade off anyway.

Because the old-head optimization only comes into play if the replica's state
happens to fall on the last write to head prior to the snap that caused the
clone in question, it's not worth the complexity.

Fixes: #3698
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-01-04 13:44:18 -08:00
Josh Durgin
6a3d475cf0 Merge remote branch 'origin/wip-rbd-watch'
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-01-04 13:37:36 -08:00
Sage Weil
28d59d374b os/FileStore: fix non-btrfs op_seq commit order
The op_seq file is the starting point for journal replay.  For stable btrfs
commit mode, which is using a snapshot as a reference, we should write this
file before we take the snap.  We normally ignore current/ contents anyway.

On non-btrfs file systems, however, we should only write this file *after*
we do a full sync, and we should then fsync(2) it before we continue
(and potentially trim anything from the journal).

This fixes a serious bug that could cause data loss and corruption after
a power loss event.  For a 'kill -9' or crash, however, there was little
risk, since the writes were still captured by the host's cache.

Fixes: #3721
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-01-03 17:15:07 -08:00
John Wilkins
f1e0305f0d doc: Removed the --without-tcmalloc flag until further advised.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-01-03 16:13:13 -08:00
Sage Weil
19df20867d Merge pull request #30 from rca/master
Minor clarification in docs.
2013-01-03 16:07:59 -08:00
John Wilkins
88af7d182a doc: Added defaults for PGs, links to recommended settings, and updated note on splitting.
Fixes: #3555

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-01-03 14:51:33 -08:00
Samuel Just
4ae4dce5c5 OSD: for old osds, dispatch peering messages immediately
Normally, we batch up peering messages until the end of
process_peering_events to allow us to combine many notifies, etc
to the same osd into the same message.  However, old osds assume
that the actiavtion message (log or info) will be _dispatched
before the first sub_op_modify of the interval.  Thus, for those
peers, we need to send the peering messages before we drop the
pg lock, lest we issue a client repop from another thread before
activation message is sent.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-01-03 14:18:00 -08:00
John Wilkins
73bc8ffc90 doc: Added comments on --without-tcmalloc option when building Ceph.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-01-03 13:30:14 -08:00
rca
37b57cdf0f Update doc/rados/configuration/filesystem-recommendations.rst
Clarified when it's necessary to use the setting:

filestore xattr use omap = true
2013-01-03 13:30:01 -08:00