Commit Graph

22928 Commits

Author SHA1 Message Date
John Wilkins
5e95510380 doc: Added REST Gateway link to 5-minute Quick Start.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-12-19 14:00:55 -08:00
John Wilkins
c2b231e416 doc: Updated the 5-minute Quick Start for Bobtail.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-12-19 13:52:20 -08:00
John Wilkins
f596cee7fc doc: Updated Block Device Quick Start for Bobtail.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-12-19 13:47:11 -08:00
John Wilkins
60b2857dee doc: Updated CephFS Quick Start for Bobtail.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-12-19 13:46:28 -08:00
John Wilkins
d17bd3840f doc: Added authentication and mkcephfs settings for Bobtail.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-12-19 13:45:34 -08:00
John Wilkins
cd5c82db9b doc: Added javascript code block tag.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-12-19 13:36:17 -08:00
Yehuda Sadeh
04e7a5ca13 rgw: configurable exit timeout
Fixes: #3638

rgw exit timeout secs : number of seconds to wait for process
to exit cleanly before forcing exit. If set to 0, it'l wait
indefinitely.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-18 14:12:29 -08:00
Sage Weil
21c47c6a89 osd: debug EMSGSIZE / OSD_WRITETOOBIG
Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-17 14:39:59 -08:00
Josh Durgin
e8b8531ee0 doc: fix typo in config file
The option is host, not hostname

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-12-17 07:57:34 -08:00
Sage Weil
601a6c936d Merge remote-tracking branch 'gh/next' 2012-12-14 17:08:35 -08:00
Greg Farnum
1ec70aa0dd qa: add a workunit for fsync-tester
It turns out that our suites don't exercise fsync, at least not very much
(I couldn't find it in all the places I looked for it). This tester
was written by Ted T'so and updated by Chris Mason; I just made it
work on a smaller dataset (256MB) because 8GB against a small cluster takes
more time than we want to wait.

Signed-off-by: Greg Farnum <greg@inktank.com>
2012-12-14 15:24:36 -08:00
Noah Watkins
286dcbeb55 test: remove underscores from cephfs test names
Google Test documentation strongly suggests avoiding underscores from
unit test names to avoid accidental conflicts with their macro naming
scheme.

http://code.google.com/p/googletest/wiki/FAQ#Why_should_not_test_case_names_and_test_names_contain_underscore

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2012-12-14 15:31:00 -08:00
Sage Weil
79db5a40c7 Merge branch 'wip_watch' into next 2012-12-14 14:32:44 -08:00
Sam Lang
a7de975d93 lockdep: Decrease lockdep backtrace skip by 1
Skipping the top 4 (it starts at 0) calls in the
backtrace actually skips the call that does the lock.
Skip 3 instead.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-12-14 14:17:40 -08:00
Sage Weil
641b077f9e mkcephfs: fix == -> =
Another bashism.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-14 14:20:00 -08:00
Alex Elder
bf01b7b2e2 map-unmap.sh: use udevadm settle for synchronization
This script was heuristically using short sleep commands in order to
give udev activity time to complete.

There's a command "udevadm settle" which actually looks at the udev
queue and waits until its processing is done.  Much, much better.

This rearranges the get_id function a bit too, breaking it into one
function that gets the id and another that loops back and tries
again after a short delay in the event the get_id fails.

Signed-off-by: Alex Elder <elder@inktank.com>
2012-12-14 15:58:39 -06:00
Sage Weil
c728171b91 Merge branch 'wip-upstart' into next
Reviewed-by: Greg Farnum <greg@inktank.com>
2012-12-14 13:51:19 -08:00
Sage Weil
02aca6830a ceph-disk-activate: mark dir as upstart-managed
Mark the directory so that upstart will manage the daemon.  Eventually,
this should be generalized to allow ceph-disk-* usage with other init
systems.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-14 13:49:26 -08:00
Sage Weil
96f40b146b upstart: make starter jobs consistent
Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-14 13:49:26 -08:00
Sage Weil
e597482f29 upstart: only start when 'upstart' file exists in daemon dir
We need to distinguish between daemons managed by upstart and sysvinit
(and, eventually, systemd).  Only start daemons when 'upstart' is present.

Note that sysvinit will only start daemons when the 'host = ...' line is
in ceph.conf, so there is a similar "opt-in".

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-14 13:49:26 -08:00
Samuel Just
6ab7db6717 ReplicatedPG: use default priority for Backfill messages
Backfill messages modify the stats on the replica and therefore
must be sent with the same priority as sub_op_modify to ensure
ordering.  Using recovery_op_priority caused the following
sequence:

1) Primary(1) sends MOSDPGBackfill FINISH with updated stats (v1)
2) Primary(1) sends SubOp modify for new client op with stats (v2)
3) Replica(2) receives SubOp with stats (v2)
4) Replica(2) receives MOSDPGBackfill FINISH with stats (v1)
5) Replica(2) responds and Primary(1) resets pgtemp making
    Replica(2) Primary(2)
6) PG stats on Primary(2) several ops old.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2012-12-14 13:38:51 -08:00
Samuel Just
7e1335691d ReplicatedPG: do not use priority from client op
There are internal ordering requirements which may be sensitive
to assigned priority.  We don't want a mix of priorities from
old clients with priorities from new clients causing trouble.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2012-12-14 13:38:37 -08:00
Sam Lang
b63940caa1 Merge branch 'wip-3610' into next 2012-12-14 09:00:24 -10:00
Greg Farnum
8d73f3e946 Fix comment in sample.ceph.conf
Signed-off-by: Greg Farnum <greg@inktank.com>
2012-12-14 09:53:38 -08:00
Samuel Just
9f0510249d crush-map.rst: add info about multiple crush heirarchies
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-14 08:21:34 -08:00
Sam Lang
f16e571757 client: Add config option to inject sleep for tick
Testing the tick delay with a fork/suspend is causing
corruption in the lockdep code.  This approach uses
a config option to sleep the tick thread for a number
of seconds, avoiding the entire fork/suspend mess.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-12-13 17:49:43 -10:00
Josh Durgin
8cf367cb79 rbd.py: check for new librbd methods before use
This way attempting to use format 2 images works when you upgrade the
python bindings before librbd, and attempting to use functions
that librbd does not have results in more understandable errors.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-12-13 17:43:23 -08:00
Sage Weil
c9894ff0e5 osd: up != acting okay on mkpg
This can happen when:

 - mon sends create pg
 - it gets created
 - osd remaps the pg to a different osd
     but osd does not update pg status to the mon
 - mkpg resent to the new osd

or something along those lines.  It seems unusual, but in the end who
really cares why the mon doesn't know about the pg creation yet.

Note that this check was added in the initial commit where acting/up was
added; there is no specific condition of concern we are protecting against.

Instead, ignore the message.  We'll get a query soon anwyay.

This 'fixes' #3614.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2012-12-13 16:27:09 -08:00
Joao Eduardo Luis
e3ed28eb2d mon: OSDMonitor: don't allow creation of pools with > 65535 pgs
There are some limitations to the number of possible pg's per pool, and
by allowing the 'osd pool create' command to succeed, we were making room
to some anomalous behavior.

Fixes: #3617

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2012-12-13 15:38:55 -08:00
Dan Mick
8103414a45 rbd: handle images disappearing while in ls -l
rbd.list() returns a list of names, but nothing stops them from
going away before rbd.open(); check for ENOENT and ignore if that
happens; warn on other errors

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-12-13 14:46:53 -08:00
caleb miles
24523913e3 rgw_op: enforce minimum part size in multi-part uploads
Signed-off-by: caleb miles <caleb.miles@inktank.com>
2012-12-13 14:05:56 -08:00
Sage Weil
aa2214c36b mds: document EXCL -> (MIX or SYNC) transition decision
Previously (in w26f6a8e48ae575f17c850e28e969d55bceefbc0f), for reasons that
are somewhat obscured by passage of time, we did

+      if ((other_wanted & (CEPH_CAP_GRD|CEPH_CAP_GWR)) ||

But then we noticed that the loner may want to RD/WR and we are losing the
loner status for some other reason.  So just recently in
b48dfeba3f we changed it to

+      if (((other_wanted|loner_wanted) & (CEPH_CAP_GRD|CEPH_CAP_GWR)) ||

Then we noticed that a non-loner wanting to read and a loner wanting to
read (i.e., no writers!) would lead to MIX, even when we want SYNC.
So in 07b36992da we changed to

+      if (((other_wanted|loner_wanted) & CEPH_CAP_GWR) ||

This appears to be correct.  The possible choices (wrt caps wanted):

loner  other   want
R      R       SYNC
R      R|W     MIX
R      W       MIX
R|W    R       MIX
R|W    R|W     MIX
R|W    W       MIX
W      R       MIX
W      R|W     MIX
W      W       MIX

Which means any writer -> we want MIX.  We only want SYNC when there is
nobody who wants to write.  Because you can't write in SYNC.  Which in
retrospect seems obvious.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-13 12:48:04 -08:00
Samuel Just
97cc55d599 OSD: put connection in disconnect_session_watches as well as the session
obc->watchers now has a ref to the connection as well.  This piece of
disconnect_session_watchers essentially parallels remove_watcher and
should generally do the same thing.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-13 10:52:52 -08:00
Samuel Just
f2c083efd5 OSD: disconnect_session_watches obc might not be valid after we relock
If disconnect_session_watches races with watch removal, the session
might no longer have a valid obc ref.  In that case, move on to
the next obc.

Note, there is no danger of any obcs being *added* to the session
since the session/connection at this point is dead.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-13 10:52:52 -08:00
Greg Farnum
c17d628b52 clarify/correct some of sample.ceph.conf
Signed-off-by: Greg Farnum <greg@inktank.com>
2012-12-13 10:39:19 -08:00
Josh Durgin
83ee85b840 Merge remote branch 'origin/next' 2012-12-13 08:30:22 -08:00
Josh Durgin
e6dd0681d1 qa: echo commands run by rbd map-unmap workunit
It's hard to figure out what failed without this.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-12-13 08:29:10 -08:00
Sage Weil
975003bf67 auth: guard decode_decrypt with try block
This will catch buffer decoding errors (maybe the block is empty) and
return an error string.

May fix (or possibly paper over) #3459.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2012-12-12 22:01:03 -08:00
Sage Weil
448db47965 mount.fuse.ceph: strip out noauto option
mount -a uses this, but also passes it to mount.fuse.ceph, and libceph
complains:

fuse: unknown option `noauto'

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-12 21:14:38 -08:00
Sage Weil
ae100cfdbc mount.fuse.ceph: add ceph-fuse mount helper
Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-12 21:14:38 -08:00
Dan Mick
ac92e4d6bd /etc/init.d/ceph: fs_type assignment syntax error
This handles the remainder of 3581; it's a lot like the problem in
mkcephfs, but it isn't mkcephfs.

Fixes: #3581
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2012-12-12 19:40:16 -08:00
Sam Lang
4605fddcf6 filestore: Don't keep checking for syncfs if found
Valgrind outputs a warning for unrecognized system calls,
and does so for the syscall(__SYS_syncfs,...) and
syscall(__NR_syncfs, ...) calls.  This patch avoids making
those calls (and the warning, when run in valgrind) if the
syncfs libc call is available.

INFO:teuthology.task.ceph.osd.1.err:--10568-- WARNING: unhandled syscall: 306
INFO:teuthology.task.ceph.osd.1.err:--10568-- You may be able to write your own handler.
INFO:teuthology.task.ceph.osd.1.err:--10568-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
INFO:teuthology.task.ceph.osd.1.err:--10568-- Nevertheless we consider this a bug.  Please report
INFO:teuthology.task.ceph.osd.1.err:--10568-- it at http://valgrind.org/support/bug_reports.html.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-12-12 15:17:48 -10:00
Samuel Just
dba096073a OSD: pg might be removed during disconnect_session_watches
We don't hold the osd_lock between the session->watches traversal
and the obc checks.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-12 15:30:04 -08:00
Samuel Just
047aecd90f PG,ReplicatedPG: handle_watch_timeout must not write during scrub/degraded
Currently, handle_watch_timeout will gladly write to an object while
that object is degraded or is being scrubbed.  Now, we queue a
callback to be called on scrub completion or finish_degraded_object
to recall handle_watch_timeout.  The callback mechanism assumes that
the registered callbacks assume they will be called with the pg
lock -- and no other locks -- already held.

The callback will release the obc and pg refs unconditionally.  Thus,
we need to replace the unconnected_watchers pointer with NULL to
ensure that unregister_unconnected_watcher fails to cancel the
event and does not release the resources a second time.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-12 15:30:00 -08:00
Samuel Just
0dfe6c84f0 ReplicatedPG:, remove_notify, put session after con
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-12 14:51:34 -08:00
Samuel Just
695bb3b0e2 ReplicatedPG: only put if we cancel evt in unregister_unconnected_watcher
If we fail to cancel the callback, the callback will fire and
release those resources.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-12 14:50:18 -08:00
Samuel Just
fdf66b6a8d ReplicatedPG: watchers must grab Connection ref as well
Session refs are not really valid on their own, the
corresponding Connection must remain live for at least
as long as the Session.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-12 14:50:15 -08:00
John Wilkins
5f55b38827 doc: Updated per comments in the mailing list.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-12-12 14:38:22 -08:00
Yehuda Sadeh
9d714560ee docs: better documentation of new rgw feature
Document rgw_extended_http_attrs config option.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-12 13:49:55 -08:00
Yehuda Sadeh
3a95d97648 rgw: configurable list of object attributes
Fixes: #3535
New object attributes are now configurable. A list
can be specified via the 'rgw extended http attrs'
config param.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-12 13:45:21 -08:00