Commit Graph

22890 Commits

Author SHA1 Message Date
Samuel Just
9f0510249d crush-map.rst: add info about multiple crush heirarchies
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-14 08:21:34 -08:00
caleb miles
24523913e3 rgw_op: enforce minimum part size in multi-part uploads
Signed-off-by: caleb miles <caleb.miles@inktank.com>
2012-12-13 14:05:56 -08:00
Sage Weil
aa2214c36b mds: document EXCL -> (MIX or SYNC) transition decision
Previously (in w26f6a8e48ae575f17c850e28e969d55bceefbc0f), for reasons that
are somewhat obscured by passage of time, we did

+      if ((other_wanted & (CEPH_CAP_GRD|CEPH_CAP_GWR)) ||

But then we noticed that the loner may want to RD/WR and we are losing the
loner status for some other reason.  So just recently in
b48dfeba3f we changed it to

+      if (((other_wanted|loner_wanted) & (CEPH_CAP_GRD|CEPH_CAP_GWR)) ||

Then we noticed that a non-loner wanting to read and a loner wanting to
read (i.e., no writers!) would lead to MIX, even when we want SYNC.
So in 07b36992da we changed to

+      if (((other_wanted|loner_wanted) & CEPH_CAP_GWR) ||

This appears to be correct.  The possible choices (wrt caps wanted):

loner  other   want
R      R       SYNC
R      R|W     MIX
R      W       MIX
R|W    R       MIX
R|W    R|W     MIX
R|W    W       MIX
W      R       MIX
W      R|W     MIX
W      W       MIX

Which means any writer -> we want MIX.  We only want SYNC when there is
nobody who wants to write.  Because you can't write in SYNC.  Which in
retrospect seems obvious.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-13 12:48:04 -08:00
Josh Durgin
83ee85b840 Merge remote branch 'origin/next' 2012-12-13 08:30:22 -08:00
Josh Durgin
e6dd0681d1 qa: echo commands run by rbd map-unmap workunit
It's hard to figure out what failed without this.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-12-13 08:29:10 -08:00
Sage Weil
975003bf67 auth: guard decode_decrypt with try block
This will catch buffer decoding errors (maybe the block is empty) and
return an error string.

May fix (or possibly paper over) #3459.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2012-12-12 22:01:03 -08:00
Sage Weil
448db47965 mount.fuse.ceph: strip out noauto option
mount -a uses this, but also passes it to mount.fuse.ceph, and libceph
complains:

fuse: unknown option `noauto'

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-12 21:14:38 -08:00
Sage Weil
ae100cfdbc mount.fuse.ceph: add ceph-fuse mount helper
Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-12 21:14:38 -08:00
Dan Mick
ac92e4d6bd /etc/init.d/ceph: fs_type assignment syntax error
This handles the remainder of 3581; it's a lot like the problem in
mkcephfs, but it isn't mkcephfs.

Fixes: #3581
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2012-12-12 19:40:16 -08:00
Sam Lang
4605fddcf6 filestore: Don't keep checking for syncfs if found
Valgrind outputs a warning for unrecognized system calls,
and does so for the syscall(__SYS_syncfs,...) and
syscall(__NR_syncfs, ...) calls.  This patch avoids making
those calls (and the warning, when run in valgrind) if the
syncfs libc call is available.

INFO:teuthology.task.ceph.osd.1.err:--10568-- WARNING: unhandled syscall: 306
INFO:teuthology.task.ceph.osd.1.err:--10568-- You may be able to write your own handler.
INFO:teuthology.task.ceph.osd.1.err:--10568-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
INFO:teuthology.task.ceph.osd.1.err:--10568-- Nevertheless we consider this a bug.  Please report
INFO:teuthology.task.ceph.osd.1.err:--10568-- it at http://valgrind.org/support/bug_reports.html.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-12-12 15:17:48 -10:00
John Wilkins
5f55b38827 doc: Updated per comments in the mailing list.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-12-12 14:38:22 -08:00
Yehuda Sadeh
9d714560ee docs: better documentation of new rgw feature
Document rgw_extended_http_attrs config option.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-12 13:49:55 -08:00
Yehuda Sadeh
3a95d97648 rgw: configurable list of object attributes
Fixes: #3535
New object attributes are now configurable. A list
can be specified via the 'rgw extended http attrs'
config param.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-12 13:45:21 -08:00
Yehuda Sadeh
8708724557 rgw: option to provide alternative s3 put obj success code
Fixes: #3529
Added a new option: rgw_s3_success_create_obj_status.
Expected values are 0, 200, 201, 204. A value of 0
will skip the special handling altogether. Any value
other than the specified will default to 200.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-12 13:45:21 -08:00
Yehuda Sadeh
bece012caa doc: document swift compatibility
Add a table that specifies swift features compatibility

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-12 13:08:05 -08:00
Yehuda Sadeh
88229a49d9 docs: add rgw POST object as supported feature
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-12 13:08:04 -08:00
Yehuda Sadeh
54618afab2 docs: fix spacing in radosgw config-ref
Needed to add an extra empty line between header and properties.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-11 17:49:34 -08:00
Josh Durgin
8e6a53531b qa: exclude some more xfstests
These worked on a newer kernel, but I forgot I had not updated it for the final image.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-12-11 17:18:14 -08:00
Josh Durgin
f0d85b73ef Merge branch 'next' 2012-12-11 16:47:41 -08:00
Sage Weil
84f90a09ab Merge branch 'next' 2012-12-11 17:16:19 -08:00
Sage Weil
caea0cbf9f os/JournalingObjectStore: un-break op quiescing during journal replay
Commit d9dce4e927 broke journal replay
because the commit thread may try to do a commit, and the ops are not
being applied via the normal work queue.  Add back in a simpler form of the
old op quiescing (simpler because there is a single thread doing the
replay).

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2012-12-11 17:15:56 -08:00
Yehuda Sadeh
6a8a58dc4b doc: document swift compatibility
Add a table that specifies swift features compatibility

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-11 17:07:37 -08:00
Yehuda Sadeh
cf28e7872e docs: add rgw POST object as supported feature
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-11 17:07:35 -08:00
Josh Durgin
326dd34726 Merge remote branch 'origin/wip-double-notify' into next
Reviewed-by: Sage Weil <sage.weil@inktank.com>
2012-12-11 16:39:59 -08:00
Josh Durgin
3950182268 st_rados_watch: tolerate extra notifies
With retries, it's possible for notifies to be received more than once
when they are resent to different OSDs, since the OSDs only track them
in memory.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-12-11 15:44:32 -08:00
Yehuda Sadeh
29307d3b32 mds: shutdown cleanly if can't authenticate
Fixes: #3590
This was triggered when tried to run mds with cephx enabled
against a mon without cephx support. We didn't handle the
returned error at all, so this one fixes it. It also makes
sure that we don't continue initialization until rotating
keys are in place (as the osd does).

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-11 15:08:58 -08:00
Sage Weil
6007088c53 Merge remote-tracking branch 'gh/wip-conf' into next
Reviewed-by: Greg Farnu <greg@inktank.com>
2012-12-11 15:07:38 -08:00
Josh Durgin
c3107009f6 objecter: don't use new tid when retrying notifies
Watches update the on-disk state in the OSD, and aren't idempotent,
so refreshing them must be treated as a separate transaction by the OSD.
Notifies are just in-memory state, and resending them will result in
acceptable behavior:

- if it's the same osd, the resent op will be recognized as a duplicate
- if it's a different osd, a new notify will be triggered since the new osd
  can't tell whether the original notify was received by any watchers

Using a new tid for each resend can cause some unecessary extra work,
as the first case turns into the second.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-12-11 11:13:17 -08:00
Yan, Zheng
9a40ef0131 mds: fix journaling issue regarding rstat accounting
Rename operation can call predirty_journal_parents() several times.
So a directory fragment's rstat can also be modified several times.
But only the first modification is journaled because EMetaBlob::add_dir()
does not update existing dirlump.

For example: when hanlding 'mv a/b/c a/c', Server::_rename_prepare may
first decrease directory a and b's nested files count by one, then
increases directory a's nested files count by one.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2012-12-11 09:06:52 -08:00
Danny Al-Gaaf
b9d717cd34 fix build of unittest_formatter
Add CRYPTO_CXXFLAGS to unittest_formatter_CXXFLAGS to find pk11pub.h to
be included in src/common/ceph_crypto.h.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2012-12-11 08:44:11 -08:00
Danny Al-Gaaf
be372765b2 include/atomic.h: add stdlib.h for size_t
Include missing stdlib.h needed for size_t.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2012-12-11 08:44:09 -08:00
Samuel Just
bcf1461c7e Merge remote-tracking branch 'upstream/wip_split2' into next
Reviewed-by: Greg Farnum <greg@inktank.com>
2012-12-10 22:00:36 -08:00
Samuel Just
1699b7dc5e OSD: get_or_create_pg doesn't need an op passed in
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-10 19:03:55 -08:00
Samuel Just
6a4fa89afa LFNIndex: fix move_subdir comments
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-10 17:45:02 -08:00
Samuel Just
fdcdca7d68 HashIndex: fix typo in reset_attr documentation
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-10 17:40:10 -08:00
Samuel Just
7eac96827e HashIndex: init exists in col_split_level and reset_attr
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-10 17:39:13 -08:00
Samuel Just
12673c24f2 PrioritizedQueue: increment ret when removing items from list
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-10 17:31:44 -08:00
Samuel Just
80cca214b9 PrioritizedQueue: move if check out of loop in filter_list_pairs
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-10 17:30:59 -08:00
Sage Weil
331c25046e Merge remote-tracking branch 'gh/next' 2012-12-10 17:08:26 -08:00
Sage Weil
a50c7d3b2f config: do not always print config file missing errors
Do not generate errors each time we fail to open a config file; only
generate one at the end if a search path was specified and none were
usable, right before we (already) exit.  This avoids spamming stderr
about each path we tried in the search list before we found a good one.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-10 16:41:19 -08:00
Sage Weil
6fb9a5580e config: always complain about config parse errors
Complain about config parsing errors even when it is the default
config file.

We may also want to fail instead of continuing, but that is a separate
issue.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-10 14:44:16 -08:00
Sage Weil
e4d0aeace1 Merge remote-tracking branch 'gh/wip-filestore2' into next
Reviewed-by: Sam Just <sam.just@inktank.com>
2012-12-10 14:34:07 -08:00
John Wilkins
2e7cba7bca doc: fixed indent in python example.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-12-10 14:14:09 -08:00
Samuel Just
788992bbf5 config_opts.h: adjust recovery defaults
osd max backfills: 5 was too low for a default, 10
 seems to work better in testing.  The message
 priority system should minimize disruption of
 push and pull operations anyway.

osd recovery max chunk: 1MB was too small for a
 default.  8MB is reasonable for a single push
 and will allow us to recover an rbd block in
 one push rather then 4 reducing client io
 latency during log-based recovery.

osd recovery op priority: 10 rather than 30 will
 further reduce the client io latency impact of
 push and pull operations.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-10 13:53:10 -08:00
Sage Weil
45865285e7 Merge remote-tracking branch 'gh/wip-3559' into next
Reviewed-by: Sage Weil <sage@inktank.com>
2012-12-10 12:55:14 -08:00
John Wilkins
f4be3c8d98 doc: Added sudo to ceph -k command.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-12-10 10:51:24 -08:00
John Wilkins
3709519519 doc: Fixed typo.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-12-10 10:24:32 -08:00
Yehuda Sadeh
47c81a3baa Makefile.am: add missing flags to some tests targets
adding CRYPTO_CXXFLAGS to some targets. This is required when
building --with-nss.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-10 10:19:02 -08:00
Sage Weil
333b3f43b5 mon: fix leak of pool op reply data
We pass a pointer because it is an optional argument, but we shouldn't
put the bufferlist on the heap or else we have to manage it's life
cycle, and that's fragile (and previously broken).

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-08 21:44:54 -08:00
Sage Weil
f66fe7783e os/JournalingObjectStore: simplify op_submitting sanity check
A list is overkill; just use a seq and make sure it increments to ensure
the op_submit_finish calls are in order.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-08 09:32:47 -08:00