Commit Graph

26805 Commits

Author SHA1 Message Date
Dan Mick
68b5fa9b61 ceph-fuse: older libfuses don't support FUSE_IOCTL_COMPAT
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-12 09:43:12 -07:00
Peter Wienemann
1577e203f0 ceph-create-keys: Make sure directories for admin and bootstrap keys exist
Signed-off-by: Peter Wienemann <wienemann@physik.uni-bonn.de>
2013-06-12 08:40:25 -07:00
Samuel Just
256afa072d store_test: create_collection prior to split
Fixes: #5310
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
2013-06-11 16:45:17 -07:00
Sage Weil
1a9415a015 mon: adjust trim defaults
User testing has shown that smaller values yield better results; see #4917.
Jim's testing has had good results with even more aggressive trimming, but I
would like to do more validation yet before changing defaults.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-11 16:30:41 -07:00
John Wilkins
5f0007e6a9 doc: Reworked the landing page.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-11 15:32:23 -07:00
John Wilkins
dc6cadc34a doc: Added a hostname resolution section for local host execution.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-11 14:46:35 -07:00
John Wilkins
f6c51b486d doc: Added some tips and re-organized to simplify the process.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-11 14:46:12 -07:00
Sage Weil
9b012e234a client: set issue_seq (not seq) in cap release
We regularly have been observing a stall where the MDS is blocked waiting
for a cap revocation (Ls, in our case) and never gets a reply.  We finally
tracked down the sequence:

 - mds issues cap seq 1 to client
 - mds does revocation (seq 2)
 - client replies
 - much time goes by
 - client trims inode from cache, sends release with seq == 2
 - mds ignores release because its issue_seq is 1
 - mds later tries to revoke other caps
 - client discards message because it doesn't have the inode in cache

The problem is simply that we are using seq instead of issue_seq in the
cap release message.  Note that the other release call site in
encode_inode_release() is correct.  That one is much more commonly
triggered by short tests, as compared to this case where the inode needs to
get pushed out of the client cache.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-06-11 13:56:45 -07:00
John Wilkins
c7fb7a3f46 doc: Added some Java S3 API troubleshooting entries.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-11 12:12:46 -07:00
John Wilkins
6c557d569d doc: Added install ceph-common instruction.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-11 12:11:51 -07:00
John Wilkins
5543f19c25 doc: Added install ceph-common instruction.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-11 12:11:26 -07:00
John Wilkins
3f3ad61fae doc: Fixed :term" syntax.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-11 12:10:52 -07:00
Gary Lowell
0948624f3e ceph-create-keys: Remove unused caps parameter on bootstrap_key()
The caps parameter was removed except for one place.

Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
2013-06-11 08:25:36 -07:00
Sage Weil
4682636f96 Merge branch 'next' 2013-06-10 23:22:49 -07:00
Sage Weil
3f2017fb17 osd: fix con -> session ref change after hb reset
set_priv() expects to be given a reference to own; take one.  This fixes
various crashes after we see a hb connection reset.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-10 23:22:26 -07:00
Sage Weil
a378c4d11e common/admin_socket: fix leak of new m_getdescs_hook
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-10 21:56:13 -07:00
Sage Weil
6bab425342 common/cmdparse: no need to use (and leak to) the heap
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-06-10 21:56:03 -07:00
Dan Mick
5c945cd1b8 CrushWrapper: dump tunables along with crush map
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-10 18:25:39 -07:00
Dan Mick
0e0e896e8c ceph: --keyring must be passed to parse_argv, which means not argparse
If argparse gets its hands on it, it's not available for parse_argv()
and is therefore ignored.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-10 18:09:05 -07:00
Samuel Just
8190b439b7 OSD: create collection in handle_pg_create before _create_lock_pg
Fixes: #5270
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-10 17:16:09 -07:00
Dan Mick
af92b9a4d9 Objecter: fail osd_command if OSD is down
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-10 17:06:04 -07:00
Dan Mick
a741aa073a mon: send "osd create" output to stdout; tests rely on it
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-10 17:06:04 -07:00
athanatos
01944ab900 Merge pull request #349 from dachary/wip-5213
unit tests for PGLog::merge_log

Reviewed-by: Sam Just <sam.just@inktank.com>
2013-06-10 16:08:46 -07:00
Sage Weil
0fe4bc09cc Merge pull request #350 from ceph/wip-osd-scrub-chunk
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-06-10 09:50:52 -07:00
Loic Dachary
8f141c451b unit tests for PGLog::rewind_divergent_log
The tests covers 100% of the LOC of rewind_divergent_log. There are
three situations :

 * throw an assert because the data is inconsistent

 * special case when the entire logs is divergent

 * regular workflow where all divergent entries are run to
   merge_old_entry

http://tracker.ceph.com/issues/5213 refs #5213

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-06-10 14:08:51 +02:00
Loic Dachary
04e89a4012 unit tests for PGLog::merge_log
The tests covers 100% of the LOC of merge_log. It is broken down
in 7 cases to enumerate all the situations it must address. Each case
is isolated in a independant code block where the conditions are
reproduced. Where possible and sensible to read, a code block covers
as much lines as possible. For instance:

  The log entry (1,3) deletes the object x9 but the olog entry (2,3)
  modifies it and is authoritative : the log entry (1,3) is divergent.

is the only test case covering a dozen "if" statements and half a
dozen "while/for" loops. It covers all the lines but it would be
useful to create others scenarii in the future.

Each test is made of a comment describing the test case, the
definition of the data structures to create the desired conditons, a
sequence of EXPECT_* checking that they are met, a single call to
merge_log and another sequence of EXPECT_* ( ordered to be easy to
compare with the first sequence ) checking all the desired side
effects.

The TestPGLog.cc file was untabified to improve the display of ascii
art when it is output as part of a diff.

http://tracker.ceph.com/issues/5213 refs #5213

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-06-10 14:08:51 +02:00
Sage Weil
6ce23541d9 messages/MMonProbe: fix uninit vars (again)
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-08 21:39:05 -07:00
Sage Weil
10bfa8350c osdc/Objecter: clear osd session command ops xlist on close
Clear the command ops list, just as we do the ops and linger_ops xlists.
This fixes a crash like this on shutdown:

2013-06-07 23:06:21.089275 7fc7e8655700 10 client.4124.objecter close_session for osd.0
2013-06-07 23:06:21.089279 7fc7e8655700  1 -- 10.3.64.22:0/1026843 mark_down 0x7fc7e0001260 -- 0x7fc7e0001000
./include/xlist.h: In function 'xlist<T>::~xlist() [with T = Objecter::CommandOp*]' thread 7fc7e8655700 time 2013-06-07 23:06:21.089401
./include/xlist.h: 69: FAILED assert(_size == 0)
 ceph version 0.63-553-ge8300d0 (e8300d0afb)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x80) [0x7fc7ec111f14]
 2: (xlist<Objecter::CommandOp*>::~xlist()+0x36) [0x7fc7ec09f95e]
 3: (Objecter::OSDSession::~OSDSession()+0x1d) [0x7fc7ec09e451]
 4: (Objecter::close_session(Objecter::OSDSession*)+0x1fc) [0x7fc7ec08a146]
 5: (Objecter::handle_osd_map(MOSDMap*)+0xe68) [0x7fc7ec087864]
 6: (librados::RadosClient::_dispatch(Message*)+0x84) [0x7fc7ec0615f0]
 7: (librados::RadosClient::ms_dispatch(Message*)+0x16b) [0x7fc7ec0613c1]
 8: (Messenger::ms_deliver_dispatch(Message*)+0x8c) [0x7fc7ec21d0f6]
 9: (DispatchQueue::entry()+0x52f) [0x7fc7ec21c653]
 10: (DispatchQueue::DispatchThread::entry()+0x1c) [0x7fc7ec2bdcc2]
 11: (Thread::_entry_func(void*)+0x23) [0x7fc7ec34d8cd]
 12: (()+0x6b50) [0x7fc7eb2feb50]
 13: (clone()+0x6d) [0x7fc7eab2c6dd]

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-08 21:38:43 -07:00
Sage Weil
81a786e9e5 librados: fix pg command test
Stat a bunch of (non-existent) random objects in the pool so ensure the
pg exists on the OSD before we assert that we get a 0 from querying it.

Although it is somewhat tempting to make the pg commands block until the
pg exists, that defeats much of the value of the command as a diagnostic
tool as it could block indefinitely instead of informing the admin/dev
that "the pg isn't there yet".

In any case, this fixes the api test failure.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-08 21:38:18 -07:00
Dan Mick
00eaf97db0 librados.h: Fix up some doxygen problems
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-06-07 22:58:20 -07:00
Sage Weil
e8300d0afb mds: fix filelock eval_gather
Broken by a08d620456

Reported-by: Yan, Zheng <yan.zheng@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-07 22:14:01 -07:00
Dan Mick
2b4157a71b .gitignore: add 'ceph', now a generated file 2013-06-07 21:47:11 -07:00
Dan Mick
359f456a70 ceph: old daemons output to outs and outbuf, combine
When talking to old daemons, if a command succeeds, there may be
output on outs, outbuf, or both; combine them if there's no error,
and clear outs so it's not treated as stderr fodder.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-07 17:29:03 -07:00
Dan Mick
b3f38f3ed8 ceph: handle old OSDs as command destinations, fix status part of -w
For osd tell or pg <pgid> commands, the CLI sends the command directly
to the OSD; if the OSDs are still old, the command needs to be sent
in 'plain' (non-JSON) form.  Also, the 'ceph status' from -w needs to
handle failure/fallback-to-old-command.

Refactor the guts of json_command() into send_command(), and call it
from json_command() and where needed for old-style commands.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-07 17:28:30 -07:00
Gregory Farnum
05d1d027b0 Merge pull request #352 from ceph/wip-4832
mds: do not double-queue file recovery in eval_gather
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-06-07 17:24:01 -07:00
Dan Mick
11e1afd84c ceph: add -v for version. Makefile processes ceph_ver.h
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-06-07 17:20:27 -07:00
Sage Weil
5e5bd66518 Merge pull request #343 from dalgaaf/wip-da-SCA-cppcheck
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-07 17:12:10 -07:00
Sage Weil
fde536fa5d osd: make scrub chunk size tunable
It was hard-coded at 5.  Make it range from 5-15 by default, for now.

We should still keep this smallish since this range is locked for the
duration of the scrub on this chunk.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-07 16:10:53 -07:00
Samuel Just
637e0eaddc rados: --num-objects will now cause bench to stop after that many objects
Reviewed-by: David Zafman <david.zafman@inktank.com>
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-06-07 15:59:40 -07:00
Samuel Just
0bc731ea93 test_filestore_idempotent: use obj name from source coll add
Fixes: #5240
Reviewed-by: David Zafman <david.zafman@inktank.com>
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-06-07 15:59:40 -07:00
Sage Weil
7e09507779 Merge remote-tracking branch 'gh/next'
Conflicts:
	src/messages/MMonProbe.h
2013-06-07 14:23:48 -07:00
Yehuda Sadeh
ad3934e335 rgw: handle deep uri resources
In case of deep uri resources (ones created beyond a single level
of hierarchy, e.g. auth/v1.0) we want to create a new empty
handlers for the path if no handlers exists. E.g., for
auth/v1.0 we need to have a handler for 'auth', otherwise
the default S3 handler will be used, which we don't want.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-06-07 13:52:25 -07:00
Yehuda Sadeh
8d55b87f95 rgw: fix get_resource_mgr() to correctly identify resource
Fixes: #5262
The original test was not comparing the correct string, ended up
with the effect of just checking the substring of the uri to match
the resource.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-06-07 13:52:14 -07:00
Yehuda Sadeh
9a0a9c205b rgw: add 'cors' to the list of sub-resources
Fixes: #5261
Backport: cuttlefish
Add 'cors' to the list of sub-resources, otherwise auth signing
is wrong.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-06-07 13:51:42 -07:00
Dan Mick
f4f6758bdb Merge branch 'wip-ceph-cli'
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-07 11:05:03 -07:00
Sage Weil
0b036ecddb osd: do not include logbl in scrub map
This is a potentially use object/file, usually prefixed by a zeroed region
on disk, that is not used by scrub at all.  It dates back to
f51348dc8b (2008) and the original version of
scrub.

This *might* fix #4179.  It is not a leak per se, but I observed 1GB
scrub messages going over the write.  Maybe the allocations are causing
fragmentation, or the sub_op queues are growing.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-06-07 10:00:34 -07:00
John Wilkins
dea8c2d188 doc: Updated for glossary terms and added indexing.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-07 09:51:05 -07:00
John Wilkins
8e24328d88 doc: Added indexing and did a bit of cleanup.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-07 09:50:03 -07:00
Sage Weil
a08d620456 mds: do not double-queue file recovery in eval_gather
This fixes a specific case of double-queuing seen in #4832:

 - client goes stale, inode marked NEEDSRECOVER
 - eval does sync, queued, -> RECOVERING
 - client resumes
 - client goes stale (again), inode marked NEEDSRECOVER
 - eval_gather queues *again*

Note that a cursory look at the recovery code makes me think this needs
a much more serious overhaul.  In particular, I don't think we should
be triggering recovery when transitioning *from* a stable state, but
explicitly when we are flagged, or when gathering.  We should probably
also hold a wrlock over the recovery period and remove the force_wrlock
kludge from the final size check.  Opened ticket #5268.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-06 21:38:56 -07:00
Dan Mick
3ac6ffe802 Merge branch 'wip-ceph-cli' into master
Conflicts:
	src/include/rados/librados.h
	src/librados/librados.cc
	src/osdc/Objecter.cc
	src/pybind/rados.py

Required modifications to:
	src/osd/OSD.cc

Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-06-06 20:08:15 -07:00