Commit Graph

26669 Commits

Author SHA1 Message Date
Dan Mick
a3767010a8 ceph, mon/OSDMonitor: fix up osd crush commands for <osd.N> or <N>
The new parsing code had been trying to allow flexibility for the
'old form' commands (where id could be different from N in osd.N),
but also accept 'new form' commands.  The new rule is that where
there's an OSD specified in the osd crush command, it is of type
CephOsdName, which can be an id *or* 'osd.<id>', but not both.

Pass CephOsdName as int64_t 'id' for convenience in mon code

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-12 21:56:16 -07:00
Sage Weil
8808ca57c6 osdc/Objecter: fix handling for osd_command dne/down cases
Generalize the map check machinery that the pool dne check uses to also
get the latest map for OSD down/dne checks.  This is better semantics, but
more important fixes the more immediate bug of returning the error code
to the caller from the osd_command -> _submit_command (that is ignored by
pretty much any caller) and then never triggering the callback.

Fixes: #5331
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-06-12 18:13:12 -07:00
Sage Weil
11d5c7a23b ceph: only use readline when in interactive mode
A mere

  import readline

line is dumping this to stdout on CentOS 6.3:

  00000000  1b 5b 3f 31 30 33 34 68  .[?1034h

That confuses non-terminals that read from stdout, so only import when we
are in the interactive mode.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-06-12 17:13:39 -07:00
Sage Weil
862148d5fb mon: fix read of format_version out of leveldb
The get_version(string, string) is the wrong method; it combines the two
args into a key that is nested inside prefix (so it's prefix/a/b), but we
want perfix/format_version.  Add a method to grab an int for this
particular combo and use that.

This fixes an infinite loop when we actually trigger this code.

Bug introduced by f43c974571.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-06-12 17:13:14 -07:00
Gary Lowell
35ac835fc8 Merge branch 'next' 2013-06-12 15:00:05 -07:00
Dan Mick
e5184ea950 ceph: make life easier on developers by handling in-tree runs
If <path-to-ceph> contains pybind and .libs:
- prepend <path-to-ceph>/pybind to PYTHONPATH
- append <path-to-ceph>/.libs to LD_LIBRARY_PATH if not already there
  and exec self so it takes effect

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-12 14:05:34 -07:00
Sage Weil
701943a278 qa/workunits/cephtool/test.sh: look for 'ceph log' via -w, not in log file
'ceph-conf ...' doesn't give you final/default values, only what is in the
conf file.  Use -w output to test this instead.

Fixes: #5327
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-12 14:00:24 -07:00
Sage Weil
b70f5658c4 ceph: flush stdout on watch print
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-12 14:00:04 -07:00
Sage Weil
b89b6ceeb0 Merge pull request #357 from atwardowski/patch-1
Usage log and ops log are disabled by defaults since 0.56
2013-06-12 13:50:15 -07:00
atwardowski
299f6a6609 Usage log and ops log are disabled by defaults since 0.56
http://ceph.com/docs/next/release-notes/#v0-56-bobtail
2013-06-12 17:48:44 -03:00
Sage Weil
de1723834c mon: fix 'pg dump_stuck' stuckops type
It's a list.

Fixes: #5332
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-06-12 13:39:30 -07:00
Sage Weil
b284e25fe7 Merge remote-tracking branch 'gh/wip_5238'
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-12 13:31:22 -07:00
Sage Weil
afa16b4817 qa: multiple_rsync.sh: more output
Trying to track down this failure:

2013-06-12T06:11:13.430 INFO:teuthology.task.workunit.client.0.err:+ rsync -auv --exclude local/ /usr/ usr.2
2013-06-12T06:11:13.430 INFO:teuthology.task.workunit.client.0.err:+ tee a
2013-06-12T06:11:13.527 INFO:teuthology.task.workunit.client.0.out:sending incremental file list
2013-06-12T06:11:46.206 INFO:teuthology.task.workunit.client.0.out:
2013-06-12T06:11:46.208 INFO:teuthology.task.workunit.client.0.out:sent 1689627 bytes  received 8302 bytes  50684.45 bytes/sec
2013-06-12T06:11:46.208 INFO:teuthology.task.workunit.client.0.out:total size is 3274130495  speedup is 1928.31
2013-06-12T06:11:46.209 INFO:teuthology.task.workunit.client.0.err:+ wc -l a
2013-06-12T06:11:46.209 INFO:teuthology.task.workunit.client.0.err:+ grep 4
2013-06-12T06:11:46.211 INFO:teuthology.task.workunit:Stopping misc on client.0...

...and am perplexed!

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-12 13:26:03 -07:00
Gary Lowell
42e06c12db v0.64 2013-06-12 09:54:06 -07:00
Dan Mick
68b5fa9b61 ceph-fuse: older libfuses don't support FUSE_IOCTL_COMPAT
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-12 09:43:12 -07:00
Peter Wienemann
1577e203f0 ceph-create-keys: Make sure directories for admin and bootstrap keys exist
Signed-off-by: Peter Wienemann <wienemann@physik.uni-bonn.de>
2013-06-12 08:40:25 -07:00
Samuel Just
256afa072d store_test: create_collection prior to split
Fixes: #5310
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
2013-06-11 16:45:17 -07:00
Sage Weil
1a9415a015 mon: adjust trim defaults
User testing has shown that smaller values yield better results; see #4917.
Jim's testing has had good results with even more aggressive trimming, but I
would like to do more validation yet before changing defaults.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-11 16:30:41 -07:00
John Wilkins
5f0007e6a9 doc: Reworked the landing page.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-11 15:32:23 -07:00
John Wilkins
dc6cadc34a doc: Added a hostname resolution section for local host execution.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-11 14:46:35 -07:00
John Wilkins
f6c51b486d doc: Added some tips and re-organized to simplify the process.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-11 14:46:12 -07:00
Sage Weil
9b012e234a client: set issue_seq (not seq) in cap release
We regularly have been observing a stall where the MDS is blocked waiting
for a cap revocation (Ls, in our case) and never gets a reply.  We finally
tracked down the sequence:

 - mds issues cap seq 1 to client
 - mds does revocation (seq 2)
 - client replies
 - much time goes by
 - client trims inode from cache, sends release with seq == 2
 - mds ignores release because its issue_seq is 1
 - mds later tries to revoke other caps
 - client discards message because it doesn't have the inode in cache

The problem is simply that we are using seq instead of issue_seq in the
cap release message.  Note that the other release call site in
encode_inode_release() is correct.  That one is much more commonly
triggered by short tests, as compared to this case where the inode needs to
get pushed out of the client cache.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-06-11 13:56:45 -07:00
John Wilkins
c7fb7a3f46 doc: Added some Java S3 API troubleshooting entries.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-11 12:12:46 -07:00
John Wilkins
6c557d569d doc: Added install ceph-common instruction.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-11 12:11:51 -07:00
John Wilkins
5543f19c25 doc: Added install ceph-common instruction.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-11 12:11:26 -07:00
John Wilkins
3f3ad61fae doc: Fixed :term" syntax.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-06-11 12:10:52 -07:00
Gary Lowell
0948624f3e ceph-create-keys: Remove unused caps parameter on bootstrap_key()
The caps parameter was removed except for one place.

Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
2013-06-11 08:25:36 -07:00
Sage Weil
4682636f96 Merge branch 'next' 2013-06-10 23:22:49 -07:00
Sage Weil
3f2017fb17 osd: fix con -> session ref change after hb reset
set_priv() expects to be given a reference to own; take one.  This fixes
various crashes after we see a hb connection reset.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-10 23:22:26 -07:00
Sage Weil
a378c4d11e common/admin_socket: fix leak of new m_getdescs_hook
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-10 21:56:13 -07:00
Sage Weil
6bab425342 common/cmdparse: no need to use (and leak to) the heap
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-06-10 21:56:03 -07:00
Dan Mick
5c945cd1b8 CrushWrapper: dump tunables along with crush map
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-10 18:25:39 -07:00
Dan Mick
0e0e896e8c ceph: --keyring must be passed to parse_argv, which means not argparse
If argparse gets its hands on it, it's not available for parse_argv()
and is therefore ignored.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-10 18:09:05 -07:00
Samuel Just
8190b439b7 OSD: create collection in handle_pg_create before _create_lock_pg
Fixes: #5270
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-10 17:16:09 -07:00
Dan Mick
af92b9a4d9 Objecter: fail osd_command if OSD is down
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-10 17:06:04 -07:00
Dan Mick
a741aa073a mon: send "osd create" output to stdout; tests rely on it
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-10 17:06:04 -07:00
athanatos
01944ab900 Merge pull request #349 from dachary/wip-5213
unit tests for PGLog::merge_log

Reviewed-by: Sam Just <sam.just@inktank.com>
2013-06-10 16:08:46 -07:00
Sage Weil
0fe4bc09cc Merge pull request #350 from ceph/wip-osd-scrub-chunk
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-06-10 09:50:52 -07:00
Loic Dachary
8f141c451b unit tests for PGLog::rewind_divergent_log
The tests covers 100% of the LOC of rewind_divergent_log. There are
three situations :

 * throw an assert because the data is inconsistent

 * special case when the entire logs is divergent

 * regular workflow where all divergent entries are run to
   merge_old_entry

http://tracker.ceph.com/issues/5213 refs #5213

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-06-10 14:08:51 +02:00
Loic Dachary
04e89a4012 unit tests for PGLog::merge_log
The tests covers 100% of the LOC of merge_log. It is broken down
in 7 cases to enumerate all the situations it must address. Each case
is isolated in a independant code block where the conditions are
reproduced. Where possible and sensible to read, a code block covers
as much lines as possible. For instance:

  The log entry (1,3) deletes the object x9 but the olog entry (2,3)
  modifies it and is authoritative : the log entry (1,3) is divergent.

is the only test case covering a dozen "if" statements and half a
dozen "while/for" loops. It covers all the lines but it would be
useful to create others scenarii in the future.

Each test is made of a comment describing the test case, the
definition of the data structures to create the desired conditons, a
sequence of EXPECT_* checking that they are met, a single call to
merge_log and another sequence of EXPECT_* ( ordered to be easy to
compare with the first sequence ) checking all the desired side
effects.

The TestPGLog.cc file was untabified to improve the display of ascii
art when it is output as part of a diff.

http://tracker.ceph.com/issues/5213 refs #5213

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-06-10 14:08:51 +02:00
Sage Weil
6ce23541d9 messages/MMonProbe: fix uninit vars (again)
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-08 21:39:05 -07:00
Sage Weil
10bfa8350c osdc/Objecter: clear osd session command ops xlist on close
Clear the command ops list, just as we do the ops and linger_ops xlists.
This fixes a crash like this on shutdown:

2013-06-07 23:06:21.089275 7fc7e8655700 10 client.4124.objecter close_session for osd.0
2013-06-07 23:06:21.089279 7fc7e8655700  1 -- 10.3.64.22:0/1026843 mark_down 0x7fc7e0001260 -- 0x7fc7e0001000
./include/xlist.h: In function 'xlist<T>::~xlist() [with T = Objecter::CommandOp*]' thread 7fc7e8655700 time 2013-06-07 23:06:21.089401
./include/xlist.h: 69: FAILED assert(_size == 0)
 ceph version 0.63-553-ge8300d0 (e8300d0afb)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x80) [0x7fc7ec111f14]
 2: (xlist<Objecter::CommandOp*>::~xlist()+0x36) [0x7fc7ec09f95e]
 3: (Objecter::OSDSession::~OSDSession()+0x1d) [0x7fc7ec09e451]
 4: (Objecter::close_session(Objecter::OSDSession*)+0x1fc) [0x7fc7ec08a146]
 5: (Objecter::handle_osd_map(MOSDMap*)+0xe68) [0x7fc7ec087864]
 6: (librados::RadosClient::_dispatch(Message*)+0x84) [0x7fc7ec0615f0]
 7: (librados::RadosClient::ms_dispatch(Message*)+0x16b) [0x7fc7ec0613c1]
 8: (Messenger::ms_deliver_dispatch(Message*)+0x8c) [0x7fc7ec21d0f6]
 9: (DispatchQueue::entry()+0x52f) [0x7fc7ec21c653]
 10: (DispatchQueue::DispatchThread::entry()+0x1c) [0x7fc7ec2bdcc2]
 11: (Thread::_entry_func(void*)+0x23) [0x7fc7ec34d8cd]
 12: (()+0x6b50) [0x7fc7eb2feb50]
 13: (clone()+0x6d) [0x7fc7eab2c6dd]

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-08 21:38:43 -07:00
Sage Weil
81a786e9e5 librados: fix pg command test
Stat a bunch of (non-existent) random objects in the pool so ensure the
pg exists on the OSD before we assert that we get a 0 from querying it.

Although it is somewhat tempting to make the pg commands block until the
pg exists, that defeats much of the value of the command as a diagnostic
tool as it could block indefinitely instead of informing the admin/dev
that "the pg isn't there yet".

In any case, this fixes the api test failure.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-08 21:38:18 -07:00
Dan Mick
00eaf97db0 librados.h: Fix up some doxygen problems
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-06-07 22:58:20 -07:00
Sage Weil
e8300d0afb mds: fix filelock eval_gather
Broken by a08d620456

Reported-by: Yan, Zheng <yan.zheng@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-07 22:14:01 -07:00
Dan Mick
2b4157a71b .gitignore: add 'ceph', now a generated file 2013-06-07 21:47:11 -07:00
Dan Mick
359f456a70 ceph: old daemons output to outs and outbuf, combine
When talking to old daemons, if a command succeeds, there may be
output on outs, outbuf, or both; combine them if there's no error,
and clear outs so it's not treated as stderr fodder.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-07 17:29:03 -07:00
Dan Mick
b3f38f3ed8 ceph: handle old OSDs as command destinations, fix status part of -w
For osd tell or pg <pgid> commands, the CLI sends the command directly
to the OSD; if the OSDs are still old, the command needs to be sent
in 'plain' (non-JSON) form.  Also, the 'ceph status' from -w needs to
handle failure/fallback-to-old-command.

Refactor the guts of json_command() into send_command(), and call it
from json_command() and where needed for old-style commands.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-07 17:28:30 -07:00
Gregory Farnum
05d1d027b0 Merge pull request #352 from ceph/wip-4832
mds: do not double-queue file recovery in eval_gather
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-06-07 17:24:01 -07:00
Dan Mick
11e1afd84c ceph: add -v for version. Makefile processes ceph_ver.h
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-06-07 17:20:27 -07:00