Options -v, --verbose, --concise didn't have helpstrings
Option --completion doesn't quite work yet, and should be hidden anyway
Signed-off-by: Dan Mick <dan.mick@inktank.com>
If we add an item that already exists in particular position, we should
update instead of inserting it; the CrushWrapper methods are not
idempotent.
Signed-off-by: Sage Weil <sage@inktank.com>
If we abort while waiting, we incorrect clean up (we switch the state value
incorrectly, and also fail to clean up the initialized objecter).
Intead, skip this wait.. it's useless!
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
mon/MonmapMonitor.cc: In member function 'bool MonmapMonitor::preprocess_command(MMonCommand*)':
mon/MonmapMonitor.cc:273:2: warning: label 'out' defined but not used [-Wunused-label]
Signed-off-by: Sage Weil <sage@inktank.com>
It doesn't work. The commands the ceph cli sends are vector<string>, and
the mon expects json.
Leave the MDS on in place since ceph-mds still takes strings.
Signed-off-by: Sage Weil <sage@inktank.com>
The new parsing code had been trying to allow flexibility for the
'old form' commands (where id could be different from N in osd.N),
but also accept 'new form' commands. The new rule is that where
there's an OSD specified in the osd crush command, it is of type
CephOsdName, which can be an id *or* 'osd.<id>', but not both.
Pass CephOsdName as int64_t 'id' for convenience in mon code
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
This implementation is limited: we direct our command by reopening
a session with the specific monitor. If there is more than one of these
queued we will fail to reach either.
Signed-off-by: Sage Weil <sage@inktank.com>
Send the command to each target. Do this in series, for now. Error out if
any one fails.
Later, we should do them in parallel.
Signed-off-by: Sage Weil <sage@inktank.com>
Generalize the map check machinery that the pool dne check uses to also
get the latest map for OSD down/dne checks. This is better semantics, but
more important fixes the more immediate bug of returning the error code
to the caller from the osd_command -> _submit_command (that is ignored by
pretty much any caller) and then never triggering the callback.
Fixes: #5331
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
A mere
import readline
line is dumping this to stdout on CentOS 6.3:
00000000 1b 5b 3f 31 30 33 34 68 .[?1034h
That confuses non-terminals that read from stdout, so only import when we
are in the interactive mode.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
The get_version(string, string) is the wrong method; it combines the two
args into a key that is nested inside prefix (so it's prefix/a/b), but we
want perfix/format_version. Add a method to grab an int for this
particular combo and use that.
This fixes an infinite loop when we actually trigger this code.
Bug introduced by f43c974571.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
If <path-to-ceph> contains pybind and .libs:
- prepend <path-to-ceph>/pybind to PYTHONPATH
- append <path-to-ceph>/.libs to LD_LIBRARY_PATH if not already there
and exec self so it takes effect
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
'ceph-conf ...' doesn't give you final/default values, only what is in the
conf file. Use -w output to test this instead.
Fixes: #5327
Signed-off-by: Sage Weil <sage@inktank.com>
Trying to track down this failure:
2013-06-12T06:11:13.430 INFO:teuthology.task.workunit.client.0.err:+ rsync -auv --exclude local/ /usr/ usr.2
2013-06-12T06:11:13.430 INFO:teuthology.task.workunit.client.0.err:+ tee a
2013-06-12T06:11:13.527 INFO:teuthology.task.workunit.client.0.out:sending incremental file list
2013-06-12T06:11:46.206 INFO:teuthology.task.workunit.client.0.out:
2013-06-12T06:11:46.208 INFO:teuthology.task.workunit.client.0.out:sent 1689627 bytes received 8302 bytes 50684.45 bytes/sec
2013-06-12T06:11:46.208 INFO:teuthology.task.workunit.client.0.out:total size is 3274130495 speedup is 1928.31
2013-06-12T06:11:46.209 INFO:teuthology.task.workunit.client.0.err:+ wc -l a
2013-06-12T06:11:46.209 INFO:teuthology.task.workunit.client.0.err:+ grep 4
2013-06-12T06:11:46.211 INFO:teuthology.task.workunit:Stopping misc on client.0...
...and am perplexed!
Signed-off-by: Sage Weil <sage@inktank.com>
User testing has shown that smaller values yield better results; see #4917.
Jim's testing has had good results with even more aggressive trimming, but I
would like to do more validation yet before changing defaults.
Signed-off-by: Sage Weil <sage@inktank.com>
We regularly have been observing a stall where the MDS is blocked waiting
for a cap revocation (Ls, in our case) and never gets a reply. We finally
tracked down the sequence:
- mds issues cap seq 1 to client
- mds does revocation (seq 2)
- client replies
- much time goes by
- client trims inode from cache, sends release with seq == 2
- mds ignores release because its issue_seq is 1
- mds later tries to revoke other caps
- client discards message because it doesn't have the inode in cache
The problem is simply that we are using seq instead of issue_seq in the
cap release message. Note that the other release call site in
encode_inode_release() is correct. That one is much more commonly
triggered by short tests, as compared to this case where the inode needs to
get pushed out of the client cache.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
set_priv() expects to be given a reference to own; take one. This fixes
various crashes after we see a hb connection reset.
Signed-off-by: Sage Weil <sage@inktank.com>