When we get a ping reply, remove the peer from the failure_queue
and send a still alive message if the peer is in the failure_pending
map.
Otherwise, the monitor could slowly accumulate sporadic failure reports
leading to an osd being incorrectly marked out.
This bug may have been contributing to the wrongly-marked-down
thrashing observed on some systems.
Signed-off-by: Samuel Just <sam.just@inktank.com>
If the osd recieving the log has divergent entries, it will
also have a "divergent" stat structure. In general, it suffices
to simply trust the stat structure shipped with the authoritative
log and info since merge_log is only used to merge an authoritative
log.
Probably fixes#2769.
In cases like #2769, this bug can result in a primary with a stat
structure which double counts an operation: once for the
divergent operation, and once for the replay. It turned up
in a regression suite run as a scrub stat mismatch.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Consider the following sequence:
1. issue, apply repop
2. replicas and primary commit
Here, repop->waitfor_(ack|disk) are empty, so we mark
repop->done and remove_repop.
3. interval change, repops still in queue are marked aborted
4. activate, last_update_applied = last_update
5. the repop from one enters apply_repop, is not aborted,
and finds that last_update_applied has passed it by.
Fixes#2749
Signed-off-by: Samuel Just <sam.just@inktank.com>
test/test_librbd.cc: In member function ‘virtual void LibRBD_TestClone_Test::TestBody()’:
warning: test/test_librbd.cc:1040:111: format ‘%ld’ expects argument of type ‘long int’, but argument 2 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat]
warning: test/test_librbd.cc:1040:111: format ‘%ld’ expects argument of type ‘long int’, but argument 3 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat]
warning: test/test_librbd.cc:1040:111: format ‘%ld’ expects argument of type ‘long int’, but argument 4 has type ‘int64_t {aka long long int}’ [-Wformat]
Signed-off-by: Sage Weil <sage@inktank.com>
This used to be conditional on config having osd_crush_location set,
but with that, minimal configuration left the OSD completely out of
the crush map, and prevented the OSD from starting properly.
Note: Ceph does not currently let this mechanism automatically move
hosts to another location in the CRUSH hierarchy. This means if you
let this run with defaults, setting osd_crush_location later will not
take effect. Set up your config file (or Chef environment) fully
before starting the OSDs the first time.
Signed-off-by: Tommi Virtanen <tv@inktank.com>
Issue #2776. Allow the removal of multiple objects in a single
rados tool command:
# rados -p pool rm obj1 [obj2 [...]]
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Bad linebreaks, wrapping, stringification, missing doc for bench args
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
Bug #2772. This fixes an issue that was introduced when we
added the 'rados cp' command. The -t param was already used
for rados bench. With this change the only way to specify
a target pool is using --target-pool.
Though this problem is post argonaut, the 'rados cp' command
has been backported, so we need this fix there too.
Backport: argonaut
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Sum the quantized weights for each bucket, and check that for overflow.
This could change the results of a compile marginally if the map is using
non-divisible weight values that quantize funny. The old code might
calculate a bucket sum that is not the actual sum of the quantized weights.
Signed-off-by: Sage Weil <sage@inktank.com>
Disallow setting OSD weights to a value over 10,000 and cap bucket weight
at 10,000,000 in a CRUSH map. Addresses issue #2101.
Signed-off-by: caleb miles <caleb.miles@inktank.com>
Bad linebreaks, wrapping, stringification, missing doc for bench args
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
This is leftover from when we built a libcrush.so. We can re-add when we
start doing that again.
Reported-by: Laszlo Boszormenyi <gcs@debian.hu>
Signed-off-by: Sage Weil <sage@inktank.com>
* simple helper to translate name to id
* verify sub type is valid in caller
* assert sub type is valid in method
* simplify iterator usage
Among other things, this gets rid of this noise in the logs:
2012-07-10 20:51:42.617152 7facb23f1700 1 mon.a@1(peon).log v310 check_sub sub monmap not log type
Signed-off-by: Sage Weil <sage@inktank.com>
This was creating a new cluster connection/session per iteration, and
along with it a few service threads and sockets and so forth.
Unfortunately, librados leaks like a sieve, starting with CephContext
and ceph::crypto::init(). See #845 and #2067.
Signed-off-by: Sage Weil <sage@inktank.com>
pinfo.stats might be wrong if we did log-based recovery on the
backfilled portion in addition to continuing backfill.
bug #2750
Signed-off-by: Samuel Just <sam.just@inktank.com>
When we are signaling the cond to indicate that a notify is complete,
take the appropriate lock. This removes the possibility of a race
that loses our signal. (That would be very difficult given that there
are network round trips involved, but this makes the lock/cond usage
"correct.")
Signed-off-by: Sage Weil <sage@inktank.com>
Earlier, this was a single -t, and that is overridden by the fact that
stdin is not a tty, so that did nothing.
Signed-off-by: Tommi Virtanen <tv@inktank.com>
split out new parent info into separate retrieval methods;
structure packing on rbd_image_info_t was becoming a problem.
Deprecate old parent fields in favor of new ones.
Signed-off-by: Dan Mick <dan.mick@inktank.com>