when the thread is calling AsyncConnection::handle_write, another thread may
replace it and requeue all messages. Because we remove the write_lock
protection for handle_write caller, it may happen sent racing with out_q
Fix: http://tracker.ceph.com/issues/20093
Signed-off-by: Haomai Wang <haomai@xsky.com>
This finisher thread has a lot of callbacks which can hold PGRefs. Make
sure we drain them out before checking that all the PGs have finished
and have no outstanding references.
Moving this should be safe; we've already stopped the op thread et al
and the only things still running are the OSDService's objecter_finisher,
recovery_request_timer, and snap_sleep_timer (which has definitely been emptied
by the time we get here as it's synchronously cleared out on PG shutdown).
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
1. normalize arg parsing for "bucket limit check"
1.1 s/buckets/bucket/
2. avoid dividing by num_shards when it is 0
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
On the leader we cancel the mapping job in encode_pending. On a peon,
we don't cancel it at all! It is surprising this didn't already cause
problems, but with the PGtempMap is pretty reliably crashes with a
largish map.
Fixes: http://tracker.ceph.com/issues/20067
Signed-off-by: Sage Weil <sage@redhat.com>
Add a description of max_file_size to the CephFS admin docs.
Thanks to John Spray <jspray@redhat.com> on ceph-users for this
information.
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
We were failing to exit various wait states which held PGRefs. Error!
Fixes: http://tracker.ceph.com/issues/19931
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
On large clusters, these large caches can be problematic (as maps get big).
We've seen good results with extremely small caches (10s of maps). Make
a more modest reduction.
Signed-off-by: Sage Weil <sage@redhat.com>
The third test (increasing osd_map_max_advance)
was triggering a warning from the 4th case (which
it didn't before).
Signed-off-by: Sage Weil <sage@redhat.com>
This way will ensure we cache data for recent osdmaps if we need to for
the benefit of laggy clients... even if (in bluestore's case)
bluestore_default_buffered_reads = false (it's true by default). This
should mitigate any tail latency/work even if the osdmap cache size is too
small.
Signed-off-by: Sage Weil <sage@redhat.com>
Add perfcounters so we can see whether we are missing osdmaps in the
cache. This will let us tell whether, given a workload or environment,
our osdmap cache might be too small.
Signed-off-by: Sage Weil <sage@redhat.com>
We enable osd_debug_misdirected_ops in QA, but this is wasted effort on
a production cluster. In particular, it means that a idle client that
sends an op to the wrong OSD based on an old map will require that OSD to
load that old map into memory to decide whether to print a warning... all
on the off-chance that the client is buggy.
Signed-off-by: Sage Weil <sage@redhat.com>
There is no good reason anyone would want this turned on.
Introduced 923e7f5ce5 (post-kraken), but
backported to kraken and jewel (10.2.6).
Signed-off-by: Sage Weil <sage@redhat.com>