When fast reads are enabled, it's possible for the ordering of a shard
read to not be enforced with respect to writes that come after because
the read completes on the primary before all shards reply. This can lead
to an ENOENT on the non-primary, and an ERR message in the cluster log,
even though everything is fine. (The reply will go back to the primary
with the error but it will be ignored since the read has completed.)
Suppress the error message so we don't see these ERR messages in the
cluster log during the normal course of events.
Fixes: http://tracker.ceph.com/issues/26972
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/24220/head:
test/objectstore: set pool for fsck test
Reviewed-by: Jianpeng Ma <jianpeng.ma@intel.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Two instances of fsstress clobber each other. Just build it in the local sandbox.
Fixes: http://tracker.ceph.com/issues/24177
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
If CEPH_DEBUG_MUTEX is defined, use the [recursive_]mutex_debug classes
that implement lockdep and a bucnh of other random debug checks. Also
typedef ceph::condition_variable to std::condition_variable_debug, which
adds addition assertions and debug checks.
If CEPH_DEBUG_MUTEX is not defined, then use the bare-bones C++ std::mutex
primitives... or as close as we can get to them.
Since the [recursive_]mutex_debug classes take a string argument for the
lockdep piece, define factory functions ceph::make_[recursive_]mutex that
either pass arguments to the debug implementations or toss them out.
Signed-off-by: Sage Weil <sage@redhat.com>
I don't see any purpose for this, and it prevents us from knowing whether
the mutex is recursive when _will_lock() is called.
Signed-off-by: Sage Weil <sage@redhat.com>
this means that BucketTrimManager will track active buckets based on
local changes, rather than changes in remote datalogs or error repos
Fixes: http://tracker.ceph.com/issues/36034
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Specifically fixes the recurringly occurring `test_osd.py` error on the
`test_scrub` method. But this change should also prevent other issues of
the same kind. Issues of "same kind" are issues which occurr due to
tests which do not immediately result in a clean cluster status and
aren't manually programmed to wait for it.
Fixes: http://tracker.ceph.com/issues/36107
Signed-off-by: Patrick Nawracay <pnawracay@suse.com>
* refs/pull/23985/head:
ceph-objectstore-tool: add back pool dne check
qa/suites/rados/singleton/reg11184: remove old test
ceph-objectstore-tool: import pg at original epoch
osd: handle null pg slot on startup
ceph-objectstore-tool: drop support for ancient export files
osd: avoid dropping osd_lock when pg osdmaps are not laggy
qa/standalone/osd/pg-merge.sh: add merge vs pg import test
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
* refs/pull/24064/head:
osd: simplify init of fabricated pg
osd/PG: inherit pg history from merge source, if necessary
osd/osd_types: increasing pg_num_pending is also an interval change
osd: cancel pg merge if PGs are undersized
mon/OSDMonitor: handle ready_to_merge message that cancels the merge
osd/PG: only signal ready_to_merge if we have all replicas
osd/PG: move all mark_clean-ish activity into try_mark_clean()
osd/PG: use last_epoch_clean from ReadyToMerge point in time for fabricated history
osd: send last_epoch_clean when indicating PG is ready to merge
osd/osd_types: rename pg_num_pending_dec_epoch -> pg_num_dec_last_epoch_clean
osd,mon: stop setting pg_num_pending_dec_epoch
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
/ceph/src/osd/PG.cc: In member function 'void
PG::choose_async_recovery_ec(const std::map<pg_shard_t, pg_info_t>&,
const pg_info_t&, std::vector<int>*, std::set<pg_shard_t>*) const':
/ceph/src/osd/PG.cc:1572:32: warning: comparison of integer expressions
of different signedness: 'long int' and 'long unsigned int'
[-Wsign-compare]
if (approx_missing_objects > cct->_conf.get_val<uint64_t>(
~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"osd_async_recovery_min_cost")) {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/ceph/src/osd/PG.cc: In member function 'void
PG::choose_async_recovery_replicated(const std::map<pg_shard_t,
pg_info_t>&, const pg_info_t&, std::vector<int>*, std::set<pg_shard_t>*)
const':
/ceph/src/osd/PG.cc:1625:33: warning: comparison of integer expressions
of different signedness: 'long int' and 'long unsigned int'
[-Wsign-compare]
if (approx_missing_objects > cct->_conf.get_val<uint64_t>(
~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"osd_async_recovery_min_cost")) {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Kefu Chai <kchai@redhat.com>
since 0bd2546eac, we check the pool id
of object when performing fsck to ensure we are looking at the right
collection, but the test is still using the pool id set by the
constructor of hobject_t. so all objects we created in that test belong
to the POOL_META. while the collection is created with the pool id of
`555`. hence the test fails.
Fixes: http://tracker.ceph.com/issues/36099
Signed-off-by: Kefu Chai <kchai@redhat.com>
Also, fix a bunch of quirky journal_tool invocations that pass
"--rank" argument as the command argument rather than passing it
as function argument.
Fixes: https://tracker.ceph.com/issues/24780
Signed-off-by: Venky Shankar <vshankar@redhat.com>
... and do not silenty act on the default filesystem.
Force users to specify the filesystem name and rank.
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Some helper functions clobber the passed in arg vector. This
causes operations on all but the first rank to fail as the
operations for the first rank tampers with the arg vector.
Signed-off-by: Venky Shankar <vshankar@redhat.com>
cephfs-journal-tool supports operations on all ranks. Operations
such as dump/export do not write to distinct filenames hence
overwriting the data dumped or exported for the previous rank.
With this change (and further commits), for operations on all
ranks, dump/export would write to distinct filenames suffixed
by the mds rank (.0, .1, etc..). For operations on a single rank
or if there exist a single rank, the passed in filename is used
as it is.
Signed-off-by: Venky Shankar <vshankar@redhat.com>