when config journal_zero_on_create true, osd mkfs will fail when zeroing journal.
journal open with O_DIRECT, buf should align with blocksize.
Backport: giant, firefly, dumpling
Signed-off-by: Xie Rui <875016668@qq.com>
Reviewed-by: Sage Weil <sage@redhat.com>
latest_monmap that we stash is only used locally--the encoded bl is never shared. Which means we should just use CEPH_FEATURES_ALL all of the time.
Fixes: #5203
Backport: giant, firefly
Signed-off-by: Xie Rui <875016668@qq.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Joao Eduardo Luis <joao@redhat.com>
Previously was set in hashbang, which meant
that "./quota.sh" was OK, but "sh ./quota.sh" would
just run through ignoring errors.
Signed-off-by: John Spray <john.spray@redhat.com>
AIO requests that are waiting on the image lock should be flushed
during all existing RBD flush scenarios. A few flush cases were
missed in the original implementation.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
The new RBD exclusive lock feature should be treated as a
feature that is only applied when the image is opened in
R/W mode.
Older clients will need to handle the updated
cls_rbd::get_features method in order to properly determine
the incompatible features for an image depending on the
current mode.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
The new unit tests cover the modifications made to integrate
the internal librbd functionality with the new ImageWatcher.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Directly unit test the new ImageWatcher class to complement
the existing librbd integration tests of exclusive lock
handling.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Unit tests need access to the private symbols of librbd no
longer exported from librbd.so. A new librbd_internal
convenience library was created to allow access.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Operations that update the image now require the exclusive lock
if the feature is enabled. AIO write and discard operations will
automatically request the exclusive lock from the current leader
to support live-migration.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
As a follow-on to 49d114f1ff,
increment the "extra" version field so clients can easily
determine if they have a version of librados that properly
translates C API operation flags.
Signed-off-by: Matthew Richards <mattjrichards@gmail.com>
The new watch/notify handler replaces the existing header
update watch/notify handler and adds support for managing
image exclusive lock leadership.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Note that this will only get used if the kernel is new enough; if it is
older than 3.5 the option will get disabled and extsize will not be used
even if the option is set to true.
This partially reverts 01cd3cdc72.
Fixes: #9956
Signed-off-by: Sage Weil <sage@redhat.com>
Old kernels have an XFS bug that exposes uninitialized data when the
extsize hint is set and only partially written. This is fixed by Linux
commit aff3a9edb7080f69f07fe76a8bd089b3dfa4cb5d, documented in XFS bug
http://oss.sgi.com/bugzilla/show_bug.cgi?id=874, and tested by XFS
test xfs/229 to prevent regressions.
Notably the original bug affects kernel 3.2, which is widely deployed with
ubuntu precise 12.04.
Backport: giant, firefly
Signed-off-by: Sage Weil <sage@redhat.com>
This is for use in CephFS disaster recovery. When
the metadata pool has been forcibly reset to a single-MDS
metadata tree, we would like to reset the MDSMap to match.
Signed-off-by: John Spray <john.spray@redhat.com>
Avoid taking the PG lock for a canceled read op (if we are lucky). Recheck
after the lock is taken for good measure.
Signed-off-by: Sage Weil <sage@redhat.com>
We can't use the synchronous completion callbacks (in fast dispatch
context) do to the proxy read completion work.
Signed-off-by: Sage Weil <sage@redhat.com>
Cancel and requeue proxy read on the following cases:
1) on_shutdown
2) on_change
3) background promotion is done
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
Conflicts:
src/osd/ReplicatedPG.cc
src/osd/ReplicatedPG.h
If we are not write ordered, continue with cache checks so that we can
(among other things) proxy reads while promoting.
Note that this may reorder reads for clients, but we've decided that's okay.
Signed-off-by: Sage Weil <sage@redhat.com>