We were somewhat inadvertantly returning a data payload for write
operations. This was a side-effect of the OpContext::ops field being a
reference to MOSDOp::ops: the return data would end up there, and then
the MOSDOpReply ctor would copy it.
Fix this by breaking the ref, and making the do_op() logic also claim
return result data for error values (so that errors can return data to the
caller).
Signed-off-by: Sage Weil <sage@inktank.com>
common/Preforker.h: In member function 'void Preforker::daemonize()':
common/Preforker.h:97:40: warning: ignoring return value of 'ssize_t write(int, const void*, size_t)', declared with attribute warn_unused_result [-Wunused-result]
Signed-off-by: Sage Weil <sage@inktank.com>
Fix by restructuring code to hoist common code and have only one
place where admin_socket is actually called.
Signed-off-by: Dan Mick <dan.mick@inktank.com>
'osd crush set' should only be used to update already existing items on
the map whereas 'osd crush add' should be able to 'add and update' items.
Considering at that point we are effectively adding a new item to the
crush map, use 'add' instead of 'set'.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Bootstrap doesn't use or need this; the crush update happens when the osd
starts up (see init-ceph or upstart/ceph-osd.conf).
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
The std::copy construct leaves a trailing separator character, which breaks
parsing for booleans (among other things) and probably mangles everything
else too.
Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
The std::copy construct leaves a trailing separator character, which breaks
parsing for booleans (among other things) and probably mangles everything
else too.
Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
- Add note to docs indicating that CephFS is not recommended for
production datasets.
- Add note to docs indicating that running CephFS with multiple MDS
servers is not currently recommended.
This fixes issue #5797http://tracker.ceph.com/issues/5797
Signed-off-by: Scott Devoid <devoid@anl.gov>
This patch adds ZFS parallel journal support. It uses libzfs provided by
zfsonlinux to access ZFS' functionalities. To enable ZFS parallel journal
support, compile ceph by:
./configure --with-libzfs LIBZFS_CFLAGS="-I<libzfs header> -I<libspl header>"
make
Add following line to osd section of ceph.conf
filestore zfs_snap = 1
Note: ZFS (no mater parallel journal is enabled or not) does not support
direct IO. To use it as backend FS for OSD, you need to add following line
to osd section of ceph.conf
journal aio = 0
journal dio = 0
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
This patch defines class FileStoreBackend, and use it to abstracts
filesystem's functionalities.
Fiemap() and syncfs() related code is moved into class
GenericFileStoreBackend.
All btrfs specific code is moved into class BtrfsFileStoreBackend.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
This patch change the fsx.sh to pull better fsx.c from xfstests site
to support hole punching test.
Signed-off-by: Yunchuan Wen <yunchuanwen@ubuntukylin.com>
Signed-off-by: Li Wang <liwang@ubuntukylin.com>
Check new parameters and check that rados_id is not None again to
catch the empty string.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage.weil@inktank.com>
Previously it had no name parameter, so the default will be used by
old clients. However, if an old client set rados_id, a new check that
both rados_id and name are set would result in an error. Fix this by
only applying the default names after the check, and add tests of this
behavior.
This was introduced in 783b7ec847,
so it does not affect cuttlefish.
Fixes: #5970
Reported-by: Michael Morgan <mmorgan@dca.net>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage.weil@inktank.com>
For aio flush, we register a wait on the most recent write. The write
completion code, however, was *only* waking the waiter if they were waiting
on that write, without regard to previous writes (completed or not).
For example, we might have 6 and 7 outstanding and wait on 7. If they
finish in order all is well, but if 7 finishes first we do the flush
completion early. Similarly, if we
- start 6
- start 7
- finish 7
- flush; wait on 7
- finish 6
we can hang forever.
Fix by doing any completions that are prior to the oldest pending write in
the aio write completion handler.
Refs: #5919
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Tested-by: Oliver Francke <Oliver.Francke@filoo.de>
Add an already-locked helper so that C_Aio{Safe,Complete} can
increment the reference count when their caller holds the
lock. C_AioCompleteAndSafe's caller is not holding the lock, so call
regular get() to ensure no racing updates can occur.
This eliminates all direct manipulations of AioCompletionImpl->ref,
and makes the necessary locking clear.
The only place C_AioCompleteAndSafe is used is in handling
aio_flush_async(). This could cause a missing completion.
Refs: #5919
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Tested-by: Oliver Francke <Oliver.Francke@filoo.de>
Do not assert before the loop waiting for the thread to complete the
expected side effect. The whole point of the loop is to make sure
there is no window of opportunity for a race condition and asserting
before it means taking a useless risk. If run enough times, it will
happen.
Signed-off-by: Loic Dachary <loic@dachary.org>