The extent map retrieved from the fiemap might have been truncated
while reading the extents. Therefore, the map needs to be re-encoded
in the response instead of directly copied.
Fixes: #12904
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
client/fuse_ll.cc is now including <fuse.h> and <fuse_lowlevel.h>
instead of <fuse/fuse.h> and <fuse/fuse_lowlevel.h>, so we need to add
the fuse directory to the FUSE_INCLUDE_DIRS variable
using find_path() with just fuse.h was finding a /usr/include/fuse.h
instead of the one in /usr/include/fuse/. looking for fuse_common.h and
fuse_lowlevel.h first causes it to generate the correct
FUSE_INCLUDE_DIRS=/usr/include/fuse
Fixes: #12909
Signed-off-by: Casey Bodley <cbodley@redhat.com>
os/newstore/NewStore.cc: In member function 'int NewStore::_zero(NewStore::TransContext*, NewStore::CollectionRef&, const ghobject_t&, uint64_t, size_t)':
os/newstore/NewStore.cc:3693:32: warning: ignoring return value of 'int ftruncate(int, __off64_t)', declared with attribute warn_unused_result [-Wunused-result]
::ftruncate(fd, f.length);
^
Signed-off-by: Sage Weil <sage@redhat.com>
It appears that db->submit_transaction() will block if there is a sync
commit that is in progress instead of simply queueing the new txn for
later. To work around this, submit these to the backend in the
kv_sync_thread prior to the synchronous submit_transaction_sync().
Signed-off-by: Sage Weil <sage@redhat.com>
Even a no-op ftruncate can block in the kernel. Prior to this change I
could frequently see ftruncate wait for an aio completion on the same
file.
Signed-off-by: Sage Weil <sage@redhat.com>
An append is expensive in terms of latency (write, fdatasync, kv commit),
while a wal write is just the kv commit and the write and fdatasync are
async. For small IOs doing the wal may improve performance.
Signed-off-by: Sage Weil <sage@redhat.com>
The read of all the overlays can be delayed until applying the wal. If
we are doing async wal apply, this can reduce write op latency by
eliminating unnecessary reads in the write code path.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
There is a deadlock issue in Newstore when newstore_sync_transaction = true.
With sync_transaction to true, the txc state machine will go all the way down
from STATE_IO_DONE to STATE_FINISHING in the same thread, while holding the osr->qlock().
The deadlock is caused in _txc_finish and _osr_reap_done, when trying to
lock osr->qlock again.
Since the _txc_finish can be called with(in sync transaction mode) or without
(in async transaction mode) holding the qlock, so fix this by setting the qlock
to PTHREAD_MUTEX_RECURSIVE, thus we can recursive acquire the qlock.
Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
The data of the later contiguous overlays should be claim_append to
'op->data', instead of 'bl'.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
There is a racing condition here, if the flush_commit() call
happened after _txc_finish_kv and before next state, the context
was pushed to on_commits but no one will handle the context since
we already pass _txc_finish_kv. This bug can be easily reproduce
by putting a sleep(5) after _txc_finish_kv, and trigger the bug by
ceph-osd -i 0 --mkfs.
Fix this bug by return true directly when state >= STATE_KV_DONE(instead
of > in previous code). We already persist the data in STATE_KV_DONE so
it's safe for us to do this.
Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
When writing all the overlays, there is no need to dup the data in WAL.
Instead, we can reference the overlays in the WAL, and remove these
overlays after commiting them to the fs. When replaying, we can get
these data from the referenced overlays. Doing this way, we can save a
write and a deletion for each of the overlay data in the db.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>