Commit Graph

44255 Commits

Author SHA1 Message Date
Loic Dachary
e8ba262723 Merge pull request #5722 from cxwshawn/vs-fix
vstart.sh: add --mon_num --osd_num --mds_num --rgw_port option

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-09-02 02:11:20 +02:00
Loic Dachary
adf3a9e3df Merge pull request #5693 from tchaikov/wip-12730
common/SubProcess: silence compiler warnings

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-09-02 02:05:27 +02:00
Loic Dachary
385cb96b8a Merge pull request #5643 from dreamhost/wip-make-check-makeopt
make-check: support MAKEOPTS overrides.

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-09-02 01:56:01 +02:00
Loic Dachary
ede98fea66 Merge pull request #5299 from hjwsm1989/pgmonitor-const
mon: added const to dump_* functions in PGMonitor

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-09-02 01:03:14 +02:00
Loic Dachary
4d23d794b0 Merge pull request #5156 from rubenk/fix-indentation
Fix indentation

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-09-02 00:51:58 +02:00
Loic Dachary
5ecf3b06cd Merge pull request #5275 from tchaikov/wip-12287
pybind/ceph_argparse: do not choke on non-ascii prefix

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-09-02 00:49:23 +02:00
Loic Dachary
539acac876 Merge pull request #5702 from Sandy4999/wip-doc-sandy
doc:radosgw: correct typos of the command removing a subuser

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
2015-09-01 23:57:53 +02:00
Sage Weil
7492c55971 Merge pull request #5747 from ceph/wip-user
fix ceph-disk

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-09-01 16:11:21 -04:00
Loic Dachary
b477fabe97 Merge pull request #5742 from dachary/wip-user
tests: ceph-disk: dmcrypt simplification

Reviewed-by: Sage Weil <sage@redhat.com>
2015-09-01 21:15:21 +02:00
Loic Dachary
8209800e2b Merge pull request #5746 from ceph/wip-fix-doc-build
doc: fix the code-block in ruby.rst

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-09-01 21:04:38 +02:00
Kefu Chai
cbe85ec126 doc: fix the code-block in ruby.rst
* and add the link to library homepage in the section titles

Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-09-02 02:51:05 +08:00
Sage Weil
37462359a3 Merge pull request #5578 from ceph/wip-newstore
osd: newstore (experimental)
2015-09-01 13:48:06 -04:00
Sage Weil
b7c5bd12b4 ceph_test_keyvaluedb: add simple commit latency benchmark
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:44 -04:00
Sage Weil
05d79b66cf os/newstore: update todo
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:44 -04:00
Sage Weil
eab4d53b74 do_autogen.sh: build static rocksdb by default
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:44 -04:00
Sage Weil
caf28fe9a5 rocksdb: update alt dist rule
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:44 -04:00
Sage Weil
9d1582d71f ceph_test_objectstore: make OMapIterator test work with FileStore
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:44 -04:00
Sage Weil
1fa2ef2347 ceph_test_objectstore: enable newstore tests
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:44 -04:00
Sage Weil
9050486306 rocksdb: update to 3.11.2
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:44 -04:00
Sage Weil
0d463ffdec os/RocksDBStore: make other rmkey match
No need for Slice() here; it can take a string.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:44 -04:00
Sage Weil
d6b0e53c54 os/RocksDBStore: fix rmkey()
This took way too long to debug!

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:43 -04:00
Sage Weil
522f8509ad ceph_test_keyvaluedb: some simple KeyValueDB unit tests
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:43 -04:00
Sage Weil
f3ddb75e3e os/newstore: fix end bound on collection_list
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:43 -04:00
Sage Weil
c37b06d0fb os/newstore: flush object before doing omap reads
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:43 -04:00
Sage Weil
faca5d0044 os/newstore: add 'newstore backend options' to pass options to e.g. rocksdb
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:43 -04:00
Sage Weil
094a190fd7 os/newstore: change escaping chars
# is lowest besides space and !, except for " (which would be too
confusing).

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:43 -04:00
Sage Weil
79799ca109 os/newstore: trim overlay when zeroing extent
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:43 -04:00
Sage Weil
15382c50d8 os/newstore: tolerate null pnext to collection_list()
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:43 -04:00
Sage Weil
a1f0bdb0bd os/newstore: fix collection range for temp objects
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:43 -04:00
Xiaoxi Chen
404cdd286d os/newstore: Implement fiemap
For simplicity we ignore holes inside an fragment now.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
2015-09-01 13:39:43 -04:00
Sage Weil
8ad6b9dfc3 os/newstore: make sync/async submit_transaction optional
It seems doing this synchronously may be better for SSDs?

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:43 -04:00
Sage Weil
35821d3aa9 os/newstore: renamed TransContext::fds -> sync_items
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:43 -04:00
Sage Weil
92979d750b os/newstore: queue kv transactions in kv_sync_thread
It appears that db->submit_transaction() will block if there is a sync
commit that is in progress instead of simply queueing the new txn for
later.  To work around this, submit these to the backend in the
kv_sync_thread prior to the synchronous submit_transaction_sync().

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:42 -04:00
Sage Weil
22a6a9f768 os/newstore: process multiple aio completions at a time
This isn't affecting things for a slow disk, but it will matter for faster
backends.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:42 -04:00
Sage Weil
9c2eb28589 os/newstore: clean up kv commit debug output
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:42 -04:00
Sage Weil
90e7f5e648 os/newstore: only ftruncate if i_size is incorrect
Even a no-op ftruncate can block in the kernel.  Prior to this change I
could frequently see ftruncate wait for an aio completion on the same
file.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:42 -04:00
Sage Weil
4c1552001a Revert "os/newstore: avoid sync append for small ios"
This reverts commit 69baab2f7e.

This is slower.  :(
2015-09-01 13:39:42 -04:00
Sage Weil
e89b2474b7 os/newstore: avoid sync append for small ios
An append is expensive in terms of latency (write, fdatasync, kv commit),
while a wal write is just the kv commit and the write and fdatasync are
async.  For small IOs doing the wal may improve performance.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:42 -04:00
Sage Weil
668c277715 rocksdb: fallocate_with_keep_size = false
This improves my 4k random writes on hdd by about 25%.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:42 -04:00
Sage Weil
08f3efb474 Revert "os/NewStore: data_map shouldn't be empty when writing all overlays"
This reverts commit 0d9cce462f.

We may want to write an overlay if hte object is new and the write is small to defer the cost
of the fsync.
2015-09-01 13:39:42 -04:00
Zhiqiang Wang
02d0ef8fe0 os/NewStore: delay the read of all the overlays until wal applying
The read of all the overlays can be delayed until applying the wal. If
we are doing async wal apply, this can reduce write op latency by
eliminating unnecessary reads in the write code path.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-09-01 13:39:42 -04:00
Xiaoxi Chen
e3abf245ba os/newstore: fix deadlock when newstore_sync_transaction=true
There is a deadlock issue in Newstore when newstore_sync_transaction = true.
With sync_transaction to true, the txc state machine will go all the way down
from STATE_IO_DONE to STATE_FINISHING in the same thread, while holding the osr->qlock().
The deadlock is caused in _txc_finish and _osr_reap_done, when trying to
lock osr->qlock again.

Since the _txc_finish can be called with(in sync transaction mode) or without
(in async transaction mode) holding the qlock, so fix this by setting the qlock
to PTHREAD_MUTEX_RECURSIVE, thus we can recursive acquire the qlock.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
2015-09-01 13:39:42 -04:00
Zhiqiang Wang
cdc652ebbe os/NewStore: fix the append of the later overlays when doing combination
The data of the later contiguous overlays should be claim_append to
'op->data', instead of 'bl'.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-09-01 13:39:41 -04:00
Xiaoxi Chen
36ed3dd20a os/Newstore: flush_commit return true on STATE_KV_DONE
There is a racing condition here, if the flush_commit() call
happened after _txc_finish_kv and before next state, the context
was pushed to on_commits but no one will handle the context since
we already pass _txc_finish_kv. This bug can be easily reproduce
by putting a sleep(5) after _txc_finish_kv, and trigger the bug by
ceph-osd -i 0 --mkfs.

Fix this bug by return true directly when state >= STATE_KV_DONE(instead
of > in previous code). We already persist the data in STATE_KV_DONE so
it's safe for us to do this.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
2015-09-01 13:39:41 -04:00
Zhiqiang Wang
e02e743857 os/NewStore: avoid dup the data of the overlays in the WAL
When writing all the overlays, there is no need to dup the data in WAL.
Instead, we can reference the overlays in the WAL, and remove these
overlays after commiting them to the fs. When replaying, we can get
these data from the referenced overlays. Doing this way, we can save a
write and a deletion for each of the overlay data in the db.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-09-01 13:39:41 -04:00
Sage Weil
6399f1d060 os/newstore: fix multiple aio case
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:41 -04:00
Sage Weil
2a7393a446 os/newstore: more conservative default for aio queue depth
There appears to be a kernel aio bug when the queue depth is small.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:41 -04:00
Xiaoxi Chen
37da4292b3 os/newstore:close fd after writting with O_DIRECT
fix bug in 2b4c60e0a5

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
2015-09-01 13:39:41 -04:00
Zhiqiang Wang
65055a0207 os/NewStore: need to increase the wal op length when combining overlays
Need to add the length of the combining overlays to the length of the
wal op.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-09-01 13:39:41 -04:00
Xiaoxi Chen
df239f0f62 os/Newstore:Fix collection_list_range
We need to rule out hobject_t::max before calling get_object_key
(in which will call get_filestore_key_u32 and get an assert failure)

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
2015-09-01 13:39:41 -04:00