Commit Graph

44176 Commits

Author SHA1 Message Date
Sage Weil
2317e446c5 os/newstore: use aio for wal writes, too
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:39 -04:00
Sage Weil
e580a82729 os/newstore: a few comments about wal
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:39 -04:00
Sage Weil
5d8e14653d os/newstore: combined O_DSYNC with O_DIRECT
This avoids the need for an explicit fdatasync when doing O_DIRECT.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:39 -04:00
Sage Weil
b7a53b5874 os/newstore: basic aio support
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:39 -04:00
Sage Weil
ba0d8d7fdd os/Newstore: add newstore_db_path option
The load of Keyvalue DB is heavy, allow user to put
DB to a seperate(fast) device.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
2015-09-01 13:39:39 -04:00
Sage Weil
143d48570f os/newstore: throttle wal work
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:39 -04:00
Sage Weil
efe218b4aa os/newstore: show # o_direct buffers in debug output
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:38 -04:00
Sage Weil
7e1af1e616 os/newstore: use a threadpool for applying wal events
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:38 -04:00
Sage Weil
dfd389e66a os/newstore: rebuild buffers to be page-aligned for O_DIRECT
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:38 -04:00
Sage Weil
552d95213b ceph_test_objectstore: fix omap test cleanup
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:38 -04:00
Sage Weil
04f55d8d18 os/newstore: use fdatasync instead of fsync
On XFS at least, fdatasync is sufficient to make data readable.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:38 -04:00
Sage Weil
1321b880cc os/newstore: update todo
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:38 -04:00
Xiaoxi Chen
65877832f8 os/Newstore: Check onode.omap_head in valid() and next()
The db iter will be set to KeyValueDB::Iterator() if onode.omap_head
not present. In that case if we touch the db iter we will get a segmentation
fault.

Prevent to touch the db iter when onode.omap_head is invalid(equals to 0).

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
2015-09-01 13:39:38 -04:00
Xiaoxi Chen
1a97fd6cb7 Use .str() to output a stringstream.
a nit.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
2015-09-01 13:39:38 -04:00
Xiaoxi Chen
9d0e925566 os/Newstore: Allow gap in _do_write append mode
We can allow some gap so we only need to ensure
onode.size <= offset.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
2015-09-01 13:39:38 -04:00
Xiaoxi Chen
5e9c64b4dd Implement get_omap_iterator
implemented get_omap_iterator

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
2015-09-01 13:39:38 -04:00
Xiaoxi Chen
c86410239b os/KeyValueDB: Add raw_key() interface for IteratorImpl
raw_key() is useful to split out the prefix.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
2015-09-01 13:39:38 -04:00
Xiaoxi Chen
b595aac4e1 test/store_test Add get_omap_iterator test cases
omap iterator test cases include:
  iter aganist omap
  lower_bound
  upper_bound

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
2015-09-01 13:39:38 -04:00
Sage Weil
ca9bc6327d os/newstore: drop sync()
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:37 -04:00
Sage Weil
d57547f103 os/newstore: drop sync()
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:37 -04:00
Sage Weil
205344d32d os/newstore: drop flush
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:37 -04:00
Sage Weil
f93856f71a os/newstore: drop sync_and_flush
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:37 -04:00
Sage Weil
28bc4ee76e os/newstore: use FS::zero()
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:37 -04:00
Sage Weil
c67c9a2bee os/newstore: use O_DIRECT is write is page-aligned
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:37 -04:00
Sage Weil
5539a75efb os/newstore: pass flags to _{open,create}_fid
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:37 -04:00
Sage Weil
48f639beec os/newstore: drop unused FragmentHandle
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:37 -04:00
Sage Weil
93fa4f1e30 os/newstore: do not call completions from kv thread
Reads may call wait_wal() holding user locks, and so we cannot block
progress on WAL completion/flushing by calling callbacks that may take
user locks.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:37 -04:00
Sage Weil
86a3f7dd51 os/newstore: let wal cleanup kv txn get batched
No need to trigger another sync kv commit here; just let the next KV
commit catch it.

We could possibly do a bit better here by not waking up the kv thread at
all...

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:37 -04:00
Sage Weil
ec21f578a7 os/newstore: fix off-by-one on overlay_max_length
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:37 -04:00
Sage Weil
f9a7fd4e4c os/newstore: use lower_bound for finding overlay extents in map
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:36 -04:00
Sage Weil
66aae98277 os/newstore: use overlay even if it is a new object or append
This avoids the fsync for small writes.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:36 -04:00
Xiaoxi Chen
0981428123 os/Newstore:Change assert in get_onode
db->get will return negtive when key is not found.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
2015-09-01 13:39:36 -04:00
Sage Weil
97bda73ebf os/newstore: open by handle
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:36 -04:00
Sage Weil
8f2c2bff30 os/newstore: use fs abstaction layer
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:36 -04:00
Xiaoxi Chen
ef420baf1c os/newstore: cap fid_max below newstore_max_dir_size
Prevent fid_max over the max_dir_size when preallocation.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
2015-09-01 13:39:36 -04:00
Sage Weil
59cd761bca os/newstore: keep smallish overlay extents in kv db
If we have a small overwrite, keep the extent in the key/value database.
Only write it back to the file/fragment later, and when we do, write them
all at once.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:36 -04:00
Sage Weil
2af1e37d7d os/newstore: assigned unique nid to each new object
Use this as the key for omap (omap_head), but keep the omap_head field
so that we can tell when no omap data is present.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:36 -04:00
Sage Weil
713c69884e os/newstore: consolite collection_list to a single implementation
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:36 -04:00
Xiaoxi Chen
a4d2a53cf6 Clear removed_collections after reap
Previous code forgot to clear the removed_collections queues
after reaped the collections in _reap_collection.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
2015-09-01 13:39:36 -04:00
Sage Weil
d8351a8d9e os/newstore: ref count OpSequencer
Our OpSequencer may live longer than the ObjectStore::Sequencer interface
object does.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:35 -04:00
Sage Weil
fbf3d5528f os/newstore: send complete overwrite to a new fid
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:35 -04:00
Sage Weil
db87e423b6 os/newstore: clone omap
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:35 -04:00
Sage Weil
d0a4bbaf69 newstore: initial version
This includes a bunch of new ceph_test_objectstore tests, and a ton of fixes
to existing tests so that objects actually live inside the collections they
are written to.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:39:35 -04:00
Sage Weil
10c0bfeadd vstart.sh: debug newstore 2015-09-01 13:39:18 -04:00
Sage Weil
be93b09fd2 Revert "os/Makefile.am: add os/fs/XFS.cc"
This reverts commit 32331ede41.

Doh, this is in a conditional below.
2015-09-01 13:38:03 -04:00
David Zafman
78e784d202 Merge pull request #5173 from ceph/wip-12000-12200
Fast read for erasure coding pool and erasure code error handling

Error handling
Reviewed-by: Loic Dachary <ldachary@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Fast Read
Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
2015-09-01 10:25:08 -07:00
Sage Weil
32331ede41 os/Makefile.am: add os/fs/XFS.cc
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-01 13:20:03 -04:00
Kefu Chai
82acd5e9ee Merge pull request #5695 from tchaikov/wip-12012
osd: translate sparse_read to read for ecpool

Reviewed-by: Sage Weil <sage@redhat.com>
2015-09-01 19:00:46 +08:00
Kefu Chai
076bad955d ceph_test_rados_api_aio: add a test for aio_sparse_read
Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-09-01 13:52:37 +08:00
Kefu Chai
4d4920610e ceph_test_rados_api_io: add tests for sparse_read
Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-09-01 13:49:22 +08:00