Commit Graph

38319 Commits

Author SHA1 Message Date
Jason Dillaman
4d3b49e9d6 rbd: ensure aio_write buffer isn't invalidated during image import
The buffer provided to aio_write shouldn't be invalidated until
after aio_write has indicated that the operation has completed.

Fixes: #10590
Backport: giant
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-01-21 12:23:16 -08:00
Jason Dillaman
3487683321 Merge pull request #3426 from jdurgin/wip-10592
qa: disable automatic locking for manual locking test

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2015-01-21 14:59:57 -05:00
Sage Weil
35badbc66c Merge pull request #3427 from jdurgin/wip-cram
test: fix rbd cli tests for new feature bit

Reviewed-by: Sage Weil <sage@redhat.com>
2015-01-20 19:28:51 -08:00
Josh Durgin
fe93f73aac test: fix rbd cli tests for new feature bit
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-01-20 17:13:10 -08:00
Josh Durgin
946958c13c qa: disable automatic locking for manual locking test
Automatic locking hides the ESHUTDOWN from the caller, which is how
this test detects that blacklisting works.

Fixes: #10592
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-01-20 15:56:12 -08:00
Yehuda Sadeh
6613358ddc Revert "Merge remote-tracking branch 'origin/wip-bi-sharding-3' into next"
This reverts commit f79d8f24e9, reversing
changes made to 896c8899ac.
2015-01-19 09:26:00 -08:00
Yehuda Sadeh
f79d8f24e9 Merge remote-tracking branch 'origin/wip-bi-sharding-3' into next 2015-01-19 09:14:32 -08:00
Josh Durgin
896c8899ac Merge remote-tracking branch 'origin/wip-10271' into next
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-01-16 14:33:59 -08:00
Yehuda Sadeh
dbaa1420c4 rgw: bilog marker related fixes
Fix the way we parse the marker. Instead of specifying whether it's a
sharded or not sharded bucket, we pass a shard_id. If string itself
points to a singe shard, we'll use the passed shard_id, otherwise we'll
parse the string and determine the shard id by that. In this way when
referencing a single shard we can get the marker with either shard id
specified or not. This works with the non-shard case too.
Adjust the bilog listing function, set it to work with the new
interface. It was broken before, and there are multiple fixes to it.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-16 09:12:54 -08:00
Sage Weil
bed740bbd0 Merge pull request #3342 from ceph/wip-10311
rgw: only keep track for cleanup of rados objects that were written

Reviewed-by: Ray Lv <xiangyulv@gmail.com>
2015-01-15 21:45:56 -08:00
Sage Weil
80473f6385 os/FileJournal: Fix journal write fail, align for direct io
when config journal_zero_on_create true, osd mkfs will fail when zeroing journal.
journal open with O_DIRECT, buf should align with blocksize.

Backport: giant, firefly, dumpling
Signed-off-by: Xie Rui <875016668@qq.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2015-01-15 11:20:18 -08:00
Jerry7X
cc0dba5261 mon: encode stashed monmap with all features
latest_monmap that we stash is only used locally--the encoded bl is never shared. Which means we should just use CEPH_FEATURES_ALL all of the time.

Fixes: #5203
Backport: giant, firefly
Signed-off-by: Xie Rui <875016668@qq.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Joao Eduardo Luis <joao@redhat.com>
2015-01-15 11:13:17 -08:00
Yehuda Sadeh
f3a57ee6a6 rgw: wait for completion only if not completion available
In a bucket aio operation, wait for completions only if there are no
completions available. Otherwise we might wait forever, as everything
already complete.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-14 11:47:18 -08:00
Sage Weil
5c8ee3388f Merge remote-tracking branch 'gh/next' 2015-01-14 08:57:33 -08:00
Yehuda Sadeh
a78a93e5a1 rgw: bi list, update marker only if result not empty
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:33 -08:00
Yehuda Sadeh
24aec123fb rgw: fix memory leak
We were iterating on both completion_objs, and completions assuming that
they follow each other. They don't do it. While at it, index completions
by id, so that we could update the completed objs correctly.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:32 -08:00
Yehuda Sadeh
33dc07cc7a rgw: initialize RGWBucketInfo::num_shards
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:32 -08:00
Yehuda Sadeh
d19a49d332 cls_rgw: call ioctx->aio_operate() under lock
close a race window

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:32 -08:00
Yehuda Sadeh
9d17bd0a6e rgw: fix linkage following rebase
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:32 -08:00
Yehuda Sadeh
f060dd6561 rgw: update calls to handle bucket sharding
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:32 -08:00
Yehuda Sadeh
0b5d80337e rgw: only keep track for cleanup of rados objects that were written
Fixes: #10311

We're keeping track of rados objects that we've written so that we could
clean them up if needed. Earlier we weren't too accurate about it and
were also setting the head object that is yet to be written. This now
only applies to the tail data, and a bit clearer.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:31 -08:00
Yehuda Sadeh
ce0ed6bb19 test: fix test_cls_rgw
Adjust to new api calls.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:28 -08:00
Yehuda Sadeh
c07af12c29 cls_rgw: remove incorrect function declaration
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:27 -08:00
Yehuda Sadeh
37a11862d2 rgw: max shards configuration is part of the zone config
The zone config params are set in the region configuration. Also,
there's a ceph.conf configurable (rgw_override_bucket_index_max_shards)
for overriding this per rgw.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:27 -08:00
Yehuda Sadeh
710166593c rgw: pass num shards on bucket initialization
Need to pass the actual num shards that are going to be used for this
specific bucket. Bucket may be created by applying metadata from
different zone, so num shards might be different.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:27 -08:00
Yehuda Sadeh
9536f74ac6 rgw: write multi shard markers on replica_log appropriately
When getting a list of shard_id#marker, iterate through the shards and
write each as needed.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:27 -08:00
Yehuda Sadeh
15703cf9f9 cls_rgw: extend shards marker api
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:27 -08:00
Yehuda Sadeh
0d1f97ff31 rgw, cls_rgw: keep shard ids with oids
Instead of just having the list of oids, keep the shard ids together, so
that we can know on which shard the operation happened.
Bucket markers are just using the shard numeric id, instead of the
bucket instance shard id. This makes it easier to parse the markers
appropriately.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:26 -08:00
Yehuda Sadeh
6b1c4a0bc2 cls_rgw: clean up CLSRGWConcurrentIO
Class is no longer a template, and keeps a map of oids by shard_id. Call
issue_op() using both shard_id and oids. Shard id is used for mapping
the results in the derived classes.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:26 -08:00
Yehuda Sadeh
a063cbac8d rgw: modify bucket instance shard marker ids
Instead of having the markers prefixed by the oids, use the bucket
instance id.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:26 -08:00
Yehuda Sadeh
0d9c2d348e rgw: bucket replica log, handle shard ids
bucket replica log can now save entries by the specified shard id

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:26 -08:00
Yehuda Sadeh
a2c3680bd1 cls_rgw: list bi log should not return marker entry
The marker should be served as a lower bound, but should not be
returned.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:26 -08:00
Yehuda Sadeh
d52a8b10a5 rgw: bucket_index_shard_hash_type fixes
add initializations, json encode /decode.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:26 -08:00
Yehuda Sadeh
44bc63b166 rgw: decode the req_state bucket instance id if needed
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:25 -08:00
Yehuda Sadeh
d31e84ea94 rgw: improve bucket sharding hashing
Amplify small source changes.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:25 -08:00
Yehuda Sadeh
a33ca59e26 rgw: data changes log, log info by bucket shard id
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:25 -08:00
Yehuda Sadeh
87934706fc rgw: use new BucketShard structure for index manipulation calls
Instead of recalculating the hash every call, do it once, and pass this
structure around. Also, will be used for logging changes into the data
log.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:25 -08:00
Yehuda Sadeh
381f68acde rgw: bi log list/trim can get specific bucket shard
bucket shard can be specified on the bucket instance param. It can be
added like this: <bucket-instance>[:shard-id]

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:25 -08:00
Guang Yang
8a04c0a61b Fix the multipart uploads functional test failures due to bucket index sharding.
Signed-off-by: Guang Yang <yguang@yahoo-inc.com>
2015-01-13 19:21:24 -08:00
Guang Yang
231fa0e670 Fix get_bucket_instance_info, only build the oid if it is empty.
Signed-off-by: Guang Yang <yguang@yahoo-inc.com>
2015-01-13 19:21:24 -08:00
Guang Yang
9e45a7cd36 Adjust bi log trim implementation to work with multiple bucket shards.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:24 -08:00
Guang Yang
f9b280ea89 Adjust bi log listing to work with multiple bucket shards.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:24 -08:00
Yehuda Sadeh
47665b23df cls_rgw, rgw: switch different ops to new concurrent infrastructure
Make all the relevant ops use the CLSRGWConcurrentIO infrastructure,
which simplifies things.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:24 -08:00
Yehuda Sadeh
30d0a49c84 rgw: generalize container type for concurrent IO base class
Turned the ConcurrentIO class a template, so that we could use different
kind of containers that are needed for the different operations.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:23 -08:00
Yehuda Sadeh
04441f2878 cls_rgw, rgw: create base class for common bucket shard operations
Instead of copy pasting the same code all over again, create a base
class for the needed concurrent IO operations.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:23 -08:00
Guang Yang
9c5acd67c4 Adjust bucket stats/index checking/index rebuild/tag timeout implementation to work with multiple shards.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:23 -08:00
Guang Yang
56feee792e Adjust bucket listing to work with multiple shards.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:23 -08:00
Guang Yang
751fd07bec Adjust rgw bucket prepare/complete OP to work with multiple bucket index shards.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:22 -08:00
Guang Yang
5d004d3eac Implement sharding for bucket creation.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:22 -08:00
Guang Yang
90a3920c44 Add a new field to bucket info indicating the number of shards of this bucket and make it configurable.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:22 -08:00