Commit Graph

38535 Commits

Author SHA1 Message Date
Sage Weil
5c8ee3388f Merge remote-tracking branch 'gh/next' 2015-01-14 08:57:33 -08:00
John Spray
9daeaec5c6 mds: handle heartbeat_reset during shutdown
Because any thread might grab mds_lock and call heartbeat_reset
immediately after a call to suicide() completes, this needs
to be handled as a special case where we tolerate MDS::hb having
already been destroyed.

Fixes: #10382
Signed-off-by: John Spray <john.spray@redhat.com>
2015-01-14 12:00:17 +00:00
Zhiqiang Wang
fc5cb3cf2e osd/ReplicatedPG: remove unnecessary parameters
In functions can_skip_promote and do_cache_redirect.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-01-14 14:26:03 +08:00
Zhiqiang Wang
78b2cf0327 osd: force promotion for watch/notify ops
Watch/notify ops can't be proxied.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-01-14 14:25:10 +08:00
Zhiqiang Wang
c8bef13c2c osd/OpRequest: add osd op flag CEPH_OSD_RMW_FLAG_PROMOTE
This flag indicates the osd op needs to force a promotion when there is
a cache tier.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-01-14 14:18:40 +08:00
Yehuda Sadeh
a78a93e5a1 rgw: bi list, update marker only if result not empty
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:33 -08:00
Yehuda Sadeh
24aec123fb rgw: fix memory leak
We were iterating on both completion_objs, and completions assuming that
they follow each other. They don't do it. While at it, index completions
by id, so that we could update the completed objs correctly.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:32 -08:00
Yehuda Sadeh
33dc07cc7a rgw: initialize RGWBucketInfo::num_shards
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:32 -08:00
Yehuda Sadeh
d19a49d332 cls_rgw: call ioctx->aio_operate() under lock
close a race window

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:32 -08:00
Yehuda Sadeh
9d17bd0a6e rgw: fix linkage following rebase
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:32 -08:00
Yehuda Sadeh
f060dd6561 rgw: update calls to handle bucket sharding
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:32 -08:00
Yehuda Sadeh
0b5d80337e rgw: only keep track for cleanup of rados objects that were written
Fixes: #10311

We're keeping track of rados objects that we've written so that we could
clean them up if needed. Earlier we weren't too accurate about it and
were also setting the head object that is yet to be written. This now
only applies to the tail data, and a bit clearer.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:31 -08:00
Yehuda Sadeh
ce0ed6bb19 test: fix test_cls_rgw
Adjust to new api calls.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:28 -08:00
Yehuda Sadeh
c07af12c29 cls_rgw: remove incorrect function declaration
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:27 -08:00
Yehuda Sadeh
37a11862d2 rgw: max shards configuration is part of the zone config
The zone config params are set in the region configuration. Also,
there's a ceph.conf configurable (rgw_override_bucket_index_max_shards)
for overriding this per rgw.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:27 -08:00
Yehuda Sadeh
710166593c rgw: pass num shards on bucket initialization
Need to pass the actual num shards that are going to be used for this
specific bucket. Bucket may be created by applying metadata from
different zone, so num shards might be different.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:27 -08:00
Yehuda Sadeh
9536f74ac6 rgw: write multi shard markers on replica_log appropriately
When getting a list of shard_id#marker, iterate through the shards and
write each as needed.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:27 -08:00
Yehuda Sadeh
15703cf9f9 cls_rgw: extend shards marker api
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:27 -08:00
Yehuda Sadeh
0d1f97ff31 rgw, cls_rgw: keep shard ids with oids
Instead of just having the list of oids, keep the shard ids together, so
that we can know on which shard the operation happened.
Bucket markers are just using the shard numeric id, instead of the
bucket instance shard id. This makes it easier to parse the markers
appropriately.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:26 -08:00
Yehuda Sadeh
6b1c4a0bc2 cls_rgw: clean up CLSRGWConcurrentIO
Class is no longer a template, and keeps a map of oids by shard_id. Call
issue_op() using both shard_id and oids. Shard id is used for mapping
the results in the derived classes.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:26 -08:00
Yehuda Sadeh
a063cbac8d rgw: modify bucket instance shard marker ids
Instead of having the markers prefixed by the oids, use the bucket
instance id.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:26 -08:00
Yehuda Sadeh
0d9c2d348e rgw: bucket replica log, handle shard ids
bucket replica log can now save entries by the specified shard id

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:26 -08:00
Yehuda Sadeh
a2c3680bd1 cls_rgw: list bi log should not return marker entry
The marker should be served as a lower bound, but should not be
returned.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:26 -08:00
Yehuda Sadeh
d52a8b10a5 rgw: bucket_index_shard_hash_type fixes
add initializations, json encode /decode.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:26 -08:00
Yehuda Sadeh
44bc63b166 rgw: decode the req_state bucket instance id if needed
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:25 -08:00
Yehuda Sadeh
d31e84ea94 rgw: improve bucket sharding hashing
Amplify small source changes.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:25 -08:00
Yehuda Sadeh
a33ca59e26 rgw: data changes log, log info by bucket shard id
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:25 -08:00
Yehuda Sadeh
87934706fc rgw: use new BucketShard structure for index manipulation calls
Instead of recalculating the hash every call, do it once, and pass this
structure around. Also, will be used for logging changes into the data
log.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:25 -08:00
Yehuda Sadeh
381f68acde rgw: bi log list/trim can get specific bucket shard
bucket shard can be specified on the bucket instance param. It can be
added like this: <bucket-instance>[:shard-id]

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:25 -08:00
Guang Yang
8a04c0a61b Fix the multipart uploads functional test failures due to bucket index sharding.
Signed-off-by: Guang Yang <yguang@yahoo-inc.com>
2015-01-13 19:21:24 -08:00
Guang Yang
231fa0e670 Fix get_bucket_instance_info, only build the oid if it is empty.
Signed-off-by: Guang Yang <yguang@yahoo-inc.com>
2015-01-13 19:21:24 -08:00
Guang Yang
9e45a7cd36 Adjust bi log trim implementation to work with multiple bucket shards.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:24 -08:00
Guang Yang
f9b280ea89 Adjust bi log listing to work with multiple bucket shards.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:24 -08:00
Yehuda Sadeh
47665b23df cls_rgw, rgw: switch different ops to new concurrent infrastructure
Make all the relevant ops use the CLSRGWConcurrentIO infrastructure,
which simplifies things.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:24 -08:00
Yehuda Sadeh
30d0a49c84 rgw: generalize container type for concurrent IO base class
Turned the ConcurrentIO class a template, so that we could use different
kind of containers that are needed for the different operations.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:23 -08:00
Yehuda Sadeh
04441f2878 cls_rgw, rgw: create base class for common bucket shard operations
Instead of copy pasting the same code all over again, create a base
class for the needed concurrent IO operations.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:23 -08:00
Guang Yang
9c5acd67c4 Adjust bucket stats/index checking/index rebuild/tag timeout implementation to work with multiple shards.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:23 -08:00
Guang Yang
56feee792e Adjust bucket listing to work with multiple shards.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:23 -08:00
Guang Yang
751fd07bec Adjust rgw bucket prepare/complete OP to work with multiple bucket index shards.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:22 -08:00
Guang Yang
5d004d3eac Implement sharding for bucket creation.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:22 -08:00
Guang Yang
90a3920c44 Add a new field to bucket info indicating the number of shards of this bucket and make it configurable.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:22 -08:00
Sage Weil
364b86813f mon/Paxos: consolidate finish_round()
Signed-off-by: Sage Weil <sage@redhat.com>
2015-01-13 14:51:22 -08:00
Sage Weil
67a90dd75c mon: accumulate a single pending transaction and propose it all at once
Previous we would queue lots of distinct encoded Transactions from various
callers, usually one per PaxosService.  These would be sent through paxos
one at a time.

If there is a completed transaction there is no reason to delay; it is
more efficient to push it through immediately.  Since we will propose
anything pending right when we finish, there is minimal opportunity for
other work to get done.

Instead, accumulate everything in a single MonitorDBStore::Transaction and
propose all pending changes all at once.  Encode at propose time and
expose the Transaction to the callers so they can add their changes.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-01-13 14:51:04 -08:00
Sage Weil
d15958631b PendingReleaseNotes: make a note about librados flag changes
Signed-off-by: Sage Weil <sage@redhat.com>
2015-01-13 12:23:37 -08:00
Sage Weil
5a1fd855df Merge pull request #3360 from mattrichards/bump_rados_version
librados: bump rados version number

Reviewed-by: Sage Weil <sage@redhat.com>
2015-01-13 12:18:04 -08:00
Jenkins
725d66098c 0.91 2015-01-13 12:10:22 -08:00
Josh Durgin
6f8b54ca29 Merge pull request #2697 from ceph/wip-8900
RBD image watcher and new exclusive lock handling

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-01-13 11:17:29 -08:00
Samuel Just
b8ce73f253 Merge pull request #3254 from trociny/feature-10036
osd: osd tree to show primary-affinity value

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-01-13 10:56:29 -08:00
Samuel Just
6c4a523c36 Merge pull request #3281 from ceph/wip-10441-b
osd: fix watch ordering bug 10441 option b

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-01-13 10:55:29 -08:00
Samuel Just
00c30dd0d4 Merge pull request #3290 from ceph/wip-da-SCA-20150102
Coverity and SCA fixes

Reviewed-by: Sage Weil <sage@redhat.com>
2015-01-13 10:54:45 -08:00