Commit Graph

27632 Commits

Author SHA1 Message Date
Sage Weil
1f5e6c2254 mon: no need to reset sync state on restart
If we are in or forcing a sync, we can leave these there until the sync
completes.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-09 11:05:48 -07:00
Sage Weil
bbfb5b41b6 mon: drop single-use is_sync_on_going() check
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-09 11:05:48 -07:00
Sage Weil
a4d0ccf68c mon: rev the internal mon protocol
This captures the new sync.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-09 11:05:48 -07:00
Sage Weil
9fc4e4f337 mon/MonitorDBStore: drop unused single prefix synchronizer
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-09 11:05:48 -07:00
Sage Weil
45907dc1ba mon: add --force-sync startup option
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-09 11:05:48 -07:00
Sage Weil
af3b49f606 mon/Paxos: move consistent check into Paxos::init()
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-09 11:05:48 -07:00
Sage Weil
ccceeee57b mon/Paxos: remove unnecessary trim enable/disable
The sync no longer cares if we trim Paxos versions as we go, as long as we
don't trim so fast that we fall behind between GET_CHUNK messages, which
we can consider a tuning problem.

Remove this extra complexity!

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-09 11:05:48 -07:00
Sage Weil
aa33bc88aa mon/Paxos: config min paxos txns to keep separately
We were using paxos_max_join_drift to control the minimum number of
paxos transactions to keep around.  Instead, make this explicit, and
separate from the join drift.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-09 11:05:47 -07:00
Sage Weil
da0aff28ab mon: implement a simpler sync
The previous sync implementation was highly stateful and very complex.
This made it very hard to understand and to debug, and there were bugs
still lurking in the timeout code (at least).

Replace it with something much simpler:

 - sync providers are almost stateless.  they keep an iterator, identified
   by a unique cookie, that times out in a simple way.
 - sync requesters sync from whomever they fancy.  namely anyone with newer
   committed paxos state.

There are a few extra fields that might allow sync continuation later, but
this is complex and not necessary at this point.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-09 11:05:47 -07:00
Sage Weil
f326c4dcef mon/PGMonitor: cleanup: use const strings for pgmap prefixes
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-09 10:42:16 -07:00
Yehuda Sadeh
5faa4ac1e5 rgw: warn on the lack of curl_multi_wait()
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-09 09:19:16 -07:00
Yehuda Sadeh
76e79266a0 rgw: fix args parsing
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-09 00:35:00 -07:00
David Zafman
3c89a19c92 os: Add missing pool to hobject_t:::dump() and hobject_t::decode()
Signed-off-by: David Zafman <david.zafman@inktank.com>
2013-07-09 00:06:35 -07:00
David Zafman
b1b188a5fd os: Remove unused hobject_t::set_filestore_key()
Signed-off-by: David Zafman <david.zafman@inktank.com>
2013-07-09 00:06:32 -07:00
David Zafman
72c27a302e librados, osdc: Refactor IoCtxImpl to use operate()/operate_read()
Add ObjectOperation::write() that includes len instead of using bufferlist length
Have selfmanaged_snap_rollback_object() use mutate()

Signed-off-by: David Zafman <david.zafman@inktank.com>
2013-07-09 00:06:32 -07:00
David Zafman
9dd60a634c TestRados: Output error for improper usage instead of Floating Point Exception
Signed-off-by: David Zafman <david.zafman@inktank.com>
2013-07-09 00:06:32 -07:00
David Zafman
30c951cc36 osd: Fix object_locator_t::get_pool() return type
Signed-off-by: David Zafman <david.zafman@inktank.com>
2013-07-09 00:06:00 -07:00
David Zafman
7efbf5da3b librados: Fix lock names
Signed-off-by: David Zafman <david.zafman@inktank.com>
2013-07-09 00:05:57 -07:00
David Zafman
3931bfa6cd psim.cc: Fix comment on how to create .ceph_osdmap
Signed-off-by: David Zafman <david.zafman@inktank.com>
2013-07-09 00:05:57 -07:00
David Zafman
313b7a1f45 os: Code conformance in LFNIndex.cc
Signed-off-by: David Zafman <david.zafman@inktank.com>
2013-07-09 00:05:43 -07:00
Yehuda Sadeh
395262cd82 rgw: call appropriate curl calls for waiting on sockets
If libcurl supports curl_multi_wait() then use it, otherwise
use select() and force a timeout, even if it has been disabled.
Otherwise we may wait forever for events that we can't wait for
as select() only uses fds < 1024.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-09 00:04:01 -07:00
Yehuda Sadeh
73c2a3dcd3 configure.ac: detect whether libcurl supports curl_multi_wait()
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-08 23:43:00 -07:00
Gary Lowell
d4ca36b317 Merge branch 'next' 2013-07-08 23:17:50 -07:00
Sage Weil
8ff62ae42e Merge remote-tracking branch 'gh/next' 2013-07-08 22:17:51 -07:00
Sage Weil
d08b6d6df7 mon/PaxosService: prevent reads until initial service commit is done
Do not process reads (or, by PaxosService::dispatch() implication, writes)
until we have committed the initial service state.  This avoids things like
EPERM due to missing keys when we race with mon creation, triggered by
teuthology tests doing their health check after startup.

Fixes: #5515
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-07-08 22:17:41 -07:00
Sage Weil
63fe8635ae mon/PaxosService: unwind should_trim()
Inline the single-caller helper.  This will help us in a moment...

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-08 21:44:05 -07:00
Sage Weil
d600dc9321 mon/PaxosService: unwind service_should_trim() helper
Nobody overloads it; put it inline in should_trim().

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-08 21:41:55 -07:00
Sage Weil
6aa023048a mon/MDSMonitor: remove unnecessary service_should_trim()
We never set_trim_to(), so this is unnecessary.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-08 21:41:34 -07:00
Sage Weil
b71a00966c mon/OSDMonitor: remove dup service_should_trim() implementation
This matches the parent.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-08 21:40:36 -07:00
Sage Weil
39b71c5826 mon/PaxosService: trim periodically instead of via propose_pending
We want to trim old states even if there is no update activity.  For
example, if a long-running rebalance finishes all osdmap updates will
stop and we won't trim out old maps to free space.

Instead, trim at the same time as tick().  Remove the trim during
propose_pending() to force all trims through this path and avoid
introducing a new and rarely-exercised behavior.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-07-08 21:38:11 -07:00
Sage Weil
2f8ff2de17 mon/PaxosService: reorder definitions
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-08 21:33:37 -07:00
Sage Weil
50ffe324e3 mon/PaxosService: uninline should_trim()
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-08 21:33:22 -07:00
John Wilkins
f1b4398dd2 Merge branch 'master' of https://github.com/ceph/ceph 2013-07-08 18:11:57 -07:00
John Wilkins
5edc1ff7ea doc: Added Ceph Object Storage installation instructions for CentOS/RHEL 6.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-07-08 18:11:25 -07:00
Sage Weil
ca54efd68e mon: sync all service prefixes, including pgmap_*
This was just recently broken with the merge of the pgmap changes.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-08 17:49:49 -07:00
Sage Weil
b536935f77 mon/MonitorDBStore: expose get_chunk_tx()
Allow users get the transaction unencoded.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-08 17:49:49 -07:00
Sage Weil
43fa7aabf1 mon/OSDMonitor: fix base case for loading full osdmap
Right after cluster creation, first_committed is 1 and latest stashed in 0,
but we don't have the initial full map yet.  Thereafter, we do (because we
write it with trim).  Fixes afd6c7d824.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-07-08 17:47:11 -07:00
Samuel Just
0e93dd93e5 Merge branch 'wip-small-object-recovery'
Conflicts:
	src/include/ceph_features.h

Reviewed-by: Sage Weil <sage@inktank.com>
Fixes: #5278
2013-07-08 16:53:17 -07:00
Samuel Just
ad65de40ff ReplicatedPG: send compound messages to enlightened peers
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:32 -07:00
Samuel Just
ae1b2e97f5 ReplicatedPG: add handlers for MOSDPG(Push|Pull|PushReply)
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:32 -07:00
Samuel Just
c0bd831ace OSD: add handlers for MOSDPG(Push|PushReply|Pull)
MOSDPG(Push|PushReply|Pull|SubOp|SubOpReply) need the
same thing checked prior to queueing the op, so they
share a templated handler.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:31 -07:00
Samuel Just
264dbf3f9e messages/,osd_types: add messages for Push, PushReply, Pull
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:31 -07:00
Samuel Just
c56f16d4dc ReplicatedPG: split handle_pull out of sub_op_pull
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:31 -07:00
Samuel Just
175c0777ed ReplicatedPG: split handle_push_reply out of sub_op_push_reply
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:31 -07:00
Samuel Just
54e5f6423a ReplicatedPG: send pulls en masse in recover_primary
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:31 -07:00
Samuel Just
c41d4dc4bb ReplicatedPG: send pushes en mass in recover_replicas, recover_backfill
This way, the pushes might be later merged into a smaller number of
messages.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:31 -07:00
Samuel Just
eec86b8d3c OSD: convert handle_push to use PushOp
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:31 -07:00
Samuel Just
a4984328be ReplicatedPG: pass a PushOp into handle_pull_response
This is the first step toward packaging multiple
pushes/pulls into a single message.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:30 -07:00
Samuel Just
82cb922e89 ReplicatedPG: split send_push into build_push_op and send_push_op
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:30 -07:00
Samuel Just
31e19a64b0 ReplicatedPG: _committed_pushed_object don't pass op
Add a separate callback to handle marking the event and
the stats.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:30 -07:00