Commit Graph

27487 Commits

Author SHA1 Message Date
Samuel Just
175c0777ed ReplicatedPG: split handle_push_reply out of sub_op_push_reply
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:31 -07:00
Samuel Just
54e5f6423a ReplicatedPG: send pulls en masse in recover_primary
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:31 -07:00
Samuel Just
c41d4dc4bb ReplicatedPG: send pushes en mass in recover_replicas, recover_backfill
This way, the pushes might be later merged into a smaller number of
messages.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:31 -07:00
Samuel Just
eec86b8d3c OSD: convert handle_push to use PushOp
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:31 -07:00
Samuel Just
a4984328be ReplicatedPG: pass a PushOp into handle_pull_response
This is the first step toward packaging multiple
pushes/pulls into a single message.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:30 -07:00
Samuel Just
82cb922e89 ReplicatedPG: split send_push into build_push_op and send_push_op
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:30 -07:00
Samuel Just
31e19a64b0 ReplicatedPG: _committed_pushed_object don't pass op
Add a separate callback to handle marking the event and
the stats.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:30 -07:00
Samuel Just
0f51b60cba ReplicatedPG: submit_push_data must take recovery_info as non-const
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:30 -07:00
Gary Lowell
b6b48dbefa v0.66 2013-07-08 15:45:00 -07:00
Sage Weil
a9906641a1 mon: implement simple 'scrub' command
Compare all keys within the sync'ed prefixes across members of the quorum
and compare the key counts and CRC for inconsistencies.

Currently this is a one-shot inefficient hammer.  We'll want to make this
work in chunks before it is usable in production environments.

Protect with a feature bit to avoid sending MMonScrub to mons who can't
decode it.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-07-08 15:34:32 -07:00
Sage Weil
afd6c7d824 mon: fix osdmap stash, trim to retain complete history of full maps
The current interaction between sync and stashing full osdmaps only on
active mons means that a sync can result in an incomplete osdmap_full
history:

 - mon.c starts a full sync
 - during sync, active osdmap service should_stash_full() is true and
   includes a full in the txn
 - mon.c sync finishes
 - mon.c update_from_paxos gets "latest" stashed that it got from the
   paxos txn
 - mon.c does *not* walk to previous inc maps to complete it's collection
   of full maps.

To fix this, we disable the periodic/random stash of full maps by the
osdmap service.

This introduces a new problem: we must have at least one full map (the first
one) in order for a mon that just synced to build it's full collection.
Extend the encode_trim() process to allow the osdmap service to include
the oldest full map with the trim txn.  This is more complex than just
writing the full maps in the txn, but cheaper--we only write the full
map at trim time.

This *might* be related to previous bugs where the full osdmap was
missing, or case where leveldb keys seemed to 'disappear'.

Fixes: #5512
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-07-08 15:04:59 -07:00
Gary Lowell
dd1e6d45b9 Revert "Makefile: fix ceph_sbindir"
This reverts commit 352f362567.

Reverting this commit because it causes problems with the debian build, and
reopening #5492.   The root problem appears to be lack of support by GNU
autotools for installing into both /sbin and /usr/sbin using the standard
location variables.

Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
2013-07-08 14:49:16 -07:00
Yehuda Sadeh
f07d21672f rgw: fix bucket link
Bucket link was assuming the bucket head object was holding the
bucket acl, which is not true anymore.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-08 14:02:45 -07:00
Sage Weil
b55e455ed4 Merge pull request #395 from kri5/wip-vstart-documentation
doc: Add a page to document vstart.sh script
2013-07-08 13:35:46 -07:00
Christophe Courtaut
eec903a6ec doc: Fix env variables in vstart.sh documentation
Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
2013-07-08 22:19:21 +02:00
Sage Weil
69a5544543 osd/osd_types: fix pg_stat_t::dump for last_epoch_clean
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-08 12:55:20 -07:00
Gregory Farnum
3cbf657303 Merge pull request #403 from ceph/wip-olazy
merge: O_LAZY flag removal

Reviewed-by: Greg Farnum <greg@inktank.com
2013-07-08 12:45:45 -07:00
Yehuda Sadeh
9f8bfb4b22 Merge pull request #397 from kri5/wip-5478
rgw: Add explicit messages in radosgw init script

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-08 12:23:36 -07:00
Yehuda Sadeh
784ce25842 Merge pull request #406 from kri5/wip-3074
rgw: Add --help support to radosgw

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-08 11:43:13 -07:00
Sage Weil
94afedf02d client: remove O_LAZY
The once-upon-a-time unique O_LAZY value I chose forever ago is now
O_NOATIME, which means that some clients are choosing relaxed
consistency without meaning to.

It is highly unlikely that a real O_LAZY will ever exist, and we can
select it in the ceph case with the ioctl or libcephfs call, so drop
any support for doing this via open(2) flags.

Update doc/lazy_posix.txt file re: lazy io.

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-07-08 11:24:48 -07:00
Sage Weil
e9d19b38c8 common/crc32c: skip cpu detection incantation on not x86_64
On i386 this fails to build with

common/crc32c-intel.c: In function 'ceph_have_crc32c_intel':
error: common/crc32c-intel.c:79:9: PIC register clobbered by 'ebx' in 'asm'

ARM had more to complain about.

Not sure where this test came from, but it is clearly not meant for
anything other than x86_64.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-07-08 10:54:53 -07:00
athanatos
0471b719c2 Merge pull request #407 from dachary/wip-5487
unit tests for ObjectContext read/write locks

Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-07-08 10:44:43 -07:00
Sage Weil
956fafc7f2 qa/workunits/rbd/simple_big.sh: don't ENOSPC every time
Set the count on the initial dd so we don't always ENOSPC.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-08 10:14:08 -07:00
Sage Weil
d423cf8c4f qa/workunits/rbd/kernel.sh: move modprobe up
Needs to happen before cleanup.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-08 09:58:16 -07:00
Sage Weil
672f51be3a qa/workunits/fs/test_o_trunc.sh: fix .sh to match new bin location
To match 83f308962c.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-08 09:56:29 -07:00
Sage Weil
eb567bebb3 Merge pull request #389 from ceph/wip-5492
Makefile: fix ceph_sbindir

Reviewed-by: Mark Kirkwood <mark.kirkwood@catalyst.net.nz>
2013-07-08 09:37:31 -07:00
Loic Dachary
7b7f752c69 unit tests for ObjectContext read/write locks
unit tests for the ObjectContext methods ondisk_write_lock,
ondisk_write_unlock, ondisk_read_lock and ondisk_read_unlock.

A class derived from ::testing::Test is created with two sub-classes (
Thread_read_lock & Thread_write_lock ) to provide a separate thread
that can block with cond.Wait(). usleep(3) is used in the main thread
to wait for the expected side effect with increasing delays ( up to
MAX_DELAY ).

http://tracker.ceph.com/issues/5487 refs #5487

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-07-08 16:45:12 +02:00
Christophe Courtaut
6f1653a6d5 rgw: Add --help support to radosgw
http://tracker.ceph.com/issues/3074 fixes #3074

The support of --help option is added through this patch.
By now, it displays the generic options usage used in radosgw.

Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
2013-07-08 11:40:20 +02:00
Sage Weil
8bc50626c5 Merge branch 'next' 2013-07-07 21:20:34 -07:00
Sage Weil
85a1d6cc5d mon: remove bad assert about monmap version
It is possible to start a sync when our newest monmap is 0.  Usually we see
e0 from probe, but that isn't always published as part of the very first
paxos transaction due to the way PaxosService::_active generates it's
first initial commit.

In any case, having e0 here is harmless.

Fixes: #5509
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-07-07 21:19:41 -07:00
Sage Weil
3f5a96236b qa: write a somewhat <1tb image
1TB is enough to fill up 6 plana osds.  And it takes forever.  Write less.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-05 11:24:06 -07:00
Sage Weil
54aa797acd qa/workunits/rbd/kernel.sh: modprobe rbd
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-05 11:20:43 -07:00
Sage Weil
83f308962c qa: move test_o_trunc.sh into fs dir
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-05 11:17:29 -07:00
Sage Weil
507a4ec87b qa: move fs test binary into workunits dir so teuthology can build it
Teuthology does a make in the workunits dir, so move this in there.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-05 11:16:08 -07:00
Sage Weil
a84e6d1824 mds/MDSTable: gracefully suicide on EBLACKLIST
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-07-05 11:04:37 -07:00
Christophe Courtaut
8b4cb8f372 rgw: Add explicit messages in radosgw init script
http://tracker.ceph.com/issues/5478 fixes #5478

Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
2013-07-05 14:41:04 +02:00
Yehuda Sadeh
d09ce3df2e rgw: fix rgw_remove_bucket()
function was referring bucket info object directly, instead of going
through helper functions, which is now a must.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-04 23:44:02 -07:00
Christophe Courtaut
a793e203fd doc: Add a page to document vstart.sh script
Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
2013-07-04 14:08:41 +02:00
Sage Weil
c14847c304 .gitignore: cls_test_*
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-03 22:04:10 -07:00
Sage Weil
22227cd1c1 qa: add O_TRUNC test
From: Yan, Zheng <yan.zheng@intel.com>

Simple reproducer for #5453, modified to run for a finite number of
iterations.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-03 21:53:48 -07:00
Sage Weil
46b7fc2ed9 radosgw-admin: fix cli test
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-03 21:43:37 -07:00
Yehuda Sadeh
a0b1be99ea rgw: fix type encoding
due to bad merge

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-03 18:08:47 -07:00
Yehuda Sadeh
d30fc4bdd1 Merge remote-tracking branch 'origin/master' into wip-rgw-geo
Conflicts:
	src/Makefile.am
	src/include/rados/librados.hpp
	src/rgw/rgw_admin.cc
	src/rgw/rgw_bucket.cc
	src/rgw/rgw_common.cc
	src/rgw/rgw_common.h
	src/rgw/rgw_json_enc.cc
	src/rgw/rgw_main.cc
	src/rgw/rgw_op.h
	src/rgw/rgw_rados.cc
	src/rgw/rgw_tools.cc
	src/rgw/rgw_user.cc
	src/rgw/rgw_user.h

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-03 17:09:56 -07:00
Sage Weil
71ebfe7e1a mon/Paxos: make 'paxos trim disabled max versions' much much larger
108000 is about 3 hours if paxos is going full-bore (1 proposal/second).
That ought to be pretty safe.  Otherwise, we start trimming to soon and a
slow sync will just have to restart when it finishes.

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-07-03 16:56:06 -07:00
Sage Weil
ab93696e30 mon: be less chatty about discarding messages
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-03 16:23:56 -07:00
Sage Weil
e8b42a6998 osd/OSDMap: handle case where some new osds have hb_front and others don't
Do not assume that because at least one OSD has an hb_front addr that they
all do, or else we will end up assigning garbage here and later thinking
it is a addr (or, more precisely, != entity_addr_t()).

Fixes: #5460
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
2013-07-03 15:37:16 -07:00
Sage Weil
81343f1df4 osd: clear hb_front if it was previously non-NULL and is now NULL
If we have a real addr for hb_front for a given osd and then a new map
has the osd coming up without an hb_front, we need to clear the addr
field.

Also, improve the debug output in add_heartbeat_peer() so we can tell if
we have no connection or a connection to a blank addr.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
2013-07-03 15:37:05 -07:00
John Wilkins
e960e1bb6a Merge branch 'master' of https://github.com/ceph/ceph 2013-07-03 15:27:26 -07:00
John Wilkins
e0da832a5e doc: Added write caps. Required for auto-creating pools.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-07-03 15:26:52 -07:00
Sage Weil
01d3e09482 osd: fix race when queuing recovery ops
Previously we would sample how many ops to start under the lock, drop it,
and start that many.  This is racy because multiple threads can jump in
and we start too many ops.  Instead, claim as many slots as we can and
release them back later if we do not end up using them.

Take care to re-wake the work-queue since we are releasing more resources
for wq use.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-07-03 15:24:20 -07:00