Commit Graph

27006 Commits

Author SHA1 Message Date
Greg Farnum
8f0fbc22c8 Merge branch 'next' 2013-06-27 15:23:00 -07:00
Greg Farnum
9e604ee694 ceph-disk: s/else if/elif/
Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Joao Luis <joao.luis@inktank.com>
(cherry picked from commit bd8255a750de08c1b8ee5e9c9a0a1b9b16171462)
2013-06-27 15:21:44 -07:00
João Eduardo Luís
d83006064b Merge pull request #372 from ceph/wip-mon-pgmap
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-06-27 15:09:50 -07:00
Sage Weil
68c013d790 Merge remote-tracking branch 'gh/next' 2013-06-26 22:19:32 -07:00
Sage Weil
61a0436035 Merge pull request #378 from ceph/wip-init-rbd
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-26 22:15:11 -07:00
Sage Weil
cd7510f26c qa/workunits/misc/multiple_rsync: put tee output in /tmp
2013-06-25T10:29:15.811 INFO:teuthology.task.workunit.client.0.err:+ rsync -auv --exclude local/ /usr/ usr.2
2013-06-25T10:29:15.811 INFO:teuthology.task.workunit.client.0.err:+ tee a
2013-06-25T10:29:15.902 INFO:teuthology.task.workunit.client.0.out:sending incremental file list
2013-06-25T10:29:48.738 INFO:teuthology.task.workunit.client.0.out:
2013-06-25T10:29:48.740 INFO:teuthology.task.workunit.client.0.out:sent 1449972 bytes  received 7477 bytes  43505.94 bytes/sec
2013-06-25T10:29:48.740 INFO:teuthology.task.workunit.client.0.out:total size is 3205268241  speedup is 2199.23
2013-06-25T10:29:48.740 INFO:teuthology.task.workunit.client.0.err:+ hexdump -C a
2013-06-25T10:29:48.741 INFO:teuthology.task.workunit.client.0.out:00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
2013-06-25T10:29:48.741 INFO:teuthology.task.workunit.client.0.out:00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 0a 73  |...............s|
2013-06-25T10:29:48.742 INFO:teuthology.task.workunit.client.0.out:00000020  65 6e 74 20 31 34 34 39  39 37 32 20 62 79 74 65  |ent 1449972 byte|
2013-06-25T10:29:48.742 INFO:teuthology.task.workunit.client.0.out:00000030  73 20 20 72 65 63 65 69  76 65 64 20 37 34 37 37  |s  received 7477|
2013-06-25T10:29:48.742 INFO:teuthology.task.workunit.client.0.out:00000040  20 62 79 74 65 73 20 20  34 33 35 30 35 2e 39 34  | bytes  43505.94|
2013-06-25T10:29:48.742 INFO:teuthology.task.workunit.client.0.out:00000050  20 62 79 74 65 73 2f 73  65 63 0a 74 6f 74 61 6c  | bytes/sec.total|
2013-06-25T10:29:48.742 INFO:teuthology.task.workunit.client.0.out:00000060  20 73 69 7a 65 20 69 73  20 33 32 30 35 32 36 38  | size is 3205268|
2013-06-25T10:29:48.742 INFO:teuthology.task.workunit.client.0.out:00000070  32 34 31 20 20 73 70 65  65 64 75 70 20 69 73 20  |241  speedup is |
2013-06-25T10:29:48.743 INFO:teuthology.task.workunit.client.0.out:00000080  32 31 39 39 2e 32 33 0a                           |2199.23.|
2013-06-25T10:29:48.743 INFO:teuthology.task.workunit.client.0.out:00000088

This passes consistently when the output is in /tmp, but fails after a few
iterations when on cephfs+kclient.  Avoid the bug with this test.

See: #5453

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-26 22:11:07 -07:00
Yehuda Sadeh
e1f9fe58d2 rgw: fix radosgw-admin buckets list
Fixes: #5455
Backport: cuttlefish
This commit fixes a regression, where radosgw-admin buckets list
operation wasn't returning any data.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-26 21:30:49 -07:00
David Zafman
fe6633172e Handle non-existent front interface in maps from older MONs
Fix OSDService::get_con_osd_hb() to not try to get_connection() without front interface
Fix OSD::handle_osd_map() to check for missing front interface

Fixes: #5460

Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-26 21:06:59 -07:00
Sage Weil
867ead91e4 qa/workunits/rbd/simple_1tb: add simple rbd read/write test on large image
Motivated by #5454.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-26 20:41:58 -07:00
Sage Weil
8a17f33b14 ceph-disk: do not mount over an osd directly in /var/lib/ceph/osd/$cluster-$id
If we see a 'ready' file in the target OSD dir, do not mount our device
on top of it.

Among other things, this prevents ceph-disk activate on stray disks from
stepping on teuthology osds.

Fixes: #5445
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-26 18:28:01 -07:00
Sage Weil
986185ca02 mon/PGMonitor: avoid duplicating map_pg_create() effort on same maps
If we have an election and refresh, but the osdmap does not change, there
is no need to recalculate the pg create maps.  However, if we register new
creating pgs, we do... when the last_pg_scan update gets pulled out of
paxos (i.e., on both leader and peon mons).

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-26 17:34:39 -07:00
Dan Mick
ca55c3416e cephtool/test.sh: add case for auth add with no caps
Test case for failure in #5467.  Supplying new auth info overwrites.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-26 17:09:49 -07:00
Dan Mick
bfed2d60a5 MonCommands.h: auth add doesn't require caps (it can use -i <file>)
This was a regression from the old behavior introduced by the
CLI rewrite.

Fixes: #5467
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-06-26 16:18:47 -07:00
Dan Mick
d1d902846d Merge branch 'next' 2013-06-26 12:39:15 -07:00
Dan Mick
71f3e56d4b Makefile.am: fix libglobal.la race with ceph_test_cors
ceph_test_cors had libglobal.la in its _LDFLAGS macro definition;
it should have been in _LDADD.  Moreover, things using libglobal.la
ought to be using LIBGLOBAL_LDA to add it to _LDADD.  Fix them all.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-06-26 12:28:09 -07:00
Sage Weil
e635c47851 mon/PGMonitor: use post_paxos_update, not init, to refresh from osdmap
We do two things here:
 - make init an one-time unconditional init method, which is what the
   health service expects/needs.
 - switch PGMonitor::init to be post_paxos_update() which is called after
   the other services update, which is what PGMonitor really needs.

This is a new version of the fix originally in commit
a2fe013794 (and those around it).  That is,
this re-fixes a problem where osds do not see pg creates from their
subscribe due to map_pg_creates() not getting called.

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-26 06:55:02 -07:00
Sage Weil
131686980f mon/PaxosService: add post_paxos_update() hook
Some services need to update internal state based on other service's
state, and thus need to be run after everyone has pulled their info out of
paxos.

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-26 06:55:02 -07:00
Sage Weil
ea1f316e5d mon: do not reopen MonitorDBStore during startup
level doesn't seem to like this when it races with an internal compaction
attempt (see below).  Instead, let the store get opened by the ceph_mon
caller, and pull a bit of the logic into the caller to make the flow a
little easier to follow.

    -2> 2013-06-25 17:49:25.184490 7f4d439f8780 10 needs_conversion
    -1> 2013-06-25 17:49:25.184495 7f4d4065c700  5 asok(0x13b1460) entry start
     0> 2013-06-25 17:49:25.316908 7f4d3fe5b700 -1 *** Caught signal (Segmentation fault) **
 in thread 7f4d3fe5b700

 ceph version 0.64-667-g089cba8 (089cba8fc0e8ae8aef9a3111cba7342ecd0f8314)
 1: ceph-mon() [0x649f0a]
 2: (()+0xfcb0) [0x7f4d435dccb0]
 3: (leveldb::Table::BlockReader(void*, leveldb::ReadOptions const&, leveldb::Slice const&)+0x154) [0x806e54]
 4: ceph-mon() [0x808840]
 5: ceph-mon() [0x808b39]
 6: ceph-mon() [0x806540]
 7: (leveldb::DBImpl::DoCompactionWork(leveldb::DBImpl::CompactionState*)+0xdd) [0x7f363d]
 8: (leveldb::DBImpl::BackgroundCompaction()+0x2c0) [0x7f4210]
 9: (leveldb::DBImpl::BackgroundCall()+0x68) [0x7f4cc8]
 10: ceph-mon() [0x80b3af]
 11: (()+0x7e9a) [0x7f4d435d4e9a]
 12: (clone()+0x6d) [0x7f4d4196bccd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-26 06:55:02 -07:00
Sage Weil
516445bebc mon/Paxos: simplify trim()
Collapse all the trim methods into a single simple method.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-26 06:55:02 -07:00
Sage Weil
b8d04a2a8b mon/PaxosService: rename scrub
Make the name patch the one in Paxos.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-26 06:55:02 -07:00
Sage Weil
ac63b2e095 mon/Paxos: clean up removal of pre-conversion paxos states
Use a helper, independent of trim machinery, and call on leader, too.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-26 06:55:02 -07:00
Sage Weil
d2f3811814 mon/Paxos: update first_committed only from paxos
Do not touch the in-memory first_committed until the trim commits.  This
avoids any possible confusion due to races and keeps commit() as similar
to store_state() as possible.

Similarly, do not touch first_committed from store_state.  We should
*only* pull it out of the kv store.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-26 06:55:02 -07:00
Sage Weil
290ccde1dc mon/Paxos: set first_committed on first commit
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-26 06:55:01 -07:00
Gary Lowell
5511daf345 doc: public network statement needed on new monitors.
When using ceph-deploy to create a new monitor on a host that is not
in the initial set of hosts defined by the ceph-deploy new command,
a "public network" statement needs to be added to the ceph.conf file.
Fixes #5195.

Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
2013-06-26 06:27:17 -07:00
Sage Weil
fe365339b9 mon/Paxos: never write first_committed except during trim
The trimming is handled by proposing transactions.  Do not confuse matters
by writing (incorrect) first_committed values at any other point.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-25 21:25:04 -07:00
Sage Weil
e93730b7ff mon: enable leveldb cache by default
512 MB sounds reasonable to me.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-25 21:25:04 -07:00
Sage Weil
ad9c294850 mon/Paxos: assert that the store gives us back what we just wrote
In bug #5424 I observed leveldb failing internally and then returning
bad info.  We then hit a random/confusing assert.  Try to detect this
earlier by verifying that a get of a just-written last_committed gives
us back the right thing.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-06-25 21:25:04 -07:00
Sage Weil
11e0325372 mon/Paxos: drop unnecessary last_committed loads
Drop (apparently) ad-hoc refreshes of last_committed from the store.
These are unnecessary and confusing.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-25 21:25:04 -07:00
Sage Weil
d31ed95064 mon/PaxosService: allow paxos service writes while paxos is updating
In commit f985de28f8 I mistakenly made
is_writeable() false while paxos was updating due to a misread of
Paxos::propose_new_value() (I didn't see that it would queue).
This is problematic because it narrows the window during which each service
is writeable for no reason.

Allow service to be writeable both when paxos is active and updating.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-25 21:25:04 -07:00
Sage Weil
2d2aa00ed3 mon/PGMonitor: store PGMap directly in store, bypassing PaxosService stash_full
Instead of encoding incrementals and periodically dumping the whole encoded
PGMap, instead store everything in a range of keys, and update them
between versions using transactions.  The per-version values are now
breadcrumbs indicating which keys were dirtied so they can be refreshed
via update_from_paxos().

This has several benefits:
 - we avoid every encoding the entire PGMap
 - we avoid dumping that blob into leveldb keys
 - we limit the amount of data living in forward-moving keys, which leveldb
   has a hard time compacting away
 - pgmap data instead lives over a fixed range of keys, which leveldb
   excels at
 - we only keep the latest copy of the PGMap (which is all we care about)

Bump the internal monitor protocol version.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-25 21:25:04 -07:00
Sage Weil
5680fa1e85 doc/release-notes: v0.65
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-25 14:14:39 -07:00
Gary Lowell
70be76b2e2 Merge branch 'next' 2013-06-25 13:45:22 -07:00
Josh Durgin
0e1612b3c4 Merge pull request #380 from dachary/wip-4907
get_xattr() can return more than 4KB

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-06-25 10:57:41 -07:00
Sage Weil
12678a1093 Merge pull request #379 from dachary/wip-5312
skip TEST(EXT4StoreTest, _detect_fs) if DISK or MOUNTPOINT are undefined

Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-25 10:15:10 -07:00
Sage Weil
c8f793694c mon/AuthMonitor: start at format 1 (latest) for new clusters
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-25 09:57:00 -07:00
Sage Weil
950c0f353b mon/PaxosService: move upgrade_format() machinery into PaxosService
We originally did this in AuthMonitor, but it is perfect for PGMonitor too,
so make it generic.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-25 09:57:00 -07:00
Sage Weil
0d73eb4dad mon/PGMonitor: drop some dead code
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-25 09:57:00 -07:00
Sage Weil
0fd776da48 mon/PGMap: make int type explicit
We get away with this because int is 32-bits on x86_64 and i386 both, but
we should be explicit anyway!

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-25 09:57:00 -07:00
Sage Weil
29e14bafa4 mon/PaxosService: s/get_version()/get_last_committed()/
Avoid aliasing simple accessors; use a single name instead.  Also, function
name overloading will throw a wrench in the class inheritance later.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-25 09:57:00 -07:00
Gary Lowell
c2d517ef96 v0.65 2013-06-25 09:19:32 -07:00
Loic Dachary
3016f46f53 get_xattr() can return more than 4KB
Instead of failing if the attribute to be returned is larger than 4KB,
double the buffer size each time librados.rados_getxattr returns
-errno.ERANGE and try again.

http://tracker.ceph.com/issues/4907 fixes #4907

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-06-25 16:10:22 +02:00
Loic Dachary
6e320a1bd3 skip TEST(EXT4StoreTest, _detect_fs) if DISK or MOUNTPOINT are undefined
The TEST(EXT4StoreTest, _detect_fs) test is meant to be run from
qa/workunits/filestore/filestore.sh, after the ext4 file system was
created. If the DISK and MOUNTPOINT environment variables are not
defined, display a message explaining the expected environment and
silentely skip the test. The tests in store_test.cc are not unit tests
because they depend on their environment.

http://tracker.ceph.com/issues/5312 fixes #5312

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-06-25 15:09:57 +02:00
Laurent Barbe
a4ddf70486 Add rc script for rbd map/unmap
Init script for mapping/unmapping rbd device on startup and shutdown.
On start, map rbd dev according to /etc/rbdmap, and force mount -a
On stop, umount file system depending on rbd and unmap all rbd
Since some distribution use symlink for /etc/mtab, the user-space attribute _netdev is not enough to umount file system before rbd dev.
(also concern: #1790)

Signed-off-by: Laurent Barbe <laurent@ksperis.com>
2013-06-24 21:48:26 -07:00
Sage Weil
b28bd7870f mon/PaxosService: drop unused last_accepted_name
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-24 21:07:26 -07:00
Sage Weil
6060268f2b mon/PaxosService: some whitespace
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-24 21:07:26 -07:00
Sage Weil
7c9dee0160 mon/PaxosService: drop unused {get,set,put}_version(prefix, a, bl)
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-24 21:07:26 -07:00
Sage Weil
1d913d2056 mon/OSDMOnitor: use provided get_version_full()
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-24 21:07:25 -07:00
Sage Weil
872f4d5fb4 mon/PaxosService: simplify full helpers, drop single-use helper
We are the only caller for get_version(prefix, name), so move it inline
and drop it.  Also rename full_version_name to full_prefix_name, which I
find slightly less confusing.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-24 21:07:25 -07:00
Sage Weil
83c49be369 mon/PaxosService: remove mkfs helpers
Keep it simple.  These are one-liners.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-24 21:07:25 -07:00
Sage Weil
c47f271de0 mon: fix mkfs monmap cleanup
exists_key(a,b) was looking for "monmap/mkfs/monmap".

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-24 21:07:25 -07:00