Commit Graph

22470 Commits

Author SHA1 Message Date
Joao Eduardo Luis
493049b4b6 mon: OSDMonitor: clarify some command replies
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2012-11-16 15:30:53 +00:00
Joao Eduardo Luis
0b28ef6a15 mon: OSDMonitor: fix spacing when outputting items on command reply
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2012-11-16 15:30:24 +00:00
Sage Weil
71cfaf1cc5 os/FileStore: only try BTRFS_IOC_SUBVOL_CREATE on btrfs
Only try to create a btrfs subvolume if the fs is btrfs.  Otherwise, just
create a directory.  Then we can error out on *any* ioctl error, and not
rely on the ioctl error code to determine if we failed because we are on
a non-btrfs or a real error.

Fixes: #3052
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2012-11-15 16:50:55 -08:00
Sage Weil
3ca947e018 mon: clean up 'ceph osd ...' list output
No more 'osd.0 is already inosd.1 is already in' crap.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-15 16:35:53 -08:00
Sage Weil
344c4fdc3f mon: correctly identify crush names
get_item_id() returns 0 if the name already exists; that's not what we
want here.  Verify the name exists before checking its id.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-15 16:32:50 -08:00
Sage Weil
592a89421f mon: use parse_osd_id() throughout
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-15 16:32:50 -08:00
Samuel Just
918c58c85e PrioritizedQueue: remove internal lock, not used
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2012-11-15 16:02:34 -08:00
Samuel Just
b53e06cac8 DispatchQueue: lock DispatchQueue when for get_queue_len()
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2012-11-15 15:59:18 -08:00
Alex Elder
659d4c25b2 run_xfstests.sh: activate more tests that now work
I've gone through the set of xfstests that were previously found to
not work.  Some of those now do work, and with the addition of an
option to pass to "mkfs.xfs" a large number of other tests now
produce expected output as well.

This patch updates the default list of tests to run to reflect
the result of this exercise.  The following 50 additional tests
are now run by default:

    029 074 078 084-087 100 105 117 121 124 126 129-134
    164 165 167 174 181 184 186 187 192 214-216 227 236
    237 241 243 245-249 257-259 261 277 278 280 285 286

Test 127 completed without error, but it took from 1-3 hours so I
kept that out of the list.

Signed-off-by: Alex Elder <elder@inktank.com>
2012-11-15 17:51:34 -06:00
Sage Weil
b40387de23 msg/Pipe: fix leak of Authorizer
Reported-by: Joao Luis <joao.luis@inktank.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-15 10:06:07 -08:00
Sage Weil
0fb23cf8ce Merge remote-tracking branch 'gh/wip-3477' into next
Reviewed-by: Greg Farnum <greg@inktank.com>
2012-11-15 09:48:25 -08:00
Samuel Just
12c2b7fa20 msg/DispatchQueue: release throttle on messages when dropping an id
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-11-14 17:05:58 -08:00
Samuel Just
5f214b2938 PrioritizedQueue: allow remove_by_class to return removed items
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-11-14 17:05:55 -08:00
Sage Weil
98b93b5d3d librbd: use delete[] properly
==4986== Mismatched free() / delete / delete []
==4986==    at 0x4C2658C: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4986==    by 0x4ED8EA9: librbd::ImageCtx::~ImageCtx() (ImageCtx.cc:100)
==4986==    by 0x4EF3827: librbd::close_image(librbd::ImageCtx*) (internal.cc:1869)
==4986==    by 0x4EE8FB8: librbd::clone(librados::IoCtx&, char const*, char const*, librados::IoCtx&, char const*, unsigned long, int*, unsigned long, int) (internal.cc:900)
==4986==    by 0x4EC363C: rbd_clone2 (librbd.cc:553)
==4986==    by 0x404C85: do_clone (fsx.c:836)
==4986==    by 0x405639: test (fsx.c:1048)
==4986==    by 0x406369: main (fsx.c:1523)
==4986==  Address 0xd498b30 is 0 bytes inside a block of size 37 alloc'd
==4986==    at 0x4C26CF7: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4986==    by 0x4ED9B4D: librbd::ImageCtx::init_layout() (ImageCtx.cc:164)
==4986==    by 0x4ED9845: librbd::ImageCtx::init() (ImageCtx.cc:142)
==4986==    by 0x4EF3449: librbd::open_image(librbd::ImageCtx*, bool) (internal.cc:1828)
==4986==    by 0x4EE89E0: librbd::clone(librados::IoCtx&, char const*, char const*, librados::IoCtx&, char const*, unsigned long, int*, unsigned long, int) (internal.cc:871)
==4986==    by 0x4EC363C: rbd_clone2 (librbd.cc:553)
==4986==    by 0x404C85: do_clone (fsx.c:836)
==4986==    by 0x405639: test (fsx.c:1048)
==4986==    by 0x406369: main (fsx.c:1523)

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-14 17:05:39 -08:00
Sage Weil
4a7a81bb53 objecter: fix leak of out_handlers
The error paths don't use the handlers.  Make sure they get cleaned up.

Fixes: #3446
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-14 17:05:39 -08:00
Sage Weil
ef4e4c8287 mon: calculate failed_since relative to message receive time
Instead of looking at the current time we process the message, look at the
receive time.  This gives us a more real failure time given that messages
may be requeued.

It doesn't solve the problem when messages are forwarded between monitors
due to an election, but that's ok; this is still a net improvement.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-14 17:00:57 -08:00
Yehuda Sadeh
9267d8a42f rgw: update post policy parser
json parser semantics changed a little bit, so
needed to update the policy parser.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-11-14 16:45:56 -08:00
Sage Weil
f6cb0780ac mon: set default port when binding to random local ip
Fixes #3135
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-14 16:26:58 -08:00
Sage Weil
dfeb8ded6a Merge remote-tracking branch 'gh/wip-asok' into next
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-11-14 16:22:27 -08:00
Yehuda Sadeh
ce28455206 rgw: relax date format check
Don't try to parse beyond the GMT or UTC. Some clients use
special date formatting. If we end up misparsing the date
it'll fail in the authorization, so don't need to be too
restrictive.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-11-14 15:51:36 -08:00
Sage Weil
4a34965c56 client: register admin socket commands without lock held
Avoid a lock cycle.

existing dependency Client::client_lock (11) -> AdminSocket::m_lock (16) at:
 ceph version 0.54-578-g7926ef5 (7926ef5393)
 1: (Mutex::Lock(bool)+0x41) [0x831337]
 2: (AdminSocket::register_command(std::string, AdminSocketHook*, std::string)+0x40) [0x873a32]
 3: (Client::init()+0x454) [0x6f4c24]
 4: (main()+0x637) [0x6ea399]
 5: (__libc_start_main()+0xed) [0x7fd97bbca76d]
 6: ./ceph-fuse() [0x6e9c59]

    -4> 2012-11-13 18:14:48.619714 7fd97b1a3700  0 new dependency AdminSocket::m_lock (16) -> Client::client_lock (11) creates a cycle at
 ceph version 0.54-578-g7926ef5 (7926ef5393)
 1: (Mutex::Lock(bool)+0x41) [0x831337]
 2: (Objecter::RequestStateHook::call(std::string, std::string, ceph::buffer::list&)+0x7a) [0x90627e]
 3: (AdminSocket::do_accept()+0xb1b) [0x87318f]
 4: (AdminSocket::entry()+0x2fa) [0x8725fe]
 5: (Thread::_entry_func(void*)+0x23) [0x86b335]
 6: (()+0x7e9a) [0x7fd97d279e9a]
 7: (clone()+0x6d) [0x7fd97bc9ccbd]

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-13 18:18:24 -08:00
Sage Weil
4db9442bad objecter: separate locked and unlocked init/shutdown
We don't want to hold the lock while we register the admin socket commands
or else we create a lock cycle when we try to process them later.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-13 18:17:55 -08:00
Gary Lowell
7926ef5393 Merge branch 'next'
Conflicts:
	configure.ac
	src/rgw/rgw_common.cc
2012-11-13 17:29:47 -08:00
Samuel Just
d395131c7f osd/: add config helper for min_size and update build_simple*
min_size should never be set to 0 on a pool.  config.h
now has a helper to determine the correct default value.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2012-11-13 17:09:26 -08:00
Sage Weil
d5bc66ac49 doc/release-notes: fix heading
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-13 17:11:34 -08:00
Sage Weil
74f7607afa doc: release-notes for v0.54
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-13 16:29:54 -08:00
Sage Weil
0d42e9762b doc: update crush weight ramping process
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-13 16:00:00 -08:00
Yehuda Sadeh
131d15a772 rgw: fix warning
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-11-13 15:43:48 -08:00
Sage Weil
97f65f6e42 Merge branch 'wip-min-size'
Reviewed-by: Sam Just <sam.just@inktank.com>
2012-11-13 15:39:42 -08:00
Sage Weil
a0eb8919ef osd: default pool min_size to 0 (which gives us size-size/2)
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-13 15:16:56 -08:00
Sage Weil
1d00f3aa67 mon: default min_size to size-size/2 if min_size default is 0
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-13 15:12:33 -08:00
Sage Weil
9d979d767d osd: default min_size to size - size/2
size -> min_size:
 5 -> 3
 4 -> 2
 3 -> 2
 2 -> 1

Basically, default to tolerating minority down.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-13 15:12:33 -08:00
Sage Weil
735df024ad mon: helpful warning in 'health detail' output about incomplete pgs
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-13 15:12:33 -08:00
Sage Weil
1679a55662 osd: start_boot() after init()
The previous trigger for start_boot() was racy, depending on whether we
got our rotating keys quickly.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-13 15:12:20 -08:00
Dan Mick
65961ca23b vstart.sh: support -X by adding 'auth required = none' entries
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2012-11-13 15:12:16 -08:00
Sage Weil
6a8a59c5d0 Merge remote-tracking branch 'gh/wip-rgw-integration'
Conflicts:
	src/common/config_opts.h
2012-11-13 14:50:42 -08:00
Gary Lowell
60b84b095b v0.54 2012-11-13 13:18:07 -08:00
Yehuda Sadeh
5d27f3da65 rgw: compile with -Woverloaded-virtual
This will trigger a warning if RGWRados api changes while
RGWCache doesn't.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-11-13 13:06:22 -08:00
Yehuda Sadeh
1be99237d0 rgw: fix RGWCache api
RGWCache api diverted form RGWRados, crippling the cache.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-11-13 12:39:42 -08:00
Yehuda Sadeh
e0e33d2c99 rgw: fix RGWCache api
RGWCache api diverted form RGWRados, crippling the cache.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-11-13 12:39:15 -08:00
Sage Weil
9a38059afa osd: remove dead rotating key code from init
Ancient, dead.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-13 12:29:09 -08:00
Sage Weil
eee0982223 osd: defer boot until we have rotating keys
Make sure we have our rotating keys before we start booting.  This
ensures we can open connections with peers *before* we add ourselves to
the osdmap.  This behaviors marks instances of #3292, although it is
not clear whether it is responsible for the actual crash.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Sam Just <sam.just@inktank.com>
2012-11-13 12:28:56 -08:00
Samuel Just
b151597efa Merge branches 'wip_persist_missing' and 'wip_recovery_qos'
Reviewed-by: Sage Weil <sage@inktank.com>
2012-11-13 10:56:41 -08:00
Samuel Just
193e2ea532 PG: persist divergent_priors in ondisklog
Consider the following logs:

a) 10'10(5'7) foo
   12'11(4'3) bar

b) 10'10(5'7) foo
   13'11(4'4) baz

When the osd with a merges primary log b, bar is deleted and
added to the missing set with need=4'3 and have=0'0.  If
the osd then dies after deleting bar, but before recovering
bar, PG::read_state() on start up will fail to re-add bar
to the missing set, and bar will be incorrect on that osd.

Now, (4'3, bar) will be added to the divergent_priors mapping
to be scanned during read_state along with the log.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-11-13 10:56:10 -08:00
Samuel Just
fcbbebc3d8 PG::merge_old_entry: fix case for divergent prior_version
Previously, we asserted that a log entry with a divergent
prior_version must be a clone.  Consider the following
case:

6'11(6'2)  m foo
7'12(6'3) m bar
7'13(7'12) m bar

If this is merged with:

6'11(6'2)  m foo
8'12(6'4) m baz

we will hit the assert.

Merging a divergent entry with prior_version after current
tail, but not in the log implies that prior_version was a
divergent entry which we have already merged.  The missing
set and filestore collection must therefore have already
been adjusted.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-11-13 10:56:10 -08:00
Sage Weil
f299be00f7 PrioritizedQueue: use iterator to streamlink SubQueue::remove_by_class()
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-13 10:45:00 -08:00
Sage Weil
95cb6cf443 PrioritizedQueue: avoid double-lookup on create_queue()
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-13 10:45:00 -08:00
Samuel Just
57a62554d6 osd/: de-prioritize recovery ops relative to client ops
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-11-13 10:45:00 -08:00
Samuel Just
bd4707ad9a msg/: use PrioritizedQueue to handle DispatchQueue queueing
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-11-13 10:45:00 -08:00
Samuel Just
5d47db2d16 OSD: queue ops based on message priority
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-11-13 10:45:00 -08:00