Commit Graph

25781 Commits

Author SHA1 Message Date
Sage Weil
bd68b82bd6 mon: enable 'mon compact on trim' by default; trim in larger increments
This resolves the leveldb growth-without-bound problem observed by
mikedawson, and all the badness that stems from it.  Enable this by
default until we figure out why leveldb is not behaving better.

While we are at it, trim more states at a time.  This will make
compaction less frequent, which should help given that there is some
overhead unrelated to the amount of deleted data.

Fixes: #4815
Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 17:20:39 -07:00
Sage Weil
95ece01251 Merge pull request #249 from ceph/wip-cuttle-man
man page updates

Reviewed-by: Sage Weil <sage@inktank.com>
2013-04-29 17:09:37 -07:00
Sage Weil
929a9944c9 mon: share extra probe peers with debug log, mon_status
This is useful when debugging initial quorum formation.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 17:08:04 -07:00
Sage Weil
030bf8aaa1 debian: only start/stop upstart jobs if upstart is present
This avoids errors on non-upstart distros (like wheezy).

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 17:01:55 -07:00
Sage Weil
5d20c39caa Merge remote-tracking branch 'gh/wip-up' into next
Reviewed-by: Sam Lang <sam.lang@inktank.com>
2013-04-29 16:57:13 -07:00
Sage Weil
4b9325b2b3 Merge pull request #248 from ctrlaltdel/next
Fix a README typo
2013-04-29 16:46:52 -07:00
Josh Durgin
23c591ed99 Merge pull request #244 from dalgaaf/wip-da-pylint-2
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-04-29 16:20:42 -07:00
Josh Durgin
825a43176b man: update remaining copyright notices
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-04-29 16:01:38 -07:00
Josh Durgin
4abf081495 man: refresh content from rst
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-04-29 16:01:03 -07:00
Samuel Just
2b5dda0e6a Merge branch 'wip_4860' into next
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
2013-04-29 15:57:29 -07:00
Samuel Just
1bd011a101 PG,OSD: _remove_pg must remove pg keys
Instead of doing this in OSD::_remove_pg, pass a transaction
to on_removal and do it in PG.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-29 15:56:54 -07:00
Samuel Just
714601261b OSD: no need to remove snapdirs on _remove_pg()
The snapmapper patches removed snapdirs altogether.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-29 15:56:19 -07:00
Sage Weil
8f6a1b8fa9 mon/Paxos: compact on trim
Compact the paxos keys when we trim old paxos states.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 15:45:58 -07:00
Sage Weil
3cb4f6783b mon: compact PaxosService prefix on trim
Each time we trim a PaxosService, have leveldb compact so that the
space from removed states is reclaimed.

This is probably not optimal if leveldb's heuristics are doing the right
thing, but it currently appears as if they are not.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 15:45:56 -07:00
Sage Weil
e8c9824102 mon: add compact_prefix transaction operation
Add a prefix compaction opteration to the transaction that will be
performed after the transaction applies.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 15:45:41 -07:00
Sage Weil
a2f7d1d1f1 leveldb: add compact_prefix method
Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 15:45:41 -07:00
Sage Weil
90b6b6df31 mon: compact leveldb on bootstrap
This is an opportunistic time to optimize our local data since we are
out of quorum.  It serves as a safety net for cases where leveldb's
automatic compaction doesn't work quite right and lets things get out
of hand.

Anecdotally we have seen stores in excess of 30GB compact down to a few
hundred KB.  And a 9GB store compact down to 900MB in only 1 minute.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 15:45:39 -07:00
Sage Weil
ee3cdaa86c mon: compact leveldb on bootstrap
This is an opportunistic time to optimize our local data since we are
out of quorum.  It serves as a safety net for cases where leveldb's
automatic compaction doesn't work quite right and lets things get out
of hand.

Anecdotally we have seen stores in excess of 30GB compact down to a few
hundred KB.  And a 9GB store compact down to 900MB in only 1 minute.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 15:45:17 -07:00
Sage Weil
5fa0f04852 mon: --compact argument, config option to compact the store on start
Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 15:44:58 -07:00
Sage Weil
6a00f33251 leveldb: add compact() method
This will compact the entire store; it will be slow!

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 15:43:47 -07:00
Josh Durgin
ffc8557acd doc: update rbd man page for new options
--no-progress and --allow-shrink were added recently.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-04-29 15:37:06 -07:00
Samuel Just
8b2a1475b0 gitignore: add ceph_monstore_tool
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-29 15:05:37 -07:00
Sage Weil
29831f9662 Makefile: fix java build warning
This is a workaround that makes the warning go away.  Not certain there
isn't something we should be changing...

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joe Buck <joe.buck@inktank.com>
2013-04-29 14:50:41 -07:00
Sage Weil
6a5be251df Merge branch 'wip-mon-pg' into next
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-04-29 11:27:22 -07:00
Sage Weil
a2fe013794 mon: remap creating pgs on startup
After Monitor::init_paxos() has loaded all of the PaxosService state,
we should then map creating pgs to osds.  This ensures we do so after the
osdmap has been loaded and the pgs actually map somewhere meaningful.

Fixes: #4675
Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 11:11:27 -07:00
Sage Weil
278186d750 mon: only map/send pg creations if osdmap is defined
This avoids calculating new pg creation mappings if the osdmap isn't
loaded yet, which currently happens when during Monitor::paxos_init()
on startup.  Assuming osdmap epoch is nonzero, it should always be
safe to do this (although possibly unnecessary).

More cleanup here is certainly possible, but this is one step toward fixing
the bad behavior for #4675.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 11:11:24 -07:00
Sage Weil
28d495a371 mon: factor map_pg_creates() out of send_pg_creates()
Factor out the portion of the function that remaps creating pgs to osds
from the part that sends those pending creates out.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 11:07:08 -07:00
Sage Weil
896b2777ce client: make dup reply a louder error
If we get a dup reply something is probably wrong!  We should make sure
it appears more loudly in the log.  In particular, it can lead to out
of sync cap state; see #4853.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 10:46:04 -07:00
Sage Weil
ee553ac279 client: fix session open vs mdsmap race with request kicking
A sequence like:

 - ceph-fuse starts, make_request on getattr
 - waits for mds to be active
 - tries to open a session
 - mds restarts, recovers
 - eventually gets session open reply
 - sends first getattr (even tho mds is in reconnect state)
 - gets mdsmap update that mds is now active
 - kicks request, resends getattr
 - get first reply
 - ignore second reply, caps get out of sync

The bug is that we send the first request when the MDS is still in
the reconnect state.  The fix is to loop in make_request so that we
ensure all conditions are satisfied before sending the request.  Any
time we wait, we loop, so that we know all conditions (still) pass if
we make it to the end.

Fixes: #4853
Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-29 10:46:03 -07:00
Samuel Just
f8f762a281 Merge branch 'wip_4836' into next
Fixes: #4836
Reviewed-by: Sage Weil <sage@inktank.com>
2013-04-29 10:45:24 -07:00
Francois Deppierraz
bf0b4306a6 Fix a README typo
Signed-off-by: François Deppierraz <francois@ctrlaltdel.ch>
2013-04-29 10:22:27 +02:00
Yan, Zheng
cea2ff8615 mon: Fix leak of context
Use Context::complete() to finish context, it frees the context
after executing Context::finish().

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-04-28 21:15:25 -07:00
John Wilkins
20d99c4a5a doc: Removed extra whitespace.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-28 15:01:44 -07:00
John Wilkins
041b0cf950 doc: Added rbd-fuse to TOC.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-28 15:01:12 -07:00
John Wilkins
8f48a3d12c Added commentary and removed fourth column for now.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-28 15:00:51 -07:00
John Wilkins
4e805a573e doc: Removed. Redunant information now.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-28 15:00:10 -07:00
John Wilkins
661278523a doc: Added openssh-server mention, corrections, hyperlink fix.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-28 14:59:51 -07:00
John Wilkins
21db055e8d doc: Added openssh-server mention.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-28 14:59:17 -07:00
John Wilkins
9fa6ba792e doc: Added manpage link and hidden TOC.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-28 14:58:45 -07:00
John Wilkins
dd6e79aa77 doc: Removed installed Chef. This is now in the ceph wiki.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-28 14:58:08 -07:00
John Wilkins
945dac6580 doc: Removed text for include directive. Wasn't behaving the way I'd hoped.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-28 14:57:44 -07:00
John Wilkins
3d9bc46945 doc: Added ceph-mds to CephFS toc.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-28 14:57:03 -07:00
John Wilkins
44d13a76a9 doc: Fix. ceph, not chef.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-27 22:28:42 -07:00
Sage Weil
5327d06275 ceph-filestore-dump: fix warnings on i386 build
tools/ceph-filestore-dump.cc: In member function ‘int header::get_header()’:
warning: tools/ceph-filestore-dump.cc:454:19: comparison between signed and unsigned integer expressions [-Wsign-compare]
tools/ceph-filestore-dump.cc: In member function ‘int footer::get_footer()’:
warning: tools/ceph-filestore-dump.cc:471:19: comparison between signed and unsigned integer expressions [-Wsign-compare]
tools/ceph-filestore-dump.cc: In member function ‘int super_header::read_super()’:
warning: tools/ceph-filestore-dump.cc:697:30: comparison between signed and unsigned integer expressions [-Wsign-compare]

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-27 17:59:24 -07:00
Sage Weil
3cc106453f Merge remote-tracking branch 'gh/next' 2013-04-26 18:12:24 -07:00
Samuel Just
79280d9f4e OSDMonitor: when adding bucket, delay response if pending map has name
Fixes: #4836
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-26 17:19:59 -07:00
Samuel Just
e725c3e210 PaxosService: use get and put for version_t
Otherwise, we just duplicate the logic for generating the version
key names.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-26 17:19:59 -07:00
Samuel Just
1e6c390a67 tools: add ceph_monstore_tool with getosdmap
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-26 17:19:59 -07:00
Gary Lowell
50e58b9f49 ceph.spec.in: remove conditional checks on tcmalloc
tcmalloc is available on all supported platforms now.

Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
2013-04-26 16:05:25 -07:00
Gary Lowell
5c1782a57c debian/rules: Fix tcmalloc breakage
Since all currently supported platforms have tcmalloc
available and it is now the default, remove broken check code
that turns it off if the package is not listed in build-depends.

Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
2013-04-26 16:04:52 -07:00