Commit Graph

26319 Commits

Author SHA1 Message Date
Noah Watkins
a805958f89 doc: document new hadoop config options
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2013-07-14 16:31:33 -07:00
Noah Watkins
352b7b5936 doc: start Hadoop installation docs
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2013-05-30 13:29:42 -07:00
Noah Watkins
743c528754 doc: Hadoop clarifications
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2013-05-30 13:29:42 -07:00
Christophe Courtaut
5fa098f10a Added -r option to usage
Added the -r option, which starts the radosgw and apache2 to access it
to the usage message.

Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
2013-05-30 12:20:17 -07:00
Alex Elder
f40256878d rbd/concurrent.sh: probe rbd module at start
There's no guarantee the rbd module is loaded when this script is
run, so add a line that loads it if necessary.

Signed-off-by: Alex Elder <elder@inktank.com>
2013-05-30 10:10:16 -05:00
Sage Weil
c410f032e5 Merge pull request #331 from ceph/wip-osd-interfacecheck
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-05-29 22:45:37 -07:00
Sage Weil
bd2ba0e3b2 Merge branch 'next' 2013-05-29 22:44:40 -07:00
Sage Weil
0c0595514d osd: wait for healthy pings from peers in waiting-for-healthy state
If we are (wrongly) marked down, we need to go into the waiting-for-healthy
state and verify that our network interfaces are working before trying to
rejoin the cluster.

 - make _is_healthy() check require positive proof of pings working
 - do heartbeat checks and updates in this state
 - reset the random peers every heartbeat_interval, in case we keep picking
   bad ones

Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-29 22:43:50 -07:00
Sage Weil
04aa2b5edf osd: distinguish between definitely healthy and definitely not unhealthy
is_unhealthy() will assume they are healthy for some period after we
send our first ping attempt.  is_healthy() is now a strict check that we
know they are healthy.

Switch the failure report check to use is_unhealthy(); use is_healthy()
everywhere else, including the waiting-for-healthy pre-boot checks.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-29 22:43:49 -07:00
Sage Weil
28ea184d3a osd: remove down hb peers
If a (say, random) peer goes down, filter it out.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-29 22:43:49 -07:00
Sage Weil
a4d3b47a92 osd: only add pg peers if active
We will soon be in this method for the waiting-for-healthy state.  As
a consequence, we need to remove any down peers.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-29 22:43:49 -07:00
Sage Weil
b586f4a92d osd: factor out _remove_heartbeat_peer
Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-29 22:43:49 -07:00
Sage Weil
e1dc3fd300 osd: augment osd heartbeat peers with neighbors and randoms, to up some min
- always include our neighbors to ensure we have a fully-connected
  graph
- include some random neighbors to get at least some min number of peers.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-29 22:43:49 -07:00
Sage Weil
50ac8917f1 osd: initialize new_state field when we use it
If we use operator[] on a new int field its value is undefined; avoid
reading it or using |= et al until we initialize it.

Fixes: #4967
Backport: cuttlefish, bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
2013-05-29 16:50:04 -07:00
Samuel Just
e21f8df1eb Merge branch 'wip_osd_throttle'
Fixes: #4782
Reviewed-by: Sage Weil
2013-05-29 15:06:18 -07:00
Samuel Just
a55e03cdfe WBThrottle: add some comments and some asserts
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-05-29 15:05:51 -07:00
Samuel Just
4b31c7e792 WBThrottle: rename replica nocache
We may want to influence the caching behavior for other
reasons.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-05-29 15:05:34 -07:00
Sage Weil
80942eb04c osd: move health checks into a single helper
For now we still only look at the internal heartbeats.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-29 13:41:44 -07:00
Sage Weil
c093e5bf91 osd: avoid duplicate mon requests for a new osdmap
sub_want() returns true if this is a new sub; only renew then.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-29 13:41:44 -07:00
Sage Weil
aac828c2ec osd: tell peers that ping us if they are dead
Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-29 13:41:44 -07:00
Sage Weil
ea2b2329b3 osd: simplify is_healthy() check during boot
This has a slight behavior change in that we ask the mon for the latest
osdmap if our internal heartbeat is failing.  That isn't useful yet, but
will be shortly.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-29 13:41:43 -07:00
Sage Weil
482733e960 mds: stay in SCAN state in file_eval
If we are in the SCAN state, stay there until the recovery finishes.  Do
not jump to another state from file_eval().

Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 0071b8e75b)
2013-05-29 10:28:25 -07:00
Sage Weil
0071b8e75b mds: stay in SCAN state in file_eval
If we are in the SCAN state, stay there until the recovery finishes.  Do
not jump to another state from file_eval().

Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-29 10:27:26 -07:00
Sage Weil
f71e1b1fbf Makefile: include new message header files
Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-29 10:27:22 -07:00
Sage Weil
532dee523a Merge remote-tracking branch 'yan/wip-mds'
Reviewed-by: Sage Weil <sage@inktank.com>

Conflicts:
	src/mds/MDCache.cc
2013-05-29 10:26:56 -07:00
Sage Weil
29e4e7e316 osd: do not assume head obc object exists when getting snapdir
For a list-snaps operation on the snapdir, do not assume that the obc for the
head means the object exists.  This fixes a race between a head deletion and
a list-snaps that wrongly returns ENOENT, triggered by the DiffItersateStress
test when thrashing OSDs.

Fixes: #5183
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-05-29 09:49:11 -07:00
Sage Weil
1d0aa2add3 Merge pull request #329 from javacruft/wip-fuse-deps
Use new fuse package instead of fuse-utils
2013-05-29 08:14:27 -07:00
James Page
e634d9d6b6 Use new fuse package instead of fuse-utils
The fuse-utils package was deprecated a while ago.

Switch the primary dependency for fuse tools to use
the preferred 'fuse' package.

Signed-off-by: James Page <james.page@ubuntu.com>
2013-05-29 10:57:17 +01:00
Sage Weil
1bb4e7435c mon: disable tdump by default
Grr.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-28 22:13:11 -07:00
Sage Weil
6afc22a158 Merge remote-tracking branch 'gh/last' 2013-05-28 22:10:21 -07:00
Sage Weil
b6be785775 Merge branch 'wip-5172'
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-05-28 20:44:48 -07:00
Sage Weil
dd35c26e5b osd: fix note_down_osd
Fix bug introduced in 27381c0c62.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-28 20:39:33 -07:00
Sage Weil
45b84f39ba osd: fix hb con failure handler
Fix a few bugs introduced by 27381c0c62:

- check against both front and back cons; either one may have failed.
- close *both* front and back before reopening either.  this is
  overkill, but slightly simpler code.
- fix leak of con when marking down
- handle race against osdmap update and note_down_osd

Fixes: #5172
Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-28 20:39:30 -07:00
Sage Weil
ce6fc2ed87 Merge pull request #319 from dalgaaf/wip-da-pylint-3
Fix some smaller Python issues
2013-05-28 19:52:41 -07:00
Sage Weil
648dcb9240 Merge pull request #326 from dalgaaf/wip-da-CID-727978
kv_flat_btree_async.cc: fix AioCompletion resource leak
2013-05-28 15:48:11 -07:00
Gary Lowell
054e96cf79 v0.63 2013-05-28 13:58:22 -07:00
Samuel Just
5bca9c38ef HashIndex: sync top directory during start_split,merge,col_split
Otherwise, the links might be ordered after the in progress
operation tag write.  We need the in progress operation tag to
correctly recover from an interrupted merge, split, or col_split.

Fixes: #5180
Backport: cuttlefish, bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-05-28 12:47:51 -07:00
Samuel Just
1c35556b56 doc/dev/osd_internals: add wbthrottle.rst
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-05-28 10:41:57 -07:00
Samuel Just
4d53e9c940 WBThrottle: add perfcounters
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-05-28 10:41:52 -07:00
Sage Weil
e8f5284026 Merge pull request #325 from dalgaaf/wip-da-CID-727980
kv_flat_btree_async.cc: fix AioCompletion resource leak
2013-05-28 10:27:56 -07:00
Sage Weil
16e6b081b3 Merge pull request #324 from dalgaaf/wip-da-CID-727979
kv_flat_btree_async.cc: fix AioCompletion resource leak
2013-05-28 10:27:25 -07:00
Sage Weil
b528a915f6 osd/OSDMap: fix Incremental dump
The front hb addr entry may not be present.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-28 09:17:05 -07:00
Sage Weil
8e982071c0 Merge pull request #322 from guilhem/patch-1
Reviewed-by: Sage Weil <sage@inktank.com>
2013-05-28 08:43:10 -07:00
Danny Al-Gaaf
478b576a71 kv_flat_btree_async.cc: fix AioCompletion resource leak
Call AioCompletion::release() if the completion is no longer needed.

CID 727978 (#1-2 of 2): Resource leak (RESOURCE_LEAK)
  leaked_storage: Variable "obj_aioc" going out of scope leaks the
  storage it points to.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-05-28 12:43:12 +02:00
Danny Al-Gaaf
e6efc39fad kv_flat_btree_async.cc: fix AioCompletion resource leak
Call AioCompletion::release() if the completion is no longer needed.

CID 727979 (#1-2 of 2): Resource leak (RESOURCE_LEAK)
  leaked_storage: Variable "a" going out of scope leaks the storage
  it points to.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-05-28 12:38:57 +02:00
Danny Al-Gaaf
6939b12492 kv_flat_btree_async.cc: fix AioCompletion resource leak
Call AioCompletion::release() if the completion is no longer
needed.

CID 727980 (#1-4 of 4): Resource leak (RESOURCE_LEAK)
  leaked_storage: Variable "aioc" going out of scope leaks
  the storage it points to.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-05-28 12:27:37 +02:00
Guilhem Lettron
554b41b171 Remove mon socket in post-stop
If ceph-mon segfault, socket file isn't removed.

By adding a remove in post-stop, upstart clean run directory properly.

Signed-off-by: Guilhem Lettron <guilhem@lettron.fr>
2013-05-28 10:35:24 +02:00
Yan, Zheng
7e0e0963ed mds: use "open-by-ino" function to open remote link
Also add a new config option "mds_open_remote_link_mode". The anchor
approach is used by default. If mode is non-zero, use the open-by-ino
function. In case open-by-ino function fails, if mode is 1, retry
using the anchor approach, otherwise trigger assertion.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-05-28 13:57:22 +08:00
Yan, Zheng
3120d969fe mds: open missing cap inodes
When a recovering MDS enters reconnect stage, client sends reconnect
messages to it. The message lists open files, their path, and issued
caps. If an inode is not in the cache, the recovering MDS uses the
path client provides to determine if it's the inode's authority. If
not, the recovering MDS exports the inode's caps to other MDS. The
issue here is that the path client provides isn't always accuracy.

The fix is use recently added "open inode by ino" function to open
any missing cap inodes when the recovering MDS enters rejoin stage.
Send cache rejoin messages to other MDS after all caps' authorities
are determined.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-05-28 13:57:22 +08:00
Yan, Zheng
ceaf51f78f mds: bump the protocol version
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-05-28 13:57:22 +08:00