Commit Graph

26954 Commits

Author SHA1 Message Date
Sage Weil
12678a1093 Merge pull request #379 from dachary/wip-5312
skip TEST(EXT4StoreTest, _detect_fs) if DISK or MOUNTPOINT are undefined

Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-25 10:15:10 -07:00
Loic Dachary
6e320a1bd3 skip TEST(EXT4StoreTest, _detect_fs) if DISK or MOUNTPOINT are undefined
The TEST(EXT4StoreTest, _detect_fs) test is meant to be run from
qa/workunits/filestore/filestore.sh, after the ext4 file system was
created. If the DISK and MOUNTPOINT environment variables are not
defined, display a message explaining the expected environment and
silentely skip the test. The tests in store_test.cc are not unit tests
because they depend on their environment.

http://tracker.ceph.com/issues/5312 fixes #5312

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-06-25 15:09:57 +02:00
Sage Weil
37a20174fd Merge remote-tracking branch 'gh/next' 2013-06-24 20:41:15 -07:00
Sage Weil
9ae0ec83da mon/Elector: cancel election timer if we bootstrap
If we short-circuit and bootstrap, cancel our timer.  Otherwise it will
go off some time later when we are in who knows what state.

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-06-24 18:51:07 -07:00
Sage Weil
03d3be3eaa mon: cancel probe timeout on reset
If we are probing and get (say) an election timeout that calls reset(),
cancel the timer.  Otherwise, we assert later with a splat like

2013-06-24 01:09:33.675882 7fb9627e7700  4 mon.b@0(leader) e1 probe_timeout 0x307a520
2013-06-24 01:09:33.676956 7fb9627e7700 -1 mon/Monitor.cc: In function 'void Monitor::probe_timeout(int)' thread 7fb9627e7700 time 2013-06-24 01:09:43.675904
mon/Monitor.cc: 1888: FAILED assert(is_probing() || is_synchronizing())

 ceph version 0.64-613-g134d08a (134d08a965)
 1: (Monitor::probe_timeout(int)+0x161) [0x56f5c1]
 2: (Context::complete(int)+0xa) [0x574a2a]
 3: (SafeTimer::timer_thread()+0x425) [0x7059a5]
 4: (SafeTimerThread::entry()+0xd) [0x7065dd]
 5: (()+0x7e9a) [0x7fb966f62e9a]
 6: (clone()+0x6d) [0x7fb9652f9ccd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Fixes: #5438
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-06-24 18:12:11 -07:00
Sage Weil
521fdc2a4e mon/AuthMonitor: ensure initial rotating keys get encoded when create_initial called 2x
The create_initial() method may get called multiple times; make sure it
will unconditionally generate new/initial rotating keys.  Move the block
up so that we can easily assert as much.

Broken by commit cd98eb0c65.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
2013-06-24 17:58:48 -07:00
Sage Weil
3791a1e558 osd: tolerate racing threads starting recovery ops
We sample the (max - active) recovery ops to know how many to start, but
do not hold the lock over the full duration, such that it is possible to
start too many ops.  This isn't problematic except that our condition
checks for being == max but not beyond it, and we will continue to start
recovery ops when we shouldn't.  Fix this by adjusting the conditional
to be <=.

Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
2013-06-24 17:44:06 -07:00
Sage Weil
31d6062076 init-radosgw.sysv: remove -x debug mode
Fixes: #5443
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-24 17:42:11 -07:00
Sage Weil
eb86eebe1b common/pick_addresses: behave even after internal_safe_to_start_threads
ceph-mon recently started using Preforker to working around forking issues.
As a result, internal_safe_to_start_threads got set sooner and calls to
pick_addresses() which try to set string config values now fail because
there are no config observers for them.

Work around this by observing the change while we adjust the value.  We
assume pick_addresses() callers are smart enough to realize that their
result will be reflected by cct->_conf and not magically handled elsewhere.

Fixes: #5195, #5205
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-06-24 16:15:13 -07:00
Dan Mick
cad8cf5818 Add python-argparse to dependencies (for pre-2.7 systems)
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-06-24 14:50:07 -07:00
Sage Weil
9a9c941d8d Merge pull request #376 from dalgaaf/wip-da-SCA-cppcheck-3
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-24 13:47:50 -07:00
Sage Weil
046e3b71a1 debian, rpm: remove python-lockfile dependency
As for 2a4953b697 ceph-disk no longer uses
this.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-24 13:04:58 -07:00
Sage Weil
8cc36895e2 Merge remote-tracking branch 'gh/next' 2013-06-24 12:25:58 -07:00
Sage Weil
f046dab88f mds: do not assume segment list is non-empty in standby_trim_segments
If we restart standby replay shortly after startup, before we actually have
any segments, we an trigger a segfault here:

 ceph version 0.64-441-gc39b99c (c39b99cdec)
 1: ceph-mds() [0x975caa]
 2: (()+0xfcb0) [0x7fc33b5a5cb0]
 3: (MDLog::standby_trim_segments()+0x192) [0x78a932]
 4: (MDS::C_MDS_StandbyReplayRestartFinish::finish(int)+0x39) [0x595f69]
 5: (Journaler::_finish_reprobe(int, unsigned long, Context*)+0x190) [0x7917b0]
 6: (Filer::_probed(Filer::Probe*, object_t const&, unsigned long, utime_t)+0x558) [0x7c6b38]
 7: (Objecter::C_Stat::finish(int)+0xc0) [0x7c7930]
 8: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xe48) [0x7b2c78]
 9: (MDS::handle_core_message(Message*)+0xae8) [0x589858]
 10: (MDS::_dispatch(Message*)+0x2f) [0x589a1f]
 11: (MDS::ms_dispatch(Message*)+0x1d3) [0x58b4a3]
 12: (DispatchQueue::entry()+0x3f1) [0x943861]
 13: (DispatchQueue::DispatchThread::entry()+0xd) [0x86e32d]

Fixes: #5333
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit abd0ff64e1)
2013-06-24 10:58:08 -07:00
Sage Weil
7ef921c8c2 Merge pull request #374 from ceph/wip-5427
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-06-24 10:20:24 -07:00
Danny Al-Gaaf
ab6ccbe226 test/librados/cmd.cc: use static_cast instead of C-Style cast
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-06-24 15:29:12 +02:00
Danny Al-Gaaf
79b5a486ee osdc/Objecter.cc: use static_cast instead of C-Style cast
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-06-24 15:24:00 +02:00
Danny Al-Gaaf
e7602a1e9f mon/MonClient.cc: use static_cast instead of C-Style cast
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-06-24 15:19:41 +02:00
Danny Al-Gaaf
de4a764565 common/cmdparse.cc: reduce scope of local variable 'pos'
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-06-24 14:34:46 +02:00
Danny Al-Gaaf
b485a3e68a common/cmdparse.cc: remove unused variable
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-06-24 14:29:50 +02:00
Danny Al-Gaaf
a92a720e00 osd/OSD.cc: prefer prefix ++operator for non-trivial iterator
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-06-24 14:24:14 +02:00
Danny Al-Gaaf
835315b690 OSDMonitor.cc: prefer prefix ++operator for non-trivial iterator
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-06-24 14:18:52 +02:00
Danny Al-Gaaf
c700db0937 mon/MonCap.cc: use empty() instead of if(size())
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-06-24 14:14:38 +02:00
Danny Al-Gaaf
4ab5bf6ff3 common/cmdparse.cc: prefer prefix ++operator for non-trivial iterator
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-06-24 13:50:33 +02:00
Gregory Farnum
134d08a965 Merge pull request #375 from ceph/wip-msgr
misc msgr fixes

Reviewed-by: Greg Farnum <greg@inktank.com>
2013-06-23 22:42:07 -07:00
Sage Weil
57dc73627e msgr: clear_pipe+queue reset when replacing lossy connections
We already handle the lossless replacement and lossy fault paths, but
not the lossy replacement.  This fixes an assert(!cleared) in the
reaper.  Adjust comments appropriately.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-23 18:09:55 -07:00
Sage Weil
9586305a23 msgr: reaper: make sure pipe has been cleared (under pipe_lock)
All paths to pipe shutdown should have cleared the con->pipe reference
already.  Assert as much.

Also, do it under pipe_lock!

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-23 15:10:24 -07:00
Sage Weil
ec612a5bda msg/Pipe: goto fail_unlocked on early failures in accept()
Instead of duplicating an incomplete cleanup sequence (that does not
clear_pipe()), goto fail_unlocked and do the cleanup in a generic way.
s/rc/r/ while we are here.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-23 15:10:24 -07:00
Sage Weil
afafb87e84 msgr: clear con->pipe inside pipe_lock on mark_down
We need to do this under protection of the pipe_lock.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-23 15:10:24 -07:00
Sage Weil
5fc1dabfb3 msgr: clear_pipe inside pipe_lock on mark_down_all
Observed a segfault in rebind -> mark_down_all -> clear_pipe -> put that
may have been due to a racing thread clearing the connection_state pointer.
Do the clear_pipe() call under the protection of pipe_lock, as we do in
all other contexts.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-23 15:10:24 -07:00
Sage Weil
cd98eb0c65 mon/AuthMonitor: make initial auth include rotating keys
This closes a very narrow race during mon creation where there are no
service keys.

Fixes: #5427
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-23 09:25:55 -07:00
Sage Weil
9b2dfb7507 mon: do not leak no_reply messages
I think I assumed no_reply() was releasing the references, but it is
not.  Which is better, since send_reply() doesn't either.  Fix the leaks
by dropping the message ref explicitly.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-06-23 08:53:12 -07:00
Sage Weil
ad12b0d61b mon: fix leak of MOSDFailure messages
We need to discard/cancel/free the failure report messages before we
cancel a report out.  Assert in the dtor to ensure we didn't forget.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-06-23 08:53:09 -07:00
Sage Weil
1aca370ed0 debian: ceph-common requires matching version of python-ceph
If they skew the ceph_argparse.py module may be missing.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-22 10:28:16 -07:00
Dan Mick
b89d7420e3 Merge branch 'next'
Conflicts:
	src/ceph.in
2013-06-21 18:46:08 -07:00
Dan Mick
94eada4046 Add header comments and Inktank copyrights to ceph.in/ceph_argparse.py
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-06-21 18:39:59 -07:00
Dan Mick
67a3c1e48d ceph.in: rip out reusable code to pybind/ceph_argparse.py
Signed-off-by: Dan Mick <dan.mick@inktank.com>

Conflicts:
	src/ceph.in
2013-06-21 18:39:43 -07:00
Sage Weil
c4272a1758 ceph: even shinier
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-06-21 15:52:32 -07:00
Sage Weil
34ef2f2484 ceph: do not busy-loop on ceph -w
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-06-21 15:50:59 -07:00
Sage Weil
27912e5858 librados: make cmd test tolerate NXIO for osd commands
The cluster may be thrashing underneath us; tolerate NXIO in case the OSD
is currently down.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-21 14:53:22 -07:00
Sage Weil
dcd275318d Merge remote-tracking branch 'gh/wip-mds'
Reviewed-by: Sage Weil <sage@inktank.com>
2013-06-21 14:25:34 -07:00
Sage Weil
abd0ff64e1 mds: do not assume segment list is non-empty in standby_trim_segments
If we restart standby replay shortly after startup, before we actually have
any segments, we an trigger a segfault here:

 ceph version 0.64-441-gc39b99c (c39b99cdec)
 1: ceph-mds() [0x975caa]
 2: (()+0xfcb0) [0x7fc33b5a5cb0]
 3: (MDLog::standby_trim_segments()+0x192) [0x78a932]
 4: (MDS::C_MDS_StandbyReplayRestartFinish::finish(int)+0x39) [0x595f69]
 5: (Journaler::_finish_reprobe(int, unsigned long, Context*)+0x190) [0x7917b0]
 6: (Filer::_probed(Filer::Probe*, object_t const&, unsigned long, utime_t)+0x558) [0x7c6b38]
 7: (Objecter::C_Stat::finish(int)+0xc0) [0x7c7930]
 8: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xe48) [0x7b2c78]
 9: (MDS::handle_core_message(Message*)+0xae8) [0x589858]
 10: (MDS::_dispatch(Message*)+0x2f) [0x589a1f]
 11: (MDS::ms_dispatch(Message*)+0x1d3) [0x58b4a3]
 12: (DispatchQueue::entry()+0x3f1) [0x943861]
 13: (DispatchQueue::DispatchThread::entry()+0xd) [0x86e32d]

Fixes: #5333
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-06-21 14:23:45 -07:00
Sage Weil
3bebbc0942 mds: rev protocol
Commit 18b9e63b4d changed the OTW lock
encoding.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-21 08:20:33 -07:00
Yan, Zheng
ded2e84f3d mds: kill Server::handle_client_lookup_hash()
Server::handle_client_lookup_ino() is more simple and robust. Use it
to handle both LOOKUPHASH and LOOKUINO requests.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-06-21 08:19:25 -07:00
Yan, Zheng
2147c4e3a6 mds: use "open-by-ino" helper to handle LOOKUPINO request
Fixes #3541
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-06-21 08:19:24 -07:00
Sage Weil
e97a2c86de Merge remote-tracking branch 'yan/wip-mds' into wip-mds 2013-06-20 15:55:41 -07:00
Dan Mick
31d221c3a4 ceph.in: remove some TAB chars
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-06-20 15:14:36 -07:00
Dan Mick
69e1a9121d ceph.in: fix ^C handling in watch (trap exception in while, too)
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-06-20 15:14:36 -07:00
Sage Weil
29f6f27729 ceph: --version as well as -v
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-20 15:04:51 -07:00
Sage Weil
4bf5b732cd Merge remote-tracking branch 'gh/next' 2013-06-20 12:30:54 -07:00