Commit Graph

24890 Commits

Author SHA1 Message Date
Sam Lang
5aa5bc2cae mds: Delay session close if in clientreplay
If the mds is in clientreplay, a session close
request needs to be delayed until it reaches
active.  Otherwise, the session state gets set to
'closing', and the replay requests get dropped on the
floor.

Fixes #4564.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-03-27 11:06:28 -05:00
Sage Weil
295c92ce91 Merge pull request #157 from ceph/wip-4539
mds: Clear backtrace updates on standby_trim_seg

Reviewed-by: Sage Weil <sage@inktank.com>
2013-03-27 08:26:25 -07:00
Sam Lang
0e009b1bb3 mds: Clear backtrace updates on standby_trim_seg
If the mds is standby, when a segment is trimmed, we need
to clear the backtrace updates list to avoid the following
assertion when the segment is deleted.

./include/elist.h: 92: FAILED assert(_head.empty())
ceph version 0.59-478-g8befbca (8befbca77a)
(MDLog::standby_trim_segments()+0xce5) [0x6ccec5]
(MDS::C_MDS_StandbyReplayRestartFinish::finish(int)+0x39) [0x4e86b9]
(Journaler::_finish_reprobe(int, unsigned long, Context*)+0x190)
[0x6d3210]
(Filer::_probed(Filer::Probe*, object_t const&, unsigned long,
utime_t)+0x558) [0x704a88]
(Objecter::C_Stat::finish(int)+0xc0) [0x705900]
(Objecter::handle_osd_op_reply(MOSDOpReply*)+0xe38) [0x6f1df8]
(MDS::handle_core_message(Message*)+0xae8) [0x4dc318]
(MDS::_dispatch(Message*)+0x2f) [0x4dc4df]
(MDS::ms_dispatch(Message*)+0x1db) [0x4ddf7b]
(DispatchQueue::entry()+0x341) [0x81f561]
(DispatchQueue::DispatchThread::entry()+0xd) [0x79c6ad]
(()+0x7e9a) [0x7f346bb9ee9a]
(clone()+0x6d) [0x7f346a3574bd]

Fixes #4539.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-03-27 09:35:08 -05:00
Samuel Just
76b296f01f ReplicatedPG: send entire stats on OP_BACKFILL_FINISH
Otherwise, we update the stat.stat structure, but not the
stat.invalid_stats part.  This will result in a recently
split primary propogating the invalid stats but not the
invalid marker.  Sending the whole pg_stat_t structure
also mirrors MOSDSubOp.

Fixes: #4557
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-03-26 17:21:51 -07:00
Sage Weil
23faa9f050 Merge pull request #147 from ceph/wip-4537
mds: CInode::build_backtrace() always incr iter

Reviewed-by: Sage Weil <sage@inktank.com>
2013-03-26 09:29:42 -07:00
Sam Lang
14cef276f1 mds: CInode::build_backtrace() always incr iter
Always increment the iterator when adding old pools
to the backtrace.  This fixes a bug on files where
the layout had been set to a different pool and then
back to the same pool, causing continuous looping in
the build_backtrace() function.

Fixes #4537.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-03-26 11:14:52 -05:00
Yehuda Sadeh
70e0ee8ba9 rgw: bucket index ops on system buckets shouldn't do anything
Fixes: #4508
Backport: bobtail
On certain bucket index operations we didn't check whether
the bucket was a system bucket, which caused the operations
to fail. This triggered an error message on bucket removal
operations.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-03-25 09:55:35 -07:00
Sam Lang
ece4348807 client: don't set other if lookup fails in rename
On rename, only set the other inode if the
lookup for the destination succeeds, otherwise we hit
a segv in set_other_inode().

Fixes #4517.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Tested-by: Noah Watkins <jayhawk@cs.ucsc.edu>
2013-03-23 17:30:53 -07:00
Sam Lang
8e6a970c45 client: Fix rename returning ENOENT for dest
Introduced by fc80c1dc6e,
the client should _not_ fail if the lookup for the
destination path on rename returns ENOENT.

The previous code also did not check that the lookup
returned ENOENT or success.  We add the check and fail
if we get any other errors.

Fixes #4517.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-03-23 11:01:50 -07:00
Sage Weil
838f1cde94 preserve /var/lib/ceph on deb/rpm purge
We should clobber configuration and log data, but *not* user data.  Leave
/var/lib/ceph alone.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Gary Lowell <gary.lowell@inktank.com>
2013-03-22 15:24:51 -07:00
Samuel Just
4fe4deafbe PG::GetMissing: need to check need_up_thru in MLogRec handler
Backport: bobtail
Fixes: #4534
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-03-22 14:07:04 -07:00
Samuel Just
d611eba9ca PG,osd_types: improve check_new_interval debugging
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-03-22 14:06:45 -07:00
Sage Weil
c524e2e01d common/MemoryModel: remove logging to /tmp/memlog
This was a hack for dev purposes ages ago; remove it.  The predictable
filename is a security issue.

CVE-2013-1882

Reported-by: Michael Scherer <misc@zarb.org>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-03-22 13:25:49 -07:00
Sage Weil
6a7ad2eac1 init-ceph: clean up temp ceph.conf filename on exit
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-03-22 13:25:43 -07:00
Sage Weil
051734522f init-ceph: push temp conf file to a unique location on remote host
The predictable file name is a security problem.

CVE-2013-1882

Reported-by: Michael Scherer <misc@zarb.org>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-03-22 13:25:33 -07:00
Sage Weil
f463ef78d7 mkcephfs: make remote temp directory name unique
The predictable file name is a security problem.

CVE-2013-1882

Reported-by: Michael Scherer <misc@zarb.org>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-03-22 13:25:23 -07:00
Sage Weil
38a5acbb82 osd: reenable 'journal aio = true'
Now that #4079 is resolved.  Reverts 1cfc3ae0.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-22 09:15:23 -07:00
Sage Weil
e5940da9a5 os/FileJournal: fix aio self-throttling deadlock
This block of code tries to limit the number of aios in flight by waiting
for the amount of data to be written to grow relative to a function of the
number of aios.  Strictly speaking, the condition we are waiting for is a
function of both aio_num and the write queue, but we are only woken by
changes in aio_num, and were (in rare cases) waiting when aio_num == 0 and
there was no possibility of being woken.

Fix this by verifying that aio_num > 0, and restructuring the loop to
recheck that condition on each wakeup.

Fixes: #4079
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-03-22 09:15:20 -07:00
Sage Weil
7118df89cd Merge pull request #135 from ceph/wip-4519
mon: AuthMonitor: delete auth_handler while increasing max_global_id

Reviewed-by: Sage Weil <sage@inktank.com>
2013-03-21 18:25:01 -07:00
Joao Eduardo Luis
71ec9c6bd5 mon: AuthMonitor: delete auth_handler while increasing max_global_id
By not deleting and setting NULL the session's auth_handler, we could
hit a scenario in which we'd end up dispatching a previously-wait-listed
auth message and we wouldn't start its auth session.

This only happened when increasing max_global_id via Paxos (in which case
we would wait-list the message) and would only be noticeable when running
with cephx disabled.

Fixes: #4519

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-03-22 01:21:00 +00:00
Samuel Just
42a71c1dd8 FileJournal: quieter debugging on journal scanning
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
(cherry picked from commit 6740d512ac)
2013-03-21 18:09:58 -07:00
Gary Lowell
d67eee1d11 Merge branch 'next' 2013-03-21 00:40:16 -07:00
Sage Weil
17d4a7c457 doc/release-notes: v0.59
Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-20 22:11:15 -07:00
Sage Weil
f21411423b Merge pull request #126 from alram/master
Update Chef deployment documentation

Reviewed-by: Sage Weil <sage@inktank.com>
2013-03-20 17:07:11 -07:00
Alexandre Marangone
e485471765 Update Chef deployment documentation
Signed-off-by: Alexandre Marangone <alexandre.marangone@inktank.com>
2013-03-20 16:49:49 -07:00
Sage Weil
131dce6e8e Merge pull request #124 from ceph/wip-4509
mon: DataHealthService: shutdown mon if failed to obtain disk stats

Reviewed-by: Sage Weil <sage@inktank.com>
2013-03-20 16:17:51 -07:00
Joao Eduardo Luis
97fd7b610d mon: DataHealthService: log to derr instead if we're about to shutdown
Otherwise the message would -- or could -- be lost.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-03-20 20:51:06 +00:00
Joao Eduardo Luis
51d62d325c mon: DataHealthService: shutdown mon if failed to obtain disk stats
Being unable to run a ::statfs() may be a symptom of something bigger.

We want to cleanly shutdown the monitor ASAP if such thing happens.

Fixes: #4509

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-03-20 20:49:20 +00:00
Sage Weil
06ae519672 Merge pull request #123 from dalgaaf/wip-da-sca-misc-1
Some smaller misc fixes

Reviewed-by: Sage Weil <sage@inktank.com>
2013-03-20 10:12:06 -07:00
Danny Al-Gaaf
5bf0331a97 client/Client.cc: handle error if _lookup() fails
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-03-20 17:08:42 +01:00
Danny Al-Gaaf
fc41684e99 qa/workunits/direct_io/test_sync_io.c: add proper error handling
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-03-20 16:56:03 +01:00
Danny Al-Gaaf
a8a5683e6d test_short_dio_read.c: add proper error handling
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-03-20 16:37:37 +01:00
Danny Al-Gaaf
f9c108c798 mds/Locker.cc: prefer prefix ++operator for iterators
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-03-20 16:15:06 +01:00
Danny Al-Gaaf
4151630c58 mount/mount.ceph.c: remove unused variable
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-03-20 16:15:00 +01:00
Sage Weil
45d5544c3f Merge pull request #121 from ceph/wip-4448
mon: Monitor: clearer output on error during attempt to convert store

Reviewed-by: Sage Weil <sage@inktank.com>
2013-03-20 06:33:18 -07:00
Joao Eduardo Luis
c29812cdaf mon: Monitor: clearer output on error during attempt to convert store
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-03-20 13:31:14 +00:00
Gary Lowell
cbae6a435c v0.59 2013-03-19 22:27:13 -07:00
Sage Weil
7e7783971e Merge pull request #106 from ceph/wip-crush
crush: update weights properly for DAG (not tree) maps 

Reviewed-by: caleb miles <caleb.miles@inktank.com>
2013-03-19 14:53:23 -07:00
Sage Weil
b7e2a0d464 Merge pull request #118 from dalgaaf/wip-da-enum
QuorumService.h: use enum instead of static const int

Reviewed-by: Sage Weil <sage@inktank.com>
2013-03-19 13:36:12 -07:00
Danny Al-Gaaf
dfb1fbe7eb QuorumService.h: use enum instead of static const int
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-03-19 21:33:18 +01:00
David Zafman
6a3aa2a2cc Missed adding rados_types.hpp to package
Caused by 3bd48cbbad
feature 4207 implementation

Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Gary Lowell <gary.lowell@inktank.com>
(cherry picked from commit e1e2d5d217)
2013-03-19 13:02:00 -07:00
Josh Durgin
2900bf4a05 PendingReleaseNotes: fix typo
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-03-19 11:42:32 -07:00
Josh Durgin
1597b3e3a1 librbd: optionally wait for a flush before enabling writeback
Older guests may not send flushes properly (i.e. never), so if this is
enabled, rbd_cache=true is safe for them transparently.

Disable by default, since it will unnecessarily slow down newer guest
boot, and prevent writeback caching for things that don't need to send
flushes, like the command line tool.

Refs: #3817
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-03-19 11:42:27 -07:00
Sage Weil
47f1a94547 Makefile: missing header
Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-19 10:39:50 -07:00
Sage Weil
020d1b1610 mon: use enum instead of static const int
This way it compiles.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-19 10:21:52 -07:00
Sage Weil
efc4b1268e mon/Paxos: set state to RECOVERING during restart
This ensures that the paxos state is not active when the PaxosService
restart() methods run right afterwards, and that EAGAIN waiters will get
requeued appropriately.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-19 10:15:41 -07:00
Joao Eduardo Luis
45843f7501 Makefile.am: fix misspelt header name
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-03-19 16:31:48 +00:00
Sage Weil
bee5046333 mon/PaxosService: handle non-zero return values
If 7aec13f749 we started passing non-zero
return values to these completions; now we have to deal with them
accordingly.

RetryMessage behaves just like the Monitor variant.

Propose and Committed update state but otherwise ignore non-zero
return values.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-18 23:09:51 -07:00
Sage Weil
1674519122 Merge branch 'next' 2013-03-18 22:54:25 -07:00
Sage Weil
d47759429a ceph-disk-prepare: 'mkfs -t' instead of 'mkfs --type='
Older mkfs (el6) doesn't like --type=.

Fixes: #4495
Reported-by: Alexandre Maragone <alexandre.maragone@inktank.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-18 21:13:34 -07:00