Commit Graph

35885 Commits

Author SHA1 Message Date
Loic Dachary
fdbfece81c Merge pull request #2497 from ceph/wip-xfs-inode64
ceph-disk: mount xfs with inode64 by default

Reviewed-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-09-16 15:12:24 +02:00
Sage Weil
782848af59 Merge pull request #2499 from ceph/wip-9219-giant
wip-9219: subscribe to the newest osdmap when reconnecting to a monitor

Reviewed-by: Sage Weil <sage@redhat.com>
2014-09-15 17:40:28 -07:00
Greg Farnum
1b9226c723 osd: subscribe to the newest osdmap when reconnecting to a monitor
This is mostly relevant in testing clusters, but it ensures that an OSD
disconnecting from the monitor at the wrong time will still see any recent
map updates and prevent accidental loss of map injection into the OSD cluster.
Fixes: #9219

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-09-15 17:07:41 -07:00
Sage Weil
11496399ef ceph-disk: mount xfs with inode64 by default
We did this forever ago with mkcephfs, but ceph-disk didn't.  Note that for
modern XFS this option is obsolete, but for older kernels it was not the
default.

Backport: firefly
Signed-off-by: Sage Weil <sage@redhat.com>
2014-09-15 15:29:08 -07:00
John Spray
8c23ef0949 Merge pull request #2492 from ceph/wip-9284
#9284 - fix client RECALL handling and add health metrics

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-09-15 23:23:46 +01:00
Sage Weil
9d36d87c05 Merge pull request #2476 from ceph/wip-9307
rgw: push hash calculater deeper

Reviewed-by: Sage Weil <sage@redhat.com>
2014-09-15 15:19:07 -07:00
Josh Durgin
853ba2dfb3 Merge pull request #2493 from ceph/wip-rbd-objectcacher-hang
rbd: ObjectCacher reads can hang when reading sparse files

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2014-09-15 13:25:33 -07:00
Sage Weil
f2039c4e01 Merge pull request #2495 from dachary/wip-erasure-code-preload
erasure-code: preload fails if < 0

Reviewed-by: Sage Weil <sage@redhat.com>
2014-09-15 11:26:51 -07:00
Loic Dachary
ded1b303b5 erasure-code: preload fails if < 0
And not if < -1.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
2014-09-15 20:21:14 +02:00
Sage Weil
0eef2d1b6f Merge pull request #2486 from jgalvez/master
init-radosgw.sysv: Support systemd for starting the gateway

Reviewed-by: Sage Weil <sage@redhat.com>
2014-09-15 09:41:45 -07:00
Loic Dachary
1941d7b60f Merge pull request #2472 from dachary/wip-9429-bench
erasure-code: fix erasure_code_benchmark goop (decode)

Reviewed-by: Janne Grunau <j@jannau.net>
2014-09-15 18:23:08 +02:00
John Spray
a140439f85 mds: limit number of caps inspected in caps_tick
This is to avoid hitting an O(caps) loop in the worst
cast scenario.  This mechanism is a little crude but
should be superceded at some point by admin socket
functionality to inspect session caps so that we
don't need to spit out this level of detail in logs.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 15:05:14 +01:00
John Spray
bf590f8a5d mds: keep per-client revoking caps list
...to avoid doing an O(caps) scan to find out
which clients are responsible for any late-revoking
caps during health checks.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 15:05:14 +01:00
John Spray
a6a0fd814b xlist: implement copy constructor
...so that I can have a std::map of them.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 15:05:14 +01:00
John Spray
fd04d5e662 mds: health metric for late releasing caps
Follow up on Yan Zheng's "mds: warn clients which
aren't revoking cap" to include a health metric
for this condition as well as the clog messages.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 15:05:14 +01:00
John Spray
05d69580b0 mon: trigger transaction on MDS health changes
I think this was previously only working as a side effect
of other MDS map changes.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 15:05:14 +01:00
John Spray
e6062b8d33 mds: add a health metric for failure to recall caps
Fixes: #9284

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 15:05:14 +01:00
John Spray
8c0f2555fe mds: add state for tracking RECALL progress
To be used later for generating health metrics
for clients which are failing to promptly service
CEPH_SESSION_RECALL_STATE messages.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 15:05:14 +01:00
John Spray
8199f80846 xlist: implement const_iterator
Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 15:05:14 +01:00
John Spray
00a002143a client: fix trim_caps for inodes in root
Previously client would fail to release caps for files
in the root directory in response to CEPH_SESSION_RECALL_STATE
messages.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 15:05:14 +01:00
John Spray
2b5bbab55c client: failure injection for cap release
Used for simulating a buggy client that trips
the error detection in #9282 (warn clients
which aren't revoking caps)

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 15:05:13 +01:00
John Spray
21f5e18ee3 client: fix potentially invalid read in trim_caps
trim_dentry can potentially free an inode, so get/put
it around the block where we use the inode's dn_set.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 15:05:13 +01:00
John Spray
9007217239 client: more precise cap trimming
Two fixes:
 * Client would unlink everything it could, instead of just
   meeting its goal, because caps.size() doesn't change until
   dentries are cleaned up later.  Take account of the trimmed
   count in the while() condition to fix that.
 * Don't count the root ino as trimmed, as although it has no
   dentries (of course), we will never give up the cap.

With this change, the client will now precisely achieve the number
of caps requested in CEPH_SESSION_RECALL_STATE messages.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 15:05:13 +01:00
John Spray
c328486f24 client: fix crash in trim_caps
In a75af4c2, procedure was added to invalidate root's dentries
if the trimming failed to free enough caps.  This would sometimes
crash because root->dir wasn't necessarily open.

Fix by only doing it if root dir is open, though I suspect this
may not be the end of it...

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 15:05:13 +01:00
Loic Dachary
68001fea75 Merge pull request #2485 from Abioy/master
bugfix: wrong socket address in log msg of Pipe.cc

Reviewed-by: Loic Dachary <loic-201408@dachary.org>
2014-09-15 15:40:44 +02:00
Abioy
83fd1cf84a bugfix: wrong socket address in log msg of Pipe.cc
paddr was not yet set up for the socket address

Signed-off-by: Yongyue Sun abioy.sun@gmail.com
2014-09-15 20:43:58 +08:00
Loic Dachary
92204287dc Merge pull request #2442 from dachary/wip-6754-jerasure-parameters
erasure-code: fix BlaumRoth sanity check on w

Reviewed-by: Andreas Peters <andreas.joachim.peters@cern.ch>
2014-09-15 12:24:19 +02:00
Loic Dachary
8e625a0032 Merge pull request #2488 from cernceph/docfix
doc: osd_backfill_scan_(min|max) are object counts

Reviewed-by: Loic Dachary <loic-201408@dachary.org>
2014-09-15 11:39:46 +02:00
Dan van der Ster
868b6b99fd doc: osd_backfill_scan_(min|max) are object counts
osd_backfill_scan_min and osd_backfill_scan_max set the number of
items grabbed during a single backfill scan, not an interval in
seconds. Correct the doc.

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
2014-09-15 11:27:24 +02:00
Jason Dillaman
cdb7675a21 rbd: ObjectCacher reads can hang when reading sparse files
The pending read list was not properly flushed when empty objects
were read from a space file.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2014-09-15 00:53:50 -04:00
JuanJose 'JJ' Galvez
ddd52e87b2 init-radosgw.sysv: Support systemd for starting the gateway
When using RHEL7 the radosgw daemon needs to start under systemd.

Check for systemd running on PID 1. If it is then start
the daemon using: systemd-run -r <cmd>. pidof returns null
as it is executed too quickly, adding one second of sleep and
script reports startup correctly.

Signed-off-by: JuanJose 'JJ' Galvez <jgalvez@redhat.com>
2014-09-14 20:38:20 -07:00
Loic Dachary
d888753c0e Merge pull request #2484 from sjahl/master
doc: Added bucket management commands to ops/crush-map

Reviewed-by: Loic Dachary <loic-201408@dachary.org>
2014-09-14 17:46:04 +02:00
Stephen Jahl
d32b4286de doc: Added bucket management commands to ops/crush-map
Describes the CLI for adding and removing buckets, in addition to the
'moving' instructions which were already present.

Signed-off-by: Stephen Jahl <stephenjahl@gmail.com>
2014-09-14 10:41:16 -04:00
Sage Weil
3f0ca4668e Merge remote-tracking branch 'gh/giant' 2014-09-13 21:20:33 -07:00
Sage Weil
b285788c56 Merge pull request #2481 from sjahl/master
doc: fixes a formatting error on ops/crush-map
2014-09-13 12:46:24 -07:00
Stephen Jahl
b8a1ec08a1 doc: fixes a formatting error on ops/crush-map
Signed-off-by: Stephen Jahl <stephenjahl@gmail.com>
2014-09-13 15:31:53 -04:00
Loic Dachary
8d066732db Merge pull request #2467 from majianpeng/fix3
buffer: In rebuild_page_aligned for the last ptr is page aligned, no need call rebuild().

Reviewed-by: Loic Dachary <loic-201408@dachary.org>
2014-09-13 17:56:16 +02:00
Loic Dachary
04e40737f1 Merge pull request #2478 from ceph/wip-9445
global: fix hang when segv happens inside logging code

Reviewed-by: Loic Dachary <loic-201408@dachary.org>
2014-09-13 17:32:57 +02:00
Yan, Zheng
499a73b3c3 Merge pull request #2477 from ceph/wip-client-msg-leak
client: fix a message leak
2014-09-13 08:42:15 +08:00
John Spray
c3c6468cad mds: update segment references during journal rewrite
... to avoid leaving log events that reference log
segments by offsets which no longer exist.

Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 386f2d7c82)
Reviewed-by: Greg Farnum <greg@inktank.com>
2014-09-12 17:39:53 -07:00
Gregory Farnum
e06f4251ac Merge pull request #2469 from ceph/wip-9427-rewrite
mds: update segment references during journal rewrite

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-09-12 17:35:54 -07:00
Sage Weil
a8c943a0e4 log: add simple test to verify an internal SEGV doesn't hang
Test that the segv injection works.

Test that a segv while logging something doesn't hang when the signal
handlers are installed.  Note that this fails/hangs without the previous
fix.

Signed-off-by: Sage Weil <sage@redhat.com>
2014-09-12 17:18:01 -07:00
John Spray
2313ce1d02 client: fix a message leak
Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-13 00:14:36 +01:00
Sage Weil
e3fe18aabe global/signal_handler: do not log if SEGV originated inside log code
Signed-off-by: Sage Weil <sage@redhat.com>
2014-09-12 15:25:03 -07:00
Sage Weil
558463e815 log: add Log::is_inside_log_lock()
Signed-off-by: Sage Weil <sage@redhat.com>
2014-09-12 15:24:50 -07:00
John Spray
386f2d7c82 mds: update segment references during journal rewrite
... to avoid leaving log events that reference log
segments by offsets which no longer exist.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-12 23:05:38 +01:00
Yehuda Sadeh
d41c3e858c rgw: push hash calculater deeper
This might have been the culprit for #9307. Before we were calculating
the hash after the call to processor->handle_data(), however, that
method might have spliced the bufferlist, so we can't be sure that the
pointer that we were holding originally is still invalid. Instead, push
the hash calculation down. Added a new explicit complete_hash() call to
the processor, since when we're at complete() it's too late (we need to
have the hash at that point already).

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2014-09-12 14:07:44 -07:00
John Spray
9f4c687288 Merge pull request #2471 from ceph/wip-9446
mon: fix MDS health detail output

Reviewed-by: Sage Weil <sage@redhat.com>
2014-09-12 16:47:52 +01:00
Loic Dachary
ce7b2ecc75 erasure-code: fix erasure_code_benchmark goop (decode)
Using a stringstream that is only displayed on error when calling the
erasure code factory, instead of cerr. The user expects the output to be
clean when there is no error. That was done for the encode function but
not the decode function.

http://tracker.ceph.com/issues/9429 Fixes: #9429

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
2014-09-12 17:46:41 +02:00
John Spray
9ba4e78f00 mon: fix MDS health detail output
I fat fingered a couple of things here.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-12 16:43:20 +01:00