Commit Graph

34490 Commits

Author SHA1 Message Date
Greg Farnum
e179e9227b OSD: introduce require_self_aliveness(OpRequestRef&,epoch_t) function
Take the self-aliveness checks out of require_same_or_newer_map() and use
the new function for that and for require_up_osd_peer().

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-07-28 18:39:59 -07:00
Greg Farnum
eb2f1ea2c3 OSD: use OpRequestRef& for a few require_* functions
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-07-28 14:08:30 -07:00
Greg Farnum
ccd0eec501 OSD: introduce require_up_osd_peer() function for gating replica ops
This checks both that a Message originates from an OSD, and that the OSD
is up in the given map epoch.
We use it in handle_replica_op so that we don't inadvertently add operations
from down peers, who might or might not know it.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-07-22 16:57:00 -07:00
Sage Weil
36265d0db0 Merge pull request #2125 from ceph/wip-memstore
memstore: a few fixes, and enable the tests!

Reviewed-by: Haomai Wang <haomaiwang@gmail.com>
2014-07-22 10:52:40 -07:00
Sage Weil
f7112c5beb Merge pull request #2105 from rootfs/wip-qa-hadoop-wordcount
update hadoop-wordcount test to be able to run on hadoop 2.x. 

Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-22 08:42:03 -07:00
rootfs
e311a085a8 uncomment cleanup command 2014-07-22 11:31:37 -04:00
Wido den Hollander
d87e5b9f60 powerdns: RADOS Gateway backend for bucket directioning
This backend can be used to create one global namespace for multiple
RGW regions.

Using a CNAME DNS response the traffic is directed towards the RGW region
without using HTTP redirects.
2014-07-22 16:51:05 +02:00
Ma, Jianpeng
9061988ec7 osd: init local_connection for fast_dispatch in _send_boot()
We were not properly setting up Sessions on the local_connection for
fast_dispatch'ed Messages if the cluster_addr was set explicitly: the OSD
was not in the dispatch list at bind() time (in ceph_osd.cc), and nothing
called it later on. This issue was missed in testing because Inktank only
uses unified NICs.

That led to errors like the following:

When do ec-read, i met a bug which was occured 100%. The messages are:
2014-07-14 10:03:07.318681 7f7654f6e700 -1 osd/OSD.cc: In function
'virtual void OSD::ms_fast_dispatch(Message*)' thread 7f7654f6e700 time
2014-07-14 10:03:07.316782 osd/OSD.cc: 5019: FAILED assert(session)

 ceph version 0.82-585-g79f3f67 (79f3f67491)
 1: (OSD::ms_fast_dispatch(Message*)+0x286) [0x6544b6]
 2: (DispatchQueue::fast_dispatch(Message*)+0x56) [0xb059d6]
 3: (DispatchQueue::run_local_delivery()+0x6b) [0xb08e0b]
 4: (DispatchQueue::LocalDeliveryThread::entry()+0xd) [0xa4a5fd]
 5: (()+0x8182) [0x7f7665670182]
 6: (clone()+0x6d) [0x7f7663a1130d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

To resolve this, we have the OSD invoke ms_handle_fast_connect() explicitly
in send_boot(). It's not really an appropriate location, but we're already
doing a bunch of messenger twiddling there, so it's acceptable for now.

Signed-off-by: Ma Jianpeng <jianpeng.ma@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2014-07-21 13:13:44 -07:00
Sage Weil
c1c5f4b5f5 Merge pull request #2121 from ceph/wip-dencoder
limit leveldb linkage; move ceph-dencoder back into ceph-common

Reviewed-by: Dan Mick <dan.mick@inktank.com>

RGW patch Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
2014-07-21 13:10:02 -07:00
Sage Weil
27f6dbb64a Merge pull request #2067 from thorstenb/wip-janitorial-clang-3
[werror] Fix mismatched tags (struct vs. class) inconsistence

Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-21 09:08:31 -07:00
Thorsten Behrens
b6f3aff766 Fix mismatched tags (struct vs. class) inconsistency
Signed-off-by: Thorsten Behrens <tbehrens@suse.com>
2014-07-21 17:09:17 +02:00
Sage Weil
ff15a43c71 Merge pull request #2111 from ceph/wip-8174
osd: add config for osd_max_object_name_len = 2048 (was hard-coded at 4096)

Reviewed-by: Haomai Wang <haomaiwang@gmail.com>

and the first patch was
Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-07-20 14:21:09 -07:00
Sage Weil
2aa3edcb13 os/FileStore: fix max object name limit
Our max object name is not limited by file name size, but by the length of
the name we can stuff in an xattr.  That will vary from file system to
file system, so just make this 4096.  In practice, it should be limited
via the global tunable, if it is adjusted at all.

Signed-off-by: Sage Weil <sage@redhat.com>
2014-07-20 07:48:47 -07:00
Sage Weil
f4bffece8f ceph_test_objectstore: test memstore
Signed-off-by: Sage Weil <sage@redhat.com>
2014-07-19 13:56:07 -07:00
Sage Weil
6f312b0584 os/MemStore: copy attrs on clone
Backport: firefly
Signed-off-by: Sage Weil <sage@redhat.com>
2014-07-19 13:56:07 -07:00
Sage Weil
8dd6b8f9d8 os/MemStore: fix wrlock ordering checks
We can't compare the shared_ptrs themselves; we need to compare the
addresses of the actual objects.

Signed-off-by: Sage Weil <sage@redhat.com>
2014-07-19 13:56:07 -07:00
Sage Weil
a2594a5472 osd/MemStore: handle collection_move_rename within the same collection
Signed-off-by: Sage Weil <sage@redhat.com>
2014-07-19 13:56:07 -07:00
Sage Weil
34671108ce ceph-dencoder: don't link librgw.la (and rados, etc.)
Signed-off-by: Sage Weil <sage@redhat.com>
2014-07-18 22:44:51 -07:00
Sage Weil
b1a641f307 rgw: move a bunch of stuff into rgw_dencoder
This will help out ceph-dencoder ...

Signed-off-by: Sage Weil <sage@redhat.com>
2014-07-18 22:39:46 -07:00
Sage Weil
1c170776cb libosd_types, libos_types, libmon_types
Signed-off-by: Sage Weil <sage@redhat.com>
2014-07-18 22:33:42 -07:00
Sage Weil
58cc894b32 Revert "ceph.spec: move ceph-dencoder to ceph from ceph-common"
This reverts commit 95f5a448b5.
2014-07-18 20:55:39 -07:00
Sage Weil
f181f78b74 Revert "debian: move ceph-dencoder to ceph from ceph-common"
This reverts commit b37e3bde3b.
2014-07-18 20:55:35 -07:00
Sage Weil
ad4a4e1346 unittest_osdmap: revert a few broken changes
From commit 80ea6067f7.

Signed-off-by: Sage Weil <sage@redhat.com>
2014-07-18 16:51:16 -07:00
Wido den Hollander
09a5974fd3 crushtool: Send output to stdout instead of stderr
A lot of output was send to stderr instead of stdout and vise versa.

Error messages should go to stderr, but all other output to stdout
2014-07-18 20:18:18 +02:00
Gregory Farnum
b9463e3497 Merge pull request #2115 from ceph/wip-8811
Make standby-replay MDSes much more careful about journal formats; both changing them and generally being aware.

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-07-18 11:17:52 -07:00
Sage Weil
bd3367eafb osd: add config for osd_max_attr_name_len = 100
Set a limit on the length of an attr name.  The fs can only take 128
bytes, but we were not imposing any limit.

Add a test.

Reported-by: Haomai Wang <haomaiwang@gmail.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2014-07-18 10:44:49 -07:00
Sage Weil
7c0b2a05b9 os: add ObjectStore::get_max_attr_name_length()
Most importantly, capture that attrs on FileStore can't be more than about
100 chars.  The Linux xattrs can only be 128 chars, but we also have some
prefixing we do.

Signed-off-by: Sage Weil <sage@redhat.com>
2014-07-18 10:44:05 -07:00
Sage Weil
7e0aca18a0 osd: add config for osd_max_object_name_len = 2048 (was hard-coded at 4096)
Previously we had a hard coded limit of 4096.  Objects > 3k crash the OSD
when running on ext4, although they probably work on xfs.  But rgw only
generates objects a bit over 1024 bytes (maybe 1200 tops?), so let set a
more reasonable limit here.  2048 is a nice round number and should be
safe.

Add a test.

Fixes: #8174
Signed-off-by: Sage Weil <sage@redhat.com>
2014-07-18 10:44:05 -07:00
John Spray
e60dd0f6c5 osdc: refactor JOURNAL_FORMAT_* constants to enum
...so that the upper limit doesn't have to be updated
by hand.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-07-18 18:40:51 +01:00
John Spray
8eef89e663 doc: fix example s/inspect/journal inspect/
Signed-off-by: John Spray <john.spray@redhat.com>
2014-07-18 18:40:51 +01:00
John Spray
5438500af8 mds: fix journal reformat failure in standbyreplay
In the 0.82 release, standbyreplay MDS daemons would try
to reformat the jouranl if they saw an older version on
disk, where this should have only been done by the active
MDS for the rank.  Depending on timing, this could cause
fatal corruption of the journal.

This change handles the following cases:
* only do reformat if not in standbyreplay (else raise EAGAIN
to keep trying til an active mds reformats it)
* if journal header goes away while in standbyreplay then raise
EAGAIN (handle rewrite happening in background)
* if journal version is greater than the max supported, suicide

Fixes: #8811

Signed-off-by: John Spray <john.spray@redhat.com>
2014-07-18 18:40:51 +01:00
John Spray
ed3bc4c385 osdc/Journaler: validate header on load and save
Previously if the journal header contained invalid
write, expire or trimmed offsets, we would end up
hitting a hard-to-understand assertion much later.

Instead, raise the error right away if the fields
are identifiably bad at load time, and assert that
they're valid before persisting them.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-07-18 18:40:51 +01:00
Sage Weil
5093666151 Merge pull request #2104 from ceph/wip-dencoder
move ceph-dencoder to ceph from ceph-common

Reviewed-by: Dan Mick <dan.mick@inktank.com>
2014-07-18 10:29:50 -07:00
Sage Weil
094db11623 Merge pull request #2114 from ceph/wip-vstart
vstart.sh: default to 3 osds

Not-NAKed-by: John Spray <john.spray@inktank.com>
2014-07-18 10:27:51 -07:00
John Spray
18ca6b60d1 test: add a missing semicolon
Broke in df8f48628.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-07-18 18:00:44 +01:00
Sage Weil
113c3656a0 Merge pull request #2119 from ceph/wip-vstart-existing-mds
Wip vstart existing mds

Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-18 09:51:13 -07:00
Sage Weil
df8f486288 Merge pull request #2108 from kevincox/sizeint
Fix size of network protocol intergers.

Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-18 09:15:09 -07:00
John Spray
0cd0268421 qa: generalise cephtool for vstart+MDS
Previously this test assumed no pre-existing
filesystem and no MDS running.  Generalize it
to nuke any existing filesystems found before
running, so that you can use it inside a vstart
cluster that had MDS>0.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-07-18 16:53:58 +01:00
John Spray
bb5a574f12 mon: carry last_failure_osd_epoch across fs new
So that new MDSs in a new filesystem are guaranteed
to be up to date with anything we blacklisted
from a filesystem coming before.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-07-18 16:53:48 +01:00
John Spray
b936a276e3 mon/MDSMonitor: fix msg on idempotent fs rm
Was outputting trailing "unrecognised command"
because we returned 0 instead of setting r=0.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-07-18 16:53:48 +01:00
Dan Mick
06a8f7b99c configure: do not link leveldb with everything
Detect leveldb, but do not let autoconf blindly link it with everything on the
planet.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Sighed-off-by: Sage Weil <sage@redhat.com>
2014-07-17 21:44:06 -07:00
Sage Weil
0193d3aa29 AUTHORS
Signed-off-by: Sage Weil <sage@inktank.com>
2014-07-17 21:33:22 -07:00
Wido den Hollander
7b342ef030 doc: Add Note about European mirror in Quick Start 2014-07-17 22:56:01 +02:00
Sage Weil
4d6899c756 qa/workunits/cephtool/test.sh: fix erasure_code_profile get test
I broke this in ce9f12d7a2 (the pool isn't
type erasure).

Signed-off-by: Sage Weil <sage@redhat.com>
2014-07-17 10:14:35 -07:00
John Spray
fe8c04f482 Merge pull request #2113 from ceph/wip-8857
mon/MDSMonitor: make legacy 'newfs' command idempotent

Reviewed-by: John Spray <john.spray@redhat.com>
2014-07-17 14:20:47 +01:00
Sage Weil
ce9f12d7a2 qa/workunits/cephtool/test.sh: test osd pool get erasure_code_profile
Signed-off-by: Sage Weil <sage@inktank.com>
2014-07-16 17:55:36 -07:00
Ma Jianpeng
e8ebcb79a4 mon: OSDMonitor: add "osd pool get <pool> erasure_code_profile" command
Enable us to obtain the erasure-code-profile for a given erasure-pool.

Signed-off-by: Ma Jianpeng <jianpeng.ma@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2014-07-16 17:49:00 -07:00
Sage Weil
5ccfd37b19 vstart.sh: default to 3 osds
Signed-off-by: Sage Weil <sage@inktank.com>
2014-07-16 17:46:11 -07:00
Sage Weil
5f6b11a6ad mon/MDSMonitor: make legacy 'newfs' command idempotent
We need to return success if we get a dup command.  Simply check whether
the fs is already enabled with the same pools and name.

Fixes: #8857
Signed-off-by: Sage Weil <sage@redhat.com>
2014-07-16 17:24:36 -07:00
Sage Weil
bf252c8df9 Merge remote-tracking branch 'gh/next' 2014-07-16 15:28:10 -07:00