The in-tree Hadoop shim was a combination of libcephfs wrapper, and the
bits to support Hadoop. This has been replaced by src/java that
implements generic libcephfs wrappers, and externally, the hadoop shim
(see docs).
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Backportable change to insure that even if no new ops started or
are running that indeed recovery is complete. Prevents some
error condition or unforseen code path from crashing an osd.
Backport: dumpling, cuttlefish
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
Caused by 944f3b7353Fixes: #6291
Backport: dumpling
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
We don't want it binding to whatever willy-nilly, and as an OSD even
its "client" traffic should go on the cluster address.
Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
This makes it possible to converse about op_cancel and cancel_linger_op
without getting too confused.
Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Right now these are very basic and aren't as sophisticated as we
want them to end up, but we have a skeleton for where to put the
decision-making logic.
Signed-off-by: Greg Farnum <greg@inktank.com>
If we get back a redirect reply, we clean up the Op's external references
and re-send using the target_oloc and target_oid. To facilitate this,
recalc_op_target() now only fills them in and overrides them with pool
cache semantics if they're empty.
Keep in mind that this is a pretty simple redirect formula -- the
Objecter will keep following redirects forever if that's what the OSDs
send back. The client is not providing any synchronization right now.
Signed-off-by: Greg Farnum <greg@inktank.com>
We have a little block to clean them up if we get back EAGAIN, but it's
actually leaking map references; we will also use this for redirects
from the OSDs.
Signed-off-by: Greg Farnum <greg@inktank.com>
When present, clients must send the request to the location specified
by the redirect (by using the combine_with_locator() function on
request_redirect_t).
A separate mechanism must be used to ensure that clients see and respect
the redirect, as we do not bump up the minimum required version to
decode.
Signed-off-by: Greg Farnum <greg@inktank.com>
For now it's just a copy of base_oid, but soon we will allow it
to be overwritten for OSD-driven redirects.
Signed-off-by: Greg Farnum <greg@inktank.com>
Analagous to the oloc->base_oloc rename we did in
e2fcad09d9, we may specify a different
target name for a redirect. Rename the existing oid to base_oid to
avoid any confusion.
Signed-off-by: Greg Farnum <greg@inktank.com>
We'll use this so that the OSD can tell the Objecter to redirect a
request to a different object somewhere else.
Signed-off-by: Greg Farnum <greg@inktank.com>
Create the class matching the string found in the
erasure-code-technique parameter, using the same strings are the
original {encoder,decoder}.c examples from Jerasure-1.2A. Registers
the plugin in ErasureCodePluginRegistry.
https://github.com/dachary/ceph/tree/wip-5879 refs #5879
Signed-off-by: Loic Dachary <loic@dachary.org>
technique == "liber8tion"
ErasureCodeInterface (abstract)
|
-> ErasureCodeJerasure (abstract)
|
-> ErasureCodeJerasureLiberation
|
-> ErasureCodeJerasureLiber8tion
| == liber8tion
Derived from Liberation it overloads the parse and prepare methods.
parse : default to K=2 and packetsize = 8.
If any of the following constraints is not satisfied, revert to the
default:
* K <= 8
* packetsize must not be zero
prepare uses liber8tion_coding_bitmatrix
https://github.com/dachary/ceph/tree/wip-5879 refs #5879
Signed-off-by: Loic Dachary <loic@dachary.org>
technique == "liberation"
parse : default to K=7, M=2 and W=7 and packetsize = 8.
If any of the following constraints is not satisfied, revert to the
default:
* K > W
* W > 2
* W is a prime number
* packetsize must not be zero
* packetsize must be a multiple of sizeof(int)
pad_in_length : pad to a multiple of k*w*packetsize*sizeof(int)
prepare, jerasure_encode, jerasure_decode map directly to the matching
jerasure functions
https://github.com/dachary/ceph/tree/wip-5879 refs #5879
Signed-off-by: Loic Dachary <loic@dachary.org>
The technique Cauchy has two variants:
ErasureCodeInterface (abstract)
|
-> ErasureCodeJerasure (abstract)
|
-> ErasureCodeJerasureCauchy (abstract)
| |
| -> ErasureCodeJerasureCauchyOrig
| | == cauchy_orig
| -> ErasureCodeJerasureCauchyGood
| | == cauchy_good
ErasureCodeJerasureCauchy defines the prepare_schedule method to be used
by prepare method, which is the only one overloaded by
ErasureCodeJerasureCauchyOrig (calling cauchy_original_coding_matrix)
and ErasureCodeJerasureCauchyGood ( calling
cauchy_good_general_coding_matrix).
The schedule is retained for encoding and the bitmatrix for decoding.
parse : default to K=7, M=3, W=8 and packetsize = 8.
pad_in_length : pad to a multiple of k*w*packetsize*sizeof(int)
jerasure_encode, jerasure_decode map directly to the matching
jerasure functions
https://github.com/dachary/ceph/tree/wip-5879 refs #5879
Signed-off-by: Loic Dachary <loic@dachary.org>
technique == reed_sol_r6_op
parse : default to K=7 and W=8 . If W is not 8, 16 or 32, it
reverts to 8.
pad_in_length : pad to a multiple of k*w*sizeof(int)
prepare, jerasure_encode, jerasure_decode map directly to the matching
jerasure functions
https://github.com/dachary/ceph/tree/wip-5879 refs #5879
Signed-off-by: Loic Dachary <loic@dachary.org>
technique == reed_sol_van
parse : default to K=7, M=3 and W=8 . If W is not 8, 16 or 32, it
reverts to 8.
pad_in_length : pad to a multiple of k*w*sizeof(int)
prepare, jerasure_encode, jerasure_decode map directly to the matching
jerasure functions
https://github.com/dachary/ceph/tree/wip-5879 refs #5879
Signed-off-by: Loic Dachary <loic@dachary.org>
A typed unit test is defined and must run regardless of the technique.
When a new technique is derived from ErasureCodeJerasure, it is added
to the JerasureTypes typedef and the test will validate that:
* it provides reasonable defaults for the technique specific
parameters
* it modifies the k, m and w to reasonable defaults depending
on the imposed constraints ( for instance Liber8tion requires
that w == 8 but the test sets it to 7 )
* the encoding of K=2, M=2 produces 4 chunks, the first two
of which contains the original buffer data showing the
code is systematic
* decoding when all 4 chunks are available indeed retrieves
the original buffer content
* decoding when the two data chunks are are missing indeed
retrieves the original buffer content
https://github.com/dachary/ceph/tree/wip-5879 refs #5879
Signed-off-by: Loic Dachary <loic@dachary.org>
With the introduction of the erasure code pool, arguments to be
interpreted depending on the pool type must be introduced.
For instance the erasure code pool loads a plugin at run time will
use easure-code-k=10 to split each object in 10.
The arguments are described as
name=properties,type=CephString,n=N,req=false,goodchars=[A-Za-z0-9-_.=]
If key=value it is stored in the new properties data member of pg_pool_t
as properties[key] = value, otherwise the value is the empty string.
The pg_pool_t version is bumped to 10 and the encode/decode methods
modified to take the properties into account. The
generate_test_instances method creates a two entries map, one of which
is the empty string to cover the case when no value is specified.
http://tracker.ceph.com/issues/6113 refs #6113
Signed-off-by: Loic Dachary <loic@dachary.org>
Return -EEXIST on duplicate ID
BUG FIX: crush_add_bucket() mixes error returns and IDs
Add optional argument to return generated ID
Fixes: #6246
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>