mirror of
https://github.com/ceph/ceph
synced 2025-03-11 02:39:05 +00:00
erasure code : plugin, interface and glossary documentation updates
* replace the erasure code plugin abstract interface with a doxygen link that will be populated when the header shows in master * update the plugin documentation to reflect the current draft implementation * fix broken link to PGBackend-h * add a glossary to define chunk, stripe, shard and strip with a drawing http://tracker.ceph.com/issues/4929 refs #4929 Signed-off-by: Loic Dachary <loic@dachary.org>
This commit is contained in:
parent
928a0d4270
commit
3a83129283
@ -10,6 +10,45 @@ architectural changes <http://tracker.ceph.com/issues/4929>`_, up to
|
||||
the point where it becomes a reference of the erasure coding
|
||||
implementation itself.
|
||||
|
||||
Glossary
|
||||
--------
|
||||
|
||||
*chunk*
|
||||
when the encoding function is called, it returns chunks of the
|
||||
same size.
|
||||
|
||||
*stripe*
|
||||
when an object is too large to be encoded with a single call,
|
||||
each set of chunks created by a call to the encoding function is
|
||||
called a stripe.
|
||||
|
||||
*shard|strip*
|
||||
the file that holds all chunks of a same rank for a given object.
|
||||
|
||||
Example:
|
||||
::
|
||||
OSD 40 OSD 33
|
||||
+-------------------------+ +-------------------------+
|
||||
| shard 0 - PG 10 | | shard 1 - PG 10 |
|
||||
|+------ object O -------+| |+------ object O -------+|
|
||||
||+---------------------+|| ||+---------------------+||
|
||||
stripe||| chunk 0 ||| ||| chunk 1 ||| ...
|
||||
0 ||| [0,+N) ||| ||| [0,+N) |||
|
||||
||+---------------------+|| ||+---------------------+||
|
||||
||+---------------------+|| ||+---------------------+||
|
||||
stripe||| chunk 0 ||| ||| chunk 1 ||| ...
|
||||
1 ||| [N,+N) ||| ||| [N,+N) |||
|
||||
||+---------------------+|| ||+---------------------+||
|
||||
||+---------------------+|| ||+---------------------+||
|
||||
stripe||| chunk 0 [N*2,+len) ||| ||| chunk 1 [N*2,+len) ||| ...
|
||||
2 ||+---------------------+|| ||+---------------------+||
|
||||
|+-----------------------+| |+-----------------------+|
|
||||
| ... | | ... |
|
||||
+-------------------------+ +-------------------------+
|
||||
|
||||
Table of content
|
||||
----------------
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
|
@ -9,17 +9,6 @@ An erasure coded pool only supports full writes, appends and read. It
|
||||
does not support snapshots or clone. An ErasureCodedPGBackend is derived
|
||||
from PGBackend.
|
||||
|
||||
|
||||
Glossary
|
||||
--------
|
||||
|
||||
* Stripe
|
||||
|
||||
* Data chunk and parity chunk
|
||||
|
||||
* Shard
|
||||
|
||||
|
||||
Reading and writing encoded chunks from and to OSDs
|
||||
---------------------------------------------------
|
||||
An erasure coded pool stores each object as M+K chunks. It is divided
|
||||
@ -352,7 +341,7 @@ If *OSD 1* goes down while *S2D2* is still in flight, the payload is partially a
|
||||
|
||||
The log entry *1,2* found on *OSD 3* is divergent from the new authoritative log provided by *OSD 4* : it is discarded and the file containing the *S2P1* chunk is truncated to the nearest multiple of the stripe size.
|
||||
|
||||
Erasure code library
|
||||
`Erasure code library <http://tracker.ceph.com/issues/5878>`_
|
||||
--------------------
|
||||
|
||||
Using `Reed-Solomon <https://en.wikipedia.org/wiki/Reed_Solomon>`_,
|
||||
@ -385,83 +374,57 @@ the encoding functions: smaller buffers will mean more calls and more
|
||||
overhead.
|
||||
|
||||
Although Reed-Solomon is provided as a default, Ceph uses it via an
|
||||
abastract API designed to allow each pool to chose the plugin that
|
||||
abstract API designed to allow each pool to choose the plugin that
|
||||
implements it.
|
||||
::
|
||||
ceph osd pool set-erasure-code <pool> plugin-dir <dir>
|
||||
ceph osd pool set-erasure-code <pool> plugin <plugin>
|
||||
ceph osd pool create <pool> \
|
||||
erasure-code-directory=<dir> \
|
||||
erasure-code-plugin=<plugin>
|
||||
|
||||
The *<plugin>* is dynamically loaded from *<dir>* (defaults to
|
||||
*/usr/lib/ceph/erasure-code-plugins* ) and expected to implement the
|
||||
*create_erasure_code_context* function
|
||||
*/usr/lib/ceph/erasure-code* ) and expected to implement the
|
||||
*void __erasure_code_init(char *plugin_name)* function
|
||||
which is responsible for registering an object derived from
|
||||
*ErasureCodePlugin* in the registry singleton :
|
||||
::
|
||||
registry.plugins[plugin_name] = new ErasureCodePluginExample();
|
||||
|
||||
* erasure_coding_t \*create_erasure_code_context(g_conf)
|
||||
The *ErasureCodePlugin* derived object must provide a factory method
|
||||
from which the concrete implementation of the *ErasureCodeInterface*
|
||||
object can be generated:
|
||||
::
|
||||
virtual int factory(ErasureCodeInterfaceRef *erasure_code,
|
||||
const map<std::string,std::string> ¶meters) {
|
||||
*erasure_code = ErasureCodeInterfaceRef(new ErasureCodeExample(parameters));
|
||||
return 0;
|
||||
}
|
||||
|
||||
return an object configured to encode and decode according to a
|
||||
given algorithm and a given set of parameters as specified in
|
||||
g_conf. Parameters must be prefixed with erasure-code to avoid name
|
||||
collisions
|
||||
::
|
||||
ceph osd pool set-erasure-code <pool> m 10
|
||||
ceph osd pool set-erasure-code <pool> k 3
|
||||
ceph osd pool set-erasure-code <pool> algorithm Reed-Solomon
|
||||
The *parameters* is the list of *key=value* pairs that were set when the pool
|
||||
was created. Each *key* must be prefixed with erasure-code to avoid name collisions
|
||||
::
|
||||
ceph osd pool create <pool> \
|
||||
erasure-code-directory=<dir> \ # mandatory
|
||||
erasure-code-plugin=jerasure \ # mandatory
|
||||
erasure-code-m=10 \ # optional and plugin dependant
|
||||
erasure-code-k=3 \ # optional and plugin dependant
|
||||
erasure-code-algorithm=Reed-Solomon \ # optional and plugin dependant
|
||||
|
||||
Erasure code library abstract API
|
||||
---------------------------------
|
||||
|
||||
The following are methods of the abstract class erasure_coding_t.
|
||||
|
||||
* set<int> minimum_to_decode(const set<int> &want_to_read, const set<int> &available_chunks);
|
||||
|
||||
returns the smallest subset of *available_chunks* that needs to be retrieved in order
|
||||
to successfully decode *want_to_read* chunks.
|
||||
|
||||
* set<int> minimum_to_decode_with_cost(const set<int> &want_to_read, const map<int, int> &available)
|
||||
|
||||
returns the minimum cost set required to read the specified
|
||||
chunks given a mapping of available chunks to costs. The costs might
|
||||
allow to consider the difference between reading local chunks vs
|
||||
remote chunks.
|
||||
|
||||
* map<int, buffer> encode(const set<int> &want_to_encode, const buffer &in)
|
||||
|
||||
encode the content of *in* and return a map associating the chunk
|
||||
number with its encoded content. The map only contains the chunks
|
||||
contained in the *want_to_encode* set. For instance, in the simplest
|
||||
case M=2,K=1 for a buffer containing AB, calling
|
||||
::
|
||||
encode([1,2,3], 'AB')
|
||||
=> { 1 => 'A', 2 => 'B', 3 => 'Z' }
|
||||
|
||||
If only the parity chunk is of interest, calling
|
||||
::
|
||||
encode([3], 'AB')
|
||||
=> { 3 => 'Z' }
|
||||
|
||||
|
||||
* map<int, buffer> decode(const set<int> &want_to_read, const map<int, buffer> &chunks)
|
||||
|
||||
decode *chunks* to read the content of the *want_to_read* chunks and
|
||||
return a map associating the chunk number with its decoded
|
||||
content. For instance, in the simplest case M=2,K=1 for an
|
||||
encoded payload of data A and B with parity Z, calling
|
||||
::
|
||||
decode([1,2], { 1 => 'A', 2 => 'B', 3 => 'Z' })
|
||||
=> { 1 => 'A', 2 => 'B' }
|
||||
|
||||
If however, the chunk B is to be read but is missing it will be:
|
||||
::
|
||||
decode([2], { 1 => 'A', 3 => 'Z' })
|
||||
=> { 2 => 'B' }
|
||||
.. doxygenfile:: ErasureCodeInterface.h
|
||||
|
||||
Erasure code jerasure plugin
|
||||
----------------------------
|
||||
|
||||
The parameters interpreted by the jerasure plugin are:
|
||||
::
|
||||
ceph osd pool set-erasure-code <pool> m <unsigned int> (defaults 10)
|
||||
ceph osd pool set-erasure-code <pool> k <unsigned int> (default 3)
|
||||
ceph osd pool set-erasure-code <pool> algorithm <string> (default Reed-Solomon)
|
||||
ceph osd pool create <pool> \
|
||||
erasure-code-directory=<dir> \ # plugin directory absolute path
|
||||
erasure-code-plugin=jerasure \ # plugin name (only jerasure)
|
||||
erasure-code-m=<m> \ # data chunks (default 10)
|
||||
erasure-code-k=<k> \ # parity chunks (default 3)
|
||||
erasure-code-algorithm=Reed-Solomon \ # algorithm (only Reed-Solomon)
|
||||
|
||||
|
||||
Scrubbing
|
||||
|
@ -2,7 +2,7 @@
|
||||
PG Backend Proposal
|
||||
===================
|
||||
|
||||
See also `PGBackend.h <PGBackend>`_
|
||||
See also `PGBackend.h <../PGBackend-h>`_
|
||||
|
||||
Motivation
|
||||
----------
|
||||
|
Loading…
Reference in New Issue
Block a user