doc/dev: improve EC glossary

Improve the clarity and syntax of the text in
doc/dev/osd_internals/erasure_coding.rst.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
This commit is contained in:
Zac Dover 2022-10-31 13:17:45 +10:00
parent 0bb4bf67c2
commit 64f10fc14b

View File

@ -6,47 +6,50 @@ Glossary
--------
*chunk*
when the encoding function is called, it returns chunks of the same
size. Data chunks which can be concatenated to reconstruct the original
object and coding chunks which can be used to rebuild a lost chunk.
when the encoding function is called, it returns chunks of the same size.
There are two kinds of chunks: (1) data chunks, which can be concatenated to
reconstruct the original object, and (2) coding chunks, which can be used to
rebuild a lost chunk.
*chunk rank*
the index of a chunk when returned by the encoding function. The
rank of the first chunk is 0, the rank of the second chunk is 1
etc.
the index of a chunk, as determined by the encoding function. The
rank of the first chunk is 0, the rank of the second chunk is 1,
and so on.
*stripe*
when an object is too large to be encoded with a single call,
each set of chunks created by a call to the encoding function is
called a stripe.
if an object is so large that encoding it requires more than one call to the
encoding function, each of these calls will create a set of chunks called a
*stripe*.
*shard|strip*
*shard* (also called *strip*)
an ordered sequence of chunks of the same rank from the same
object. For a given placement group, each OSD contains shards of
object. For a given placement group, each OSD contains shards of
the same rank. When dealing with objects that are encoded with a
single operation, *chunk* is sometime used instead of *shard*
single operation, *chunk* is sometimes used instead of *shard*
because the shard is made of a single chunk. The *chunks* in a
*shard* are ordered according to the rank of the stripe they belong
to.
*K*
the number of data *chunks*, i.e. the number of *chunks* in which the
original object is divided. For instance if *K* = 2 a 10KB object
will be divided into *K* objects of 5KB each.
the number of "data *chunks*" into which an object is divided. For example,
if *K* = 2, then a 10KB object is divided into two objects of 5KB each.
*M*
the number of coding *chunks*, i.e. the number of additional *chunks*
computed by the encoding functions. If there are 2 coding *chunks*,
it means 2 OSDs can be out without losing data.
the number of coding *chunks* (the number of chunks in addition to the "data
chunks") computed by the encoding functions. *M* is equal to the number of
OSDs that can be lost from the cluster without the cluster suffering data
loss. For example, if there are two coding *chunks*, then two OSDs can be
down without data loss.
*N*
the number of data *chunks* plus the number of coding *chunks*,
i.e. *K+M*.
the number of data *chunks* plus the number of coding *chunks*. *K* + *M*.
*rate*
the proportion of the *chunks* that contains useful information, i.e. *K/N*.
For instance, for *K* = 9 and *M* = 3 (i.e. *K+M* = *N* = 12) the rate is
*K* = 9 / *N* = 12 = 0.75, i.e. 75% of the chunks contain useful information.
the proportion of the *chunks* containing useful information: that is, *K*
divided by *N*. For example, suppose that *K* = 9 and *M* = 3. This would
mean that *N* = 12 (because *K* + *M* = 9 + 3). Therefore, the rate (*K* /
*N*) is 9 / 12 = 0.75. In other words, 75% of the chunks contain useful
information.
The definitions are illustrated as follows (PG stands for placement group):
::
@ -71,8 +74,8 @@ The definitions are illustrated as follows (PG stands for placement group):
| ... | | ... |
+-------------------------+ +-------------------------+
Table of content
----------------
Table of contents
-----------------
.. toctree::
:maxdepth: 1