Merge pull request #20131 from jcsp/wip-doc-encoding

doc/dev: add a brief guide to serialization

Reviewed-by: Kefu Chai <kchai@redhat.com>
This commit is contained in:
John Spray 2018-03-05 22:33:34 +00:00 committed by GitHub
commit 86d7f3f72c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

95
doc/dev/encoding.rst Normal file
View File

@ -0,0 +1,95 @@
Serialization (encode/decode)
=============================
When a structure is sent over the network or written to disk, it is
encoded into a string of bytes. Serializable structures have
``encode`` and ``decode`` methods that write and read from ``bufferlist``
objects representing byte strings.
Adding a field to a structure
-----------------------------
You can see examples of this all over the Ceph code, but here's an
example:
::
class AcmeClass
{
int member1;
std::string member2;
void encode(bufferlist &bl)
{
ENCODE_START(1, 1, bl);
::encode(member1, bl);
::encode(member2, bl);
ENCODE_FINISH(bl);
}
void decode(bufferlist::iterator &bl)
{
DECODE_START(1, bl);
::decode(member1, bl);
::decode(member2, bl);
DECODE_FINISH(bl);
}
};
The ``ENCODE_START`` macro writes a header that specifies a *version* and
a *compat_version* (both initially 1). The message version is incremented
whenever a change is made to the encoding. The compat_version is incremented
only if the change will break existing decoders -- decoders are tolerant
of trailing bytes, so changes that add fields at the end of the structure
do not require incrementing compat_version.
The ``DECODE_START`` macro takes an argument specifying the most recent
message version that the code can handle. This is compared with the
compat_version encoded in the message, and if the message is too new then
an exception will be thrown. Because changes to compat_verison are rare,
this isn't usually something to worry about when adding fields.
In practice, changes to encoding usually involve simply adding the desired fields
at the end of the ``encode`` and ``decode`` functions, and incrementing
the versions in ``ENCODE_START`` and ``DECODE_START``. For example, here's how
to add a third field to ``AcmeClass``:
::
class AcmeClass
{
int member1;
std::string member2;
std::vector<std::string> member3;
void encode(bufferlist &bl)
{
ENCODE_START(2, 1, bl);
::encode(member1, bl);
::encode(member2, bl);
::encode(member3, bl);
ENCODE_FINISH(bl);
}
void decode(bufferlist::iterator &bl)
{
DECODE_START(2, bl);
::decode(member1, bl);
::decode(member2, bl);
if (struct_v >= 2) {
::decode(member3, bl);
}
DECODE_FINISH(bl);
}
};
Note that the compat_version did not change because the encoded message
will still be decodable by versions of the code that only understand
version 1 -- they will just ignore the trailing bytes where we encode ``member3``.
In the ``decode`` function, decoding the new field is conditional: this is
because we might still be passed older-versioned messages that do not
have the field. The ``struct_v`` variable is a local set by the ``DECODE_START``
macro.