Now, the function htx_from_buf() will set the buffer's length to its size
automatically. In return, the caller should call htx_to_buf() at the end to be
sure to leave the buffer hosting the HTX message in the right state. When the
caller can use the function htxbuf() to get the HTX message without any update
on the underlying buffer.
The small HTX overhead is enough to make the system perform multiple
reads and unaligned memory copies. Here we provide a function whose
purpose is to reduce the apparent room in a buffer by the size of the
overhead for DATA blocks, which is the struct htx plus 2 blocks (one
for DATA, one for the end of message so that small blocks can fit at
once). The muxes using HTX will be encouraged to use this one instead
of b_room() to compute the available buffer room and avoid filling
their demux buf with more data than can fit at once into the HTX
buffer.
This one is used a lot during transfers, let's avoid resetting its
size when there are already data in the buffer since it implies the
size is correct.
Instead, we now use the htx_sl coming from the HTX message. It avoids to have
too H1 specific code in version-agnostic parts. Of course, the concept of the
start-line is higly influenced by the H1, but the structure htx_sl can be
adapted, if necessary. And many things depend on a start-line during HTTP
analyzis. Using the structure htx_sl also avoid boring conversions between HTX
version and H1 version.
If there is no start-line, this offset is set to -1. Otherwise, it is the
relative address where the start-line is stored in the data block. When the
start-line is added, replaced or removed, this offset is updated accordingly. On
remove, if the start-line is no set and if the next block is a start-line, the
offset is updated. Finally, when an HTX structure is defragmented, the offset is
also updated accordingly.
The HTX start-line is now a struct. It will be easier to extend, if needed. Same
info can be found, of course. In addition it is now possible to set flags on
it. It will be used to set some infos about the message.
Some macros and functions have been added in proto/htx.h to help accessing
different parts of the start-line.
The function htx_find_blk() returns the HTX block containing data with a given
offset, relatively to the beginning of the HTX message. It is a good way to skip
outgoing data and find the first HTX block not already processed.
the functions htx_get_next() and htx_get_prev() are used to iterate on an HTX
message using blocks position. With htx_get_next_blk() and htx_get_prev_blk(),
it is possible to do the same, but with HTX blocks. Of course, internally, we
rely on position's versions to do so. But it is handy for callers to not take
care of the blocks position.
The function htx_add_data_before() can be used to add an HTX block before
another one. For instance, it could be used to add some data before the
end-of-message marker.
htx_cut_data_blk() is used to cut the beginning of a DATA block after a
part of it was tranferred. It simply advances the address, reduces the
advertised length and updates the htx's total data count.
Building with musl and gcc-5.3 for MIPS returns this :
include/common/buf.h: In function 'b_dist':
include/common/buf.h:252:2: error: unknown type name 'ssize_t'
ssize_t dist = to - from;
^
Including stdint or stddef is not sufficient there to get ssize_t,
unistd is needed as well. It's likely that other platforms will have
the same issue. This patch also addresses it in ist.h and memory.h.
Building on 32 bits gives this :
include/proto/htx.h: In function 'htx_dump':
include/proto/htx.h:443:25: warning: format '%lu' expects argument of type 'long unsigned int', but argument 8 has type 'uint64_t {aka long long unsigned int}' [-Wformat=]
fprintf(stderr, "htx:%p [ size=%u - data=%u - used=%u - wrap=%s - extra=%lu]\n",
^
In htx_dump(), fprintf() uses %lu but the value is an uint64_t so it
doesn't match on 32-bit. Let's cast this to unsigned long long and use
%llu instead.
The internal representation of an HTTP message, called HTX, is a structured
representation, unlike the old one which is a raw representation of
messages. Idea is to have a version-agnostic representation of the HTTP
messages, which can be easily used by to handle HTTP/1, HTTP/2 and hopefully
QUIC messages, and communication from one of them to another.
In this patch, we add types to define the internal representation itself and the
main functions to manipulate them.