Calling tolower/toupper for each character is slow, a lookup into a
256-byte table is cheaper, especially for common characters used in
header field names which all fit into a cache line. Let's create these
two variables marked weak so that they're included only once.
The ist functions were missing functions to copy an IST into a target
buffer, making some code have to resort to memcpy(), which tends to be
overkill for small strings, that the compiler cannot guess. In addition
sometimes there is a need to turn a string to lower or upper case so it
had to be overwritten after the operation.
This patch adds 6 functions to copy an ist to a buffer, as binary or as a
string (i.e. a zero is or is not appended), and optionally to apply a
lower case or upper case transformation on the fly.
A number of tests were performed to optimize the processing for small
strings. The loops are marked unlikely to dissuade the compilers from
over-optimizing them and switching to SIMD instructions. The lower case
or upper case transformations used to rely on external functions for
each character and to crappify the code due to clobbered registers,
which is not acceptable when we know that only a certain class of chars
has to be transformed, so the test was open-coded.
Building with musl and gcc-5.3 for MIPS returns this :
include/common/buf.h: In function 'b_dist':
include/common/buf.h:252:2: error: unknown type name 'ssize_t'
ssize_t dist = to - from;
^
Including stdint or stddef is not sufficient there to get ssize_t,
unistd is needed as well. It's likely that other platforms will have
the same issue. This patch also addresses it in ist.h and memory.h.
This function modifies the string to add a zero after the end, and returns
the start pointer. The purpose is to use it on strings extracted by parsers
from larger strings cut with delimiters that are not important and can be
destroyed. It allows any such string to be used with regular string
functions. It's also convenient to use with printf() to show data extracted
from writable areas.
It's not possible to use strlen() in const arrays even with const
strings, but we can use sizeof-1 via a macro. Let's provide this in
the IST() macro, as it saves the developer from having to count the
characters.
For HPACK we'll need to perform a lot of string manipulation between the
dynamic headers table and the output stream, and we need an efficient way
to deal with that, considering that the zero character is not an end of
string marker here. It turns out that gcc supports returning structs from
functions and is able to place up to two words directly in registers when
-freg-struct is used, which is the case by default on x86 and armv8. On
other architectures the caller reserves some stack space where the callee
can write, which is equivalent to passing a pointer to the return value.
So let's implement a few functions to deal with this as the resulting code
will be optimized on certain architectures where retrieving the length of
a string will simply consist in reading one of the two returned registers.
Extreme care was taken to ensure that the compiler gets maximum opportunities
to optimize out every bit of unused code. This is also the reason why no
call to regular string functions (such as strlen(), memcmp(), memcpy() etc)
were used. The code involving them is often larger than when they are open
coded. Given that strings are usually very small, especially when manipulating
headers, the time spent calling a function optimized for large vectors often
ends up being higher than the few cycles needed to count a few bytes.
An issue was met with __builtin_strlen() which can automatically convert
a constant string to its constant length. It doesn't accept NULLs and there
is no way to hide them using expressions as the check is made before the
optimizer is called. On gcc 4 and above, using an intermediary variable
is enough to hide it. On older versions, calls to ist() with an explicit
NULL argument will issue a warning. There is normally no reason to do this
but taking care of it the best possible still seems important.