Commit 2c315ee75e ("BUG/MEDIUM: ebtree: don't set attribute packed
without unaligned access support") addressed alignment issues in
ebtrees in a way that is not really optimal since it will leave holes
in eb32trees for example.
This fix is better in that it restores the packed attribute on ebnode
but enforces proper alignment on the carrying nodes where necessary.
This also has the benefit of closing holes wherever possible and to
align data to the minimally required size.
The only thing it cannot close is the 32-bit hole at the end of ebmbnode
due to the required 64-bit on certain archs but at least it guarantees
that the key correctly points to the end of the node and that there is
never a hole after it.
This is a better fix than the one above and should be backported to
branches where the one above will be backported.
This used to be a minor optimization on ix86 where registers are scarce
and the calling convention not very efficient, but this platform is not
relevant enough anymore to warrant all this dirt in the code for the sake
of saving 1 or 2% of performance. Modern platforms don't use this at all
since their calling convention already defaults to using several registers
so better get rid of this once for all.
Last commit 2c315ee ("BUG/MEDIUM: ebtree: don't set attribute packed
without unaligned access support") accidently enclosed the semicolon
in the ifdef so it will fail when HA_UNALIGNED is not set.
An alignment issue on Sparc64 with ebtrees was reported in early 2017
here https://www.mail-archive.com/haproxy@formilux.org/msg25937.html and
a similar one was finally reported in issue #512.
The problem has its roots in the fact that 64-bit keys will end up being
unaligned on such archs which do not support unaligned accesses. But on
most platforms supporting unaligned accesses, dealing with smaller nodes
results in better performance.
One of the possible problems caused by attribute packed there is that it
promotes a structure both to be unaligned and unpadded, which may come
with fun if some fields of the struct itself are accessed and such a node
is placed at an unaligned location. It's not a problem for regular unaligned
accesses but may become one for atomic operations such as the CAS on leaf_p
that's used in struct task. In practice we know that this struct is properly
aligned and is a very edge case so this patch adds comments there to remind
to be careful about it.
This patch depends on previous patch "MINOR: compiler: move CPU capabilities
definition from config.h and complete them" and could be backported to all
stable branches. It fixes issues #512 and #9.
For whatever absurd reason these ones do not take a const, resulting in
some haproxy functions being forced to confusingly use variables instead
of const arguments. Let's fix this and backport it to older versions.
struct eb_node is 36 bytes on a 64-bit machine. It's thus rounded
up to 40 bytes, and when forming a struct eb32_node, another 4 bytes
are added, rounded up to 48 bytes. We waste 8 bytes of space on 48
bytes because of alignments. It's basically the same with memory
blocks and immediate strings.
By packing the structure, eb32_node is down to 40 bytes. This saves
16 bytes per struct task and 20 bytes per struct stksess, used to
store each stick-table key.
Sometimes it's very useful to visit duplicates of a same node, but doing
so from the application is not convenient because keys have to be compared,
while all the information is available during the next/prev steps.
Let's introduce a couple of new eb_next_dup/eb_prev_dup functions to visit
only duplicates of the current node and return NULL once it's done. Now we
have all 3 combinations :
- next : returns next node in the tree
- next_dup : returns next dup in the sub-tree
- next_unique : returns next value after skipping dups
(cherry picked from commit 3327b8ae6866f3878322a1a29e70b450226d216d)
(from ebtree 6.0.7)
This typo has been there since we introduced duplicates. A "struct eb_troot *"
which apparently the compiler doesn't complain about while it is never declared
anywhere. Amazing...
(cherry picked from commit 2879648db5d32cf009ae571cb0e8e1df75152281)
(from ebtree 6.0.6)
This version is mainly aimed at clarifying the fact that the ebtree license
is LGPL. Some files used to indicate LGPL and other ones GPL, while the goal
clearly is to have it LGPL. A LICENSE file has also been added.
No code is affected, but it's better to have the local tree in sync anyway.
(cherry picked from commit 24dc7cca051f081600fe8232f33e55ed30e88425)
(from ebtree 6.0.6)
Care has been taken not to make the code bigger (it even got smaller
due to a possible simplification).
(cherry picked from commit 7a2c1df646049c7daac52677ec11ed63048cd150)
(update to ebtree 6.0.4)
Recent fix fd301cc1370cd4977fe175dfa4544c7dc0e7ce6b was not OK because it
was returning one excess byte, causing some duplicates not to be detected.
The reason is that we added 8 bits to count the trailing zero but they
were implied by the pre-incrementation of the pointer.
Fixing this was still not enough, as the problem appeared when
string_equal_bits() was applied on two identical strings, and it returned
a number of bits covering the trailing zero. Subsequent calls were applied
to the first byte after this trailing zero. It was often zero when doing
insertion from raw files, explaining why the issue was not discovered
earlier. But when the data is from a reused area, duplicate strings are not
correctly detected when inserting into the tree.
Several solutions were tested, and the only efficient one consists in making
string_equal_bits() notify the caller that the end of the string was reached.
It now returns zero and the callers just have to ensure that when they get a
zero, they stop using that bit until a dup tree or a leaf is encountered.
This fix brought the unexpected bonus of simplifying the insertion code a bit
and making it slightly faster to process duplicates.
The impact for haproxy was that if many similar string patterns were loaded
from a file, there was a potential risk that their insertion or matching could
have been slower. The bigger impact was with the URL sorting feature of halog,
which is not yet merged and is how this bug was discovered.
(cherry picked from commit 518d59ec9ba43705f930f9ece3749c450fd005df)
(from ebtree 6.0.2)
When inserting duplicates on x86/x86_64, the assembler optimization
does not support equal strings that both end up with a zero, and
can return garbage in the bit number, possibly causing a segfault
for its users. The only case where this can happen appears to be
in ebst_insert().
(cherry picked from commit 006152c62ae56d151188626e6074a79be3928858)
This version adds support for prefix-based matching of memory blocks,
as well as some code-size and performance improvements on the generic
code. It provides a prefix insertion and longest match which are
compatible with the rest of the common features (walk, duplicates,
delete, ...). This is typically used for network address matching. The
longest-match code is a bit slower than the original memory block
handling code, so they have not been merged together into generic
code. Still it's possible to perform about 10 million networks lookups
per second in a set of 50000, so this should be enough for most usages.
This version also fixes some bugs in parts that were not used, so there
is no need to backport them.
It's a pain to enable regparm because ebtree is built in its corner
and does not depend on the rest of the config. This causes no problem
except that if the regparm settings are not exactly similar, then we
can get inconsistent function interfaces and crashes.
One solution realized in this patch consists in externalizing all
compiler settings and changing CONFIG_XXX_REGPARM into CONFIG_REGPARM
so that we ensure that any sub-component uses the same setting. Since
ebtree used a value here and not a boolean, haproxy's config has been
set to use a number too. Both haproxy's core and ebtree currently use
the same copy of the compiler.h file. That way we don't have any issue
anymore when one setting changes somewhere.
We needed to upgrade ebtree to v5.0 to support string indexing,
and it was getting very painful to have it split across 2 dirs
and to have to patch it. Now we just have to copy the .c and .h
files to the right place.