struct eb_node is 36 bytes on a 64-bit machine. It's thus rounded
up to 40 bytes, and when forming a struct eb32_node, another 4 bytes
are added, rounded up to 48 bytes. We waste 8 bytes of space on 48
bytes because of alignments. It's basically the same with memory
blocks and immediate strings.
By packing the structure, eb32_node is down to 40 bytes. This saves
16 bytes per struct task and 20 bytes per struct stksess, used to
store each stick-table key.
Sometimes it's very useful to visit duplicates of a same node, but doing
so from the application is not convenient because keys have to be compared,
while all the information is available during the next/prev steps.
Let's introduce a couple of new eb_next_dup/eb_prev_dup functions to visit
only duplicates of the current node and return NULL once it's done. Now we
have all 3 combinations :
- next : returns next node in the tree
- next_dup : returns next dup in the sub-tree
- next_unique : returns next value after skipping dups
(cherry picked from commit 3327b8ae6866f3878322a1a29e70b450226d216d)
(from ebtree 6.0.7)
This typo has been there since we introduced duplicates. A "struct eb_troot *"
which apparently the compiler doesn't complain about while it is never declared
anywhere. Amazing...
(cherry picked from commit 2879648db5d32cf009ae571cb0e8e1df75152281)
(from ebtree 6.0.6)
This version is mainly aimed at clarifying the fact that the ebtree license
is LGPL. Some files used to indicate LGPL and other ones GPL, while the goal
clearly is to have it LGPL. A LICENSE file has also been added.
No code is affected, but it's better to have the local tree in sync anyway.
(cherry picked from commit 24dc7cca051f081600fe8232f33e55ed30e88425)
(from ebtree 6.0.6)
Care has been taken not to make the code bigger (it even got smaller
due to a possible simplification).
(cherry picked from commit 7a2c1df646049c7daac52677ec11ed63048cd150)
(update to ebtree 6.0.4)
Recent fix fd301cc1370cd4977fe175dfa4544c7dc0e7ce6b was not OK because it
was returning one excess byte, causing some duplicates not to be detected.
The reason is that we added 8 bits to count the trailing zero but they
were implied by the pre-incrementation of the pointer.
Fixing this was still not enough, as the problem appeared when
string_equal_bits() was applied on two identical strings, and it returned
a number of bits covering the trailing zero. Subsequent calls were applied
to the first byte after this trailing zero. It was often zero when doing
insertion from raw files, explaining why the issue was not discovered
earlier. But when the data is from a reused area, duplicate strings are not
correctly detected when inserting into the tree.
Several solutions were tested, and the only efficient one consists in making
string_equal_bits() notify the caller that the end of the string was reached.
It now returns zero and the callers just have to ensure that when they get a
zero, they stop using that bit until a dup tree or a leaf is encountered.
This fix brought the unexpected bonus of simplifying the insertion code a bit
and making it slightly faster to process duplicates.
The impact for haproxy was that if many similar string patterns were loaded
from a file, there was a potential risk that their insertion or matching could
have been slower. The bigger impact was with the URL sorting feature of halog,
which is not yet merged and is how this bug was discovered.
(cherry picked from commit 518d59ec9ba43705f930f9ece3749c450fd005df)
(from ebtree 6.0.2)
When inserting duplicates on x86/x86_64, the assembler optimization
does not support equal strings that both end up with a zero, and
can return garbage in the bit number, possibly causing a segfault
for its users. The only case where this can happen appears to be
in ebst_insert().
(cherry picked from commit 006152c62ae56d151188626e6074a79be3928858)
This version adds support for prefix-based matching of memory blocks,
as well as some code-size and performance improvements on the generic
code. It provides a prefix insertion and longest match which are
compatible with the rest of the common features (walk, duplicates,
delete, ...). This is typically used for network address matching. The
longest-match code is a bit slower than the original memory block
handling code, so they have not been merged together into generic
code. Still it's possible to perform about 10 million networks lookups
per second in a set of 50000, so this should be enough for most usages.
This version also fixes some bugs in parts that were not used, so there
is no need to backport them.
It's a pain to enable regparm because ebtree is built in its corner
and does not depend on the rest of the config. This causes no problem
except that if the regparm settings are not exactly similar, then we
can get inconsistent function interfaces and crashes.
One solution realized in this patch consists in externalizing all
compiler settings and changing CONFIG_XXX_REGPARM into CONFIG_REGPARM
so that we ensure that any sub-component uses the same setting. Since
ebtree used a value here and not a boolean, haproxy's config has been
set to use a number too. Both haproxy's core and ebtree currently use
the same copy of the compiler.h file. That way we don't have any issue
anymore when one setting changes somewhere.
We needed to upgrade ebtree to v5.0 to support string indexing,
and it was getting very painful to have it split across 2 dirs
and to have to patch it. Now we just have to copy the .c and .h
files to the right place.