Commit 2c315ee75e ("BUG/MEDIUM: ebtree: don't set attribute packed
without unaligned access support") addressed alignment issues in
ebtrees in a way that is not really optimal since it will leave holes
in eb32trees for example.
This fix is better in that it restores the packed attribute on ebnode
but enforces proper alignment on the carrying nodes where necessary.
This also has the benefit of closing holes wherever possible and to
align data to the minimally required size.
The only thing it cannot close is the 32-bit hole at the end of ebmbnode
due to the required 64-bit on certain archs but at least it guarantees
that the key correctly points to the end of the node and that there is
never a hole after it.
This is a better fix than the one above and should be backported to
branches where the one above will be backported.
This used to be a minor optimization on ix86 where registers are scarce
and the calling convention not very efficient, but this platform is not
relevant enough anymore to warrant all this dirt in the code for the sake
of saving 1 or 2% of performance. Modern platforms don't use this at all
since their calling convention already defaults to using several registers
so better get rid of this once for all.
Last commit 2c315ee ("BUG/MEDIUM: ebtree: don't set attribute packed
without unaligned access support") accidently enclosed the semicolon
in the ifdef so it will fail when HA_UNALIGNED is not set.
An alignment issue on Sparc64 with ebtrees was reported in early 2017
here https://www.mail-archive.com/haproxy@formilux.org/msg25937.html and
a similar one was finally reported in issue #512.
The problem has its roots in the fact that 64-bit keys will end up being
unaligned on such archs which do not support unaligned accesses. But on
most platforms supporting unaligned accesses, dealing with smaller nodes
results in better performance.
One of the possible problems caused by attribute packed there is that it
promotes a structure both to be unaligned and unpadded, which may come
with fun if some fields of the struct itself are accessed and such a node
is placed at an unaligned location. It's not a problem for regular unaligned
accesses but may become one for atomic operations such as the CAS on leaf_p
that's used in struct task. In practice we know that this struct is properly
aligned and is a very edge case so this patch adds comments there to remind
to be careful about it.
This patch depends on previous patch "MINOR: compiler: move CPU capabilities
definition from config.h and complete them" and could be backported to all
stable branches. It fixes issues #512 and #9.
For whatever absurd reason these ones do not take a const, resulting in
some haproxy functions being forced to confusingly use variables instead
of const arguments. Let's fix this and backport it to older versions.
There is a 4-bytes hole in this structure after the eb_node and the
last field is 4-bytes as well, resulting in a total of 64 bytes with
8 bytes holes. Just moving the key after the eb_node (like in eb32_node)
fills the hole and reduces the structure's size by 8 bytes.
Clang emits a warning about these types being redefined in eb32sctree
while they are already defined in eb32tree. Let's simply not redefine
them if eb32tree was already included.
Christopher found a case where some tasks would remain unseen in the run
queue and would spontaneously appear after certain apparently unrelated
operations performed by the other thread.
It's in fact the insertion which is not correct, the node serving as the
top of duplicate tree wasn't properly updated, just like the each top of
subtree in a duplicate tree. This had the effect that after some removals,
the incorrectly tagged node would hide the underlying ones, which would
then suddenly re-appear once they were removed.
This is 1.8-specific, no backport is needed.
Several parts of the code need to access the next node but don't start
from a node but a tagged parent link. Even eb32sc_next() does this.
Let's provide this function to prepare a cleanup for the lookup function.
The eb32sc_walk_down_left() function needs to be able to go up when
it doesn't find a matching entry because this situation may always
happen, especially when fixing two constraints (scope + value). It
also happens after certain removal situations where some bits remain
on some intermediary nodes in the tree.
In addition, the algorithm for deciding to take the right branch is
wrong as it would take it if the current node shows a scope that
doesn't matchthe required one.
The current code is flakey in that it returns NULL when the bottom
has been reached and it's up to the caller to visit other nodes above.
In addition to being complex it's not reliable, and it was noticed a
few times that some tasks could remain lying in the tree after heavy
insertion/removals under multi-threaded workloads.
Now instead we make eb32sc_walk_down_left() visit the leftmost branch
that matches the scope, and automatically go up to visit the closest
matching right branch. This effectively does the same operations as a
next() operation but in reverse order (down then up instead of up then
down).
The eb32sc_next() function now becomes very simple again and matches
the original one, and the initial issues cannot be met anymore.
No backport is needed, this is purely 1.8-specific.
Commit ca30839 and following ("MINOR: ebtree: implement the scope-aware
functions for eb32") improperly dealt with the scope in duplicate trees.
The insertion was too lenient in that it would always mark the whole
rightmost chain below the insertion point, and the removal could leave
marks of non-existing scopes causing next()/first() to visit the wrong
branch and return NULL.
For insertion, we must only tag the nodes between the head of the dup
tree and the insertion point which is the top of the lowest subtree. For
removal, the new scope must be be calculated by oring the scopes of the
two new branches and is irrelevant to the previous values.
No backport is needed, this is purely 1.8-specific.
In the scheduler we always have to loop back to the beginning after
we don't find the last entry, so let's implement this in a new lookup
function instead. The resulting code is slightly faster, mostly due
to the fact that there's much less inlined code in the fast path.
Now when looking up a node via eb32sc_first(), eb32sc_next(), and
eb32sc_lookup_ge(), we only focus on the branches matching the requested
scope. The code must be careful to miss no branch. It changes a little
bit from the previous one because the scope stored on the intermediary
nodes is not exact (since we don't propagate upwards during deletion),
so in case a lookup fails, we have to walk up and pick the next matching
entry.
During a delete operation, if the deleted node is above its leaf's
parent, this parent will replace the node and then go up. In this
case it is important to update the new parent's scope to reflect
the presence of other branches.
It's worth noting that in theory we should precisely recompute the
exact node value, but it seems that it's not worth it for the rare
cases there is a mismatch.
A new kind of tree nodes is currently being developed in ebtree v7,
consisting in storing a scope in each node indicating a visibility
mask so that certain nodes are not reported on certain lookups. The
initial goal was to make this usable with a multi-thread scheduler.
Since the ebtree v7 code is completely different from v6, this patch
instead copies the minimally required functions from eb32 and ebtree
and calls them "eb32sc_*". At the moment the scope is not implemented,
it's only passed in arguments.
And indicate what is required for this (that the pattern is properly
terminated by a zero).
(cherry picked from commit c87c93800ce4045b1053302d99a3cd78321a7ec4)
struct eb_node is 36 bytes on a 64-bit machine. It's thus rounded
up to 40 bytes, and when forming a struct eb32_node, another 4 bytes
are added, rounded up to 48 bytes. We waste 8 bytes of space on 48
bytes because of alignments. It's basically the same with memory
blocks and immediate strings.
By packing the structure, eb32_node is down to 40 bytes. This saves
16 bytes per struct task and 20 bytes per struct stksess, used to
store each stick-table key.
Sometimes it's very useful to visit duplicates of a same node, but doing
so from the application is not convenient because keys have to be compared,
while all the information is available during the next/prev steps.
Let's introduce a couple of new eb_next_dup/eb_prev_dup functions to visit
only duplicates of the current node and return NULL once it's done. Now we
have all 3 combinations :
- next : returns next node in the tree
- next_dup : returns next dup in the sub-tree
- next_unique : returns next value after skipping dups
(cherry picked from commit 3327b8ae6866f3878322a1a29e70b450226d216d)
Otherwise we end up comparing the byte past the end, resulting
in duplicate values still being inserted into the tree even if
undesired.
This generally has low impact, though it can sometimes cause one new entry
to be added next to an existing one for stick tables, preventing the results
from being merged.
(cherry picked from commit 12e54ac493a91bb02064568f410592c2700d3933)
(from ebtree 6.0.7)
Julien Thomas provided a reproducible test case where a string lookup
could return the wrong node. The issue is caused by the jump to a node
which contains less bit in common than the previous node, making the
string_equal_bits() function return -1. We must not remember more bits
than the number on the node, otherwise we can be tempted to trust them
while they can change while running down.
For a valid test case, enter : "0", "WW", "W", "S", and lookup "W".
Previously, "S" was returned.
Note: string-based ebtrees are used in haproxy in ACL, peers and
stick-tables. ACLs are not affected because all patterns are
interchangeable. stick-tables are not affected because lookups are
performed using ebmb_lookup(). Only peers might be affected though
it is not easy to infirm or confirm the issue.
(cherry picked from commit dd47a54103597458887d3cc8414853a541aee9c1)
(from ebtree 6.0.7)
root_right was wrongly initialized first to <root> which is not the same
type, to be later initialized to root->b[EB_RGHT].
Let's simply remove the wrong and useless initialization.
(cherry picked from commit e63a0c2f56369b52c4d00221d83c2c4569605c06)
(from ebtree 6.0.7)
This typo has been there since we introduced duplicates. A "struct eb_troot *"
which apparently the compiler doesn't complain about while it is never declared
anywhere. Amazing...
(cherry picked from commit 2879648db5d32cf009ae571cb0e8e1df75152281)
(from ebtree 6.0.6)
This version is mainly aimed at clarifying the fact that the ebtree license
is LGPL. Some files used to indicate LGPL and other ones GPL, while the goal
clearly is to have it LGPL. A LICENSE file has also been added.
No code is affected, but it's better to have the local tree in sync anyway.
(cherry picked from commit 24dc7cca051f081600fe8232f33e55ed30e88425)
(from ebtree 6.0.6)
Care has been taken not to make the code bigger (it even got smaller
due to a possible simplification).
(cherry picked from commit 7a2c1df646049c7daac52677ec11ed63048cd150)
gcc (Debian 4.6.0-2) 4.6.1 20110329 (prerelease)
Copyright (C) 2011 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
...
src/proto_http.c:3029:14: warning: variable ‘del_cl’ set but not used [-Wunused-but-set-variable]
In file included from ebtree/eb64tree.c:23:0:
ebtree/eb64tree.h: In function ‘__eb64_lookup’:
ebtree/eb64tree.h:128:6: warning: variable ‘node_bit’ set but not used [-Wunused-but-set-variable]
ebtree/eb64tree.h: In function ‘__eb64i_lookup’:
ebtree/eb64tree.h:180:6: warning: variable ‘node_bit’ set but not used [-Wunused-but-set-variable]
In file included from ebtree/ebpttree.h:26:0,
from ebtree/ebimtree.c:23:
ebtree/eb64tree.h: In function ‘__eb64_lookup’:
ebtree/eb64tree.h:128:6: warning: variable ‘node_bit’ set but not used [-Wunused-but-set-variable]
ebtree/eb64tree.h: In function ‘__eb64i_lookup’:
ebtree/eb64tree.h:180:6: warning: variable ‘node_bit’ set but not used [-Wunused-but-set-variable]
In file included from ebtree/ebpttree.h:26:0,
from ebtree/ebistree.h:25,
from ebtree/ebistree.c:23:
ebtree/eb64tree.h: In function ‘__eb64_lookup’:
ebtree/eb64tree.h:128:6: warning: variable ‘node_bit’ set but not used [-Wunused-but-set-variable]
ebtree/eb64tree.h: In function ‘__eb64i_lookup’:
ebtree/eb64tree.h:180:6: warning: variable ‘node_bit’ set but not used [-Wunused-but-set-variable]
(from ebtree 6.0.5)
Both of them are very short and rely on another non-inlined lookup function,
so it's pointless to have them as pure functions, it wastes space.
(cherry picked from commit 1e68d6fef815f759304d4cc0e65f957689e19a7a)
(from ebtree 6.0.5)
Last bugfix has introduced a de-optimization in the lookup function because
it artificially extended the scope of some local variables, which resulted in
higher stack usage and more numerous moves between stack and registers.
We can reduce that by moving the return code out of the loop, because gcc
notices that it never needs both "troot" and "node" at the same time and
can use the same register for both. Doing so has reduced the code size by
39 bytes for the lookup function alone, and has sensibly reduced the
instruction dependencies caused by data moves.
(cherry picked from commit 59be3cdb96296b65a57aff30cc203269f9a94ebe)
It should be backported to 1.4 if previous ebtree fix is backported.
(from ebtree 6.0.5)
ebmb_lookup() is used by ebst_lookup_len() to lookup a string starting
with a known substring. Since the substring does not necessarily end
with a zero, we must absolutely ensure that the comparison stops at
<len> bytes, otherwise we can end up comparing crap and most often
returning the wrong node in case of multiple matches.
ebim_lookup() was fixed too by resyncing it with ebmb_lookup().
(cherry picked from commit 98eba315aa2c3285181375d312bcb770f058fd2b)
This should be backported to 1.4 though it's not critical there.
(update to ebtree 6.0.4)
Recent fix fd301cc1370cd4977fe175dfa4544c7dc0e7ce6b was not OK because it
was returning one excess byte, causing some duplicates not to be detected.
The reason is that we added 8 bits to count the trailing zero but they
were implied by the pre-incrementation of the pointer.
Fixing this was still not enough, as the problem appeared when
string_equal_bits() was applied on two identical strings, and it returned
a number of bits covering the trailing zero. Subsequent calls were applied
to the first byte after this trailing zero. It was often zero when doing
insertion from raw files, explaining why the issue was not discovered
earlier. But when the data is from a reused area, duplicate strings are not
correctly detected when inserting into the tree.
Several solutions were tested, and the only efficient one consists in making
string_equal_bits() notify the caller that the end of the string was reached.
It now returns zero and the callers just have to ensure that when they get a
zero, they stop using that bit until a dup tree or a leaf is encountered.
This fix brought the unexpected bonus of simplifying the insertion code a bit
and making it slightly faster to process duplicates.
The impact for haproxy was that if many similar string patterns were loaded
from a file, there was a potential risk that their insertion or matching could
have been slower. The bigger impact was with the URL sorting feature of halog,
which is not yet merged and is how this bug was discovered.
(cherry picked from commit 518d59ec9ba43705f930f9ece3749c450fd005df)
(from ebtree 6.0.2)
When inserting duplicates on x86/x86_64, the assembler optimization
does not support equal strings that both end up with a zero, and
can return garbage in the bit number, possibly causing a segfault
for its users. The only case where this can happen appears to be
in ebst_insert().
(cherry picked from commit 006152c62ae56d151188626e6074a79be3928858)
This version adds support for prefix-based matching of memory blocks,
as well as some code-size and performance improvements on the generic
code. It provides a prefix insertion and longest match which are
compatible with the rest of the common features (walk, duplicates,
delete, ...). This is typically used for network address matching. The
longest-match code is a bit slower than the original memory block
handling code, so they have not been merged together into generic
code. Still it's possible to perform about 10 million networks lookups
per second in a set of 50000, so this should be enough for most usages.
This version also fixes some bugs in parts that were not used, so there
is no need to backport them.
Sometimes it's useful to lookup a string without terminating it with a
zero. We can do that relying on ebmb_lookup() since the string in the
tree contains a zero.
It's a pain to enable regparm because ebtree is built in its corner
and does not depend on the rest of the config. This causes no problem
except that if the regparm settings are not exactly similar, then we
can get inconsistent function interfaces and crashes.
One solution realized in this patch consists in externalizing all
compiler settings and changing CONFIG_XXX_REGPARM into CONFIG_REGPARM
so that we ensure that any sub-component uses the same setting. Since
ebtree used a value here and not a boolean, haproxy's config has been
set to use a number too. Both haproxy's core and ebtree currently use
the same copy of the compiler.h file. That way we don't have any issue
anymore when one setting changes somewhere.
We needed to upgrade ebtree to v5.0 to support string indexing,
and it was getting very painful to have it split across 2 dirs
and to have to patch it. Now we just have to copy the .c and .h
files to the right place.