Till now it used to call it only if there were not too many objects into
the local cache otherwise would send the latest one directly into the
shared cache. Now it always sends to the local cache and it's up to the
local cache to free its oldest objects. From a cache freshness perspective
it's better this way since we always evict cold objects instead of hot
ones. From an API perspective it's better because it will help make the
shared cache invisible to the public API.
Till now we could only evict oldest objects from all local caches using
pool_evict_from_local_caches() until the cache size was satisfying again,
but there was no way to evict excess objects from a single cache, which
is the reason why pool_put_to_cache() used to refrain from putting into
the local cache and would directly write to the shared cache, resulting
in massive writes when caches were full.
Let's add this new function now. It will stop once the number of objects
in the local cache is no higher than 16+total/8 or the cache size is no
more than 75% full, just like before.
For now the function is not used.
These two functions are now responsible for allocating directly from
the cache and releasing to the cache.
Now the pool_alloc() function simply does this:
if cache enabled
return pool_alloc_from_cache() if no NULL
return pool_alloc_nocache() otherwise
and the pool_free() function does this:
if cache enabled
pool_put_to_cache()
else
pool_free_nocache()
For now this only introduces these two functions without changing anything
else, but the goal is to soon allow to make them implementation-specific.
Continuing the unification of local and shared pools, now the usage of
pools is governed by CONFIG_HAP_POOLS without which allocations and
releases are performed directly from the OS using pool_alloc_nocache()
and pool_free_nocache().
There are two levels of freeing to the OS:
- code that wants to keep the pool's usage counters updated uses
pool_free_area() and handles the counters itself. That's what
pool_put_to_shared_cache() does in the no-global-pools case.
- code that does not want to update the counters because they were
already updated only calls pool_free_area().
Let's extract these calls to establish the symmetry with pool_get_from_os()
and pool_alloc_nocache(), resulting in pool_put_to_os() (which only updates
the allocated counter) and pool_free_nocache() (which also updates the used
counter). This will later allow to simplify the generic code.
Calling pool_free_area() inside a lock in pool_put_to_shared_cache() is
a very bad idea. Fortunately this only happens on the lowest end platforms
which almost never use threads or in very small counts.
This change consists in zeroing the pointer once already released to the
cache in the first test so that the second stage knows if it needs to
pass it to the OS or not. This has slightly reduced the length of the
A part of the code cannot be factored out because it still uses non-atomic
inc/dec for pool->used and pool->allocated as these are located under the
pool's lock. While it can make sense in terms of bus cycles, it does not
make sense in terms of code normalization. Further, some operations were
still performed under a lock that could be totally removed via the use of
atomic ops.
There is still one occurrence in pool_put_to_shared_cache() in the locked
code where pool_free_area() is called under the lock, which must absolutely
be fixed.
Now there's one part dealing with the allocation itself and keeping
counters up to date, and another one on top of it to return such an
allocated pointer to the user and update the use count and stats.
This is in anticipation for being able to group cache-related parts.
The release code is still done at once.
Till now it was limited to objects allocated from the OS which means
it had little use as soon as pools were enabled. Let's move it upper
in the layers so that any code can benefit from fault injection. In
addition this allows to pass a new flag POOL_F_NO_FAIL to disable it
if some callers prefer a no-failure approach.
ha_random() is quite heavy and uses atomic ops or even a lock on some
architectures. Here we don't seek good randoms, just statistical ones,
so let's use the statistical prng instead.
Now the multi-level cache becomes more visible:
pool_get_from_local_cache()
pool_put_to_local_cache()
pool_get_from_shared_cache()
pool_put_to_shared_cache()
The functions were rightfully called from/to_cache when the thread-local
cache was considered as the only cache, but this is getting terribly
confusing. Let's call them from/to local_cache to make it clear that
it is not related with the shared cache.
As a side note, since pool_evict_from_cache() used not to work for a
particular pool but for all of them at once, it was renamed to
pool_evict_from_local_caches() (plural form).
This is exactly what it is, the entry is retrieved from the shared
cache when it is defined. The implementation that is enabled with
CONFIG_HAP_NO_GLOBAL_POOLS continues to return NULL.
Now that __pool_alloc() only surrounds __pool_get_first() with the lock,
let's move it to the only variant that requires it and remove the ugly
ifdefs from the function. This is safe because nobody else calls this
function.
In __pool_alloc(), historically we used to use factor out the
pool's lock between __pool_get_first() and __pool_refill_alloc(),
resulting in real malloc() or mmap() calls being performed under
the pool lock (for platforms using the locked shared pools).
As this is not needed anymore, let's move the call out of the
lock, it may improve allocation patterns on some platforms. This
also makes __pool_alloc() cleaner as we see a first attempt to
allocate from the local cache, then a second from the shared
cache then a reall allocation.
They were strictly equivalent, let's remerge them and rename them to
pool_alloc_nocache() as it's the call which performs a real allocation
which does not check nor update the cache. The only difference in the
past was the former taking the lock and not the second but now the lock
is not needed anymore at this stage since the pool's list is not touched.
In addition, given that the "avail" argument is no longer used by the
function nor by its callers, let's drop it.
Now we don't loop anymore trying to refill multiple items at once, and
an allocated object is directly returned to the requester instead of
being stored into the shared pool. This has multiple benefits. The
first one is that no locking is needed anymore on the allocation path
and the second one is that the loop will no longer cause latency spikes.
This is a first step towards unifying all the fallback code. Right now
these two functions are the only ones which do not update the needed_avg
rate counter since there's currently no shared pool kept when using them.
But their code is similar to what could be used everywhere except for
this one, so let's make them capable of maintaining usage statistics.
As a side effect the needed field in "show pools" will now be populated.
The mem_should_fail() call enabled by DEBUG_FAIL_ALLOC used to be placed
only in the no-cache version of the allocator. Now we can generalize it
to all modes and remove the exclusive test on CONFIG_HAP_NO_GLOBAL_POOLS.
We're going to make the local pool always present unless pools are
completely disabled. This means that pools are always enabled by
default, regardless of the use of threads. Let's drop this notion
of "local" pools and make it just "pool". The equivalent debug
option becomes DEBUG_NO_POOLS instead of DEBUG_NO_LOCAL_POOLS.
For now this changes nothing except the option and dropping the
dependency on USE_THREAD.
Initially per-thread pool caches were stored into a fixed-size array.
But this was a bit ugly because the last allocated pools were not able
to benefit from the cache at all. As a work around to preserve
performance, a size of 64 cacheable pools was set by default (there
are 51 pools at the moment, excluding any addon and debugging code),
so all in-tree pools were covered, at the expense of higher memory
usage.
In addition an index had to be calculated for each pool, and was used
to acces the pool cache head into that array. The pool index was not
even stored into the pools so it was required to determine it to access
the cache when the pool was already known.
This patch changes this by moving the pool cache head into the pool
head itself. This way it is certain that each pool will have its own
cache. This removes the need for index calculation.
The pool cache head is 32 bytes long so it was aligned to 64B to avoid
false sharing between threads. The extra cost is not huge (~2kB more
per pool than before), and we'll make better use of that space soon.
The pool cache head contains the size, which should probably be removed
since it's already in the pool's head.
In commit fb117e6a8 ("MEDIUM: memory: don't let pool_put_to_cache() free
the objects itself") pool_evict_from_cache() was introduced with no
argument, yet the only call place passes it the pool, the pointer and
the index number!
Let's remove these as they even let the reader think that the function
does something specific to the current pool while it's not the case.
When building with DEBUG_FAIL_ALLOC we call a random generator to decide
whether the pool alloc should succeed or fail, and there was a preliminary
debugging mechanism to keep sort of a history of the previous decisions. But
it was never used, enforces a lock during the allocation, and forces to use
static variables, all of which are limiting the ability to pursue the pools
cleanups with no real benefit. Let's get rid of them now.
Since recent commit ae07592 ("MEDIUM: pools: add CONFIG_HAP_NO_GLOBAL_POOLS
and CONFIG_HAP_GLOBAL_POOLS") the pre-allocation of all desired reserved
buffers was not done anymore on systems not using the shared cache. This
basically has no practical impact since these ones will quickly be refilled
by all the ones used at run time, but it may confuse someone checking if
they're allocated in "show pools".
That's only 2.4-dev, no backport is needed.
When running with CONFIG_HAP_NO_GLOBAL_POOLS, it's theoritically possible
to keep an incorrect count of allocated entries in a pool because the
allocated counter was used as a cumulated counter of alloc calls instead
of a number of currently allocated items (it's possible the meaning has
changed over time). The only impact in this mode essentially is that
"show pools" will report incorrect values. But this would only happen on
limited pools, which is not even certain still exist.
This was added by recent commit 0bae07592 ("MEDIUM: pools: add
CONFIG_HAP_NO_GLOBAL_POOLS and CONFIG_HAP_GLOBAL_POOLS") so no backport
is needed.
This patch adds an introduction to the http-request normalize-uri section,
explaining what to expect from the normalizers and possible issues that might
arise when not being careful.
This patch renames all existing uri-normalizers into a more consistent naming
scheme:
1. The part of the URI that is being touched.
2. The modification being performed as an explicit verb.
This normalizer merges `../` path segments with the predecing segment, removing
both the preceding segment and the `../`.
Empty segments do not receive special treatment. The `merge-slashes` normalizer
should be executed first.
See GitHub Issue #714.
It is a workaround to avoid a clang 11 bug that exits with SIGABRT when
stderr is redirected to stdin. This bug was already reported few weeks ago:
https://bugs.llvm.org/show_bug.cgi?id=49463
But because it is pretty annoying, the standard error is now redirected to
/dev/null.
When the session is aborted before any connection attempt to any server, the
number of connection retries reported in the logs is wrong. It happens
because when the retries counter is not strictly positive, we consider the
max number of retries was reached and the backend retries value is used. It
is obviously wrong when no connectioh was performed.
In fact, at this stage, the retries counter is initialized to 0. But the
backend stream-interface is in the INI state. Once it is set to SI_ST_REQ,
the counter is set to the backend value. And it is the only possible state
transition from INI state. Thus it is safe to rely on it to fix the bug.
This patch must be backported to all stable versions.
The http_get_stline() was designed to be called from HTTP analyzers. Thus
before any data forwarding. To prevent any invalid usage, two BUG_ON()
statements were added. However, it is not a good idea because it is pretty
hard to be sure no HTTP sample fetch will never be called outside the
analyzers context. Especially because there is at least one possible area
where it may happens. An HTTP sample fetch may be used inside the unique-id
format string. On the normal case, it is generated in AN_REQ_HTTP_INNER
analyzer. But if an error is reported too early, the id is generated when
the log is emitted.
So, it is safer to remove the BUG_ON() statements and consider the normal
behavior is to return NULL if the first block is not a start-line. Of
course, this means all calling functions must test the return value or be
sure the start-line is really there.
This patch must be backported as far as 2.0.
This patch adds 4 new sample fetches to get the source and the destination
info (ip address and port) of the backend connection :
* bc_dst : Returns the destination address of the backend connection
* bc_dst_port : Returns the destination port of the backend connection
* bc_src : Returns the source address of the backend connection
* bc_src_port : Returns the source port of the backend connection
The configuration manual was updated accordingly.
When method sample fetch is called, if an exotic method is found
(HTTP_METH_OTHER), when smp_prefetch_htx() is called, we must be sure the
start-line is still there. Otherwise, HAproxy may crash because of a NULL
pointer dereference, for instance if the method sample fetch is used inside
a unique-id format string. Indeed, the unique id may be generated when the
log message is emitted. At this stage, the request channel is empty.
This patch must be backported as far as 2.0. But the bug exists in all
stable versions for the legacy HTTP mode too. Thus it must be adapted to the
legacy HTTP mode and backported to all other stable versions.
For all ssl_bc_* sample fetches, the test on the keyword when called from a
health-check is inverted. We must be sure the 5th charater is a 'b' to
retrieve a connection.
This patch must be backported as far as 2.2.
bc_http_major sample fetch now works when it is called from a
tcp-check. When it happens, the session origin is a check. The backend
connection is retrieved from the conn-stream attached to the check.
If required, this path may easily be backported as far as 2.2.
fc_http_major and bc_http_major sample fetches return the major digit of the
HTTP version used, respectively, by the frontend and the backend
connections, based on the mux. However, in reality, "2" is returned if the
H2 mux is detected, otherwise "1" is inconditionally returned, regardless
the mux used. Thus, if called for a raw TCP connection, "1" is returned.
To fix this bug, we now get the multiplexer flags, if there is one, to be
sure MX_FL_HTX is set.
I guess it was made this way on purpose when the H2 multiplexer was
introduced in the 1.8 and with the legacy HTTP mode there is no other
solution at the connection level. Thus this patch should be backported as
far as 2.2. For the 2.0, it must be evaluated first because of the legacy
HTTP mode.
When a log-format string is built from an health-check, the session origin
is the health-check itself and not a connection. In addition, there is no
stream. It means for now some formats are not supported: %s, %sc, %b, %bi,
%bp, %si and %sp.
Thanks to this patch, the session origin is converted to a check. So it is
possible to retrieve the backend and the backend connection. Note this
session have no listener, thus %ft format must be guarded.
This patch is light and standalone, thus it may be backported as far as 2.2
if required. However, because the error is human, it is probably better to
wait a bit to be sure everything is properly protected.
The dummy frontend used to create the session of the tcp-checks is
initialized without identifier. However, it is required because this id may
be used without any guard, for instance in log-format string via "%f" or
when fe_name sample fetch is called. Thus, an unset id may lead to crashes.
This patch must be backported as far as 2.2.
When a thread ends its harmeless period, we must only consider running
threads when testing threads_want_rdv_mask mask. To do so, we reintroduce
all_threads_mask mask in the bitwise operation (It was removed to fix a
deadlock).
Note that for now it is useless because there is no way to stop threads or
to have threads reserved for another task. But it is safer this way to avoid
bugs in the future.