586 lines
31 KiB
Plaintext
586 lines
31 KiB
Plaintext
2022-02-24 - Pools structure and API
|
|
|
|
1. Background
|
|
-------------
|
|
|
|
Memory allocation is a complex problem covered by a massive amount of
|
|
literature. Memory allocators found in field cover a broad spectrum of
|
|
capabilities, performance, fragmentation, efficiency etc.
|
|
|
|
The main difficulty of memory allocation comes from finding the optimal chunks
|
|
for arbitrary sized requests, that will still preserve a low fragmentation
|
|
level. Doing this well is often expensive in CPU usage and/or memory usage.
|
|
|
|
In programs like HAProxy that deal with a large number of fixed size objects,
|
|
there is no point having to endure all this risk of fragmentation, and the
|
|
associated costs (sometimes up to several milliseconds with certain minimalist
|
|
allocators) are simply not acceptable. A better approach consists in grouping
|
|
frequently used objects by size, knowing that due to the high repetitiveness of
|
|
operations, a freed object will immediately be needed for another operation.
|
|
|
|
This grouping of objects by size is what is called a pool. Pools are created
|
|
for certain frequently allocated objects, are usually merged together when they
|
|
are of the same size (or almost the same size), and significantly reduce the
|
|
number of calls to the memory allocator.
|
|
|
|
With the arrival of threads, pools started to become a bottleneck so they now
|
|
implement an optional thread-local lockless cache. Finally with the arrival of
|
|
really efficient memory allocator in modern operating systems, the shared part
|
|
has also become optional so that it doesn't consume memory if it does not bring
|
|
any value.
|
|
|
|
In 2.6-dev2, a number of debugging options that used to be configured at build
|
|
time only changed to boot-time and can be modified using keywords passed after
|
|
"-dM" on the command line, which sets or clears bits in the pool_debugging
|
|
variable. The build-time options still affect the default settings however.
|
|
Default values may be consulted using "haproxy -dMhelp".
|
|
|
|
|
|
2. Principles
|
|
-------------
|
|
|
|
The pools architecture is selected at build time. The main options are:
|
|
|
|
- thread-local caches and process-wide shared pool enabled (1)
|
|
|
|
This is the default situation on most operating systems. Each thread has
|
|
its own local cache, and when depleted it refills from the process-wide
|
|
pool that avoids calling the standard allocator too often. It is possible
|
|
to force this mode at build time by setting CONFIG_HAP_GLOBAL_POOLS or at
|
|
boot time with "-dMglobal".
|
|
|
|
- thread-local caches only are enabled (2)
|
|
|
|
This is the situation on operating systems where a fast and modern memory
|
|
allocator is detected and when it is estimated that the process-wide shared
|
|
pool will not bring any benefit. This detection is automatic at build time,
|
|
but may also be forced at build tmie by setting CONFIG_HAP_NO_GLOBAL_POOLS
|
|
or at boot time with "-dMno-global".
|
|
|
|
- pass-through to the standard allocator (3)
|
|
|
|
This is used when one absolutely wants to disable pools and rely on regular
|
|
malloc() and free() calls, essentially in order to trace memory allocations
|
|
by call points, either internally via DEBUG_MEM_STATS, or externally via
|
|
tools such as Valgrind. This mode of operation may be forced at build time
|
|
by setting DEBUG_NO_POOLS or at boot time with "-dMno-cache".
|
|
|
|
- pass-through to an mmap-based allocator for debugging (4)
|
|
|
|
This is used only during deep debugging when trying to detect various
|
|
conditions such as use-after-free. In this case each allocated object's
|
|
size is rounded up to a multiple of a page size (4096 bytes) and an
|
|
integral number of pages is allocated for each object using mmap(),
|
|
surrounded by two unaccessible holes that aim to detect some out-of-bounds
|
|
accesses. Released objects are instantly freed using munmap() so that any
|
|
immediate subsequent access to the memory area crashes the process if the
|
|
area had not been reallocated yet. This mode can be enabled at build time
|
|
by setting DEBUG_UAF, or at run time by disabling pools and enabling UAF
|
|
with "-dMuaf". It tends to consume a lot of memory and not to scale at all
|
|
with concurrent calls, that tends to make the system stall. The watchdog
|
|
may even trigger on some slow allocations.
|
|
|
|
There are no more provisions for running with a shared pool but no thread-local
|
|
cache: the shared pool's main goal is to compensate for the expensive calls to
|
|
the memory allocator. This gain may be huge on tiny systems using basic
|
|
allocators, but the thread-local cache will already achieve this. And on larger
|
|
threaded systems, the shared pool's benefit is visible when the underlying
|
|
allocator scales poorly, but in this case the shared pool would suffer from
|
|
the same limitations without its thread-local cache and wouldn't provide any
|
|
benefit.
|
|
|
|
Summary of the various operation modes:
|
|
|
|
(1) (2) (3) (4)
|
|
|
|
User User User User
|
|
| | | |
|
|
pool_alloc() V V | |
|
|
+---------+ +---------+ | |
|
|
| Thread | | Thread | | |
|
|
| Local | | Local | | |
|
|
| Cache | | Cache | | |
|
|
+---------+ +---------+ | |
|
|
| | | |
|
|
pool_refill*() V | | |
|
|
+---------+ | | |
|
|
| Shared | | | |
|
|
| Pool | | | |
|
|
+---------+ | | |
|
|
| | | |
|
|
malloc() V V V |
|
|
+---------+ +---------+ +---------+ |
|
|
| Library | | Library | | Library | |
|
|
+---------+ +---------+ +---------+ |
|
|
| | | |
|
|
mmap() V V V V
|
|
+---------+ +---------+ +---------+ +---------+
|
|
| OS | | OS | | OS | | OS |
|
|
+---------+ +---------+ +---------+ +---------+
|
|
|
|
One extra build define, DEBUG_FAIL_ALLOC, is used to enforce random allocation
|
|
failure in pool_alloc() by randomly returning NULL, to test that callers
|
|
properly handle allocation failures. It may also be enabled at boot time using
|
|
"-dMfail". In this case the desired average rate of allocation failures can be
|
|
fixed by global setting "tune.fail-alloc" expressed in percent.
|
|
|
|
The thread-local caches contain the freshest objects. Its total size amounts to
|
|
the number of bytes set in global.tune.pool_cache_size and that may be adjusted
|
|
by the "tune.memory.hot-size" global option, which itself defaults to build
|
|
time setting CONFIG_HAP_POOL_CACHE_SIZE, which was 1MB before 2.6 and 512kB
|
|
after. The aim is to keep hot objects that still fit in the CPU core's private
|
|
L2 cache. Once these objects do not fit into the cache anymore, there's no
|
|
benefit keeping them local to the thread, so they'd rather be returned to the
|
|
shared pool or the main allocator so that any other thread may make use of
|
|
them. Under extreme thread contention the cost of accessing shared structures
|
|
in the global cache or in malloc() may still be important and it may prove
|
|
useful to increase the thread-local cache size.
|
|
|
|
|
|
3. Storage in thread-local caches
|
|
---------------------------------
|
|
|
|
This section describes how objects are linked in thread local caches. This is
|
|
not meant to be a concern for users of the pools API but it can be useful when
|
|
inspecting post-mortem dumps or when trying to figure certain size constraints.
|
|
|
|
Objects are stored in the local cache using a doubly-linked list. This ensures
|
|
that they can be visited by freshness order like a stack, while at the same
|
|
time being able to access them from oldest to newest when it is needed to
|
|
evict coldest ones first:
|
|
|
|
- releasing an object to the cache always puts it on the top.
|
|
|
|
- allocating an object from the cache always takes the topmost one, hence the
|
|
freshest one.
|
|
|
|
- scanning for older objects to evict starts from the bottom, where the
|
|
oldest ones are located
|
|
|
|
To that end, each thread-local cache keeps a list head in the "list" member of
|
|
its "pool_cache_head" descriptor, that links all objects cast to type
|
|
"pool_cache_item" via their "by_pool" member.
|
|
|
|
Note that the mechanism described above only works for a single pool. When
|
|
trying to limit the total cache size to a certain value, all pools included,
|
|
there is also a need to arrange all objects from all pools together in the
|
|
local caches. For this, each thread_ctx maintains a list head of recently
|
|
released objects, all pools included, in its member "pool_lru_head". All items
|
|
in a thread-local cache are linked there via their "by_lru" member.
|
|
|
|
This means that releasing an object using pool_free() consists in inserting
|
|
it at the beginning of two lists:
|
|
- the local pool_cache_head's "list" list head
|
|
- the thread context's "pool_lru_head" list head
|
|
|
|
Allocating an object consists in picking the first entry from the pool's "list"
|
|
and deleting its "by_pool" and "by_lru" links.
|
|
|
|
Evicting an object consists in scanning the thread context's "pool_lru_head"
|
|
backwards and deleting the object's "by_pool" and "by_lru" links.
|
|
|
|
Given that entries are both inserted and removed synchronously, we have the
|
|
guarantee that the oldest object in the thread's LRU list is always the oldest
|
|
object in its pool, and that the next element is the cache's list head. This is
|
|
what allows the LRU eviction mechanism to figure what pool an object belongs to
|
|
when releasing it.
|
|
|
|
Note:
|
|
| Since a pool_cache_item has two list entries, on 64-bit systems it will be
|
|
| 32-bytes long. This is the smallest size that a pool may be, and any smaller
|
|
| size will automatically be rounded up to this size.
|
|
|
|
When build option DEBUG_POOL_INTEGRITY is set, or the boot-time option
|
|
"-dMintegrity" is passed on the command line, the area of the object between
|
|
the two list elements and the end according to pool->size will be filled with
|
|
pseudo-random words during pool_put_to_cache(), and these words will be
|
|
compared between each other during pool_get_from_cache(), and the process will
|
|
crash in case any bit differs, as this would indicate that the memory area was
|
|
modified after the free. The pseudo-random pattern is in fact incremented by
|
|
(~0)/3 upon each free so that roughly half of the bits change each time and we
|
|
maximize the likelihood of detecting a single bit flip in either direction. In
|
|
order to avoid an immediate reuse and maximize the time the object spends in
|
|
the cache, when this option is set, objects are picked from the cache from the
|
|
oldest one instead of the freshest one. This way even late memory corruptions
|
|
have a chance to be detected.
|
|
|
|
When build option DEBUG_MEMORY_POOLS is set, or the boot-time option "-dMtag"
|
|
is passed on the executable's command line, pool objects are allocated with
|
|
one extra pointer compared to the requested size, so that the bytes that follow
|
|
the memory area point to the pool descriptor itself as long as the object is
|
|
allocated via pool_alloc(). Upon releasing via pool_free(), the pointer is
|
|
compared and the code will crash in if it differs. This allows to detect both
|
|
memory overflows and object released to the wrong pool (code bug resulting from
|
|
a copy-paste error typically).
|
|
|
|
Thus an object will look like this depending whether it's in the cache or is
|
|
currently in use:
|
|
|
|
in cache in use
|
|
+------------+ +------------+
|
|
<--+ by_pool.p | | N bytes |
|
|
| by_pool.n +--> | |
|
|
+------------+ |N=16 min on |
|
|
<--+ by_lru.p | | 32-bit, |
|
|
| by_lru.n +--> | 32 min on |
|
|
+------------+ | 64-bit |
|
|
: : : :
|
|
| N bytes | | |
|
|
+------------+ +------------+ \ optional, only if
|
|
: (unused) : : pool ptr : > DEBUG_MEMORY_POOLS
|
|
+------------+ +------------+ / is set at build time
|
|
or -dMtag at boot time
|
|
|
|
Right now no provisions are made to return objects aligned on larger boundaries
|
|
than those currently covered by malloc() (i.e. two pointers). This need appears
|
|
from time to time and the layout above might evolve a little bit if needed.
|
|
|
|
|
|
4. Storage in the process-wide shared pool
|
|
------------------------------------------
|
|
|
|
In order for the shared pool not to be a contention point in a multi-threaded
|
|
environment, objects are allocated from or released to shared pools by clusters
|
|
of a few objects at once. The maximum number of objects that may be moved to or
|
|
from a shared pool at once is defined by CONFIG_HAP_POOL_CLUSTER_SIZE at build
|
|
time, and currently defaults to 8.
|
|
|
|
In order to remain scalable, the shared pool has to make some tradeoffs to
|
|
limit the number of atomic operations and the duration of any locked operation.
|
|
As such, it's composed of a single-linked list of clusters, themselves made of
|
|
a single-linked list of objects.
|
|
|
|
Clusters and objects are of the same type "pool_item" and are accessed from the
|
|
pool's "free_list" member. This member points to the latest pool_item inserted
|
|
into the pool by a release operation. And the pool_item's "next" member points
|
|
to the next pool_item, which was the one present in the pool's free_list just
|
|
before the pool_item was inserted, and the last pool_item in the list simply
|
|
has a NULL "next" field.
|
|
|
|
The pool_item's "down" pointer points down to the next objects part of the same
|
|
cluster, that will be released or allocated at the same time as the first one.
|
|
Each of these items also has a NULL "next" field, and are chained by their
|
|
respective "down" pointers until the last one is detected by a NULL value.
|
|
|
|
This results in the following layout:
|
|
|
|
pool pool_item pool_item pool_item
|
|
+-----------+ +------+ +------+ +------+
|
|
| free_list +--> | next +--> | next +--> | NULL |
|
|
+-----------+ +------+ +------+ +------+
|
|
| down | | NULL | | down |
|
|
+--+---+ +------+ +--+---+
|
|
| |
|
|
V V
|
|
+------+ +------+
|
|
| NULL | | NULL |
|
|
+------+ +------+
|
|
| down | | NULL |
|
|
+--+---+ +------+
|
|
|
|
|
V
|
|
+------+
|
|
| NULL |
|
|
+------+
|
|
| NULL |
|
|
+------+
|
|
|
|
Allocating an entry is only a matter of performing two atomic allocations on
|
|
the free_list and reading the pool's "next" value:
|
|
|
|
- atomically mark the free_list as being updated by writing a "magic" pointer
|
|
- read the first pool_item's "next" field
|
|
- atomically replace the free_list with this value
|
|
|
|
This results in a fast operation that instantly retrieves a cluster at once.
|
|
Then outside of the critical section entries are walked over and inserted into
|
|
the local cache one at a time. In order to keep the code simple and efficient,
|
|
objects allocated from the shared pool are all placed into the local cache, and
|
|
only then the first one is allocated from the cache. This operation is
|
|
performed by the dedicated function pool_refill_local_from_shared() which is
|
|
called from pool_get_from_cache() when the cache is empty. It means there is an
|
|
overhead of two list insert/delete operations for the first object and that
|
|
could be avoided at the expense of more complex code in the fast path, but this
|
|
is negligible since it only concerns objects that need to be visited anyway.
|
|
|
|
Freeing a group of objects consists in performing the operation the other way
|
|
around:
|
|
|
|
- atomically mark the free_list as being updated by writing a "magic" pointer
|
|
- write the free_list value to the to-be-released item's "next" entry
|
|
- atomically replace the free_list with the pool_item's pointer
|
|
|
|
The cluster will simply have to be prepared before being sent to the shared
|
|
pool. The operation of releasing a cluster at once is performed by function
|
|
pool_put_to_shared_cache() which is called from pool_evict_last_items() which
|
|
itself is responsible for building the clusters.
|
|
|
|
Due to the way objects are stored, it is important to try to group objects as
|
|
much as possible when releasing them because this is what will condition their
|
|
retrieval as groups as well. This is the reason why pool_evict_last_items()
|
|
uses the LRU to find a first entry but tries to pick several items at once from
|
|
a single cache. Tests have shown that CONFIG_HAP_POOL_CLUSTER_SIZE set to 8
|
|
achieves up to 6-6.5 objects on average per operation, which effectively
|
|
divides by as much the average time spent per object by each thread and pushes
|
|
the contention point further.
|
|
|
|
Also, grouping items in clusters is a property of the process-wide shared pool
|
|
and not of the thread-local caches. This means that there is no grouped
|
|
operation when not using the shared pool (mode "2" in the diagram above).
|
|
|
|
|
|
5. API
|
|
------
|
|
|
|
The following functions are public and available for user code:
|
|
|
|
struct pool_head *create_pool(char *name, uint size, uint flags)
|
|
Create a new pool named <name> for objects of size <size> bytes. Pool
|
|
names are truncated to their first 11 characters. Pools of very similar
|
|
size will usually be merged if both have set the flag MEM_F_SHARED in
|
|
<flags>. When DEBUG_DONT_SHARE_POOLS was set at build time, or
|
|
"-dMno-merge" is passed on the executable's command line, the pools
|
|
also need to have the exact same name to be merged. In addition, unless
|
|
MEM_F_EXACT is set in <flags>, the object size will usually be rounded
|
|
up to the size of pointers (16 or 32 bytes). The name that will appear
|
|
in the pool upon merging is the name of the first created pool. The
|
|
returned pointer is the new (or reused) pool head, or NULL upon error.
|
|
Pools created this way must be destroyed using pool_destroy().
|
|
|
|
void *pool_destroy(struct pool_head *pool)
|
|
Destroy pool <pool>, that is, all of its unused objects are freed and
|
|
the structure is freed as well if the pool didn't have any used objects
|
|
anymore. In this case NULL is returned. If some objects remain in use,
|
|
the pool is preserved and its pointer is returned. This ought to be
|
|
used essentially on exit or in rare situations where some internal
|
|
entities that hold pools have to be destroyed.
|
|
|
|
void pool_destroy_all(void)
|
|
Destroy all pools, without checking which ones still have used entries.
|
|
This is only meant for use on exit.
|
|
|
|
void *__pool_alloc(struct pool_head *pool, uint flags)
|
|
Allocate an entry from the pool <pool>. The allocator will first look
|
|
for an object in the thread-local cache if enabled, then in the shared
|
|
pool if enabled, then will fall back to the operating system's default
|
|
allocator. NULL is returned if the object couldn't be allocated (due to
|
|
configured limits or lack of memory). Object allocated this way have to
|
|
be released using pool_free(). Like with malloc(), by default the
|
|
contents of the returned object are undefined. If memory poisonning is
|
|
enabled, the object will be filled with the poisonning byte. If the
|
|
global "pool.fail-alloc" setting is non-zero and DEBUG_FAIL_ALLOC is
|
|
enabled, a random number generator will be called to randomly return a
|
|
NULL. The allocator's behavior may be adjusted using a few flags passed
|
|
in <flags>:
|
|
- POOL_F_NO_POISON : when set, disables memory poisonning (e.g. when
|
|
pointless and expensive, like for buffers)
|
|
- POOL_F_MUST_ZERO : when set, the memory area will be zeroed before
|
|
being returned, similar to what calloc() does
|
|
- POOL_F_NO_FAIL : when set, disables the random allocation failure,
|
|
e.g. for use during early init code or critical sections.
|
|
|
|
void *pool_alloc(struct pool_head *pool)
|
|
This is an exact equivalent of __pool_alloc(pool, 0). It is the regular
|
|
way to allocate entries from a pool.
|
|
|
|
void *pool_alloc_nocache(struct pool_head *pool)
|
|
Allocate an entry from the pool <pool>, bypassing the cache. If shared
|
|
pools are enabled, they will be consulted first. Otherwise the object
|
|
is allocated using the operating system's default allocator. This is
|
|
essentially used during early boot to pre-allocate a number of objects
|
|
for pools which require a minimum number of entries to exist.
|
|
|
|
void *pool_zalloc(struct pool_head *pool)
|
|
This is an exact equivalent of __pool_alloc(pool, POOL_F_MUST_ZERO).
|
|
|
|
void pool_free(struct pool_head *pool, void *ptr)
|
|
Free an entry allocate from one of the pool_alloc() functions above
|
|
from pool <pool>. The object will be placed into the thread-local cache
|
|
if enabled, or in the shared pool if enabled, or will be released using
|
|
the operating system's default allocator. When a local cache is
|
|
enabled, if the local cache size becomes larger than 75% of the maximum
|
|
size configured at build time, some objects will be evicted to the
|
|
shared pool. Such objects are taken first from the same pool, but if
|
|
the total size is really huge, other pools might be checked as well.
|
|
Some extra checks enabled at build time may enforce extra checks so
|
|
that the process will immediately crash if the object was not allocated
|
|
from this pool or experienced an overflow or some memory corruption.
|
|
|
|
void pool_flush(struct pool_head *pool)
|
|
Free all unused objects from shared pool <pool>. Thread-local caches
|
|
are not affected. This is essentially used when running low on memory
|
|
or when stopping, in order to release a maximum amount of memory for
|
|
the new process.
|
|
|
|
void pool_gc(struct pool_head *pool)
|
|
Free all unused objects from all pools, but respecting the minimum
|
|
number of spare objects required for each of them. Then, for operating
|
|
systems which support it, indicate the system that all unused memory
|
|
can be released. Thread-local caches are not affected. This operation
|
|
differs from pool_flush() in that it is run locklessly, under thread
|
|
isolation, and on all pools in a row. It is called by the SIGQUIT
|
|
signal handler and upon exit. Note that the obsolete argument <pool> is
|
|
not used and the convention is to pass NULL there.
|
|
|
|
void dump_pools_to_trash(void)
|
|
Dump the current status of all pools into the trash buffer. This is
|
|
essentially used by the "show pools" CLI command or the SIGQUIT signal
|
|
handler to dump them on stderr. The total report size may not exceed
|
|
the size of the trash buffer. If it does, some entries will be missing.
|
|
|
|
void dump_pools(void)
|
|
Dump the current status of all pools to stderr. This just calls
|
|
dump_pools_to_trash() and writes the trash to stderr.
|
|
|
|
int pool_total_failures(void)
|
|
Report the total number of failed allocations. This is solely used to
|
|
report the "PoolFailed" metrics of the "show info" output. The total
|
|
is calculated on the fly by summing the number of failures in all pools
|
|
and is only meant to be used as an indicator rather than a precise
|
|
measure.
|
|
|
|
ullong pool_total_allocated(void)
|
|
Report the total number of bytes allocated in all pools, for reporting
|
|
in the "PoolAlloc_MB" field of the "show info" output. The total is
|
|
calculated on the fly by summing the number of allocated bytes in all
|
|
pools and is only meant to be used as an indicator rather than a
|
|
precise measure.
|
|
|
|
ullong pool_total_used(void)
|
|
Report the total number of bytes used in all pools, for reporting in
|
|
the "PoolUsed_MB" field of the "show info" output. The total is
|
|
calculated on the fly by summing the number of used bytes in all pools
|
|
and is only meant to be used as an indicator rather than a precise
|
|
measure. Note that objects present in caches are accounted as used.
|
|
|
|
Some other functions exist and are only used by the pools code itself. While
|
|
not strictly forbidden to use outside of this code, it is generally recommended
|
|
to avoid touching them in order not to create undesired dependencies that will
|
|
complicate maintenance.
|
|
|
|
A few macros exist to ease the declaration of pools:
|
|
|
|
DECLARE_POOL(ptr, name, size)
|
|
Placed at the top level of a file, this declares a global memory pool
|
|
as variable <ptr>, name <name> and size <size> bytes per element. This
|
|
is made via a call to REGISTER_POOL() and by assigning the resulting
|
|
pointer to variable <ptr>. <ptr> will be created of type "struct
|
|
pool_head *". If the pool needs to be visible outside of the function
|
|
(which is likely), it will also need to be declared somewhere as
|
|
"extern struct pool_head *<ptr>;". It is recommended to place such
|
|
declarations very early in the source file so that the variable is
|
|
already known to all subsequent functions which may use it.
|
|
|
|
DECLARE_STATIC_POOL(ptr, name, size)
|
|
Placed at the top level of a file, this declares a static memory pool
|
|
as variable <ptr>, name <name> and size <size> bytes per element. This
|
|
is made via a call to REGISTER_POOL() and by assigning the resulting
|
|
pointer to local variable <ptr>. <ptr> will be created of type "static
|
|
struct pool_head *". It is recommended to place such declarations very
|
|
early in the source file so that the variable is already known to all
|
|
subsequent functions which may use it.
|
|
|
|
|
|
6. Build options
|
|
----------------
|
|
|
|
A number of build-time defines allow to tune the pools behavior. All of them
|
|
have to be enabled using "-Dxxx" or "-Dxxx=yyy" in the makefile's DEBUG
|
|
variable.
|
|
|
|
DEBUG_NO_POOLS
|
|
When this is set, pools are entirely disabled, and allocations are made
|
|
using malloc() instead. This is not recommended for production but may
|
|
be useful for tracing allocations. It corresponds to "-dMno-cache" at
|
|
boot time.
|
|
|
|
DEBUG_MEMORY_POOLS
|
|
When this is set, an extra pointer is allocated at the end of each
|
|
object to reference the pool the object was allocated from and detect
|
|
buffer overflows. Then, pool_free() will provoke a crash in case it
|
|
detects an anomaly (pointer at the end not matching the pool). It
|
|
corresponds to "-dMtag" at boot time.
|
|
|
|
DEBUG_FAIL_ALLOC
|
|
When enabled, a global setting "tune.fail-alloc" may be set to a non-
|
|
zero value representing a percentage of memory allocations that will be
|
|
made to fail in order to stress the calling code. It corresponds to
|
|
"-dMfail" at boot time.
|
|
|
|
DEBUG_DONT_SHARE_POOLS
|
|
When enabled, pools of similar sizes are not merged unless the have the
|
|
exact same name. It corresponds to "-dMno-merge" at boot time.
|
|
|
|
DEBUG_UAF
|
|
When enabled, pools are disabled and all allocations and releases pass
|
|
through mmap() and munmap(). The memory usage significantly inflates
|
|
and the performance degrades, but this allows to detect a lot of
|
|
use-after-free conditions by crashing the program at the first abnormal
|
|
access. This should not be used in production. It corresponds to
|
|
boot-time options "-dMuaf". Caching is disabled but may be re-enabled
|
|
using "-dMcache".
|
|
|
|
DEBUG_POOL_INTEGRITY
|
|
When enabled, objects picked from the cache are checked for corruption
|
|
by comparing their contents against a pattern that was placed when they
|
|
were inserted into the cache. Objects are also allocated in the reverse
|
|
order, from the oldest one to the most recent, so as to maximize the
|
|
ability to detect such a corruption. The goal is to detect writes after
|
|
free (or possibly hardware memory corruptions). Contrary to DEBUG_UAF
|
|
this cannot detect reads after free, but may possibly detect later
|
|
corruptions and will not consume extra memory. The CPU usage will
|
|
increase a bit due to the cost of filling/checking the area and for the
|
|
preference for cold cache instead of hot cache, though not as much as
|
|
with DEBUG_UAF. This option is meant to be usable in production. It
|
|
corresponds to boot-time options "-dMcold-first,integrity".
|
|
|
|
DEBUG_POOL_TRACING
|
|
When enabled, the callers of pool_alloc() and pool_free() will be
|
|
recorded into an extra memory area placed after the end of the object.
|
|
This may only be required by developers who want to get a few more
|
|
hints about code paths involved in some crashes, but will serve no
|
|
purpose outside of this. It remains compatible (and completes well)
|
|
DEBUG_POOL_INTEGRITY above. Such information become meaningless once
|
|
the objects leave the thread-local cache. It corresponds to boot-time
|
|
option "-dMcaller".
|
|
|
|
DEBUG_MEM_STATS
|
|
When enabled, all malloc/calloc/realloc/strdup/free calls are accounted
|
|
for per call place (file+line number), and may be displayed or reset on
|
|
the CLI using "debug dev memstats". This is essentially used to detect
|
|
potential leaks or abnormal usages. When pools are enabled (default),
|
|
such calls are rare and the output will mostly contain calls induced by
|
|
libraries. When pools are disabled, about all calls to pool_alloc() and
|
|
pool_free() will also appear since they will be remapped to standard
|
|
functions.
|
|
|
|
CONFIG_HAP_GLOBAL_POOLS
|
|
When enabled, process-wide shared pools will be forcefully enabled even
|
|
if not considered useful on the platform. The default is to let haproxy
|
|
decide based on the OS and C library. It corresponds to boot-time
|
|
option "-dMglobal".
|
|
|
|
CONFIG_HAP_NO_GLOBAL_POOLS
|
|
When enabled, process-wide shared pools will be forcefully disabled
|
|
even if considered useful on the platform. The default is to let
|
|
haproxy decide based on the OS and C library. It corresponds to
|
|
boot-time option "-dMno-global".
|
|
|
|
CONFIG_HAP_POOL_CACHE_SIZE
|
|
This allows one to define the default size of the per-thread cache, in
|
|
bytes. The default value is 512 kB (524288). Smaller values will use
|
|
less memory at the expense of a possibly higher CPU usage when using
|
|
many threads. Higher values will give diminishing returns on
|
|
performance while using much more memory. Usually there is no benefit
|
|
in using more than a per-core L2 cache size. It would be better not to
|
|
set this value lower than a few times the size of a buffer (bufsize,
|
|
defaults to 16 kB). In addition, keep in mind that this option may be
|
|
changed at runtime using "tune.memory.hot-size".
|
|
|
|
CONFIG_HAP_POOL_CLUSTER_SIZE
|
|
This allows one to define the maximum number of objects that will be
|
|
groupped together in an allocation from the shared pool. Values 4 to 8
|
|
have experimentally shown good results with 16 threads. On systems with
|
|
more cores or loosely coupled caches exhibiting slow atomic operations,
|
|
it could possibly make sense to slightly increase this value.
|