MEDIUM: dynbuf: use emergency buffers upon failed memory allocations

Now, if a pool_alloc() fails for a buffer and if conditions are met
based on the queue number, we'll try to get an emergency buffer.

Thanks to this the situation is way more stable now. With only 4 reserve
buffers and 1 buffer it's possible to reliably serve 500 concurrent end-
to-end H1 connections and consult stats in parallel in loops showing the
growing number of buf_wait events in "show activity" without facing an
instant stall like in the past. Lower values still cause quick stalls
though.

It's also apparent that some subsystems do not seem to detach from the
buffer_wait lists when leaving. For example several crashes in the H1
part showed list elements still present after a free(), so maybe some
operations performed inside h1_release() after the b_dequeue() call
can sometimes result in a new allocation. Same for streams, where
the dequeue is done relatively early.
This commit is contained in:
Willy Tarreau 2024-04-29 08:36:09 +02:00
parent 0ce51dc93b
commit fc792694a6

View File

@ -32,6 +32,7 @@
#include <haproxy/buf.h>
#include <haproxy/chunk.h>
#include <haproxy/dynbuf-t.h>
#include <haproxy/global.h>
#include <haproxy/pool.h>
extern struct pool_head *pool_head_buffer;
@ -67,6 +68,18 @@ static inline int b_may_alloc_for_crit(uint crit)
if (!(crit & DB_F_NOQUEUE) && th_ctx->bufq_map & ((2 << q) - 1))
return 0;
/* If the emergency buffers are too low, we won't try to allocate a
* buffer either so that we speed up their release. As a corrolary, it
* means that we're always allowed to try to fall back to an emergency
* buffer if pool_alloc() fails. The minimum number of available
* emergency buffers for an allocation depends on the queue:
* q == 0 -> 0%
* q == 1 -> 33%
* q == 2 -> 66%
* q == 3 -> 100%
*/
if (th_ctx->emergency_bufs_left * 3 < q * global.tune.reserved_bufs)
return 0;
return 1;
}
@ -100,8 +113,11 @@ static inline char *__b_get_emergency_buf(void)
\
if (!_retbuf->size) { \
*_retbuf = BUF_WANTED; \
if (b_may_alloc_for_crit(_criticality)) \
if (b_may_alloc_for_crit(_criticality)) { \
_area = pool_alloc_flag(pool_head_buffer, POOL_F_NO_POISON | POOL_F_NO_FAIL); \
if (unlikely(!_area)) \
_area = __b_get_emergency_buf(); \
} \
if (unlikely(!_area)) { \
activity[tid].buf_wait++; \
_retbuf = NULL; \