haproxy/include
Willy Tarreau 4dd33d9c32 OPTIM: pool: split the read_mostly from read_write parts in pool_head
Performance profiling on a 48-thread machine showed a lot of time spent
in pool_free(), precisely at the point where pool->limit was retrieved.
And the reason is simple. Some parts of the pool_head are heavily updated
only when facing a cache miss ("allocated", "used", "needed_avg"), while
others are always accessed (limit, flags, size). The fact that both
entries were stored into the same cache line makes it very difficult for
each thread to access these precious info even when working with its own
cache.

By just splitting the fields apart, a test on QUIC (which stresses pools
a lot) more than doubled performance from 42 Gbps to 96 Gbps!

Given that the patch only reorders fields and addresses such a significant
contention, it should be backported to 2.7 and 2.6.
2022-12-20 14:51:12 +01:00
..
haproxy OPTIM: pool: split the read_mostly from read_write parts in pool_head 2022-12-20 14:51:12 +01:00
import CLEANUP: assorted typo fixes in the code and comments 2022-11-30 14:02:36 +01:00
make BUILD: makefile: move the compiler option detection stuff to compiler.mk 2022-11-17 10:56:35 +01:00