MEDIUM: pools: refine pool size rounding

The pools sizes were rounded up a little bit too much with commit
30f931ead ("BUG/MEDIUM: pools: fix the minimum allocation size"). The
goal was in fact to make sure they were always at least large enough to
store 2 list heads, and stuffing this into the alignment calculation
resulted in the size being always rounded up to this size. This is
problematic because it means that the appended tag at the end doesn't
always catch potential overflows since more bytes than needed are
allocated. Moreover, this test was later reinforced by commit b5ba09ed5
("BUG/MEDIUM: pools: ensure items are always large enough for the
pool_cache_item"), proving that the first test was not always sufficient.

This needs to be reworked to proceed correctly:
  - the two lists are needed when the object is in the cache, hence
    when we don't care about the tag, which means that the tag's size,
    if any, can easily cover for the missing bytes to reach that size.
    This is actually what was already being checked for.

  - the rounding should not be performed (beyond the size of a word to
    preserve pointer alignment) when pool tagging is enabled, otherwise
    we don't detect small overflows. It means that there will be less
    merging when proceeding like this. Tests show that we merge 93 pools
    into 36 without tags and 43 with tags enabled.

  - the rounding should not consider the extra size, since it's already
    done when calculating the allocated size later (i.e. don't round up
    twice). The difference is subtle but it's what makes sure the tag
    immediately follows the area instead of starting from the end.

Thanks to this, now when writing one byte too many at the end of a struct
stream, the error is instantly caught.
This commit is contained in:
Willy Tarreau 2023-09-12 15:38:32 +02:00
parent 61575769ac
commit 93c2ea0ec3

View File

@ -295,35 +295,31 @@ struct pool_head *create_pool(char *name, unsigned int size, unsigned int flags)
unsigned int align;
int thr __maybe_unused;
/* We need to store a (void *) at the end of the chunks. Since we know
* that the malloc() function will never return such a small size,
* let's round the size up to something slightly bigger, in order to
* ease merging of entries. Note that the rounding is a power of two.
* This extra (void *) is not accounted for in the size computation
* so that the visible parts outside are not affected.
*
* Note: for the LRU cache, we need to store 2 doubly-linked lists.
*/
extra_mark = (pool_debugging & POOL_DBG_TAG) ? POOL_EXTRA_MARK : 0;
extra_caller = (pool_debugging & POOL_DBG_CALLER) ? POOL_EXTRA_CALLER : 0;
extra = extra_mark + extra_caller;
if (!(flags & MEM_F_EXACT)) {
align = 4 * sizeof(void *); // 2 lists = 4 pointers min
size = ((size + extra + align - 1) & -align) - extra;
}
if (!(pool_debugging & POOL_DBG_NO_CACHE)) {
/* we'll store two lists there, we need the room for this. This is
* guaranteed by the test above, except if MEM_F_EXACT is set, or if
* the only EXTRA part is in fact the one that's stored in the cache
* in addition to the pci struct.
/* we'll store two lists there, we need the room for this. Let's
* make sure it's always OK even when including the extra word
* that is stored after the pci struct.
*/
if (size + extra - extra_caller < sizeof(struct pool_cache_item))
size = sizeof(struct pool_cache_item) + extra_caller - extra;
}
/* Now we know our size is set to the strict minimum possible. It may
* be OK for elements allocated with an exact size (e.g. buffers), but
* we're going to round the size up 16 bytes to merge almost identical
* pools together. We only round up however when we add the debugging
* tag since it's used to detect overflows. Otherwise we only round up
* to the size of a word to preserve alignment.
*/
if (!(flags & MEM_F_EXACT)) {
align = (pool_debugging & POOL_DBG_TAG) ? sizeof(void *) : 16;
size = ((size + align - 1) & -align);
}
/* TODO: thread: we do not lock pool list for now because all pools are
* created during HAProxy startup (so before threads creation) */
start = &pools;