haproxy public development tree
Go to file
Aleksey Ponomaryov 593802128c BUG/MEDIUM: stick-table: do not leave entries in end of window during purge
At some moments expired stick table records stop being removed. This
happens when the internal time wraps around the 32-bit limit, or every
49.7 days. What precisely happens is that some elements that are collected
close to the end of the time window (2^32 - table's "expire" setting)
might have been updated and will be requeued further, at the beginning
of the next window. Here, three bad situations happen:

  - the incorrect integer-based comparison that is not aware of wrapping
    will result in the scan to restart from the freshly requeued element,
    skipping all those at the end of the window. The net effect of this
    is that at each wakeup of the expiration task, only one element from
    the end of the window will be expired, and other ones will remain
    there for a very long time, especially if they have to wait for all
    the predecessors to be picked one at a time after slow wakeups due
    to a long expiration ; this is what was observed in issue #2034
    making the table fill up and appear as not expiring at all, and it
    seems that issue #2024 reports the same problem at the same moment
    (since such issues happen for everyone roughly at the same time
    when the clock doesn't drift too much).

  - the elements that were placed at the beginning of the next window
    are skipped as well for as long as there are refreshed entries at
    the end of the previous window, so these ones participate to filling
    the table as well. This is cause by the restart from the current,
    updated node that is generally placed after most other less recently
    updated elements.

  - once the last element at the end of the window is picked, suddenly
    there is a large amount of expired entries at the beginning of the
    next window that all have to be requeued. If the expiration delay
    is large, the number can be big and it can take a long time, which
    can very likely explain the periodic crashes reported in issue #2025.
    Limiting the batch size as done in commit dfe79251d ("BUG/MEDIUM:
    stick-table: limit the time spent purging old entries") would make
    sense for process_table_expire() as well.

This patch addresses the incorrect tree scan algorithm to make sure that:
  - there's always a next element to compare against, even when dealing
    with the last one in the tree, the first one must be used ;

  - time comparisons used to decide whether to restart from the current
    element use tick_is_lt() as it is the only case where we know the
    current element will be placed before any other one (since the tree
    respects insertion ordering for duplicates)

In order to reproduce the issue, it was found that injecting traffic on
a random key that spans over half of the size of a table whose expiration
is set to 15s while the date is going to wrap in 20s does exhibit an
increase of the table's size 5s after startup, when entries start to be
pushed to the next window. It's more effective when a second load
generator constantly hammers a same key to be certain that none of them
is ready to expire. This doesn't happen anymore after this patch.

This fix needs to be backported to all stable versions. The bug has been
there for as long as the stick tables were introduced in 1.4-dev7 with
commit 3bd697e07 ("[MEDIUM] Add stick table (persistence) management
functions and types"). A cleanup could consists in deduplicating that
code by having process_table_expire() call __stktable_trash_oldest(),
with that one improved to support an optional time check.
2023-02-08 08:55:02 +01:00
.github CI: Reformat matrix.py using black 2023-01-03 16:28:34 +01:00
addons BUG/MINOR: promex: Don't forget to consume the request on error 2023-01-13 09:45:23 +01:00
admin BUILD: halog: fix missing double-quote at end of help line 2022-11-25 11:11:41 +01:00
dev DEV: hpack: fix trash build regression 2023-01-27 10:22:20 +01:00
doc [RELEASE] Released version 2.8-dev3 2023-02-04 10:51:05 +01:00
examples EXAMPLES: remove completely outdated acl-content-sw.cfg 2022-05-30 18:14:24 +02:00
include MINOR: quic: Update version_information transport parameter to draft-14 2023-02-06 11:54:07 +01:00
reg-tests REGTEST: add RFC7239 forwarded header tests 2023-01-27 15:18:59 +01:00
scripts SCRIPTS: run-regtests: add a version check 2022-11-30 18:44:33 +01:00
src BUG/MEDIUM: stick-table: do not leave entries in end of window during purge 2023-02-08 08:55:02 +01:00
tests TESTS: add a unit test for one_among_mask() 2022-06-21 20:29:57 +02:00
.cirrus.yml CI: cirrus-ci: bump FreeBSD image to 13-1 2022-09-09 13:30:17 +02:00
.gitattributes MINOR: Configure the cpp userdiff driver for *.[ch] in .gitattributes 2021-02-22 18:17:57 +01:00
.gitignore CLEANUP: exclude udp-perturb with .gitignore 2022-09-16 15:47:04 +02:00
.mailmap DOC: update Tim's address in .mailmap 2021-09-16 09:14:14 +02:00
.travis.yml CI: travis-ci: temporarily disable arm64 builds 2021-08-07 07:28:15 +02:00
BRANCHES DOC: fix some spelling issues over multiple files 2021-01-08 14:53:47 +01:00
CHANGELOG [RELEASE] Released version 2.8-dev3 2023-02-04 10:51:05 +01:00
CONTRIBUTING CLEANUP: assorted typo fixes in the code and comments 2021-08-16 12:37:59 +02:00
INSTALL MINOR: version: mention that it's development again 2022-12-01 15:24:10 +01:00
LICENSE LICENSE: add licence exception for OpenSSL 2012-09-07 13:52:26 +02:00
MAINTAINERS CLEANUP: assorted typo fixes in the code and comments 2022-11-30 14:02:36 +01:00
Makefile BUILD: makefile: fix PCRE overriding specific lib path 2023-02-03 09:42:49 +01:00
README DOC: create a BRANCHES file to explain the life cycle 2019-06-15 22:00:14 +02:00
SUBVERS BUILD: use format tags in VERDATE and SUBVERS files 2013-12-10 11:22:49 +01:00
VERDATE [RELEASE] Released version 2.8-dev3 2023-02-04 10:51:05 +01:00
VERSION [RELEASE] Released version 2.8-dev3 2023-02-04 10:51:05 +01:00

The HAProxy documentation has been split into a number of different files for
ease of use.

Please refer to the following files depending on what you're looking for :

  - INSTALL for instructions on how to build and install HAProxy
  - BRANCHES to understand the project's life cycle and what version to use
  - LICENSE for the project's license
  - CONTRIBUTING for the process to follow to submit contributions

The more detailed documentation is located into the doc/ directory :

  - doc/intro.txt for a quick introduction on HAProxy
  - doc/configuration.txt for the configuration's reference manual
  - doc/lua.txt for the Lua's reference manual
  - doc/SPOE.txt for how to use the SPOE engine
  - doc/network-namespaces.txt for how to use network namespaces under Linux
  - doc/management.txt for the management guide
  - doc/regression-testing.txt for how to use the regression testing suite
  - doc/peers.txt for the peers protocol reference
  - doc/coding-style.txt for how to adopt HAProxy's coding style
  - doc/internals for developer-specific documentation (not all up to date)