Commit Graph

16177 Commits

Author SHA1 Message Date
Remi Tricot-Le Breton
bb3e80e181 BUG/MINOR: vars: Fix the set-var and unset-var converters
In commit 3a4bedccc6 the variable logic was changed. Instead of
accessing variables by their name during runtime, the variable tables
are now indexed by a hash of the name. But the set-var and unset-var
converters try to access the correct variable by calculating a hash on
the sample instead of the already calculated variable hash.

It should be backported to 2.5.
2021-12-01 10:32:19 +01:00
Frédéric Lécaille
008386bec4 MINOR: quic: Delete the ODCIDs asap
As soon as the connection ID (the one choosen by the QUIC server) has been used
by the client, we can delete its original destination connection ID from its tree.
2021-11-30 12:01:32 +01:00
Frédéric Lécaille
a7d2c09468 MINOR: quic: Enable the Key Update process
This patch modifies ha_quic_set_encryption_secrets() to store the
secrets received by the TLS stack and prepare the information for the
next key update thanks to quic_tls_key_update().
qc_pkt_decrypt() is modified to check if we must used the next or the
previous key phase information to decrypt a short packet.
The information are rotated if the packet could be decrypted with the
next key phase information. Then new secrets, keys and IVs are updated
calling quic_tls_key_update() to prepare the next key phase.
quic_build_packet_short_header() is also modified to handle the key phase
bit from the current key phase information.
2021-11-30 11:51:12 +01:00
Frédéric Lécaille
a7973a6dce MINOR: quic: Add quic_tls_key_update() function for Key Update
This function derives the next RX and TX keys and IVs from secrets
for the next key update key phase. We also implement quic_tls_rotate_keys()
which rotate the key update key phase information to be able to continue
to decrypt old key phase packets. Most of these information are pointers
to unsigned char.
2021-11-30 11:51:12 +01:00
Frédéric Lécaille
6e351d6c19 MINOR: quic: Optional header protection key for quic_tls_derive_keys()
quic_tls_derive_keys() is responsible to derive the AEAD keys, IVs and$
header protection key from a secret provided by the TLS stack. We want
to make the derivation of the header protection key be optional. This
is required for the Key Update process where there is no update for
the header protection key.
2021-11-30 11:51:12 +01:00
Frédéric Lécaille
40df78f116 MINOR: quic: Add structures to maintain key phase information
When running Key Update process, we must maintain much information
especially when the key phase bit has been toggled by the peer as
it is possible that it is due to late packets. This patch adds
quic_tls_kp new structure to do so. They are used to store
previous and next secrets, keys and IVs associated to the previous
and next RX key phase. We also need the next TX key phase information
to be able to encrypt packets for the next key phase.
2021-11-30 11:51:12 +01:00
Frédéric Lécaille
39484de813 MINOR: quic: Add a function to derive the key update secrets
This is the function used to derive an n+1th secret from the nth one as
described in RFC9001 par. 6.1.
2021-11-30 11:51:12 +01:00
Frédéric Lécaille
fc768ecc88 MINOR: quic: Dynamically allocate the secrete keys
This is done for any encryption level. This is to prepare the Key Update feature.
2021-11-30 11:51:12 +01:00
Frédéric Lécaille
d77c50b6d6 MINOR: quic: Possible crash when inspecting the xprt context
haproxy may crash when running this statement in qc_lstnr_pkt_rcv():
	conn_ctx = qc->conn->xprt_ctx;
because qc->conn may not be initialized. With this patch we ensure
qc->conn is correctly initialized before accessing its ->xprt_ctx
members. We zero the xrpt_ctx structure (ssl_conn_ctx struct), then
initialize its ->conn member with HA_ATOMIC_STORE. Then, ->conn and
->conn->xptr_ctx members of quic_conn struct can be accessed with HA_ATOMIC_LOAD()
2021-11-30 11:50:42 +01:00
Frédéric Lécaille
e2660e61e2 MINOR: quic: Rename qc_prep_hdshk_pkts() to qc_prep_pkts()
qc_prep_hdshk_pkts() does not prepare only handshake packets but
any type of packet.
2021-11-30 11:47:46 +01:00
Frédéric Lécaille
b5b5247b18 MINOR: quic: Immediately close if no transport parameters extension found
If the ClientHello callback does not manage to find a correct QUIC transport
parameters extension, we immediately close the connection with
missing_extension(109) as TLS alert which is turned into 0x16d QUIC connection
error.
2021-11-30 11:47:46 +01:00
Frédéric Lécaille
1fc5e16c4c MINOR: quic: More accurate immediately close.
When sending a CONNECTION_CLOSE frame to immediately close the connection,
do not provide CRYPTO data to the TLS stack. Do not built anything else than a
CONNECTION_CLOSE and do not derive any secret when in immediately close state.
Seize the opportunity of this patch to rename ->err quic_conn struct member
to ->error_code.
2021-11-30 11:47:46 +01:00
Frédéric Lécaille
067a82bba1 MINOR: quic: Set "no_application_protocol" alert
We set this TLS error when no application protocol could be negotiated
via the TLS callback concerned. It is converted as a QUIC CRYPTO_ERROR
error (0x178).
2021-11-30 11:47:46 +01:00
Willy Tarreau
3cc1e3d5ca BUILD: evports: remove a leftover from the dead_fd cleanup
Commit b1f29bc62 ("MINOR: activity/fd: remove the dead_fd counter") got
rid of FD_UPDT_DEAD, but evports managed to slip through the cracks and
wasn't cleaned up, thus it doesn't build anymore, as reported in github
issue #1467. We just need to remove the related lines since the situation
is already handled by the remaining conditions.

Thanks to Dominik Hassler for reporting the issue and confirming the fix.

This must be backported to 2.5 only.
2021-11-30 09:34:32 +01:00
Christopher Faulet
d98da3bc90 BUG/MEDIUM: cli: Properly set stream analyzers to process one command at a time
The proxy used by the master CLI is an internal proxy and no filter are
registered on it. Thus, there is no reason to take care to set or unset
filter analyzers in the master CLI analyzers. AN_REQ_FLT_END was set on the
request channel to prevent the infinite forward and be sure to be able to
process one commande at a time. However, the only work because
CF_FLT_ANALYZE flag was used by error as a channel analyzer instead of a
channel flag. This erroneously set AN_RES_FLT_END on the request channel,
that really prevent the infinite forward, be side effet.

In fact, We must avoid this kind of trick because this only work by chance
and may be source of bugs in future. Instead, we must always keep the CLI
request analyzer and add an early return if the response is not fully
processed. It happens when the CLI response analyzer is set.

This patch must be backported as far as 2.0.
2021-11-29 11:28:54 +01:00
Willy Tarreau
4673c5e2c8 CI: github actions: add the output of $CC -dM -E-
Sometimes figuring what differs between platforms is useful to fix
build issues, to decide what ifdef to add for example. Let's always
call $CC -dM -E- before starting make.
2021-11-26 17:58:42 +01:00
Willy Tarreau
781f07a620 BUILD: pools: only detect link-time jemalloc on ELF platforms
The build broke on Windows and MacOS after commit ed232148a ("MEDIUM:
pool: refactor malloc_trim/glibc and jemalloc api addition detections."),
because the extern+attribute(weak) combination doesn't result in a really
weak symbol and it causes an undefined symbol at link time.

Let's reserve this detection to ELF platforms. The runtime detection using
dladdr() remains used if defined.

No backport needed, this is purely 2.6.
2021-11-26 16:13:17 +01:00
William Lallemand
efd954793e BUG/MINOR: mworker: deinit of thread poller was called when not initialized
Commit 67e371e ("BUG/MEDIUM: mworker: FD leak of the eventpoll in wait
mode") introduced a regression. Upon a reload it tries to deinit the
poller per thread, but no poll loop was initialized after loading the
configuration.

This patch fixes the issue by moving this part of the code in
mworker_reload(), since this function will be called only when the
poller is fully initialized.

This patch must be backported in 2.5.
2021-11-26 14:43:57 +01:00
David Carlier
d450ff636c MEDIUM: pool: support purging jemalloc arenas in trim_all_pools()
In the case of Linux/glibc, falling back to malloc_trim if jemalloc
had not been detected beforehand.
2021-11-25 18:54:50 +01:00
David Carlier
ed232148a7 MEDIUM: pool: refactor malloc_trim/glibc and jemalloc api addition detections.
Attempt to detect jemalloc at runtime before hand whether linked or via
symbols overrides, and fall back to malloc_trim/glibc for Linux otherwise.
2021-11-25 18:54:50 +01:00
Amaury Denoyelle
5bae85d0d2 MINOR: quic: use more verbose QUIC traces set at compile-time
Remove the verbosity set to 0 on quic_init_stdout_traces. This will
generate even more verbose traces on stdout with the default verbosity
of 1 when compiling with -DENABLE_QUIC_STDOUT_TRACES.
2021-11-25 18:10:58 +01:00
Amaury Denoyelle
118b2cbf84 MINOR: quic: activate QUIC traces at compilation
Implement a function quic_init_stdout_traces called at STG_INIT. If
ENABLE_QUIC_STDOUT_TRACES preprocessor define is set, the QUIC trace
module will be automatically activated to emit traces on stdout on the
developer level.

The main purpose for now is to be able to generate traces on the haproxy
docker image used for QUIC interop testing suite. This should facilitate
test failure analysis.
2021-11-25 16:12:44 +01:00
Amaury Denoyelle
7d3aea50b8 MINOR: qpack: support litteral field line with non-huff name
Support qpack header using a non-huffman encoded name in a litteral
field line with name reference.

This format is notably used by picoquic client and should improve
haproxy interop covering.
2021-11-25 11:41:29 +01:00
Amaury Denoyelle
d6a352a58b MEDIUM: quic: handle CIDs to rattach received packets to connection
Change the way the CIDs are organized to rattach received packets DCID
to QUIC connection. This is necessary to be able to handle multiple DCID
to one connection.

For this, the quic_connection_id structure has been extended. When
allocated, they are inserted in the receiver CID tree instead of the
quic_conn directly. When receiving a packet, the receiver tree is
inspected to retrieve the quic_connection_id. The quic_connection_id
contains now contains a reference to the QUIC connection.
2021-11-25 11:41:29 +01:00
Amaury Denoyelle
42b9f1c6dd CLEANUP: quic: add comments on CID code
Add minor comment to explain how the CID are stored in the QUIC
connection.
2021-11-25 11:33:35 +01:00
Amaury Denoyelle
aff4ec86eb REORG: quic: add comment on rare thread concurrence during CID alloc
The comment is here to warn about a possible thread concurrence issue
when treating INITIAL packets from the same client. The macro unlikely
is added to further highlight this scarce occurence.
2021-11-25 11:13:12 +01:00
Amaury Denoyelle
cb318a80e4 MINOR: quic: do not reject PADDING followed by other frames
It is valid for a QUIC packet to contain a PADDING frame followed by
one or several other frames.

quic_parse_padding_frame() does not require change as it detect properly
the end of the frame with the first non-null byte.

This allow to use quic-go implementation which uses a PADDING-CRYPTO as
the first handshake packet.
2021-11-25 11:13:12 +01:00
William Lallemand
67e371ea14 BUG/MEDIUM: mworker: FD leak of the eventpoll in wait mode
Since 2.5, before re-executing in wait mode, the master can have a
working configuration loaded, with a eventpoll fd. This case was not
handled correctly and a new eventpoll FD is leaking in the master at
each reload, which is inherited by the new worker.

Must be backported in 2.5.
2021-11-25 10:45:29 +01:00
William Lallemand
befab9ee4a BUG/MINOR: mworker: does not add the -sf in wait mode
Since the wait mode is automatically executed after charging the
configuration, -sf was shown in argv[] with the previous PID, which is
normal, but also the current one. This is only a visual problem when
listing the processes, because -sf does not do anything in wait mode.

Fix the issue by removing the whole "-sf" part in wait mode, but the
executed command can be seen in the argv[] of the latest worker forked.

Must be backported in 2.5.
2021-11-25 10:39:54 +01:00
Bertrand Jacquin
7fbc7708d4 BUG/MINOR: lua: remove loop initial declarations
HAProxy is documented to support gcc >= 3.4 as per INSTALL file, however
hlua.c makes use of c11 only loop initial declarations leading to build
failure when using gcc-4.9.4:

  x86_64-unknown-linux-gnu-gcc -Iinclude  -Wchar-subscripts -Wcomment -Wformat -Winit-self -Wmain -Wmissing-braces -Wno-pragmas -Wparentheses -Wreturn-type -Wsequence-point -Wstrict-aliasing -Wswitch -Wtrigraphs -Wuninitialized -Wunknown-pragmas -Wunused-label -Wunused-variable -Wunused-value -Wpointer-sign -Wimplicit -pthread -fdiagnostics-color=auto -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -O3 -msse -mfpmath=sse -march=core2 -g -fPIC -g -Wall -Wextra -Wundef -Wdeclaration-after-statement -fwrapv -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wtype-limits      -DUSE_EPOLL  -DUSE_NETFILTER   -DUSE_PCRE2 -DUSE_PCRE2_JIT -DUSE_POLL -DUSE_THREAD -DUSE_BACKTRACE   -DUSE_TPROXY -DUSE_LINUX_TPROXY -DUSE_LINUX_SPLICE -DUSE_LIBCRYPT -DUSE_CRYPT_H -DUSE_GETADDRINFO -DUSE_OPENSSL -DUSE_LUA -DUSE_ACCEPT4   -DUSE_SLZ -DUSE_CPU_AFFINITY -DUSE_TFO -DUSE_NS -DUSE_DL -DUSE_RT      -DUSE_PRCTL  -DUSE_THREAD_DUMP        -DUSE_PCRE2 -DPCRE2_CODE_UNIT_WIDTH=8  -I/usr/local/include -DCONFIG_HAPROXY_VERSION=\"2.5.0\" -DCONFIG_HAPROXY_DATE=\"2021/11/23\" -c -o src/connection.o src/connection.c
  src/hlua.c: In function 'hlua_config_prepend_path':
  src/hlua.c:11292:2: error: 'for' loop initial declarations are only allowed in C99 or C11 mode
    for (size_t i = 0; i < 2; i++) {
    ^
  src/hlua.c:11292:2: note: use option -std=c99, -std=gnu99, -std=c11 or -std=gnu11 to compile your code

This commit moves loop iterator to an explicit declaration.

Must be backported to 2.5 because this issue was introduced in
v2.5-dev10~69 with commit 9e5e586e35 ("BUG/MINOR: lua: Fix lua error
 handling in `hlua_config_prepend_path()`")
2021-11-25 09:07:34 +01:00
William Lallemand
2be557f7cb MEDIUM: mworker: seamless reload use the internal sockpairs
With the master worker, the seamless reload was still requiring an
external stats socket to the previous process, which is a pain to
configure.

This patch implements a way to use the internal socketpair between the
master and the workers to transfer the sockets during the reload.
This way, the master will always try to transfer the socket, even
without any configuration.

The master will still reload with the -x argument, followed by the
sockpair@ syntax. ( ex -x sockpair@4 ). Which use the FD of internal CLI
to the worker.
2021-11-24 19:00:39 +01:00
William Lallemand
82d5f013f9 BUG/MINOR: lua: don't expose internal proxies
Since internal proxies are now in the global proxy list, they are now
reachable from core.proxies, core.backends, core.frontends.

This patch fixes the issue by checking the PR_CAP_INT flag before
exposing them in lua, so the user can't have access to them.

This patch must be backported in 2.5.
2021-11-24 16:14:24 +01:00
William Lallemand
f03b53c81d BUG/MINOR: httpclient: allow to replace the host header
This patch allows to replace the host header generated by the
httpclient instead of adding a new one, resulting in the server replying
an error 400.

The host header is now generated from the uri only if it wasn't found in
the list of headers.

Also add a new request in the VTC file to test this.

This patch must be backported in 2.5.
2021-11-24 15:44:36 +01:00
Christopher Faulet
27f88a9059 BUG/MINOR: cache: Fix loop on cache entries in "show cache"
A regression was introduced in the commit da91842b6 ("BUG/MEDIUM: cache/cli:
make "show cache" thread-safe"). When cli_io_handler_show_cache() is called,
only one node is retrieved and is used to fill the output buffer in loop.
Once set, the "node" variable is never renewed. At the end, all nodes are
dumped but each one is duplicated several time into the output buffer.

This patch must be backported everywhere the above commit is. It means only
to 2.5 and 2.4.
2021-11-23 16:15:02 +01:00
Willy Tarreau
73dec76e85 [RELEASE] Released version 2.6-dev0
Released version 2.6-dev0 with the following main changes :
    - MINOR: version: it's development again
2021-11-23 15:50:11 +01:00
Willy Tarreau
3b068c45ee MINOR: version: it's development again
This essentially reverts 9dc4057df0.
2021-11-23 15:48:35 +01:00
Willy Tarreau
f2e0833f16 [RELEASE] Released version 2.5.0
Released version 2.5.0 with the following main changes :
    - BUILD: SSL: add quictls build to scripts/build-ssl.sh
    - BUILD: SSL: add QUICTLS to build matrix
    - CLEANUP: sock: Wrap `accept4_broken = 1` into additional parenthesis
    - BUILD: cli: clear a maybe-unused  warning on some older compilers
    - BUG/MEDIUM: cli: make sure we can report a warning from a bind keyword
    - BUG/MINOR: ssl: make SSL counters atomic
    - CLEANUP: assorted typo fixes in the code and comments
    - BUG/MINOR: ssl: free correctly the sni in the backend SSL cache
    - MINOR: version: mention that it's stable now
2021-11-23 15:40:21 +01:00
Willy Tarreau
9dc4057df0 MINOR: version: mention that it's stable now
This version will be maintained up to around Q1 2023. The INSTALL file
also mentions it.
2021-11-23 15:38:10 +01:00
William Lallemand
ce9903319c BUG/MINOR: ssl: free correctly the sni in the backend SSL cache
__ssl_sock_load_new_ckch_instance() does not free correctly the SNI in
the session cache, it only frees the one in the current tid.

This bug was introduced with e18d4e8 ("BUG/MEDIUM: ssl: backend TLS
resumption with sni and TLSv1.3").

This fix must be backported where the mentionned commit was backported.
(all maintained versions).
2021-11-23 15:20:59 +01:00
Ilya Shipitsin
a4d09e7ffd CLEANUP: assorted typo fixes in the code and comments
This is 28th iteration of typo fixes
2021-11-22 19:08:12 +01:00
Willy Tarreau
c5e7cf9e69 BUG/MINOR: ssl: make SSL counters atomic
SSL counters were added with commit d0447a7c3 ("MINOR: ssl: add counters
for ssl sessions") in 2.4, but their updates were not atomic, so it's
likely that under significant loads they are not correct.

This needs to be backported to 2.4.
2021-11-22 17:46:13 +01:00
Willy Tarreau
0a1e1cb555 BUG/MEDIUM: cli: make sure we can report a warning from a bind keyword
Since recent 2.5 commit c8cac04bd ("MEDIUM: listener: deprecate "process"
in favor of "thread" on bind lines"), the "process" bind keyword may
report a warning. However some parts like the "stats socket" parser
will call such bind keywords and do not expect to face warnings, so
this will instantly cause a fatal error to be reported. A concrete
effect is that "stats socket ... process 1" will hard-fail indicating
the keyword is deprecated and will be removed in 2.7.

We must relax this test, but the code isn't designed to report warnings,
it uses a single string and only supports reporting an error code (-1).

This patch makes a special case of the ERR_WARN code and uses ha_warning()
to report it, and keeps the rest of the existing error code for other
non-warning codes. Now "process" on the "stats socket" is properly
reported as a warning.

No backport is needed.
2021-11-20 20:15:37 +01:00
Willy Tarreau
97b5d07a3e BUILD: cli: clear a maybe-unused warning on some older compilers
The SHOW_TOT() and SHOW_AVG() macros used in cli_io_handler_show_activity()
produce a warning on gcc 4.7 on MIPS with threads disabled because the
compiler doesn't know that global.nbthread is necessarily non-null, hence
that at least one iteration is performed. Let's just change the loop for
a do {} while () that lets the compiler know it's always initialized. It
also has the tiny benefit of making the code shorter.
2021-11-20 20:15:37 +01:00
Tim Duesterhus
f897fc99bd CLEANUP: sock: Wrap accept4_broken = 1 into additional parenthesis
This makes it clear to static analysis tools that this assignment is
intentional and not a mistyped comparison.
2021-11-20 14:52:01 +01:00
Ilya Shipitsin
d69d65a563 BUILD: SSL: add QUICTLS to build matrix
It also enables QUIC when QUICTLS is used.
2021-11-20 08:18:00 +01:00
Ilya Shipitsin
2091c7ca70 BUILD: SSL: add quictls build to scripts/build-ssl.sh
script/build-ssl.sh is used mostly in CI, let us introduce QUIC
OpenSSL fork support
2021-11-20 08:17:22 +01:00
Willy Tarreau
a99cdfb531 [RELEASE] Released version 2.5-dev15
Released version 2.5-dev15 with the following main changes :
    - BUG/MINOR: stick-table/cli: Check for invalid ipv6 key
    - CLEANUP: peers: Remove useless test on peer variable in peer_trace()
    - DOC: log: Add comments to specify when session's listener is defined or not
    - BUG/MEDIUM: mux-h1: Handle delayed silent shut in h1_process() to release H1C
    - REGTESTS: ssl_crt-list_filters: feature cmd incorrectly set
    - DOC: internals: document the list API
    - BUG/MINOR: h3: ignore unknown frame types
    - MINOR: quic: redirect app_ops snd_buf through mux
    - MEDIUM: quic: inspect ALPN to install app_ops
    - MINOR: quic: support hq-interop
    - MEDIUM: quic: send version negotiation packet on unknown version
    - BUG/MEDIUM: mworker: cleanup the listeners when reexecuting
    - DOC: internals: document the scheduler API
    - BUG/MINOR: quic: fix version negotiation packet generation
    - CLEANUP: ssl: fix wrong #else commentary
    - MINOR: config: support default values for environment variables
    - SCRIPTS: run-regtests: reduce the number of processes needed to check options
    - SCRIPT: run-regtests: avoid several calls to grep to test for features
    - SCRIPT: run-regtests: avoid calling awk to compute the version
    - REGTEST: set retries count to zero for all tests that expect at 503
    - REGTESTS: make tcp-check_min-recv fail fast
    - REGTESTS: extend the default I/O timeouts and make them overridable
    - BUG/MEDIUM: ssl: backend TLS resumption with sni and TLSv1.3
    - BUG/MEDIUM: ssl: abort with the correct SSL error when SNI not found
    - REGTESTS: ssl: test the TLS resumption
    - BUILD: makefile: stop opening sub-shells for each and every command
    - BUILD: makefile: reorder objects by build time
    - BUG/MEDIUM: mux-h2: always process a pending shut read
    - MINOR: quic_sock: missing CO_FL_ADDR_TO_SET flag
    - MINOR: quic: Possible wrong connection identification
    - MINOR: quic: Correctly pad UDP datagrams
    - MINOR: quic: Support transport parameters draft TLS extension
    - MINOR: quic: Anti-amplification implementation
    - MINOR: quic: Wrong Initial packet connection initialization
    - MINOR: quic: Wrong ACK range building
    - MINOR: quic: Update some QUIC protocol errors
    - MINOR: quic: Send CONNECTION_CLOSE frame upon TLS alert
    - MINOR: quic: Wrong largest acked packet number parsing
    - MINOR: quic: Add minimalistic support for stream flow control frames
    - MINOR: quic: Wrong value for version negotiation packet 'Unused' field
    - MINOR: quic: Support draft-29 QUIC version
    - BUG/MINOR: quic: fix segfault on trace for version negotiation
    - BUG/MINOR: hq-interop: fix potential NULL dereference
    - BUILD: quic: fix potential NULL dereference on xprt_quic
    - DOC: lua: documentation about the httpclient API
    - BUG/MEDIUM: cache/cli: make "show cache" thread-safe
    - BUG/MEDIUM: shctx: leave the block allocator when enough blocks are found
    - BUG/MINOR: shctx: do not look for available blocks when the first one is enough
    - MINOR: shctx: add a few BUG_ON() for consistency checks
2021-11-19 19:30:04 +01:00
Willy Tarreau
48b608026b MINOR: shctx: add a few BUG_ON() for consistency checks
The shctx code relies on sensitive conditions that are hard to infer
from the code itself, let's add some BUG_ON() to verify them. They
helped spot the previous bugs.
2021-11-19 19:25:13 +01:00
Willy Tarreau
cafe15c743 BUG/MINOR: shctx: do not look for available blocks when the first one is enough
In shctx_row_reserve_hot() we only leave if we've found the exact
requested size instead of at least as large, as is documented. This
results in extra lookups and free calls in the avail loop while it is
not needed, and participates to seeing a negative data_len early as
spotted in previous bugs.

It doesn't seem to have any other impact however, but it's better to
backport it to stable branches.
2021-11-19 19:25:13 +01:00
Willy Tarreau
b15e8a1c96 BUG/MEDIUM: shctx: leave the block allocator when enough blocks are found
In shctx_row_reserve_hot(), a missing break allows the avail loop to
loop for a while after having allocated the required blocks, possibly
leading to the point where it could trigger the watchdog after checking
up to 2 million blocks. In addition, the extra iteration may leave one
block assigned with size zero at the head of the avail list, and mark
it as being an isolated chain of 1 block. It's unclear whether this
could have had other consequences.

There is a non-negligible chance that it addreses bugs #1451 and #1284,
as the pattern observed in the loop looks exactly the same as the one
reported there in the crashes.

It's only marked medium because it is extremely hard to trigger. Here
the conditions were reproduced when starting 4k connections at once
requesting objects of random sizes between 0 and 20k to store them into
a small 1MB cache. However the watchdog will never trigger in such a case
so one needs to instrument the functions.

Thanks to Sohaib Ahmad and @g0uZ for providing useful traces.

This will need to be backported to all stable branches.
2021-11-19 19:25:13 +01:00