Commit Graph

15904 Commits

Author SHA1 Message Date
Christopher Faulet
1e83b70409 MINOR: tcp-act: Add set-src/set-src-port for "tcp-request content" rules
This patch was reverted because it was inconsitent to change connection
addresses at stream level. Especially in HTTP because all requests was
affected by this change and not only the current one. In HTTP/2, it was
worse. Several streams was able to change the connection addresses at the
same time.

It is no longer an issue, thanks to recent changes. With multi-level client
source and destination addresses, it is possible to limit the change to the
current request. Thus this patch can be reintroduced.

If it possible to set source IP/Port from "tcp-request connection",
"tcp-request session" and "http-request" rules but not from "tcp-request
content" rules. There is no reason for this limitation and it may be a
problem for anyone wanting to call a lua fetch to dynamically set source
IP/Port from a TCP proxy. Indeed, to call a lua fetch, we must have a
stream. And there is no stream when "tcp-request connection/session" rules
are evaluated.

Thanks to this patch, "set-src" and "set-src-port" action are now supported
by "tcp_request content" rules.

This patch is related to the issue #1303.
2021-10-27 11:35:59 +02:00
Christopher Faulet
d69377eb02 MEDIUM: tcp-act: Set addresses at the apprioriate level in set-(src/dst) actions
When client source or destination addresses are changed via a tcp/http
action, we update addresses at the appropriate level. When "tcp-request
connection" rules are evaluated, we update addresses at the connection
level. When "tcp-request session" rules is evaluated, we update those at the
session level. And finally, when "tcp-request content" or "http-request"
rules are evaluated, we update the addresses at the stream level.

The same is performed when source or destination ports are changed.

Of course, for now, not all level are supported. But thanks to this patch,
it will be possible.
2021-10-27 11:35:59 +02:00
Christopher Faulet
e83e8821bb MEDIUM: connection: Assign session addresses when NetScaler CIP proto is parsed
Just like for the PROXY protocol, when the NetScaler Client IP insertion
header is received, the retrieved client source and destination addresses
are set at the session level. This leaves those at the connection level
intact.
2021-10-27 11:35:59 +02:00
Christopher Faulet
c105c9213f MEDIUM: connection: Assign session addresses when PROXY line is received
When PROXY protocol line is received, the retrieved client source and
destination addresses are set at the session level. This leaves those at the
connection level intact.
2021-10-27 11:35:59 +02:00
Christopher Faulet
a8e95fed43 MEDIUM: backend: Rely on addresses at stream level to init server connection
Client source and destination addresses at stream level are used to initiate
the connections to a server. For now, stream-interface addresses are never
set. So, thanks to the fallback mechanism, no changes are expected with this
patch. But its purpose is to rely on addresses at the appropriate level when
set instead of those at the connection level.
2021-10-27 11:35:59 +02:00
Christopher Faulet
b097aef2ef MEDIUM: connection: Rely on addresses at stream level to make proxy line
If the stream exists, the frontend stream-interface is used to get the
client source and destination addresses when the proxy line is built. For
now, stream-interface or session addresses are never set. So, thanks to the
fallback mechanism, no changes are expected with this patch. But its purpose
is to rely on addresses at the appropriate level when set instead of those
at the connection level.
2021-10-27 11:35:57 +02:00
Christopher Faulet
c03be1a129 MEDIUM: tcp-sample: Rely on addresses at the appropriate level in tcp samples
In src, src-port, dst and dst-port sample fetches, the client source and
destination addresses are retrieved from the appropriate level. It means
that, if the stream exits, we use the frontend stream-interface to get the
client source and destination addresses. Otherwise, the session is used. For
now, stream-interface or session addresses are never set. So, thanks to the
fallback mechanism, no changes are expected with this patch. But its purpose
is to rely on addresses at the appropriate level when set instead of those
at the connection level.
2021-10-27 11:34:21 +02:00
Christopher Faulet
568008d199 MINOR: mux-fcgi: Rely on client addresses at stream level to set default params
Client source and destination addresses at stream level are now used to emit
SERVER_NAME/SERVER_PORT and REMOTE_ADDR/REMOTE_PORT parameters. For now,
stream-interface addresses are never set. So, thanks to the fallback
mechanism, no changes are expected with this patch. But its purpose is to
rely on addresses at the stream level, when set, instead of those at the
connection level.
2021-10-27 11:34:21 +02:00
Christopher Faulet
6fc817a28e MINOR: http-fetch: Rely on addresses at stream level in HTTP sample fetches
Client source and destination addresses at stream level are now used to
compute base32+src and url32+src hashes. For now, stream-interface addresses
are never set. So, thanks to the fallback mechanism, no changes are expected
with this patch. But its purpose is to rely on addresses at the stream
level, when set, instead of those at the connection level.
2021-10-27 11:34:21 +02:00
Christopher Faulet
8a104ba3e0 MINOR: http-ana: Rely on addresses at stream level to set xff and xot headers
Client source and destination addresses at stream level are now used to emit
X-Forwarded-For and X-Original-To headers. For now, stream-interface addresses
are never set. So, thanks to the fallback mechanism, no changes are expected
with this patch. But its purpose is to rely on addresses at the stream level,
when set, instead of those at the connection level.
2021-10-27 11:34:21 +02:00
Christopher Faulet
c269f664bd MINOR: session: Rely on client source address at session level to log error
When an embryonic session is killed, if no log format is defined for this
error, a generic error is emitted. When this happens, we now rely on the
session to get the client source address. For now, session addresses are
never set. So, thanks to the fallback mechanism, no changes are expected
with this patch. But its purpose is to rely on addresses at the session
level when set instead of those at the connection level.
2021-10-27 11:34:21 +02:00
Christopher Faulet
f9c4d8d5be MINOR: log: Rely on client addresses at the appropriate level to log messages
When a log message is emitted, if the stream exits, we use the frontend
stream-interface to retrieve the client source and destination
addresses. Otherwise, the session is used. For now, stream-interface or
session addresses are never set. So, thanks to the fallback mechanism, no
changes are expected with this patch. But its purpose is to rely on
addresses at the appropriate level when set instead of those at the
connection level.
2021-10-27 11:34:21 +02:00
Christopher Faulet
c9c8e1cc01 MINOR: frontend: Rely on client src and dst addresses at stream level
For now, stream-interface or session addresses are never set. So, thanks to
the fallback mechanism, no changes are expected with this patch. But its
purpose is to rely on the client addresses at the stream level, when set,
instead of those at the connection level. The addresses are retrieved from
the frontend stream-interface.
2021-10-27 11:34:21 +02:00
Christopher Faulet
859ff84f8c MINOR: stream-int: Add src and dst addresses to the stream-interface
For now, these addresses are never set. But the idea is to be able to set, at
least first, the client source and destination addresses at the stream level
without updating the session or connection ones.

Of course, because these addresses are carried by the strream-interface, it
would be possible to set server source and destination addresses at this level
too.

Functions to fill these addresses have been added: si_get_src() and
si_get_dst(). If not already set, these functions relies on underlying
layers to fill stream-interface addresses. On the frontend side, the session
addresses are used if set, otherwise the client connection ones are used. On
the backend side, the server connection addresses are used.

And just like for sessions and conncetions, si_src() and si_dst() may be used to
get source and destination addresses or the stream-interface. And, if not set,
same mechanism as above is used.
2021-10-27 11:34:21 +02:00
Christopher Faulet
f46e1ea1ad MINOR: session: Add src and dst addresses to the session
For now, these addresses are never set. But the idea is to be able to set
client source and destination addresses at the session level without
updating the connection ones.

Functions to fill these addresses have been added: sess_get_src() and
sess_get_dst(). If not already set, these functions relies on
conn_get_src() and conn_get_dst() to fill session addresses.

And just like for conncetions, sess_src() and sess_dst() may be used to get
source and destination addresses. However, if not set, the corresponding
address from the underlying client connection is returned. When this
happens, the addresses is filled in the connection object.
2021-10-27 11:34:21 +02:00
Christopher Faulet
cc6fc26bfe MINOR: connection: Add function to get src/dst without updating the connection
conn_get_src() and conn_get_dst() functions are used to fill the source and
destination addresses of a connection. On success, ->src and ->dst
connection fields can be safely used.

For convenience, 2 new functions are added here: conn_src() and conn_dst().
These functions return the corresponding address, as a const and only if it
is already set. Otherwise NULL is returned.
2021-10-27 11:34:21 +02:00
Christopher Faulet
e6465b3b75 CLEANUP: lua: Use a const address to retrieve info about a connection
hlua_socket_info() only extracts information about an address, there is no
reason to not use a const.
2021-10-27 11:34:21 +02:00
Christopher Faulet
99163b75ec CLEANUP: tools: Use const address for get_net_port() and get_host_port()
These functions only extract the port from an address. There is no reason to
not use a const address.
2021-10-27 11:34:21 +02:00
Christopher Faulet
4bfce397b8 CLEANUP: connection: No longer export make_proxy_line_v1/v2 functions
These functions are only used by the make_proxy_line() function. Thus, we
can turn them as static.
2021-10-27 11:34:14 +02:00
vishnu
0af4bd7beb BUG/MEDIUM: lua: fix invalid return types in hlua_http_msg_get_body
hlua_http_msg_get_body must return either a Lua string or nil. For some
HTTPMessage objects, HTX_BLK_EOT blocks are also present in the HTX buffer
along with HTX_BLK_DATA blocks. In such cases, _hlua_http_msg_dup will start
copying data into a luaL_Buffer until it encounters an HTX_BLK_EOT. But then
instead of pushing neither the luaL_Buffer nor `nil` to the Lua stack, the
function will return immediately. The end result will be that the caller of
the HTTPMessage.body() method from a Lua filter will see whatever object was
on top of the stack as return value. It may be either a userdata object if
HTTPMessage.body() was called with only two arguments, or the third argument
itself if called with three arguments. Hence HTTPMessage.body() would return
either nil, or HTTPMessage body as Lua string, or a userdata objects, or
number.

This fix ensure that HTTPMessage.body() will always return either a string
or nil.

Reviewed-by: Christopher Faulet <cfaulet@haproxy.com>
2021-10-27 11:04:16 +02:00
Christopher Faulet
24a58fbd7e CLEANUP: lua: Remove any ambiguities about lua txn execution context flags
Flags used to set the execution context of a lua txn are used as an enum. It is
not uncommon but there are few flags otherwise. So to remove ambiguities, a
comment and a _NONE value are added to have a clear definition of supported
values.

This patch should fix the issue #1429. No backport needed.
2021-10-27 11:04:16 +02:00
William Lallemand
6137a9ee20 MINOR: httpclient/lua: return an error when it can't generate the request
Add a check during the httpclient request generation which yield an lua
error when the generation didn't work. The most common case is the lack
of space in the buffer, it can because of too much headers or a too big
body.
2021-10-27 10:19:58 +02:00
William Lallemand
dc2cc9008b MINOR: httpclient/lua: support more HTTP methods
Add support for HEAD/PUT/POST/DELETE method with the lua httpclient.

This patch use the httpclient_req_gen() function with a different meth
parameter to implement this.

Also change the reg-test to support a POST request with a body.
2021-10-27 10:19:49 +02:00
William Lallemand
dec25c3e14 MINOR: httpclient: support payload within a buffer
httpclient_req_gen() takes a payload argument which can be use to put a
payload in the request. This payload can only fit a request buffer.

This payload can also be specified by the "body" named parameter within
the lua. httpclient.

It is also used within the CLI httpclient when specified as a CLI
payload with "<<".
2021-10-27 10:19:41 +02:00
Willy Tarreau
b4d0cd02c1 [RELEASE] Released version 2.5-dev11
Released version 2.5-dev11 with the following main changes :
    - DEV: coccinelle: Add strcmp.cocci
    - CLEANUP: Apply strcmp.cocci
    - CI: Add `permissions` to GitHub Actions
    - CI: Clean up formatting in GitHub Action definitions
    - MINOR: add ::1 to predefined LOCALHOST acl
    - CLEANUP: assorted typo fixes in the code and comments
    - CLEANUP: Consistently `unsigned int` for bitfields
    - MEDIUM: resolvers: lower-case labels when converting from/to DNS names
    - MEDIUM: resolvers: replace bogus resolv_hostname_cmp() with memcmp()
    - MINOR: jwt: Empty the certificate tree during deinit
    - MINOR: jwt: jwt_verify returns negative values in case of error
    - MINOR: jwt: Do not rely on enum order anymore
    - BUG/MEDIUM: stream: Keep FLT_END analyzers if a stream detects a channel error
    - MINOR: httpclient/cli: access should be only done from expert mode
    - DOC: management: doc about the CLI httpclient
    - BUG/MEDIUM: tcpcheck: Properly catch early HTTP parsing errors
    - BUG/MAJOR: dns: tcp session can remain attached to a list after a free
    - BUG/MAJOR: dns: attempt to lock globaly for msg waiter list instead of use barrier
    - CLEANUP: dns: always detach the appctx from the dns session on release
    - DEBUG: dns: add a few more BUG_ON at sensitive places
    - BUG/MAJOR: resolvers: add other missing references during resolution removal
    - CLEANUP: resolvers: do not export resolv_purge_resolution_answer_records()
    - BUILD: resolvers: avoid a possible warning on null-deref
    - BUG/MEDIUM: resolvers: always check a valid item in query_list
    - CLEANUP: always initialize the answer_list
    - CLEANUP: resolvers: simplify resolv_link_resolution() regarding requesters
    - CLEANUP: resolvers: replace all LIST_DELETE with LIST_DEL_INIT
    - MEDIUM: resolvers: use a kill list to preserve the list consistency
    - MEDIUM: resolvers: remove the last occurrences of the "safe" argument
    - BUG/MEDIUM: checks: fix the starting thread for external checks
    - MEDIUM: resolvers: replace the answer_list with a (flat) tree
    - MEDIUM: resolvers: hash the records before inserting them into the tree
    - BUG/MAJOR: buf: fix varint API post- vs pre- increment
    - OPTIM: resolvers: move the eb32 node before the data in the answer_item
    - MINOR: list: add new macro LIST_INLIST_ATOMIC()
    - OPTIM: dns: use an atomic check for the list membership
    - BUG/MINOR: task: do not set TASK_F_USR1 for no reason
    - BUG/MINOR: mux-h2: do not prevent from sending a final GOAWAY frame
    - MINOR: connection: add a new CO_FL_WANT_DRAIN flag to force drain on close
    - MINOR: mux-h2: perform a full cycle shutdown+drain on close
    - CLEANUP: resolvers: get rid of single-iteration loop in resolv_get_ip_from_response()
    - MINOR: quic: Increase the size of handshake RX UDP datagrams
    - BUG/MEDIUM: lua: fix memory leaks with realloc() on non-glibc systems
    - MINOR: memprof: report the delta between alloc and free on realloc()
    - MINOR: memprof: add one pointer size to the size of allocations
    - BUILD: fix compilation on NetBSD
    - MINOR: backend: add traces for idle connections reuse
    - BUG/MINOR: backend: fix improper insert in avail tree for always reuse
    - MINOR: backend: improve perf with tcp proxies skipping idle conns
    - MINOR: connection: remove unneeded memset 0 for idle conns
2021-10-22 19:40:44 +02:00
Amaury Denoyelle
8e358af8a3 MINOR: connection: remove unneeded memset 0 for idle conns
Remove the zeroing of an idle connection node on remove from a tree.
This is not needed and should improve slightly the performance of idle
connection usage. Besides, it breaks the memory poisoning feature.
2021-10-22 17:29:25 +02:00
Amaury Denoyelle
926712ab2d MINOR: backend: improve perf with tcp proxies skipping idle conns
Skip the hash connection calcul when reuse must not be used in
connect_server() : this is the case for TCP proxies. This should result
in slightly better performance when using this use-case.
2021-10-22 17:28:29 +02:00
Amaury Denoyelle
aee4fdbd17 BUG/MINOR: backend: fix improper insert in avail tree for always reuse
In connect_server(), if http-reuse always is set, the backend connection
is inserted into the available tree as soon as created. However, the
hash connection field is only set later at the end of the function.

This seems to have no impact as the hash connection field is always
position before a lookup. However, this is not a proper usage of ebmb
API. Fix this by setting the hash connection field before the insertion
into the avail tree.

This must be backported up to 2.4.
2021-10-22 17:26:22 +02:00
Amaury Denoyelle
1252b6f951 MINOR: backend: add traces for idle connections reuse
Add traces in connect_server() to debug idle connection reuse. These
are attached to stream trace module, as it's already in use in
backend.c with the macro TRACE_SOURCE.
2021-10-22 17:21:14 +02:00
Amaury Denoyelle
28c5b3c0bc BUILD: fix compilation on NetBSD
Use include file <sys/time.h> to fix compilation error with timeval in
some files. This is as reported as 'man 7 system_data_types'. The build
error is reported on NetBSD 9.2.

This should be backported up to 2.2.
2021-10-22 17:04:35 +02:00
Willy Tarreau
1de51eb727 MINOR: memprof: add one pointer size to the size of allocations
The current model causes an issue when trying to spot memory leaks,
because malloc(0) or realloc(0) do not count as allocations since we only
account for the application-usable size. This is the problem that made
issue #1406 not to appear as a leak.

What we're doing now is to account for one extra pointer (the one that
memory allocators usually place before the returned area), so that a
malloc(0) will properly account for 4 or 8 bytes. We don't need something
exact, we just need something non-zero so that a realloc(X) followed by a
realloc(0) without a free() gives a small non-zero result.

It was verified that the results are stable including in the presence
of lots of malloc/realloc/free as happens when stressing Lua.

It would make sense to backport this to 2.4 as it helps in bug reports.
2021-10-22 16:40:09 +02:00
Willy Tarreau
8cce4d79ff MINOR: memprof: report the delta between alloc and free on realloc()
realloc() calls are painful to analyse because they have two non-zero
columns and trying to spot a leaking one requires a bit of scripting.
Let's simply append the delta at the end of the line when alloc and
free are non-nul.

It would be useful to backport this to 2.4 to help with bug reports.
2021-10-22 16:40:09 +02:00
Willy Tarreau
a5efdff93c BUG/MEDIUM: lua: fix memory leaks with realloc() on non-glibc systems
In issue #1406, Lev Petrushchak reported a nasty memory leak on Alpine
since haproxy 2.4 when using Lua, that memory profiling didn't detect.
After inspecting the code and Lua's code, it appeared that Lua's default
allocator does an explicit free() on size zero, while since 2.4 commit
d36c7fa5e ("MINOR: lua: simplify hlua_alloc() to only rely on realloc()"),
haproxy only calls realloc(ptr,0) that performs a free() on glibc but not
on other systems as it's not required by POSIX...

This patch reinstalls the explicit test for nsize==0 to call free().

Thanks to Lev for the very documented report, and to Tim for the links
to a musl thread on the same subject that confirms the diagnostic.

This must be backported to 2.4.
2021-10-22 16:40:09 +02:00
Frédéric Lécaille
46be7e92b4 MINOR: quic: Increase the size of handshake RX UDP datagrams
Some browsers may send Initial packets with sizes greater than 1252 bytes
(QUIC_INITIAL_IPV4_MTU). Let us increase this size limit up to 2048 bytes.
Also use this size for "max_udp_payload_size" transport parameter to limit
the size of the datagrams we want to receive.
2021-10-22 15:48:19 +02:00
Willy Tarreau
dbb0bb59e3 CLEANUP: resolvers: get rid of single-iteration loop in resolv_get_ip_from_response()
In issue 1424 Coverity reports that the loop increment is unreachable,
which is true, the list_for_each_entry() was replaced with a for loop,
but it was already not needed and was instead used as a convenient
construct for a single iteration lookup. Let's get rid of all this
now and replace the loop with an "if" statement.
2021-10-22 08:34:14 +02:00
Willy Tarreau
0b22247606 MINOR: mux-h2: perform a full cycle shutdown+drain on close
While in H1 we can usually close quickly, in H2 a client might be sending
window updates or anything while we're sending a GOAWAY and the pending
data in the socket buffers at the moment the close() is performed on the
socket results in the output data being lost and an RST being emitted.

One example where this happens easily is with h2spec, which randomly
reports connection resets when waiting for a GOAWAY while haproxy sends
it, as seen in issue #1422. With h2spec it's not window updates that are
causing this but the fact that h2spec has to upload the payload that
comes with invalid frames to accommodate various implementations, and
does that in two different segments. When haproxy aborts on the invalid
frame header, the payload was not yet received and causes an RST to
be sent.

Here we're dealing with this two ways:
  - we perform a shutdown(WR) on the connection to forcefully push pending
    data on a front connection after the xprt is shut and closed ;
  - we drain pending data
  - then we close

This totally solves the issue with h2spec, and the extra cost is very
low, especially if we consider that H2 connections are not set up and
torn down often. This issue was never observed with regular clients,
most likely because this pattern does not happen in regular traffic.

After more testing it could make sense to backport this, at least to
avoid reporting errors on h2spec tests.
2021-10-21 22:24:31 +02:00
Willy Tarreau
20b622e04b MINOR: connection: add a new CO_FL_WANT_DRAIN flag to force drain on close
Sometimes we'd like to do our best to drain pending data before closing
in order to save the peer from risking to receive an RST on close.

This adds a new connection flag CO_FL_WANT_DRAIN that is used to
trigger a call to conn_ctrl_drain() from conn_ctrl_close(), and the
sock_drain() function ignores fd_recv_ready() if this flag is set,
in order to catch latest data. It's not used for now.
2021-10-21 21:48:23 +02:00
Willy Tarreau
e6dc7a0129 BUG/MINOR: mux-h2: do not prevent from sending a final GOAWAY frame
Some checks were added by commit 9a3d3fcb5 ("BUG/MAJOR: mux-h2: Don't try
to send data if we know it is no longer possible") to make sure we don't
loop forever trying to send data that cannot leave. But one of the
conditions there is not correct, the one relying on H2_CS_ERROR2. Indeed,
this state indicates that the error code was serialized into the mux
buffer, and since the test is placed before trying to send the data to
the socket, if the connection states only contains a GOAWAY frame, it
may refrain from sending and may close without sending anything. It's
not dramatic, as GOAWAY reports connection errors in situations where
delivery is not even certain, but it's cleaner to make sure the error
is properly sent, and it avoids upsetting h2spec, as seen in github
issue #1422.

Given that the patch above was backported as far as 1.8, this patch will
also have to be backported that far.

Thanks to Ilya for reporting this one.
2021-10-21 17:37:22 +02:00
Willy Tarreau
3193eb9907 BUG/MINOR: task: do not set TASK_F_USR1 for no reason
This applicationn specific flag was added in 2.4-dev by commit 6fa8bcdc7
("MINOR: task: add an application specific flag to the state: TASK_F_USR1")
to help preserve a the idle connections status across wakeup calls. While
the code to do this was OK for tasklets, it was wrong for tasks, as in an
effort not to lose it when setting the RUNNING flag (that tasklets don't
have), it ended up being inconditionally set. It just happens that for now
no regular tasks use it, only tasklets.

This fix makes sure we always atomically perform (state & flags | running)
there, using a CAS. It also does it for tasklets because it was possible
to lose some such flags if set by another thread, even though this should
not happen with current code. In order to make the code more readable (and
avoid the previous mistake of repeated flags in the bit field), a new
TASK_PERSISTENT aggregate was declared in task.h for this.

In practice the CAS is cheap here because task states are stable or
convergent so the loop will almost never be taken.

This should be backported to 2.4.
2021-10-21 16:17:29 +02:00
Willy Tarreau
dde1b4499a OPTIM: dns: use an atomic check for the list membership
The crash that was fixed by commit 7045590d8 ("BUG/MAJOR: dns: attempt
to lock globaly for msg waiter list instead of use barrier") was now
completely analysed and confirmed to be partially a result of the
debugging code added to LIST_INLIST(), which was looking at both
pointers and their reciprocals, and that, if used in a concurrent
context, could perfectly return false if a neighbor was being added or
removed while the current one didn't change, allowing the LIST_APPEND
to fail.

As the LIST API was not designed to be used in a concurrent context,
we should not rely on LIST_INLIST() but on the newly introduced
LIST_INLIST_ATOMIC().

This patch simply reverts the commit above to switch to the new test,
saving a lock during potentially long operations. It was verified that
the check doesn't fail anymore.

It is unsure what the performance impact of the fix above could be in
some contexts. If any performance regression is observed, it could make
sense to backport this patch, along with the previous commit introducing
the LIST_INLIST_ATOMIC() macro.
2021-10-21 15:28:42 +02:00
Willy Tarreau
c79f014972 MINOR: list: add new macro LIST_INLIST_ATOMIC()
This macro is similar to LIST_INLIST() except that it is guaranteed to
perform the test atomically, so that even if LIST_INLIST() is intrumented
with debugging code to perform extra consistency checks, it will not fail
when used in the context of barriers and atomic ops.
2021-10-21 15:28:24 +02:00
Willy Tarreau
9628c42284 OPTIM: resolvers: move the eb32 node before the data in the answer_item
perf top shows that we spend a lot of time trying to read item->type in
the lookup loop, because the node is placed after the very long name,
so when the node is found, no data is in the cache yet. Let's simply
move the node upper in the struct. This results in the CPU usage of
resolv_validate_dns_response() to drop by 4 points.
2021-10-21 15:28:24 +02:00
Willy Tarreau
dd362b7b24 BUG/MAJOR: buf: fix varint API post- vs pre- increment
A bogus test in b_get_varint(), b_put_varint(), b_peek_varint() shifts
the end of the buffer by one byte. Since the bug is the same in the read
and write functions, the buffer contents remain compatible, which explains
why this bug was not detected earlier. But if the buffer ends on an
aligned address or page, it can result in a one-byte overflow which will
typically cause a crash or an inconsistent behavior.

This API is only used by rings (e.g. for traces and boot messages) and
by DNS responses, so the probability to hit it is extremely low, but a
crash on boot was observed.

This must be backported to 2.2.
2021-10-21 15:28:24 +02:00
Willy Tarreau
dcb696cd31 MEDIUM: resolvers: hash the records before inserting them into the tree
We're using an XXH32() on the record to insert it into or look it up from
the tree. This way we don't change the rest of the code, the comparisons
are still made on all fields and the next node is visited on mismatch. This
also allows to continue to use roundrobin between identical nodes.

Just doing this is sufficient to see the CPU usage go down from ~60-70% to
4% at ~2k DNS requests per second for farm with 300 servers. A larger
config with 12 backends of 2000 servers each shows ~8-9% CPU for 6-10000
DNS requests per second.

It would probably be possible to go further with multiple levels of indexing
but it's not worth it, and it's important to remember that tree nodes take
space (the struct answer_list went back from 576 to 600 bytes).
2021-10-21 08:29:02 +02:00
Willy Tarreau
7893ae117f MEDIUM: resolvers: replace the answer_list with a (flat) tree
With SRV records, a huge amount of time is spent looking for records
by walking long lists. It is possible to reduce this by indexing values
in trees instead. However the whole code relies a lot on the list
ordering, and even implements some round-robin on it to distribute IP
addresses to servers.

This patch starts carefully by replacing the list with a an eb32 tree
that is still used like a list, with a constant key 0. Since ebtrees
preserve insertion order for duplicates, the tree walk visits the nodes
in the exact same order it did with the lists. This allows to implement
the required infrastructure without changing the behavior.
2021-10-21 08:02:08 +02:00
Willy Tarreau
a89c19127d BUG/MEDIUM: checks: fix the starting thread for external checks
When cleaning up the code to remove most explicit task masks in commit
beeabf531 ("MINOR: task: provide 3 task_new_* wrappers to simplify the
API"), a mistake was done with the external checks where the call does
task_new_on(1) instead of task_new_on(0) due to the confusion with the
previous mask 1.

No backport is needed as that's only 2.5-dev.
2021-10-20 18:43:30 +02:00
Willy Tarreau
6878f80427 MEDIUM: resolvers: remove the last occurrences of the "safe" argument
This one was used to indicate whether the callee had to follow particularly
safe code path when removing resolutions. Since the code now uses a kill
list, this is not needed anymore.
2021-10-20 17:54:27 +02:00
Willy Tarreau
f766ec6b53 MEDIUM: resolvers: use a kill list to preserve the list consistency
When scanning resolution.curr it's possible to try to free some
resolutions which will themselves result in freeing other ones. If
one of these other ones is exactly the next one in the list, the list
walk visits deleted nodes and causes memory corruption, double-frees
and so on. The approach taken using the "safe" argument to some
functions seems to work but it's extremely brittle as it is required
to carefully check all call paths from process_ressolvers() and pass
the argument to 1 there to refrain from deleting entries, so the bug
is very likely to come back after some tiny changes to this code.

A variant was tried, checking at various places that the current task
corresponds to process_resolvers() but this is also quite brittle even
though a bit less.

This patch uses another approach which consists in carefully unlinking
elements from the list and deferring their removal by placing it in a
kill list instead of deleting them synchronously. The real benefit here
is that the complexity only has to be placed where the complications
are.

A thread-local list is fed with elements to be deleted before scanning
the resolutions, and it's flushed at the end by picking the first one
until the list is empty. This way we never dereference the next element
and do not care about its presence or not in the list. One function,
resolv_unlink_resolution(), is exported and used outside, so it had to
be modified to use this list as well. Internal code has to use
_resolv_unlink_resolution() instead.
2021-10-20 17:54:22 +02:00
Willy Tarreau
aae7320b0d CLEANUP: resolvers: replace all LIST_DELETE with LIST_DEL_INIT
The code as it is uses crossed lists between many elements, and at
many places the code relies on list iterators or emptiness checks,
which does not work with only LIST_DELETE. Further, it is quite
difficult to place debugging code and checks in the current situation,
and gdb is helpless.

This code replaces all LIST_DELETE calls with LIST_DEL_INIT so that
it becomes possible to trust the lists.
2021-10-20 17:54:14 +02:00
Willy Tarreau
239675e4a9 CLEANUP: resolvers: simplify resolv_link_resolution() regarding requesters
This function allocates requesters by hand for each and every type. This
is complex and error-prone, and it doesn't even initialize the list part,
leaving dangling pointers that complicate debugging.

This patch introduces a new function resolv_get_requester() that either
returns the current pointer if valid or tries to allocate a new one and
links it to its destination. Then it makes use of it in the function
above to clean it up quite a bit. This allows to remove complicated but
unneeded tests.
2021-10-20 17:54:01 +02:00