In this patch, we add the possibility to declare on a table definition
("table" in peer section, or "stick-table" in proxy section) that we
want the remote/peer updates on that table to be pushed on a local
haproxy table in addition to the source table.
Consider this example:
|peers mypeers
| peer local 127.0.0.1:3334
| peer clust 127.0.0.1:3333
| table t1.local type string size 10m store server_id,server_key expire 30s
| table t1.clust type string size 10m store server_id,server_key write-to mypeers/t1.local expire 30s
With this setup, we consider haproxy uses t1.local as cache/local table
for read and write operations, and that t1.clust is a remote table
containing datas processed from t1.local and similar tables from other
haproxy peers in a cluster setup. The t1.clust table will be used to
refresh the local/cache one via the "write-to" statement.
What will happen, is that every time haproxy will see entry updates for
the t1.clust table: it will overwrite t1.local table with fresh data and
will update the entry expiration timer. If t1.local entry doesn't exist
yet (key doesn't exist), it will automatically create it. Note that only
types that cannot be used for arithmetic ops will be handled, and this
to prevent processed values from the remote table from interfering with
computations based on values from the local table. (ie: prevent
cumulative counters from growing indefinitely).
"write-to" will only push supported types if they both exist in the source
and the target table. Be careful with server_id and server_key storage
because they are often declared implicitly when referencing a table in
sticking rules but it is required to declare them explicitly for them to
be pushed between a remote and a local table through "write-to" option.
Also note that the "write-to" target table should have the same type as
the source one, and that the key length should be strictly equal,
otherwise haproxy will raise an error due to the tables being
incompatibles. A table that is already being written to cannot be used
as a source table for a "write-to" target.
Thanks to this patch, it will now be possible to use sticking rules in
peer cluster context by using a local table as a local cache which
will be automatically refreshed by one or multiple remote table(s).
This commit depends on:
- "MINOR: stktable: stktable_init() sets err_msg on error"
- "MINOR: stktable: check if a type should be used as-is"
stick table types now have an extra bit named 'as_is' that allows us to
check if such type should be used as-is or if it may be involved in
arithmetic operations such as counters. This can be useful since those
types are not common and may require specific handling.
e.g.: stktable_data_types[data_type].as_is will be set to 1 if the type
cannot be used in arithmetic operations.
As a result of copy paste error in 1b8e68e ("MEDIUM: stick-table: Stop
handling stick-tables as proxies."), postparsing stktable_init() failures
were reported as such for named peer tables:
"Proxy 'table_name': failed to initialize stick table."
Now they are correctly reported like this:
"Parsing [file:line]: failed to initialize 'table_name' stick-table."
This should be backported to every stable versions.
When "peers" keyword is encountered within a stick table definition,
peers.name hint gets replaced with a new copy of the provided name using
strdup(). However, there is no detection on whether the name was
previously set or not, so it is currently allowed to reuse the keyword
multiple time to overwrite previous value, but here we forgot to free
previous value for peers.name before assigning it to a new one.
This should be backported to every stable versions.
Simplify stick and store sticktable proxy rules postparsing by adding
a sticking rule entry resolve (postparsing) function.
This will ease code maintenance.
SNI may be specify on a server line for connecting to the remote host.
This requires to manually set it on the connection via
ssl_sock_set_servername().
This step was missing when a server line was used for active reverse
HTTP. Fix this by adding the missing ssl_sock_set_servername()
invocation inside new_reverse_conn().
Note that for the moment, no session is instantiated to carry active
reverse connection. A direct consequence of this is that SNI sample
retrieval may crash depending if it depends on session parameters. This
should be fixed by a later commit. In the meantime, this patch is
sufficient to support simple SNI value such as constant expressions.
No need to backport.
This new fetcher can be used to extract the list of cookie names from
Cookie request header or from Set-Cookie response header depending on
the stream direction. There is an optional argument that can be used
as the delimiter (which is assumed to be the first character of the
argument) between cookie names. The default delimiter is comma (,).
Note that we will treat the Cookie request header as a semi-colon
separated list of cookies and each Set-Cookie response header as
a single cookie and extract the cookie names accordingly.
When the `haproxy -c` check during the reload fails, no error is output
in the logs, this can be quite bothersome to understand what's going on.
This patch removes the -q option on the check so we can see the error
with `journalctl -u haproxy` or `systemctl status haproxy`
This will change the behavior when the check works, and will display
"Configuration file is valid"
Note that in some case this test could be completely removed, because
the master process loads the configuration itself and is able to keep
the previous workers running when the reload failed. This is interesting
to disable the test when there are a lot of certificates of files to
load, to divide the reload time by 2.
No need to backport.
When an expect rule failed for a tcp-check, information about the expect
rule is dumped in the report. For a check on a binary string, a hexstring is
used in the configuration but the decoded string is dumped. It is an problem
because it can contain special characters. And it is not really handy
because there is no correspondance with the config.
So, now, the hexstring is dumped in the report. This way, we are sure there
is no special characters and it is easy to find it in the configuration.
This patch shoudl solve the issue #2326. It must be backported as far as
2.2.
The patch which fixes the certificate selection uses
SSL_CIPHER_get_id() to skip the SCSV ciphers without checking if cipher
is NULL. This patch fixes the issue by skipping any NULL cipher in the
iteration.
Problem was reported in #2329.
Need to be backported where 23093c72f1 was
backported. No release was made with this patch so the severity is
MEDIUM.
When no client timeout is defined in the configuration, QCC timeout task
is never allocated. However, a NULL timeout task is also used as a
criteria in qcc_is_dead() to consider that the MUX instance should be
released as timeout stroke earlier.
This bug causes every connection to be closed by haproxy side with a
CONNECTION_CLOSE. This is notable when using several streams per
connection with only the first stream completed and the others failed.
To fix this, change timeout task allocation policy. It is now always
allocated. This means that if no timeout is defined, it will never be
run. This is not considered a waste of resource as no timeout in the
configuration is considered as an exception case. However, this has the
advantage to simplify the rest of the code which can now check for the
task instance without having an extra check on the timeout value.
This bug is labelled as minor as it only occurs if no timeout client is
defined which reports warning on startup as it may caused unexpected
behavior.
This bug should be backported up to 2.6.
Signature algorithms allows us to select the right certificates when
using TLSv1.3. This patch update the ssl_crt-list_filters.vtc regtest to
do more precise testing with TLSv1.3 in addition to TLSv1.2.
This allow us to test correctly bug #2300.
It could be backported to 2.8 with the previous fix for certificate
selection.
When using TLSv1.3, the signature algorithms extension is used to chose
the right ECDSA or RSA certificate.
However there was an old test for previous version of TLS (< 1.3) which
was testing if the cipher is compatible with ECDSA when an ECDSA
signature algorithm is used. This test was relying on
SSL_CIPHER_get_auth_nid(cipher) == NID_auth_ecdsa to verify if the
cipher is still good.
Problem is, with TLSv1.3, all ciphersuites are compatible with any
authentication algorithm, but SSL_CIPHER_get_auth_nid(cipher) does not
return NID_auth_ecdsa, but NID_auth_any.
Because of this, with TLSv1.3 when both ECDSA and RSA certificates are
available for a domain, the ECDSA one is not chosen in priority.
This patch also introduces a test on the cipher IDs for the signaling
ciphersuites, because they would always return NID_auth_any, and are not
relevent for this selection.
This patch fixes issue #2300.
Must be backported in all stable versions.
Similar to the previous commit which check for maxconn before allocating
a QUIC connection, this patch checks for maxsslconn at the same step.
This is necessary as a QUIC connection cannot run without a SSL context.
This should be backported up to 2.6. It relies on the following patch :
"BUG/MINOR: ssl: use a thread-safe sslconns increment"
Increment actconn and check maxconn limit when a quic_conn is
instantiated. This is necessary because prior to this patch, quic_conn
instances where not counted. Global actconn was only incremented after
the handshake has been completed and the connection structure is
allocated.
The increment is done using increment_actconn() on INITIAL packet
parsing if a new connection is about to be created. If the limit is
reached, the allocation is cancelled and the INITIAL packet is dropped.
The decrement is done under quic_conn_release(). This means that
quic_cc_conn instances are not taken into account. This seems safe
enough because quic_cc_conn are only used for minimal usage.
The counterpart of this change is that maxconn must not be checked a
second time when listener_accept() is done over a QUIC connection. For
this, a new bind_conf flag BC_O_XPRT_MAXCONN is set for listeners when
maxconn is already counted by the lower layer. For the moment, it is
positionned only for QUIC listeners.
Without this patch, haproxy process could suffer from heavy memory/CPU
load if the number of concurrent handshake is high.
This patch is not considered a bug fix per-se. However, it has a major
benefit to protect against too many QUIC handshakes. As such, it should
be backported up to 2.6. For this, it relies on the following patch :
"MINOR: frontend: implement a dedicated actconn increment function"
Each time a new SSL context is allocated, global.sslconns is
incremented. If global.maxsslconn is reached, the allocation is
cancelled.
This procedure was not entirely thread-safe due to the check and
increment operations conducted at different stage. This could lead to
global.maxsslconn slightly exceeded when several threads allocate SSL
context while sslconns is near the limit.
To fix this, use a CAS operation in a do/while loop. This code is
similar to the actconn/maxconn increment for connection.
A new function increment_sslconn() is defined for this operation. For
the moment, only SSL code is using it. However, it is expected that QUIC
will also use it to count QUIC connections as SSL ones.
This should be backported to all stable releases. Note that prior to the
2.6, sslconns was outside of global struct, so this commit should be
slightly adjusted.
When a new frontend connection is instantiated, actconn global counter
is incremented. If global maxconn value is reached, the connection is
cancelled. This ensures that system limit are under control.
Prior to this patch, the atomic check/increment operations were done
directly into listener_accept(). Move them in a dedicated function
increment_actconn() in frontend module. This will be useful when QUIC
connections will be counted in actconn counter.
When entering closing state, a QUIC connection is maintained during a
certain delay. The principle is to ensure the other peer has received
the CONNECTION_CLOSE frame. In case of packet duplication/reordering,
CONNECTION_CLOSE is reemitted.
QUIC RFC recommends to use at least 3 times the PTO value. However,
prior to this patch, haproxy used instead the max value between 3 times
the PTO and the connection idle timeout. In the default case, idle
timeout is set to 30s which is in most of the times largely superior to
the PTO. This has the downside of keeping the connection in memory for
too long whereas all resources could be released much earlier.
Fix this behavior by using 3 times the PTO on closing or draining state.
This value is limited up to 1s. This ensures that most of connections
are covered by this. If a connection runs with a very high RTT, it must
not impact the whole process and should be released in a reasonable
delay.
This should be backported up to 2.6.
Now when calling ha_panic() with a thread still under malloc_trim(),
we'll set a new tainted flag to easily report it, and the output
trace will report that this condition happened and will suggest to
use no-memory-trimming to avoid it in the future.
William suggested that since we can detect the presence of Lua in the
stack, let's combine it with stuck detection to set a new pair of flags
indicating a stuck Lua context and a stuck Lua shared context.
Now, executing an infinite loop in a Lua sample fetch function with
yield disabled crashes with tainted=0xe40 if loaded from a lua-load
statement, or tainted=0x640 from a lua-load-per-thread statement.
In addition, at the end of the panic dump, we can check if Lua was
seen stuck and emit recommendations about lua-load-per-thread and
the choice of dependencies depending on the presence of threads
and/or shared context.
This will make it easier to know that the panic function was called,
for the occasional case where the dump crashes and/or the stack is
corrupted and not much exploitable. Now at least it will be sufficient
to check the tainted value to know that someone called ha_panic(), and
it will also be usable to condition extra analysis.
Remove some code duplication by introducing a basic helper function
to detach a server from its parent proxy. It is supported to call
the function even if the server is not yet listed in the proxy list.
If the server is not yet listed in the proxy, the function will do
nothing. In delete_server(), we previously performed some BUG_ON()
to ensure that the detach always succeeded given that we were certain
that the server was in the proxy list because it was retrieved through
get_backend_server().
However this test is superfluous, we can safely assume that the operation
will always succeed if get_backend_server() returned != NULL (we're under
full thread isolation), and if it's not the case, then we have a bigger
API issue anyway..
In 304672320e ("MINOR: server: support keyword proto in 'add server' cli")
improper use of conn_get_best_mux_entry() function was made:
First, server's proxy mode was directly passed as "proto_mode" argument
to conn_get_best_mux_entry(), but this is strictly invalid because while
there is some relationship between proto modes and proxy modes, they
don't use the same storage mechanism and cannot be used interchangeably.
Because of this bug, conn_get_best_mux_entry() would not work at all for
TCP because PR_MODE_TCP equals 0, where PROTO_MODE_TCP normally equals 1.
Then another, less sensitive bug, remains:
as its name and description implies, conn_get_best_mux_entry() will try
its best to return something to the user, only using keyword (mux_proto)
input as an hint to return the most relevant mux within the list of
mux that are compatibles with proto_side and proto_mode values.
This means that even if mux_proto cannot be found or is not available
with current proto_side and proto_mode values, conn_get_best_mux_entry()
will most probably fallback to a more generic mux.
However in cli_parse_add_server(), we directly check the result of
conn_get_best_mux_entry() and consider that it will return NULL if the
provided keyword hint for mux_proto cannot be found. This will result in
the function not raising errors as expected, because most of the times if
the expected proto cannot be found, then we'll silently switch to the
fallback one, despite the user providing an explicit proto.
To fix that, we store the result of conn_get_best_mux_entry() to compare
the returned mux proto name with the one we're expecting to get, as it
is originally performed in cfgparse during initial server keyword parsing.
This patch depends on
- "MINOR: connection: add conn_pr_mode_to_proto_mode() helper func")
It must be backported up to 2.6.
This function allows to safely map proxy mode to corresponding proto_mode
This will allow for easier code maintenance and prevent mixups between
proxy mode and proto mode.
In 9a74a6c ("MAJOR: log: introduce log backends"), a mistake was made:
it was assumed that the proxy mode was already known during server
keyword parsing in parse_server() function, but this is wrong.
Indeed, "mode log" can be declared late in the proxy section. Due to this,
a simple config like this will cause the process to crash:
|backend test
|
| server name 127.0.0.1:8080
| mode log
In order to fix this, we relax some checks in _srv_parse_init() and store
the address protocol from str2sa_range() in server struct, then we set-up
a postparsing function that is to be called after config parsing to
finish the server checks/initialization that depend on the proxy mode
to be known. We achieve this by checking the PR_CAP_LB capability from
the parent proxy to know if we're in such case where the effective proxy
mode is not yet known (it is assumed that other proxies which are implicit
ones don't provide this possibility and thus don't suffer from this
constraint).
Only then, if the capability is not found, we immediately perform the
server checks that depend on the proxy mode, else the check is postponed
and it will automatically be performed during postparsing thanks to the
REGISTER_POST_SERVER_CHECK() hook.
Note that we remove the SRV_PARSE_IN_LOG_BE flag because it was introduced
in the above commit and it is no longer relevant.
No backport needed unless 9a74a6c gets backported.
The two recent commits below each added one flag to h2c but omitted to
update the __APPEND_FLAG macro used by dev/flags so they are not
properly decoded:
3dd963b35 ("BUG/MINOR: mux-h2: fix http-request and http-keep-alive timeouts again")
68d02e5fa ("BUG/MINOR: mux-h2: make up other blocked streams upon removal from list")
This can be backported along with these commits.
Define a new function srv_add_to_avail_list(). This function is used to
centralize connection insertion in available tree. It reuses a BUG_ON()
statement to ensure the connection is not present in the idle list.
Since the following commit, idle conns are stored in a list as secondary
storage to retrieve them in usage order :
5afcb686b9
MAJOR: connection: purge idle conn by last usage
The list usage has been extended wherever connections lookup are done
both on idle and safe trees. This reduced the code size by replacing a
two tree loops by a single list loop.
LIST_ELEM() is used in this context to retrieve the first idle list
element from the server list head. However, macro usage was wrong due to
an extra '&' operator which returns an invalid connection reference.
This will most of the time caused a crash on conn_delete_from_tree() or
affiliated functions.
This bug only occurs if the FD pool is exhausted and some idle
connections are selected to be killed.
It can be reproduced using the following config and h2load command :
$ h2load -t 8 -c 800 -m 10 -n 800 "http://127.0.0.1:21080/?s=10k"
global
maxconn 100
defaults
mode http
timeout connect 20s
timeout client 20s
timeout server 20s
listen li
bind :21080 proto h2
server nginx 127.99.0.1:30080 proto h1
This bug has been introduced by the above commit. Thus no need to
backport this fix.
Note that LIST_ELEM() macro usage was slightly adjusted also in
srv_migrate_conns_to_remove(). The function used toremove_list instead
of idle_list connection list element. This is not a bug as they are
stored in the same union. However, the new code is clearer as it intends
to move connection from the idle_list only into the toremove_list
mt-list.
Idle connections are both stored in an idle/safe tree and in an idle
list. The list is used as a secondary storage to be able to retrieve
them by usage order.
If a connection is moved into the available tree, it must not be present
in the idle list. A BUG_ON() was written to check this but was placed at
the wrong code section. Fix this by removing the misplaced one and write
new ones for avail_conns tree insertion and lookup.
The impact of this bug is minor as the misplaced BUG_ON() did not seem
to be triggered.
No need to backport.
After making it configurable in previous commit "MINOR: lua: Add flags
to configure logging behaviour", this patch changes the default value
of tune.lua.log.stderr from 'on' (unconditionally forward LUA logs to
stderr) to 'auto' (only forward LUA logs to stderr if logging via a
standard logger is disabled, or none is configured for the current context)
Since this is a change in behaviour, it shouldn't be backported
Until now, messages printed from LUA log functions were sent both to
the any logger configured for the current proxy, and additionally to
stderr (in most cases)
This introduces two flags to configure LUA log handling:
- tune.lua.log.loggers to use standard loggers or not
- tune.lua.log.stderr to use stderr, or not, or only conditionally
This addresses github feature request #2316
This can be backported to 2.8 as it doesn't change previous behaviour.
The configuration parser still adds the 'ca-base' directory when loading
the @system-ca, preventing it to be loaded correctly.
This patch fixes the problem by not adding the ca-base when a file
starts by '@'.
Fix issue #2313.
Must be backported as far as 2.6.
Released version 2.9-dev8 with the following main changes :
- MINOR: ssl: add an explicit error when 'ciphersuites' are not supported
- BUILD: ssl: enable 'ciphersuites' for WolfSSL
- BUILD: ssl: add 'ssl_c_r_dn' fetch for WolfSSL
- BUILD: ssl: add 'secure_memcmp' converter for WolfSSL and awslc
- BUILD: ssl: enable keylog for awslc
- CLEANUP: ssl: remove compat functions for openssl < 1.0.0
- BUILD: ssl: enable keylog for WolfSSL
- REGTESTS: pki: add a pki for SSL tests
- REGTESTS: ssl: update common.pem with the new pki
- REGTESTS: ssl: disable ssl_dh.vtc for WolfSSL
- REGTESTS: wolfssl: temporarly disable some failing reg-tests
- CI: ssl: add wolfssl to build-ssl.sh
- CI: ssl: add git id support for wolfssl download
- CI: github: add a wolfssl entry to the CI
- CI: github: update wolfssl to git revision d83f2fa
- CI: github: add awslc 1.16.0 to the push CI
- BUG/MINOR: quic: Avoid crashing with unsupported cryptographic algos
- REORG: quic: cleanup traces definition
- BUG/MINOR: quic: reject packet with no frame
- BUG/MEDIUM: mux-quic: fix RESET_STREAM on send-only stream
- BUG/MINOR: mux-quic: support initial 0 max-stream-data
- BUG/MINOR: h3: strengthen host/authority header parsing
- CLEANUP: connection: drop an uneeded leftover cast
- BUG/MAJOR: connection: make sure to always remove a connection from the tree
- BUG/MINOR: quic: fix qc.cids access on quic-conn fail alloc
- BUG/MINOR: quic: fix free on quic-conn fail alloc
- BUG/MINOR: mux-quic: fix free on qcs-new fail alloc
- BUG/MEDIUM: quic-conn: free unsent frames on retransmit to prevent crash
- MEDIUM: tree-wide: logsrv struct becomes logger
- MEDIUM: log: introduce log target
- DOC: config: log <address> becomes log <target> in "log" related doc
- MEDIUM: sink/log: stop relying on AF_UNSPEC for rings
- MINOR: log: support explicit log target as argument in __do_send_log()
- MINOR: log: remove the logger dependency in do_send_log()
- MEDIUM: log/sink: simplify log header handling
- MEDIUM: sink: inherit from caller fmt in ring_write() when rings didn't set one
- MINOR: sink: add sink_new_from_srv() function
- MAJOR: log: introduce log backends
- MINOR: log/balance: support for the "sticky" lb algorithm
- MINOR: log/balance: support for the "random" lb algorithm
- MINOR: lbprm: support for the "none" hash-type function
- MINOR: lbprm: compute the hash avalanche in gen_hash()
- MINOR: sample: add sample_process_cnv() function
- MEDIUM: log/balance: support for the "hash" lb algorithm
- REGTEST: add a test for log-backend used as a log target
- MINOR: server: introduce "log-bufsize" kw
- BUG/MEDIUM: stconn: Report a send activity everytime data were sent
- BUG/MEDIUM: applet: Report a send activity everytime data were sent
- BUG/MINOR: mux-h1: Send a 400-bad-request on shutdown before the first request
- MINOR: support for http-response set-timeout
- BUG/MINOR: mux-h2: make up other blocked streams upon removal from list
- DEBUG: pool: store the memprof bin on alloc() and update it on free()
- BUG/MEDIUM: quic_conn: let the scheduler kill the task when needed
- CLEANUP: hlua: Remove dead-code on error path in hlua_socket_new()
- BUG/MEDIUM: mux-h1: do not forget TLR/EOT even when no data is sent
- BUG/MINOR: htpp-ana/stats: Specify that HTX redirect messages have a C-L header
- BUG/MEDIUM: mux-h2: Don't report an error on shutr if a shutw is pending
- MEDIUM: stconn/channel: Move pipes used for the splicing in the SE descriptors
- MINOR: stconn: Start to introduce mux-to-mux fast-forwarding notion
- MINOR: stconn: Extend iobuf to handle a buffer in addition to a pipe
- MINOR: connection: Add new mux callbacks to perform data fast-forwarding
- MINOR: stconn: Temporarily remove kernel splicing support
- MINOR: mux-pt: Temporarily remove splicing support
- MINOR: mux-h1: Temporarily remove splicing support
- MINOR: connection: Remove mux callbacks about splicing
- MEDIUM: stconn: Add mux-to-mux fast-forward support
- MINOR: mux-h1: Use HTX extra field only for responses with known length
- MEDIUM: mux-h1: Properly handle state transitions of chunked outgoing messages
- MEDIUM: raw-sock: Specifiy amount of data to send via snd_pipe callback
- MINOR: mux-h1: Add function to add size of a chunk to an outgoind message
- MEDIUM: mux-h1: Simplify zero-copy on sending path
- MEDIUM: mux-h1: Simplify payload formatting based on HTX blocks on sending path
- MEDIUM: mux-h1: Add fast-forwarding support
- MINOR: h2: Set the BODYLESS_RESP flag on the HTX start-line if necessary
- MEDIUM: mux-h2: Add consumer-side fast-forwarding support
- MEDIUM: channel: don't look at iobuf to report an empty channel
- MINOR: tree-wide: Only rely on co_data() to check channel emptyness
- REGTESTS: Reenable HTTP tests about splicing
- CLEAN: mux-h1: Remove useless __maybe_unused attribute on h1_make_chunk()
- MEDIUM: mux-pt: Add fast-forwarding support
- MINOR: global: Add an option to disable the zero-copy forwarding
- BUILD: mux-h1: Fix build without kernel splicing support
- REORG: stconn/muxes: Rename init step in fast-forwarding
- MINOR: dgram: allow to set rcv/sndbuf for dgram sockets as well
- BUG/MINOR: mux-h2: fix http-request and http-keep-alive timeouts again
- BUG/MINOR: trace: fix trace parser error reporting
- BUG/MEDIUM: peers: Be sure to always refresh recconnect timer in sync task
- BUG/MEDIUM: peers: Fix synchro for huge number of tables
- MINOR: cfgparse: forbid mixing reverse and standard listeners
- MINOR: listener: add nbconn kw for reverse connect
- MINOR: server: convert @reverse to rev@ standard format
- MINOR: cfgparse: rename "rev@" prefix to "rhttp@"
- REGTESTS: remove maxconn from rhttp bind line
- MINOR: listener: forbid most keywords for reverse HTTP bind
- MINOR: sample: Added support for Arrays in sample_conv_json_query in sample.c
- MINOR: mux-h2/traces: explicitly show the error/refused stream states
- MINOR: mux-h2/traces: clarify the "rejected H2 request" event
- BUG/MINOR: mux-h2: commit the current stream ID even on reject
- BUG/MINOR: mux-h2: update tracked counters with req cnt/req err
Originally H2 would transfer everything to H1 and parsing errors were
handled there, so that if there was a track-sc rule in effect, the
counters would be updated as well. As we started to add more and more
HTTP-compliance checks at the H2 layer, then switched to HTX, we
progressively lost this ability. It's a bit annoying because it means
we will not maintain accurate error counters for a given source, for
example.
This patch adds the calls to session_inc_http_req_ctr() and
session_inc_http_err_ctr() when needed (i.e. when failing to parse
an HTTP request since all other cases are handled by the stream),
just like mux-h1 does. The same should be done for mux-h3 by the
way.
This can be backported to recent stable versions. It's not exactly a
bug, rather a missing feature in that we had never updated this counter
for H2 till now, but it does make sense to do it especially based on
what the doc says about its usage.
The H2 spec says that a HEADERS frame turns an idle stream to the open
state, and it may then turn to half-closed(remote) on ES, then to close,
all at once, if we respond with RST (e.g. on error). Due to the fact that
we process a complete frame at once since h2_dec_hdrs() may reassemble
CONTINUATION frames until everything is complete, the state was only
committed after the frame was completley valid (otherwise multiple passes
could result in subsequent frames being rejected as the stream ID would
be equal to the highest one).
However this is not correct because it means that a client may retry on
the same ID as a previously failed one, which technically is forbidden
(for example the client couldn't know which of them a WINDOW_UPDATE or
RST_STREAM frame is for).
In practice, due to the error paths, this would only be possible when
failing to decode HPACK while leaving the HPACK stream intact, thus
when the valid decoded HPACK stream cannot be turned into a valid HTTP
representation, e.g. when the resulting headers are too large for example.
The solution to avoid this consists in committing the stream ID on this
error path as well. h2spec continues to be happy.
Thanks to Annika Wickert and Tim Windelschmidt for reporting this issue.
This fix must be backported to all stable versions.
In h2_frt_handle_headers() all failures lead to a generic message saying
"rejected H2 request". It's quite inexpressive while there are a few
distinct tests that are made before jumping there:
- trailers on closed stream
- unparsable request
- refused stream
Let's emit the traces from these call points instead so that we get more
info about what happened. Since these are user-level messages, we take
care of keeping them aligned as much as possible.
For example before it would say:
[04|h2|1|mux_h2.c:2859] rejected H2 request : h2c=0x7f5d58036fd0(F,FRE)
[04|h2|5|mux_h2.c:2860] h2c_frt_handle_headers(): leaving on error : h2c=0x7f5d58036fd0(F,FRE) dsi=1 h2s=0x9fdb60(0,CLO)
And now it says:
[04|h2|1|mux_h2.c:2817] rcvd unparsable H2 request : h2c=0x7f55f8037160(F,FRH) dsi=1 h2s=CLO
[04|h2|5|mux_h2.c:2875] h2c_frt_handle_headers(): leaving on error : h2c=0x7f55f8037160(F,FRE) dsi=1 h2s=CLO
Sometimes it's unclear whether a stream is still open or closed when
certain traces are emitted, for example when the stream was refused,
because the reported pointer and ID in fact correspond to the refused
stream. And for closed streams, no pointer/name is printed, leaving
some confusion about the state. This patch makes the situation easier
to analyse by explicitly reporting "h2s=CLO" on closed/error/refused
streams so that we don't waste time comparing pointers and we instantly
know the stream is closed. Now instead of emitting:
[03|h2|5|mux_h2.c:2874] h2c_frt_handle_headers(): leaving on error : h2c=0x7fdfa8026820(F,FRE) dsi=201 h2s=0x9fdb60(0,CLO)
It will emit:
[03|h2|5|mux_h2.c:2874] h2c_frt_handle_headers(): leaving on error : h2c=0x7fdfa8026820(F,FRE) dsi=201 h2s=CLO
Method now returns the content of Json Arrays, if it is specified in
Json Path as String. The start and end character is a square bracket. Any
complex object in the array is returned as Json, so that you might get Arrays
of Array or objects. Only recommended for Arrays of simple types (e.g.,
String or int) which will be returned as CSV String. Also updated
documentation and fixed issue with parenthesis and other changes from
comments.
This patch was discussed in issue #2281.
Signed-off-by: William Lallemand <wlallemand@haproxy.com>
Reverse HTTP bind is very specific in that in rely on a server to
initiate connection. All connection settings are defined on the server
line and ignored from the bind line.
Before this patch, most of keywords were silently ignored. This could
result in a configuration from doing unexpected things from the user
point of view. To improve this situation, add a new 'rhttp_ok' field in
bind_kw structure. If not set, the keyword is forbidden on a reverse
bind line and will cause a fatal config error.
For the moment, only the following keywords are usable with reverse bind
'id', 'name' and 'nbconn'.
This change is safe as it's already forbidden to mix reverse and
standard addresses on the same bind line.
The maxconn keyword is not used anymore for reverse HTTP bind. It has
been replaced recently by the new keyword nbconn. As it's default value
is 1, it can be safely removed from the regtest without affecting its
behavior.
Previously, maxconn keyword was reused for a specific usage on reverse
HTTP binds to specify the number of active connect to proceed. To avoid
confusion, introduce a new dedicated keyword 'nbconn' which is specific
to reverse HTTP bind.
This new keyword is forbidden for non-reverse listener. A fatal error is
emitted during config parsing if this rule is not respected. It's safe
because it's also forbidden to mix standard and reverse addresses on the
same bind line.
Internally, nbconn value will be reassigned to 'maxconn' member of
bind_conf structure. This ensures that listener layer will automatically
reenable the preconnect task each time a connection is closed.
Reverse HTTP listeners are very specific and share only a very limited
subset of keywords with other listeners. As such, it is probable
meaningless to mix standard and reverse addresses on the same bind line.
This patch emits a fatal error during configuration parsing if this is
the case.
The number of updates sent at once was limited to not loop too long to emit
updates when the buffer size is huge or when the number of sync tables is
huge. The limit can be configured and is set to 200 by default. However,
this fix introduced a bug. It is impossible to syncrhonize two peers if the
number of tables is higher than this limit. Thus by default, it is not
possible to sync two peers if there are more than 200 tables to sync.
Technically speacking, a teaching process is finished if we loop on all tables
with no new update messages sent. Because we are limited at each call, the loop
is splitted on several calls. However the restart point for the next loop is
always the last table for which we emitted an update message. Thus with more
tables than the limit, the loop never reachs the end point.
Worse, in conjunction with the bug fixed by "BUG/MEDIUM: peers: Be sure to
always refresh recconnect timer in sync task", it is possible to trigger the
watchdog because the applets may be woken up in loop and leave requesting
more room while its buffer is empty.
To fix the issue, restart conditions for a teaching loop were changed. If
the teach process is interrupted, we now save the restart point, called
stop_local_table. It is the last evaluated table on the previous loop. This
restart point is reset when the teach process is finished.
In additionn, the updates_sent variable in peer_send_msgs() was renamed to
updates to avoid ambiguities. Indeed, the variable is incremented, whether
messages were sent or not.
This patch must be backported as far as 2.6.
A sync task used to manage reconnect, sessions creation or shutdown and data
synchronization is responsible to refresh reconnect and heartbeat timers for
each remote peers and trigger applets wakeup. These timers are used to
refresh the sync task timeer itself. Thus it is important to take care to
always properly refresh them.
However, when there are some data to push, the reconnect timer is not
checked. It may be expired and not refreshed. In this case, an expired timer
may be used to the sync task, leading to a storm of wakeups. The sync task
is woken up in loop because its timer is in the past, waking up Peer applets
at each time.
To fix the issue, the peer's reconnect timer is now refresh to the default
reconnect timeout, if necessary, when there are some data to push.
This patch must be backported to all stable versions.