The API was extended by commit e352b9dac ("MINOR: vars: make vars_get_by_*
support an optional default value") but I didn't notice that opentracing
was using it, so it broke the build. No backport needed.
It is quite common to see in configurations constructions like the
following one:
http-request set-var(txn.bodylen) 0
http-request set-var(txn.bodylen) req.hdr(content-length)
...
http-request set-header orig-len %[var(txn.bodylen)]
The set-var() rules are almost always duplicated when manipulating
integers or any other value that is mandatory along operations. This is
a problem because it makes the configurations complicated to maintain
and slower than needed. And it becomes even more complicated when several
conditions may set the same variable because the risk of forgetting to
initialize it or to accidentally reset it is high.
This patch extends the var() sample fetch function to take an optional
argument which contains a default value to be returned if the variable
was not set. This way it becomes much simpler to use the variable, just
set it where needed, and read it with a fall back to the default value:
http-request set-var(txn.bodylen) req.hdr(content-length)
...
http-request set-header orig-len %[var(txn.bodylen,0)]
The default value is always passed as a string, thus it will experience
a cast to the output type. It doesn't seem userful to complicate the
configuration to pass an explicit type at this point.
The vars.vtc regtest was updated accordingly.
In preparation for support default values when fetching variables, we
need to update the internal API to pass an extra argument to functions
vars_get_by_{name,desc} to provide an optional default value. This
patch does this and always passes NULL in this argument. var_to_smp()
was extended to fall back to this value when available.
The two functions vars_get_by_name() and vars_get_by_scope() perform
almost the same operations except that they differ from the way the
name and scope are retrieved. The second part in common is more
complex and involves locking, so better factor this one out into a
new function.
There is no other change than refactoring.
Most often "set var" on the CLI is used to set a string, and using only
expressions is not always convenient, particularly when trying to
concatenate variables sur as host names and paths.
Now the "set var" command supports an optional keyword before the value
to indicate its type. "expr" takes an expression just like before this
patch, and "fmt" a format string, making it work like the "set-var-fmt"
actions.
The VTC was updated to include a test on the format string.
Just like the set-var-fmt action for tcp/http rules, the set-var-fmt
directive in global sections allows to pre-set process-wide variables
using a format string instead of a sample expression. This is often
more convenient when it is required to concatenate multiple fields,
or when emitting just one word.
The log-format strings are usable at plenty of places, but the expressions
using %[] were restricted to request or response context and nothing else.
This prevents from using them from the config context or the CLI, let's
relax this.
We're using a dummy temporary proxy when creating global variables in
the configuration file, it was copied from the CLI's code and was
mistakenly called "CLI", better name it "CFG". It should not appear
anywhere except maybe when debugging cores.
When attempting to set a variable does not start with the "proc" scope on
the CLI, we used to emit "only proc is permitted in the global section"
which obviously is a leftover from the initial code.
This may be backported to 2.4.
When a variable starts with the wrong scope, it is named without stripping
the extra characters that follow it, which usually are closing parenthesis.
Let's make sure we only report what is expected.
This may be backported to 2.4.
In commit 9a621ae76 ("MEDIUM: vars: add a new "set-var-fmt" action")
we introduced the support for format strings in variables with the
ability to release them on exit, except that it's the wrong list that
was being scanned for the rule (http vs vars), resulting in random
crashes during deinit.
This was a recent commit in 2.5-dev, no backport is needed.
The set-var() action is convenient because it preserves the input type
but it's a pain to deal with when trying to concatenate values. The
most recurring example is when it's needed to build a variable composed
of the source address and the source port. Usually it ends up like this:
tcp-request session set-var(sess.port) src_port
tcp-request session set-var(sess.addr) src,concat(":",sess.port)
This is even worse when trying to aggregate multiple fields from stick-table
data for example. Due to this a lot of users instead abuse headers from HTTP
rules:
http-request set-header(x-addr) %[src]:%[src_port]
But this requires some careful cleanups to make sure they won't leak, and
it's significantly more expensive to deal with. And generally speaking it's
not clean. Plus it must be performed for each and every request, which is
expensive for this common case of ip+port that doesn't change for the whole
session.
This patch addresses this limitation by implementing a new "set-var-fmt"
action which performs the same work as "set-var" but takes a format string
in argument instead of an expression. This way it becomes pretty simple to
just write:
tcp-request session set-var-fmt(sess.addr) %[src]:%[src_port]
It is usable in all rulesets that already support the "set-var" action.
It is not yet implemented for the global "set-var" directive (which already
takes a string) and the CLI's "set var" command, which would definitely
benefit from it but currently uses its own parser and engine, thus it
must be reworked.
The doc and regtests were updated.
There is a massive abuse of copy-paste in the doc that is visible in
the examples and arguments declaration. Let's at least remove irrelevant
examples for now.
When the expression called in "set-var" uses argments that require late
resolution, the context must be set. At the moment, any unknown argument
is misleadingly reported as "ACL":
frontend f
bind :8080
mode http
http-request set-var(proc.a) be_conn(foo)
parsing [b1.cfg:4]: unable to find backend 'foo' referenced in arg 1 \
of ACL keyword 'be_conn' in proxy 'f'.
Once the context is properly set, it now says the truth:
parsing [b1.cfg:8]: unable to find backend 'foo' referenced in arg 1 \
of sample fetch keyword 'be_conn' in http-request expression in proxy 'f'.
This may be backported but is not really important. If so, the preceeding
patches "BUG/MINOR: vars: improve accuracy of the rules used to check
expression validity" and "MINOR: sample: add missing ARGC_ entries" must
be backported as well.
For a long time we couldn't have arguments in expressions used in
tcp-request, tcp-response etc rules. But now due to the variables
it's possible, and their context in case of failure to resolve an
argument (e.g. backend name not found) is not properly reported
because there is no arg context values in ARGC_* to report them.
Let's add a number of missing ones for tcp-request {connection,
session,content}, tcp-response content, tcp-check, the config
parser (for "set-var" in the global section) and the CLI parser
(for "set-var" on the CLI).
The set-var() expression naturally checks whether expressions are valid
in the context of the rule, but it fails to differentiate frontends from
backends. As such for tcp-content and http-request rules, it will only
accept frontend-compatible sample-fetches, excluding those declared with
SMP_UES_BKEND (a few such as be_id, be_name). For the response it accepts
the backend-compatible expressions only, though it seems that there are
no sample-fetch function that are valid only in the frontend's content,
so that should not cause any problem.
Note that while allowing valid configs to be used, the fix might also
uncover some incorrect configurations where some expressions currently
return nothing (e.g. something depending on frontend declared in a
backend), and which could be rejected, but there does not seem to be
any such keyword. Thus while it should be backported, better not backport
it too far (2.4 and possibly 2.3 only).
The parser checks first for "set-var" then "unset-var" from the updated
offset instead of testing it only when the other one fails, so it
validates this rule as "unset-var":
http-request set-varunset-var(proc.a)
This should be backported everywhere relevant, though it's mostly harmless
as it's unlikely that some users are purposely writing this in their conf!
A recent update to BoringSSL broke the build again, and given that
it's not used except for QUIC development, let's temporarily disable
it until the issue is analysed and fixed.
Sometimes it is convenient to remap large sets of URIs to new ones (e.g.
after a site migration for example). This can be achieved using
"http-request redirect" combined with maps, but one difficulty there is
that non-matching entries will return an empty response. In order to
avoid this, duplicating the operation as an ACL condition ending in
"-m found" is possible but it becomes complex and error-prone while it's
known that an empty URL is not valid in a location header.
This patch addresses this by improving the redirect rules to be able to
simply ignore the rule and skip to the next one if the result of the
evaluation of the "location" expression is empty. However in order not
to break existing setups, it requires a new "ignore-empty" keyword.
There used to be an ACT_FLAG_FINAL on redirect rules that's used during
the parsing to emit a warning if followed by another rule, so here we
only set it if the option is not there. The http_apply_redirect_rule()
function now returns a 3rd value to mention that it did nothing and
that this was not an error, so that callers can just ignore the rule.
The regular "redirect" rules were not modified however since this does
not apply there.
The map_redirect VTC was completed with such a test and updated to 2.5
and an example was added into the documentation.
Those fetches are used to identify connection errors and SSL handshake
errors on the backend side of a connection. They can for instance be
used in a log-format line as in the regtest.
The bc_conn_err and bc_conn_err_str sample fetches give the status of
the connection on the backend side. The error codes and error messages
are the same than the ones that can be raised by the fc_conn_err fetch.
This new sample fetch along the ssl_bc_hsk_err_str fetch contain the
last SSL error of the error stack that occurred during the SSL
handshake (from the backend's perspective).
The locking in the dequeuing process was significantly improved by commit
49667c14b ("MEDIUM: queue: take the proxy lock only during the px queue
accesses") in that it tries hard to limit the time during which the
proxy's queue lock is held to the strict minimum. Unfortunately it's not
enough anymore, because we take up the task and manipulate a few pendconn
elements after releasing the proxy's lock (while we're under the server's
lock) but the task will not necessarily hold the server lock since it may
not have successfully found one (e.g. timeout in the backend queue). As
such, stream_free() calling pendconn_free() may release the pendconn
immediately after the proxy's lock is released while the other thread
currently proceeding with the dequeuing tries to wake up the owner's
task and dies in task_wakeup().
One solution consists in releasing le proxy's lock later. But tests have
shown that we'd have to sacrifice a significant share of the performance
gained with the patch above (roughly a 20% loss).
This patch takes another approach. It adds a "del_lock" to each pendconn
struct, that allows to keep it referenced while the proxy's lock is being
released. It's mostly a serialization lock like a refcount, just to maintain
the pendconn alive till the task_wakeup() call is complete. This way we can
continue to release the proxy's lock early while keeping this one. It had
to be added to the few points where we're about to free a pendconn, namely
in pendconn_dequeue() and pendconn_unlink(). This way we continue to
release the proxy's lock very early and there is no performance degradation.
This lock may only be held under the queue's lock to prevent lock
inversion.
No backport is needed since the patch above was merged in 2.5-dev only.
This option can be used to define a specific log format that will be
used in case of error, timeout, connection failure on a frontend... It
will be used for any log line concerned by the log-separate-errors
option. It will also replace the format of specific error messages
decribed in section 8.2.6.
If no "error-log-format" is defined, the legacy error messages are still
emitted and the other error logs keep using the regular log-format.
This option will be replaced by a "error-log-format" that enables to use
a dedicated log-format for connection error messages instead of the
regular log-format (in which most of the fields would be invalid in such
a case).
The "log-error-via-logformat" mechanism will then be replaced by a test
on the presence of such an error log format or not. If a format is
defined, it is used for connection error messages, otherwise the legacy
error log format is used.
As seen in issue #1369, supporting #if with unknown macros can silently
hide typos that may result in suboptimal code paths to be used, or even
possibly bugs. It looks like our code base does not rely that much on
this, so it's worth enabling -Wundef to catch future ones and have them
turned to more explicit "#if defined()" or #ifdef.
One was in backend.c and the other one in hlua.c. No other candidate
was found with "git grep '^#if\s*USE'". It's worth noting that 3
other such tests exist for SSL_OP_NO_{SSLv3,TLSv1_1,TLSv1_2} but
that these ones are properly set to 0 in openssl-compat.h when not
defined.
Other build warnings were emitted on LIBRESSL_VERSION_NUMBER with -Wundef
under openssl < 1.1. Related to GH issue #1369. Seems like some of them
could be simplified a little bit.
The condition should first check whether `bsize` is reached, before
dereferencing the offset. Even if this always works fine, due to the
string being null-terminated, this certainly looks odd.
Found using GitHub's CodeQL scan.
This bug traces back to at least 97c2ae13bc
(1.7.0+) and this patch should be backported accordingly.
Using localtime / gmtime is not thread-safe, whereas the `get_*` wrappers are.
Found using GitHub's CodeQL scan.
The use in sample_conv_ltime() can be traced back to at least
fac9ccfb70 (first appearing in 1.6-dev3), so all
supported branches with thread support are affected.
Released version 2.5-dev5 with the following main changes :
- MINOR: httpclient: initialize the proxy
- MINOR: httpclient: implement a simple HTTP Client API
- MINOR: httpclient/cli: implement a simple client over the CLI
- MINOR: httpclient/cli: change the User-Agent to "HAProxy"
- MEDIUM: ssl: Keep a reference to the client's certificate for use in logs
- BUG/MEDIUM: h2: match absolute-path not path-absolute for :path
- BUILD/MINOR: ssl: Fix compilation with OpenSSL 1.0.2
- MINOR: server: check if srv is NULL in free_server()
- MINOR: proxy: check if p is NULL in free_proxy()
- BUG/MEDIUM: cfgparse: do not allocate IDs to automatic internal proxies
- BUG/MINOR: http_client: make sure to preset the proxy's default settings
- REGTESTS: http_upgrade: fix incorrect expectation on TCP->H1->H2
- REGTESTS: abortonclose: after retries, 503 is expected, not close
- REGTESTS: server: fix agent-check syntax and expectation
- BUG/MINOR: httpclient: fix uninitialized sl variable
- BUG/MINOR: httpclient/cli: change the appctx test in the callbacks
- BUG/MINOR: httpclient: check if hdr_num is not 0
- MINOR: httpclient: cleanup the include files
- MINOR: hlua: take the global Lua lock inside a global function
- MINOR: tools: add FreeBSD support to get_exec_path()
- BUG/MINOR: systemd: ExecStartPre must use -Ws
- MINOR: systemd: remove the ExecStartPre line in the unit file
- MINOR: ssl: add an openssl version string parser
- MINOR: cfgcond: implements openssl_version_atleast and openssl_version_before
- CLEANUP: ssl: remove useless check on p in openssl_version_parser()
- BUG/MINOR: stick-table: fix the sc-set-gpt* parser when using expressions
- BUG/MINOR: httpclient: remove deinit of the httpclient
- BUG/MEDIUM: base64: check output boundaries within base64{dec,urldec}
- MINOR: httpclient: set verify none on the https server
- MINOR: httpclient: add the server to the proxy
- BUG/MINOR: httpclient: fix Host header
- BUILD: httpclient: fix build without OpenSSL
- CI: github-actions: remove obsolete options
- CLEANUP: assorted typo fixes in the code and comments
- MINOR: proc: setting the process to produce a core dump on FreeBSD.
- BUILD: adopt script/build-ssl.sh for OpenSSL-3.0.0beta2
- MINOR: server: return the next srv instance on free_server
- BUG/MINOR: stats: use refcount to protect dynamic server on dump
- MEDIUM: server: extend refcount for all servers
- MINOR: server: define non purgeable server flag
- MINOR: server: mark referenced servers as non purgeable
- MINOR: server: mark servers referenced by LUA script as non purgeable
- MEDIUM: server: allow to remove servers at runtime except non purgeable
- BUG/MINOR: base64: base64urldec() ignores padding in output size check
- REGTEST: add missing lua requirements on server removal test
- REGTEST: fix haproxy required version for server removal test
- BUG/MINOR: proxy: don't dump servers of internal proxies
- REGTESTS: Use `feature cmd` for 2.5+ tests
- REGTESTS: Remove REQUIRE_VERSION=1.5 from all tests
- BUG/MINOR: resolvers: mark servers with name-resolution as non purgeable
- MINOR: compiler: implement an ONLY_ONCE() macro
- BUG/MINOR: lua: use strlcpy2() not strncpy() to copy sample keywords
- MEDIUM: ssl: Capture more info from Client Hello
- MINOR: sample: Expose SSL captures using new fetchers
- MINOR: sample: Add be2dec converter
- MINOR: sample: Add be2hex converter
- MEDIUM: config: Deprecate tune.ssl.capture-cipherlist-size
- BUG/MINOR: time: fix idle time computation for long sleeps
- MINOR: time: add report_idle() to report process-wide idle time
- BUG/MINOR: ebtree: remove dependency on incorrect macro for bits per long
- BUILD: activity: use #ifdef not #if on USE_MEMORY_PROFILING
- BUILD/MINOR: defaults: eliminate warning on MAXHOSTNAMELEN with -Wundef
- BUILD/MINOR: ssl: avoid a build warning on LIBRESSL_VERSION with -Wundef
- IMPORT: slz: silence a build warning with -Wundef
- BUILD/MINOR: regex: avoid a build warning on USE_PCRE2 with -Wundef
The test on FIND_OPTIMAL_MATCH for the experimental code can yield a
build warning when using -Wundef, let's turn it into a regular ifdef.
This is slz upstream commit 05630ae8f22b71022803809eb1e7deb707bb30fb
Openssl-compat emits a warning for the test on LIBRESSL_VERSION that might
be underfined, if built with -Wundef. The fix is easy, let's do it. Related
to GH issue #1369.
As reported in GH issue #1369, there is a single case of #if with a
possibly undefined value in defaults.h which is on MAXHOSTNAMELEN. Let's
turn it to a #ifdef.
The code used to rely on BITS_PER_LONG to decide on the most efficient
way to perform a 64-bit shift, but this macro is not defined (at best
it's __BITS_PER_LONG) and it's likely that it's been like this since
the early implementation of ebtrees designed on i386. Let's remove the
test on this macro and rely on sizeof(long) instead, it also has the
benefit of letting the compiler validate the two branches.
This can be backported to all versions. Thanks to Ezequiel Garcia for
reporting this one in issue #1369.
Before threads were introduced in 1.8, idle_pct used to be a global
variable indicating the overall process idle time. Threads made it
thread-local, meaning that its reporting in the stats made little
sense, though this was not easy to spot. In 2.0, the idle_pct variable
moved to the struct thread_info via commit 81036f273 ("MINOR: time:
move the cpu, mono, and idle time to thread_info"). It made it more
obvious that the idle_pct was per thread, and also allowed to more
accurately measure it. But no more effort was made in that direction.
This patch introduces a new report_idle() function that accurately
averages the per-thread idle time over all running threads (i.e. it
should remain valid even if some threads are paused or stopped), and
makes use of it in the stats / "show info" reports.
Sending traffic over only two connections of an 8-thread process
would previously show this erratic CPU usage pattern:
$ while :; do socat /tmp/sock1 - <<< "show info"|grep ^Idle;sleep 0.1;done
Idle_pct: 30
Idle_pct: 35
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Idle_pct: 35
Idle_pct: 33
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Now it shows this more accurate measurement:
$ while :; do socat /tmp/sock1 - <<< "show info"|grep ^Idle;sleep 0.1;done
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
This is not technically a bug but this lack of precision definitely affects
some users who rely on the idle_pct measurement. This should at least be
backported to 2.4, and might be to some older releases depending on users
demand.
In 2.4 we extended the max poll time from 1s to 60s with commit
4f59d3861 ("MINOR: time: increase the minimum wakeup interval to 60s").
This had the consequence that the calculation of the idle time percentage
may overflow during the multiply by 100 if the thread had slept 43s or
more. Let's change this to a 64 bit computation. This will have no
performance impact since this is done at most twice per second.
This should fix github issue #1366.
This must be backported to 2.4.
To be able to provide JA3 compatible TLS Fingerprints we need to expose
all Client Hello captured data using fetchers. Patch provides new
and modifies existing fetchers to add ability to filter out GREASE values:
- ssl_fc_cipherlist_*
- ssl_fc_ecformats_bin
- ssl_fc_eclist_bin
- ssl_fc_extlist_bin
- ssl_fc_protocol_hello_id
When we set tune.ssl.capture-cipherlist-size to a non-zero value
we are able to capture cipherlist supported by the client. To be able to
provide JA3 compatible TLS fingerprinting we need to capture more
information from Client Hello message:
- SSL Version
- SSL Extensions
- Elliptic Curves
- Elliptic Curve Point Formats
This patch allows HAProxy to capture such information and store it for
later use.
The lua initialization code which creates the Lua mapping of all converters
and sample fetch keywords makes use of strncpy(), and as such can take ages
to start with large values of tune.bufsize because it spends its time zeroing
gigabytes of memory for nothing. A test performed with an extreme value of
16 MB takes roughly 4 seconds, so it's possible that some users with huge
1 MB buffers (e.g. for payload analysis) notice a small startup latency.
However this does not affect config checks since the Lua stack is not yet
started. Let's replace this with strlcpy2().
This should be backported to all supported versions.
There are regularly places, especially in config analysis, where we
need to report certain things (warnings or errors) only once, but
where implementing a counter is sufficiently deterrent so that it's
not done.
Let's add a simple ONLY_ONCE() macro that implements a static variable
(char) which is atomically turned on, and returns true if it's set for
the first time. This uses fairly compact code, a single byte of BSS
and is thread-safe. There are probably a number of places in the config
parser where this could be used. It may also be used to implement a
WARN_ON() similar to BUG_ON() but which would only warn once.
When a server is configured with name-resolution, resolvers objects are
created with reference to this server. Thus the server is marked as non
purgeable to prevent its removal at runtime.
This does not need to be backport.