Commit Graph

13126 Commits

Author SHA1 Message Date
Amaury Denoyelle
9963fa74d2 MINOR: ssl: instantiate stats module
This module is responsible for providing statistics for ssl. It allocates
counters for frontend/backend/listener/server objects.
2020-11-06 12:05:17 +01:00
Christopher Faulet
a66adf41ea MINOR: http-htx: Add understandable errors for the errorfiles parsing
No details are provided when an error occurs during the parsing of an errorfile,
Thus it is a bit hard to diagnose where the problem is. Now, when it happens, an
understandable error message is reported.

This patch is not a bug fix in itself. But it will be required to change an
fatal error into a warning in last stable releases. Thus it must be backported
as far as 2.0.
2020-11-06 09:13:58 +01:00
Willy Tarreau
6d27a92b83 BUG/MINOR: ssl: don't report 1024 bits DH param load error when it's higher
The default dh_param value is 2048 and it's preset to zero unless explicitly
set, so we must not report a warning about DH param not being loadble in 1024
bits when we're going to use 2048. Thanks to Dinko for reporting this.

This should be backported to 2.2.
2020-11-05 19:40:14 +01:00
Jerome Magnin
eff2e0a958 CLEANUP: cfgparse: remove duplicate registration for transparent build options
Since commit 37bafdcbb ("MINOR: sock_inet: move the IPv4/v6 transparent mode code
to sock_inet"), build options for transparent proxying are registered twice.
This patch removes the older one.
2020-11-05 19:27:16 +01:00
Willy Tarreau
38d41996c1 MEDIUM: pattern: turn the pattern chaining to single-linked list
It does not require heavy deletion from the expr anymore, so we can now
turn this to a single-linked list since most of the time we want to delete
all instances of a given pattern from the head. By doing so we save 32 bytes
of memory per pattern. The pat_unlink_from_head() function was adjusted
accordingly.
2020-11-05 19:27:09 +01:00
Willy Tarreau
867a8a5a10 MINOR: pattern: prepare removal of a pattern from the list head
Instead of using LIST_DEL() on the pattern itself inside an expression,
we look it up from its head. The goal is to get rid of the double-linked
list while this usage remains exclusively for freeing on startup error!
2020-11-05 19:27:09 +01:00
Willy Tarreau
2817472bb0 MINOR: pattern: during reload, delete elements frem the ref, not the expression
Instead of scanning all elements from the expression and using the
slow delete path there, let's use the faster way which involves
pat_delete_gen() while the elements are detached from ther reference.
2020-11-05 19:27:09 +01:00
Willy Tarreau
ae83e63b48 MEDIUM: pattern: make pat_ref_prune() rely on pat_ref_purge_older()
When purging all of a reference, it's much more efficient to scan the
reference patterns from the reference head and delete all derivative
patterns than to scan the expressions. The only thing is that we need
to proceed both for the current and next generations, in case there is
a huge gap between the two. With this, purging 20M IP addresses in small
batches of 100 takes roughly 3 seconds.
2020-11-05 19:27:09 +01:00
Willy Tarreau
94b9abe200 MINOR: pattern: add pat_ref_purge_older() to purge old entries
This function will be usable to purge at most a specified number of old
entries from a reference. Entries are declared old if their generation
number is in the past compared to the one passed in argument. This will
ease removal of early entries when new ones have been appended.

We also call malloc_trim() when available, at the end of the series,
because this is one place where there is a lot of memory to save. Reloads
of 1M IP addresses used in an ACL made the process grow up to 1.7 GB RSS
after 10 reloads and roughly stabilize there without this call, versus
only 260 MB when the call is present. Sadly there is no direct equivalent
for jemalloc, which stabilizes around 800MB-1GB.
2020-11-05 19:27:09 +01:00
Willy Tarreau
1a6857b9c1 MINOR: pattern: implement pat_ref_load() to load a pattern at a given generation
pat_ref_load() basically combines pat_ref_append() and pat_ref_commit().
It's very similar to pat_ref_add() except that it also allows to set the
generation ID and the line number. pat_ref_add() was modified to directly
rely on it to avoid code duplication. Note that a previous declaration
of pat_ref_load() was removed as it was just a leftover of an earlier
incarnation of something possibly similar, so no existing functionality
was changed here.
2020-11-05 19:27:09 +01:00
Willy Tarreau
0439e5eeb4 MINOR: pattern: add pat_ref_commit() to commit a previously inserted element
This function will be used after a successful pat_ref_append() to propagate
the pattern to all use places (including parsing and indexing). On failure,
it will entirely roll back all insertions and free the pattern itself. It
also preserves the generation number so that it is convenient for use in
association with pat_ref_append(). pat_ref_add() was modified to rely on
it instead of open-coding the insertion and roll-back.
2020-11-05 19:27:09 +01:00
Willy Tarreau
c93da6950e MEDIUM: pattern: only match patterns that match the current generation
Instead of matching any pattern found in the tree, only match those
matching the current generation of entries. This will make sure that
reloads are atomic, regardless of the time they take to complete, and
that newly added data are not matched until the whole reference is
committed. For consistency we proceed the same way on "show map" and
"show acl".

This will have no impact for now since generations are not used.
2020-11-05 19:27:09 +01:00
Willy Tarreau
29947745b5 MINOR: pattern: store a generation number in the reference patterns
Right now it's not possible to perform a safe reload because we don't
know what patterns were recently added or were already present. This
patch adds a generation counter to the reference patterns so that it
is possible to know what generation of the reference they were loaded
with. A reference now has two generations, the current one, used for
all additions, and the next one, allocated to those wishing to update
the contents. The generation wraps at 2^32 so comparisons must be made
relative to the current position.

The idea will be that upon full reload, the caller will first get a new
generation ID, will insert all new patterns using it, will then switch
the current ID to the new one, and will delete all entries older than
the current ID. This has the benefit of supporting chunked updates that
remain consistent and that won't block the whole process for ages like
pat_ref_reload() currently does.
2020-11-05 19:27:09 +01:00
Willy Tarreau
1fd52f70e5 MINOR: pattern: introduce pat_ref_delete_by_ptr() to delete a valid reference
Till now the only way to remove a known reference was via
pat_ref_delete_by_id() which scans the whole list to find a matching pointer.
Let's add pat_ref_delete_by_ptr() which takes a valid pointer. It can be
called by the function above after the pointer is found, and can also be
used to roll back a failed insertion much more efficiently.
2020-11-05 19:27:09 +01:00
Willy Tarreau
a98b2882ac CLEANUP: pattern: remove pat_delete_fcts[] and pattern_head->delete()
These ones are not used anymore, so let's remove them to remove a bit
of the complexity. The ACL keyword's delete() function could be removed
as well, though most keyword declarations are positional and we have a
high risk of introducing a mistake here, so let's not touch the ACL part.
2020-11-05 19:27:09 +01:00
Willy Tarreau
b35aa9b256 CLEANUP: acl: don't reference the generic pattern deletion function anymore
A few ACL keyword used to reference pat_delete_gen() as the deletion
function but this is not needed since it's the default one now. Let's
just remove this reference.
2020-11-05 19:27:09 +01:00
Willy Tarreau
e828d8f0e8 MINOR: pattern: perform a single call to pat_delete_gen() under the expression
When we're removing an element under the expression lock, we don't need
anymore to run over all ->delete() functions via the expressions, since
we know that the single function does it fine now. Note that at this
point, pattern->delete() is not used at all through out the code anymore.
2020-11-05 19:27:09 +01:00
Willy Tarreau
f1c0892aa6 MINOR: pattern: remerge the list and tree deletion functions
pat_del_tree_gen() was already chained onto pat_del_list_gen() to deal
with remaining cases, so let's complete the merge and have a generic
pattern deletion function acting on the reference and taking care of
reliably removing all elements.
2020-11-05 19:27:09 +01:00
Willy Tarreau
78777ead32 MEDIUM: pattern: change the pat_del_* functions to delete from the references
This is the next step in speeding up entry removal. Now we don't scan
the whole lists or trees for elements pointing to the target reference,
instead we start from the reference and delete all linked patterns.

This simplifies some delete functions since we don't need anymore to
delete multiple times from an expression since all nodes appear after
the reference element. We can now have one generic list and one generic
tree deletion function.

This required the replacement of pattern_delete() with an open-coded
version since we now need to lock all expressions first before proceeding.
This means there is a high risk of lock inversion here but given that the
expressions are always scanned in the same order from the same head, this
must not happen.

Now deleting first entries is instantaneous, and it's still slow to
delete the last ones when looking up their ID since it still requires
to look them up by a full scan, but it's already way faster than
previously. Typically removing the last 10 IP from a 20M entries ACL
with a full-scan each took less than 2 seconds.

It would be technically possible to make use of indexed entries to
speed up most lookups for removal by value (e.g. IP addresses) but
that's for later.
2020-11-05 19:27:09 +01:00
Willy Tarreau
4bdd0a13d6 MEDIUM: pattern: link all final elements from the reference
There is a data model issue in the current pattern design that makes
pattern deletion extremely expensive: there's no direct way from a
reference to access all indexed occurrences. As such, the only way
to remove all indexed entries corresponding to a reference update
is to scan all expressions's lists and trees to find a link to the
reference. While this was possibly OK when map removal was not
common and most maps were small, this is not conceivable anymore
with GeoIP maps containing 10M+ entries and del-map operations that
are triggered from http-request rulesets.

This patch introduces two list heads from the pattern reference, one
for the objects linked by lists and one for those linked by tree node.
Ideally a single list would be enough but the linked elements are too
much unrelated to be distinguished at the moment, so we'll need two
lists. However for the long term a single-linked list will suffice but
for now it's not possible due to the way elements are removed from
expressions. As such this patch adds 32 bytes of memory usage per
reference plus 16 per indexed entry, but both will be cut in half
later.

The links are not yet used for deletion, this patch only ensures the
list is always consistent.
2020-11-05 19:27:09 +01:00
Willy Tarreau
6d8a68914e MINOR: pattern: make the delete and prune functions more generic
Now we have a single prune() function to act on an expression, and one
delete function for the lists and one for the trees. The presence of a
pointer in the lists is enough to warrant a free, and we rely on the
PAT_SF_REGFREE flag to decide whether to free using free() or regfree().
2020-11-05 19:27:09 +01:00
Willy Tarreau
9b5c8bbc89 MINOR: pattern: new sflag PAT_SF_REGFREE indicates regex_free() is needed
Currently we have no way to know how to delete/prune a pattern in a
generic way. A pattern doesn't contain its own type so we don't know
what function to call. Tree nodes are roughly OK but not lists where
regex are possible. Let's add one new bit for sflags at index time to
indicate that regex_free() will be needed upon deletion. It's not used
for now.
2020-11-05 19:27:08 +01:00
Willy Tarreau
d4164dcd4a CLEANUP: pattern: delete the back refs at once during pat_ref_reload()
It's pointless to delete a backref and relink it to the next entry since
the next entry is going to do the exact same and so on until all of them
are deleted. Let's simply delete backrefs on reload.
2020-11-05 19:27:08 +01:00
Willy Tarreau
3ee0de1b41 MINOR: pattern: move the update revision to the pat_ref, not the expression
It's not possible to uniquely update a single expression without updating
the pattern reference, I don't know why we've put the revision in the
expression back then, given that it in fact provides an update for a
full pattern. Let's move the revision into the reference's head instead.
2020-11-05 19:27:08 +01:00
Willy Tarreau
114d698fde MEDIUM: pattern: call malloc_trim() on pat_ref_reload()
This is one case where we may release large amounts of data at once. Tests
show that without this, after 10 full reloads of an ACL containing  1M IP
addresses, the memory usage grew and stabilized around 1.7 GB of RSS. With
this change, it stays around 260 MB and is stable across reloads.
2020-11-05 19:27:08 +01:00
Willy Tarreau
88366c2926 MEDIUM: pools: call malloc_trim() from pool_gc()
If available it definitely makes sense to call it since it's also
called when stopping to reclaim the maximum possible memory.
2020-11-05 19:27:08 +01:00
Willy Tarreau
1d3c7003d9 MINOR: compat: automatically include malloc.h on glibc
This is in order to access malloc_trim() which is convenient after
clearing huge maps to reclaim memory. When this is detected, we also
define HA_HAVE_MALLOC_TRIM.
2020-11-05 19:27:08 +01:00
Christopher Faulet
32186472cb REGTEST: converter: Add a regtest for MQTT converters
This new script tests mqtt_is_valid() and mqtt_get_field_value() converters used
to validate and extract information from a MQTT (Message Queuing Telemetry
Transport) message.
2020-11-05 19:27:08 +01:00
Baptiste Assmann
e279ca6bbe MINOR: sample: Add converts to parses MQTT messages
This patch implements a couple of converters to validate and extract data from a
MQTT (Message Queuing Telemetry Transport) message. The validation consists of a
few checks as well as "packet size" validation. The extraction can get any field
from the variable header and the payload.

This is limited to CONNECT and CONNACK packet types only. All other messages are
considered as invalid. It is not a problem for now because only the first packet
on each side can be parsed (CONNECT for the client and CONNACK for the server).

MQTT 3.1.1 and 5.0 are supported.

Reviewed and Fixed by Christopher Faulet <cfaulet@haproxy.com>
2020-11-05 19:27:03 +01:00
Christopher Faulet
7983b8687e REGTEST: converter: Add a regtest for fix converters
This new script tests fix_is_valid() and fix_tag_value() converters used to
validate and extract information from a FIX (Financial Information eXchange)
message.
2020-11-05 19:26:40 +01:00
Baptiste Assmann
e138dda1e0 MINOR: sample: Add converters to parse FIX messages
This patch implements a couple of converters to validate and extract tag value
from a FIX (Financial Information eXchange) message. The validation consists in
a few checks such as mandatory fields and checksum computation. The extraction
can get any tag value based on a tag string or tag id.

This patch requires the istend() function. Thus it depends on "MINOR: ist: Add
istend() function to return a pointer to the end of the string".

Reviewed and Fixed by Christopher Faulet <cfaulet@haproxy.com>
2020-11-05 19:26:30 +01:00
Christopher Faulet
cf26623780 MINOR: ist: Add istend() function to return a pointer to the end of the string
istend() is a shortcut to istptr() + istlen().
2020-11-05 19:25:12 +01:00
Willy Tarreau
1db5579bf8 [RELEASE] Released version 2.4-dev0
Released version 2.4-dev0 with the following main changes :
    - MINOR: version: it's development again.
    - DOC: mention in INSTALL that it's development again
2020-11-05 17:20:35 +01:00
Willy Tarreau
587fdece8c DOC: mention in INSTALL that it's development again
This reverts commit 975908c831.
2020-11-05 17:19:13 +01:00
Willy Tarreau
b9b2ac20f8 MINOR: version: it's development again.
This reverts commit 0badabc381.
2020-11-05 17:18:49 +01:00
Willy Tarreau
1c0a722a83 [RELEASE] Released version 2.3.0
Released version 2.3.0 with the following main changes :
    - CLEANUP: pattern: remove unused entry "tree" in pattern.val
    - BUILD: ssl: use SSL_CTRL_GET_RAW_CIPHERLIST instead of OpenSSL versions
    - BUG/MEDIUM: filters: Don't try to init filters for disabled proxies
    - BUG/MINOR: proxy/server: Skip per-proxy/server post-check for disabled proxies
    - BUG/MINOR: checks: Report a socket error before any connection attempt
    - BUG/MINOR: server: Set server without addr but with dns in RMAINT on startup
    - MINOR: server: Copy configuration file and line for server templates
    - BUG/MEDIUM: mux-pt: Release the tasklet during an HTTP upgrade
    - BUILD: ssl: use HAVE_OPENSSL_KEYLOG instead of OpenSSL versions
    - MINOR: debug: don't count free(NULL) in memstats
    - BUG/MINOR: filters: Skip disabled proxies during startup only
    - MINOR: mux_h2: capitalize frame type in stats
    - MINOR: mux_h2: add stat for total count of connections/streams
    - MINOR: stats: do not display empty stat module title on html
    - BUG/MEDIUM: stick-table: limit the time spent purging old entries
    - BUG/MEDIUM: listener: only enable a listening listener if needed
    - BUG/MEDIUM: listener: never suspend inherited sockets
    - BUG/MEDIUM: listener: make the master also keep workers' inherited FDs
    - MINOR: fd: add fd_want_recv_safe()
    - MEDIUM: listeners: make use of fd_want_recv_safe() to enable early receivers
    - REGTESTS: mark abns_socket as working now
    - CLEANUP: mux-h2: Remove the h1 parser state from the h2 stream
    - MINOR: sock: add a check against cross worker<->master socket activities
    - CI: github actions: limit OpenSSL no-deprecated builds to "default,bug,devel" reg-tests
    - BUG/MEDIUM: server: make it possible to kill last idle connections
    - MINOR: mworker/cli: the master CLI use its own applet
    - MINOR: ssl: define SSL_CTX_set1_curves_list to itself on BoringSSL
    - BUILD: ssl: use feature macros for detecting ec curves manipulation support
    - DOC: Add dns as an available domain to show stat
    - BUILD: makefile: usual reorder of objects for faster builds
    - DOC: update INSTALL to mention that TCC is supported
    - DOC: mention in INSTALL that haproxy 2.3 is a stable version
    - MINOR: version: mention that it's stable now
2020-11-05 17:04:53 +01:00
Willy Tarreau
0badabc381 MINOR: version: mention that it's stable now
This version will be maintained up to around Q1 2022.
2020-11-05 17:00:50 +01:00
Willy Tarreau
975908c831 DOC: mention in INSTALL that haproxy 2.3 is a stable version
One day we'll have a script to update the status and support dates :-)
2020-11-05 17:00:46 +01:00
Willy Tarreau
c22747d470 DOC: update INSTALL to mention that TCC is supported
TinyCC as found at https://repo.or.cz/tinycc.git does work to some extents
and is very convenient for developers, so let's mention it.
2020-11-05 16:58:45 +01:00
Willy Tarreau
2404e37838 BUILD: makefile: usual reorder of objects for faster builds
Reordered the objets by reverse build times made the total build time
go down from 17.7s to 17.2s at -O2 using make -j8 on my PC, and from
~3.2 to ~2.7s on the build farm.
2020-11-05 16:46:24 +01:00
Daniel Corbett
c40edacbda DOC: Add dns as an available domain to show stat
Within management.txt, proxy was listed as the only available option. "dns"
is now supported so let's add that. This change also updates the command to list
the available options <dns|proxy> for "domain" as previously it only specified
<domain>, which could be confusing as a user may think this field accepts
dynamic options when it actually requires a specific keyword.
2020-11-05 16:46:24 +01:00
Ilya Shipitsin
0aa8c29460 BUILD: ssl: use feature macros for detecting ec curves manipulation support
Let us use SSL_CTX_set1_curves_list, defined by OpenSSL, as well as in
openssl-compat when SSL_CTRL_SET_CURVES_LIST is present (BoringSSL),
for feature detection instead of versions.
2020-11-05 15:08:41 +01:00
Willy Tarreau
5b8af1e30c MINOR: ssl: define SSL_CTX_set1_curves_list to itself on BoringSSL
OpenSSL 1.0.2 and onwards define SSL_CTX_set1_curves_list which is both a
function and a macro. OpenSSL 1.0.2 to 1.1.0 define SSL_CTRL_SET_CURVES_LIST
as a macro, which disappeared from 1.1.1. BoringSSL only has that one and
not the former macro but it does have the function. Let's keep the test on
the macro matching the function name by defining the macro to itself when
needed.
2020-11-05 15:05:09 +01:00
William Lallemand
99e0bb997f MINOR: mworker/cli: the master CLI use its own applet
Following the patch b4daee ("MINOR: sock: add a check against cross
worker<->master socket activities"), this patch adds a dedicated applet
for the master CLI. It ensures that the CLI connection can't be
used with the master rights in the case of bugs.
2020-11-05 10:28:53 +01:00
Willy Tarreau
21b9ff59b2 BUG/MEDIUM: server: make it possible to kill last idle connections
In issue #933, @jaroslawr provided a report indicating that when using
many threads and many servers, it's very difficult to terminate the last
idle connections on each server. The issue has two causes in fact. The
first one is that during the calculation of the estimate of needed
connections, we round the computation up while in previous round it was
already rounded up, so we end up adding 1 to 1 which once divided by 2
remains 1. The second issue is that servers are not woken up anymore for
purging their connections if they don't have activity. The only reason
that was there to wake them up again was in case insufficient connections
were purged. And even then the purge task itself was not woken up. But
that is not enough for getting rid of the long tail of old connections
nor updating est_need_conns.

This patch makes sure to properly wake up as long as at least one idle
connection remains, and not to round up the needed connections anymore.

Prior to this patch, a test involving many connections which suddenly
stopped would keep many idle connections, now they're effectively halved
every pool-purge-delay.

This needs to be backported to 2.2.
2020-11-05 09:12:20 +01:00
Ilya Shipitsin
f9bc6e4968 CI: github actions: limit OpenSSL no-deprecated builds to "default,bug,devel" reg-tests 2020-11-04 16:14:09 +01:00
Willy Tarreau
b4daeeb094 MINOR: sock: add a check against cross worker<->master socket activities
Given that the previous issues caused spurious worker socket wakeups in
the master for inherited FDs that couldn't be closed, let's add a strict
test in the I/O callback to make sure that an accept() event is always
caught by the appropriate type of process (master for master listeners,
worker for worker listeners).
2020-11-04 15:05:50 +01:00
Christopher Faulet
fafd1b0a5b CLEANUP: mux-h2: Remove the h1 parser state from the h2 stream
Since the h2 multiplexer no longer relies on the legacy HTTP representation, and
uses exclusively the HTX, the H1 parser state (h1m) is no longer used by the h2
streams. Thus it can be removed.

This patch may be backported as far as 2.1.
2020-11-04 15:02:24 +01:00
Willy Tarreau
501c99588e REGTESTS: mark abns_socket as working now
William noticed that the real issue in the abns test was that it was
failing to rebind some listeners, issues which were addressed in the
previous commits (in some cases, the master would even accept traffic
on the worker's socket which was not properly disabled, causing some
of the strange error messages).

The other issue documented in commit 2ea15a080 was the lack of
reliability when the test was run in parallel. This is caused by the
abns socket which uses a hard-coded address, making all tests in
parallel to step onto each others' toes. Since vtest cannot provide
abns sockets, we're instead concatenating the number of the listening
port that vtest allocated for another frontend to the abns path, which
guarantees to make them unique in the system.

The test works fine in all cases now, even with 100 in parallel.
2020-11-04 14:42:15 +01:00
Willy Tarreau
a4380b211f MEDIUM: listeners: make use of fd_want_recv_safe() to enable early receivers
We used to refrain from calling fd_want_recv() if fd_updt was not allocated
but it's not the right solution as this does not allow the FD to be set.
Instead, let's use the new fd_want_recv_safe() which will update the FD and
create an update entry only if possible. In addition, the equivalent test
before calling fd_stop_recv() was removed as totally useless since there's
not fd_updt creation in this case.
2020-11-04 14:22:42 +01:00