mworker_cli_sockpair_new() is used to create the socketpair CLI listener of
the worker. Its FD is referenced in the mworker_proc structure, however,
once it's assigned to the listener the reference should be removed so we
don't use it accidentally.
The same must be done in case of errors if the FDs were already closed.
When starting HAProxy in master-worker, the master pre-allocate a struct
mworker_proc and do a socketpair() before the configuration parsing. If
the configuration loading failed, the FD are never closed because they
aren't part of listener, they are not even in the fdtab.
This patch fixes the issue by cleaning the mworker_proc structure that
were not asssigned a process, and closing its FDs.
Must be backported as far as 2.0, the srv_drop() only frees the memory
and could be dropped since it's done before an exec().
There were a few casts of list* to mt_list* that were upsetting some
old compilers (not sure about the effect on others). We had created
list_to_mt_list() purposely for this, let's use it instead of applying
this cast.
dladdr1() is used on glibc and takes a void**, but we pass it a
const ElfW(Sym)** and some compilers complain that we're aliasing.
Let's just set a may_alias attribute on the local variable to
address this. There's no need to backport this unless warnings are
reported on older distros or uncommon compilers.
At a few places in the code the switch/case ond flags are tested against
64-bit constants without explicitly being marked as long long. Some
32-bit compilers complain that the constant is too large for a long, and
other likely always use long long there. Better fix that as it's uncertain
what others which do not complain do. It may be backported to avoid doubts
on uncommon platforms if needed, as it touches very few areas.
fcgi_trace() declares fconn as a const and casts its mbuf array to
(struct buffer*), which rightfully upsets some older compilers. Better
just declare it as a writable variable and get rid of the cast. It's
harmless anyway. This has been there since 2.1 with commit 5c0f859c2
("MINOR: mux-fcgi/trace: Register a new trace source with its events")
and doens't need to be backported though it would not harm either.
Once in a while we get rid of this one. isblank() is missing on old
C libraries and only matches two values, so let's just replace it.
It was brought with this commit in 2.4:
0bf268e18 ("MINOR: server: Be more strict on the server-state line parsing")
It may be backported though it's really not important.
Compiling vars.c with gcc 4.2 shows that we're initializing some local
structs field members in a not really portable way:
src/vars.c: In function 'vars_parse_cli_set_var':
src/vars.c:1195: warning: initialized field overwritten
src/vars.c:1195: warning: (near initialization for 'px.conf.args')
src/vars.c:1195: warning: initialized field overwritten
src/vars.c:1195: warning: (near initialization for 'px.conf')
src/vars.c:1201: warning: initialized field overwritten
src/vars.c:1201: warning: (near initialization for 'rule.conf')
It's totally harmless anyway, but better clean this up.
These functions are declared as external functions in check.h and
as inline functions in check.c. Let's move them as static inline in
check.h. This appeared in 2.4 with the following commits:
4858fb2e1 ("MEDIUM: check: align agentaddr and agentport behaviour")
1c921cd74 ("BUG/MINOR: check: consitent way to set agentaddr")
While harmless (it only triggers build warnings with some gcc 4.x),
it should probably be backported where the paches above are present
to keep the code consistent.
The man page indicates that CPU_AND() and CPU_ASSIGN() take a variable,
not a const on the source, even though it doesn't make much sense. But
with older libcs, this triggers a build warning:
src/cpuset.c: In function 'ha_cpuset_and':
src/cpuset.c:53: warning: initialization discards qualifiers from pointer target type
src/cpuset.c: In function 'ha_cpuset_assign':
src/cpuset.c:101: warning: initialization discards qualifiers from pointer target type
Better stick stricter to the documented API as this is really harmless
here. There's no need to backport it (unless build issues are reported,
which is quite unlikely).
We have an implementation of atomic ops for older versions of gcc that
do not provide the __builtin_* API (< 4.4). Recent changes to the pools
broke that in pool_releasable() by having a load from a const pointer,
which doesn't work there due to a temporary local variable that is
declared then assigned. Let's make use of a compount statement to assign
it a value when declaring it.
There's no need to backport this.
When the master process is reloaded on a new config, it will try to
connect to the previous process' socket to retrieve all known
listening FDs to be reused by the new listeners. If listeners were
removed, their unused FDs are simply closed.
However there's a catch. In case a socket fails to bind, the master
will cancel its startup and swithc to wait mode for a new operation
to happen. In this case it didn't close the possibly remaining FDs
that were left unused.
It is very hard to hit this case, but it can happen during a
troubleshooting session with fat fingers. For example, let's say
a config runs like this:
frontend ftp
bind 1.2.3.4:20000-29999
The admin wants to extend the port range down to 10000-29999 and
by mistake ends up with:
frontend ftp
bind 1.2.3.41:20000-29999
Upon restart the bind will fail if the address is not present, and the
master will then switch to wait mode without releasing the previous FDs
for 1.2.3.4:20000-29999 since they're now apparently unused. Then once
the admin fixes the config and does:
frontend ftp
bind 1.2.3.4:10000-29999
The service will start, but will bind new sockets, half of them
overlapping with the previous ones that were not properly closed. This
may result in a startup error (if SO_REUSEPORT is not enabled or not
available), in a FD number exhaustion (if the error is repeated many
times), or in connections being randomly accepted by the process if
they sometimes land on the old FD that nobody listens on.
This patch will need to be backported as far as 1.8, and depends on
previous patch:
MINOR: sock: move the unused socket cleaning code into its own function
Note that before 2.3 most of the code was located inside haproxy.c, so
the patch above should probably relocate the function there instead of
sock.c.
The startup code used to scan the list of unused sockets retrieved from
an older process, and to close them one by one. This also required that
the knowledge of the internal storage of these temporary sockets was
known from outside sock.c and that the code was copy-pasted at every
call place.
This patch moves this into sock.c under the name
sock_drop_unused_old_sockets(), and removes the xfer_sock_list
definition from sock.h since the rest of the code doesn't need to know
this.
This cleanup is minimal and preliminary to a future fix that will need
to be backported to all versions featuring FD transfers over the CLI.
In the release callback, ctx.peers was used instead of ctx.sft. Concretly,
it is not an issue because the appctx context is an union and these both
fields are structures with a unique pointer. But it will be a problem if
that changes.
This patch must be backported as far as 2.2.
When a string is converted to a domain name label, the trailing dot must be
ignored. In resolv_str_to_dn_label(), there is a test to do so. However, the
trailing dot is not really ignored. The character itself is not copied but
the string index is still moved to the next char. Thus, this trailing dot is
counted in the length of the last encoded part of the domain name. Worst,
because the copy is skipped, a garbage character is included in the domain
name.
This patch should fix the issue #1528. It must be backported as far as 2.0.
Do not use an extra DCID parameter on new_quic_cid to be able to
associated a new generated CID to a thread ID. Simply do the computation
inside the function. The API is cleaner this way.
This also has the effects to improve the apparent randomness of CIDs.
With the previous version the first byte of all CIDs are identical for a
connection which could lead to privacy issue. This version may not be
totally perfect on this aspect but it improves the situation.
The producer must know where is the tailing hole in the RX buffer
when it purges it from consumed datagram. This is done allocating
a fake datagram with the remaining number of bytes which cannot be produced
at the tail of the RX buffer as length.
The DeviceAtlas addon can optionally interacts with the new service
without change of configuration in the HAProxy part.
Note that however it requires the DeviceAtlas Identification C API
2.4.0 minimum from this point.
Signed-off-by: David Carlier <dcarlier@deviceatlas.com>
Mentions of the new database update runtime mode and update of
the legit module and the dummy part too.
Note the DeviceAtlas C API version 2.4.0 minimum required
alongside with libCURL, libzip and libgz.
New specialized service to daily handle the update of download file without
interruption of service and to be preemptively started before HAProxy.
It consists on a standalone utility which shared a memory block with
the DeviceAtlas module which handles the JSON data file update on
a daily basis.
Signed-off-by: David Carlier <dcarlier@deviceatlas.com>
The CID trees are no more attached to the listener receiver but to the
underlying datagram handlers (one by thread) which run always on the same thread.
So, any operation on these trees do not require any locking.
We copy the first octet of the original destination connection ID to any CID for
the connection calling new_quic_cid(). So this patch modifies only this function
to take a dcid as passed parameter.
As the RX buffer is not consumed by the sock i/o handler as soon as a datagram
is produced, when full an RX buffer must not be reset. The remaining room is
consumed without modifying it. The consumer has a represention of its contents:
a list of datagrams.
Rename quic_lstnr_dgram_read() to quic_lstnr_dgram_dispatch() to reflect its new role.
After calling this latter, the sock i/o handler must consume the buffer only if
the datagram it received is detected as wrong by quic_lstnr_dgram_dispatch().
The datagram handler task mark the datagram as consumed atomically setting ->buf
to NULL value. The sock i/o handler is responsible of flushing its RX buffer
before using it. It also keeps a datagram among the consumed ones so that
to pass it to quic_lstnr_dgram_dispatch() and prevent it from allocating a new one.
As mentionned in the comment, the tx_qrings and rxbufs members of
receiver struct must be pointers to pointers!
Modify the functions responsible of their allocations consequently.
Note that this code could work because sizeof rxbuf and sizeof tx_qrings
are greater than the size of pointer!
The quic_dgram_ctx struct has been replaced by quic_dgram struct.
There is no need to keek a typedef for a pointer to function since we
converted the UDP datagram parser (quic_dgram_read()) into a task.
quic_dgram_read() parses all the QUIC packets from a UDP datagram. It is the best
candidate to be converted into a task, because is processing data unit is the UDP
datagram received by the QUIC sock i/o handler. If correct, this datagram is
added to the context of a task, quic_lstnr_dghdlr(), a conversion of quic_dgram_read()
into a task. This task pop a datagram from an mt_list and passes it among to
the packet handler (quic_lstnr_pkt_rcv()).
Modify the quic_dgram struct to play the role of the old quic_dgram_ctx struct when
passed to quic_lstnr_pkt_rcv().
Modify the datagram handlers allocation to set their tasks to quic_lstnr_dghdlr().
Add quic_dghdlr new struct do define datagram handler tasks, one by thread.
Allocate them and attach them to the listener receiver part calling
quic_alloc_dghdlrs_listener() newly implemented function.
Add quic_dgram new structure to store information about datagrams received
by the sock I/O handler (quic_sock_fd_iocb) and its associated pool.
Implement quic_get_dgram_dcid() to retrieve the datagram DCID which must
be the same for all the packets in the datagram.
Modify quic_lstnr_dgram_read() called by the sock I/O handler to allocate
a quic_dgram each time a correct datagram is found and add it to the sock I/O
handler rxbuf dgram list.
Define the offsets of the DCIDs from the beginning of a QUIC packets.
Note that they must always be present. As QUIC servers, QUIC haproxy listeners
always use a CID, source CID on the haproxy side, which is a destination ID on the
peer side.
This function is no more used anymore, broken and uses code shared with the
listener packet parser. This is becoming anoying to continue to modify
it without testing each time we modify the code it shares with the
listener packet parser.
This is to be sure xprt functions do not manipulate the buffer struct
passed as parameter to quic_lstnr_dgram_read() from low level datagram
I/O callback in quic_sock.c (quic_sock_fd_iocb()).
Mention that the token is sent only by servers in both server and listener
packet parsers.
Remove a "TO DO" section in listener packet parser because there is nothing
more to do in this function about the token
This quic_dgram_ctx struct member is used to denote if we are parsing a new
datagram (null value), or a coalesced packet into the current datagram (non null
value). But it was never set.
There's a risk that fdtab is not 64-byte aligned. The first effect is that
it may cause false sharing between cache lines resulting in contention
when adjacent FDs are used by different threads. The second is related
to what is explained in commit "BUG/MAJOR: compiler: relax alignment
constraints on certain structures", i.e. that modern compilers might
make use of aligned vector operations to zero some entries, and would
crash. We do not use any memset() or so on fdtab, so the risk is almost
inexistent, but that's not a reason for violating some valid assumptions.
This patch addresses this by allocating 64 extra bytes and aligning the
structure manually (this is an extremely cheap solution for this specific
case). The original address is stored in a new variable "fdtab_addr" and
is the one that gets freed. This remains extremely simple and should be
easily backportable. A dedicated aligned allocator later would help, of
course.
This needs to be backported as far as 2.2. No issue related to this was
reported yet, but it could very well happen as compilers evolve. In
addition this should preserve high performance across restarts (i.e.
no more dependency on allocator's alignment).
In github bug #1517, Mike Lothian reported instant crashes on startup
on RHEL8 + gcc-11 that appeared with 2.4 when allocating a proxy.
The analysis brought us down to the THREAD_ALIGN() entries that were
placed inside the "server" struct to avoid false sharing of cache lines.
It turns out that some modern gcc make use of aligned vector operations
to manipulate some fields (e.g. memset() etc) and that these structures
allocated using malloc() are not necessarily aligned, hence the crash.
The compiler is allowed to do that because the structure claims to be
aligned. The problem is in fact that the alignment propagates to other
structures that embed it. While most of these structures are used as
statically allocated variables, some are dynamic and cannot use that.
A deeper analysis showed that struct server does this, propagates to
struct proxy, which propagates to struct spoe_config, all of which
are allocated using malloc/calloc.
A better approach would consist in usins posix_memalign(), but this one
is not available everywhere and will either need to be reimplemented
less efficiently (by always wasting 64 bytes before the area), or a
few functions will have to be specifically written to deal with the
few structures that are dynamically allocated.
But the deeper problem that remains is that it is difficult to track
structure alignment, as there's no available warning to check this.
For the long term we'll probably have to create a macro such as
"struct_malloc()" etc which takes a type and enforces an alignment
based on the one of this type. This also means propagating that to
pools as well, and it's not a tiny task.
For now, let's get rid of the forced alignment in struct server, and
replace it with extra padding. By punching 63-byte holes, we can keep
areas on separate cache lines. Doing so moderately increases the size
of the "server" structure (~+6%) but that's the best short-term option
and it's easily backportable.
This will have to be backported as far as 2.4.
Thanks to Mike for the detailed report.
The standalone testing code used to rely on rand(), but switching to a
xorshift generator speeds up the test by 7% which is important to
accurately measure the real impact of the LRU code itself.