Commit Graph

18349 Commits

Author SHA1 Message Date
Frédéric Lécaille
c591459d11 MINOR: quic: Congestion control architecture refactoring
Ease the integration of new congestion control algorithm to come.
Move the congestion controller state to a private array of uint32_t
to stop using a union. We do not want to continue using such long
paths cc->algo_state.<algo>.<var> to modify the internal state variable
for each algorithm.

Must be backported to 2.6
2022-07-29 17:32:05 +02:00
Amaury Denoyelle
72a78e8290 BUG/MEDIUM: mux-quic: fix missing EOI flag to prevent streams leaks
On H3 DATA frame transfer from the client, some streams are not properly
closed by the upper layer, despite all transfer operation completed.
Data integrity is not impacted but this will prevent the stream timeout
to fire and thus keep the owner session opened.

In most cases, sessions are closed on QUIC idle timeout, but it may stay
forever if a client emits PING frames at a regular interval to maintain
it.

This bug is caused by a missing EOI stream desc flag on certain
condition in qc_rcv_buf(). To be triggered, we have to use the
optimization when conn-stream buffer is empty and can be swapped with
qcs buffer. The problem is that it will skip the function body for
default copy but also a condition to check if EOI must be set. Thus this
bug does not happens for every H3 post requets : it requires that
conn-stream buffer is empty on last qc_rcv_buf() invocation.

This was reproduced more frequently when using ngtcp2 client with one or
multiple streams :
$ ngtcp2-client -m POST -d ~/infra/html/10K 127.0.0.1 20443 \
  http://127.0.0.1:20443/post

This may fix at least partially github issue #1801.

This must be backported up to 2.6.
2022-07-29 16:01:21 +02:00
William Lallemand
b5d062dff1 MINOR: cli: warning on _getsocks when socket were closed
The previous attempt was reverted because it would emit a warning when
the sockets are still in the process when a reload failed, so this was
an expected 2nd try.

This warning however, will be displayed if a new process successfully
get the previous sockets AND the sendable number of sockets is 0.

This way the user will be warned if he tried to get the sockets fromt
the wrong process.
2022-07-28 15:49:43 +02:00
William Lallemand
9c821e615e Revert "MINOR: cli: emit a warning when _getsocks was used more than once"
This reverts commit 519cd2021b.

This was reverted because it's still useful to have access to _getsosks
when the previous reload failed.
2022-07-27 13:55:54 +02:00
William Lallemand
14b98ef1bd BUG/MINOR: mworker: PROC_O_LEAVING used but not updated
Since commit 2be557f ("MEDIUM: mworker: seamless reload use the internal
sockpair"), we are using the PROC_O_LEAVING flag to determine which
sockpair worker will be used with -x during the next reload.

However in mworker_reexec(), the PROC_O_LEAVING flag is not updated, it
is only updated at startup in mworker_env_to_proc_list().

This could be a problem when a remaining process is still in the list,
it could be selected as the current worker, and its socket will be used
even if _getsocks doesn't work anymore on it. (bug #1803)

This patch fixes the issue by updating the PROC_O_LEAVING flag in
mworker_proc_list_to_env() just before using it in mworker_reexec()

Must be backported to 2.6.
2022-07-27 12:13:56 +02:00
William Lallemand
519cd2021b MINOR: cli: emit a warning when _getsocks was used more than once
The _getsocks CLI command can be used only once, after that the sockets
are not available anymore.

Emit a warning when the command was already used once.
2022-07-27 11:48:54 +02:00
Willy Tarreau
b983145837 BUG/MINOR: fd: always remove late updates when freeing fd_updt[]
Christopher found that since commit 8e2c0fa8e ("MINOR: fd: delete unused
updates on close()") we may crash in a late stop due to an fd_delete()
in the main thread performed after all threads have deleted the fd_updt[]
array. Prior to that commit that didn't happen because we didn't touch
the updates on this path, but now it may happen. We don't care about these
ones anyway since the poller is stopped, so let's just wipe them by
resetting their counter before freeing the array.

No backport is needed as this is only 2.7.
2022-07-26 19:06:17 +02:00
William Lallemand
c31577f32e MEDIUM: resolvers: continue startup if network is unavailable
When haproxy starts with a resolver section, and there is a default one
since 2.6 which use /etc/resolv.conf, it tries to do a connect() with the UDP
socket in order to check if the routes of the system allows to reach the
server.

This check is too much restrictive as it won't prevent any runtime
failure.

Relax the check by making it a warning instead of a fatal alert.

This must be backported in 2.6.
2022-07-26 10:59:14 +02:00
William Lallemand
dc66f2f97d DEBUG: fd: split the fd check
Split the BUG_ON(fd < 0 || fd >= global.maxsock) so it's easier to know
if it quits because of a -1.
2022-07-26 10:35:24 +02:00
Christopher Faulet
244331f6e7 Revert "BUG/MINOR: peers: set the proxy's name to the peers section name"
This reverts commit 356866acce.

It seems that an undocumented expectation of peers is based on the peers
proxy name to determine if the local peer is fully configured or not. Thus
because of the commit above, we are no longer able to detect incomplete
peers sections.

On side effect of this bug is a segfault when HAProxy is stopped/reloaded if
we try to perform a local resync on a mis-configured local peer. So waiting
for a better solution, the patch is reverted.

This patch must be backported as far as 2.5.
2022-07-25 16:17:04 +02:00
William Lallemand
708949da49 MINOR: sockpair: move send_fd_uxst() error message in caller
Move the ha_alert() in send_fd_uxst() in the callers and add the FD
numbers in the message.
2022-07-25 16:11:11 +02:00
William Lallemand
f67e8fb92c BUG/MINOR: sockpair: wrong return value for fd_send_uxst()
The fd_send_uxst() function which is used to send a socket over the
socketpair returns 1 upon error instead of -1, which means the error
case of the sendmsg() is never catched correctly.

Must be backported as far as 1.9.
2022-07-25 16:10:58 +02:00
Willy Tarreau
51b1fcedeb DEBUG: fd: detect possibly invalid tgid in fd_insert()
Since the API is still a bit young, let's make sure nobody tries to
assign and FD to a group not strictly 1..MAX_TGROUPS as that would
indicate a bug.

Note: some of these might be relaxed to BUG_ON_HOT() in the future
2022-07-25 15:47:45 +02:00
Willy Tarreau
6983426354 BUG/MAJOR: poller: drop FD's tgid when masks don't match
A bug was introduced in 2.7-dev2 by commit 1f947cb39 ("MAJOR: poller:
only touch/inspect the update_mask under tgid protection"): once the
FD's tgid is held, we would forget to drop it in case the update mask
doesn't match, resulting in random watchdog panics of older processes
on successive reloads.

This should fix issue #1798. Thanks to Christian for the report and
to Christopher for the reproducer.

No backport is needed.
2022-07-25 15:47:15 +02:00
Willy Tarreau
53bfac8c63 BUG/MEDIUM: master: force the thread count earlier
Christopher bisected that recent commit d0b73bca71 ("MEDIUM: listener:
switch bind_thread from global to group-local") broke the master socket
in that only the first out of the Nth initial connections would work,
where N is the number of threads, after which they all work.

The cause is that the master socket was bound to multiple threads,
despite global.nbthread being 1 there, so the incoming connection load
balancing would try to send incoming connections to non-existing threads,
however the bind_thread mask would nonetheless include multiple threads.

What happened is that in 1.9 we forced "nbthread" to 1 in the master's poll
loop with commit b3f2be338b ("MEDIUM: mworker: use the haproxy poll loop").

In 2.0, nbthread detection was enabled by default in commit 149ab779cc
("MAJOR: threads: enable one thread per CPU by default"). From this point
on, the operation above is unsafe because everything during startup is
performed with nbthread corresponding to the default value, then it
changes to one when starting the polling loop. But by then we weren't
using the wait mode except for reload errors, so even if it would have
happened nobody would have noticed.

In 2.5 with commit fab0fdce9 ("MEDIUM: mworker: reexec in waitpid mode
after successful loading") we started to rexecute all the time, not just
for errors, so as to release precious resources and to possibly spot bugs
that were rarely exposed in this mode. By then the incoming connection LB
was enforcing all_threads_mask on the listener's thread mask so that the
incorrect value was being corrected while using it.

Finally in 2.7 commit d0b73bca71 ("MEDIUM: listener: switch bind_thread
from global to group-local") replaces the all_threads_mask there with
the listener's bind_thread, but that one was never adjusted by the
starting master, whose thread group was filled to N threads by the
automatic detection during early setup.

The best approach here is to set nbthread to 1 very early in init()
when we're in the master in wait mode, so that we don't try to guess
the best value and don't end up with incorrect bindings anymore. This
patch does this and also sets nbtgroups to 1 in preparation for a
possible future where this will also be automatically calculated.

There is no need to backport this patch since no other versions were
affected, but if it were to be discovered that the incorrect bind mask
on some of the master's FDs could be responsible for any trouble in
older versions, then the backport should be safe (provided that
nbtgroups is dropped of course).
2022-07-22 17:51:53 +02:00
Christopher Faulet
38c53944cb BUG/MINOR: backend: Fallback on RR algo if balance on source is impossible
If the loadbalancing is performed on the source IP address, an internal
error was returned on error. So for an applet on the client side (for
instance an SPOE applet) or for a client connected to a unix socket, an
internal error is returned.

However, when other LB algos fail, a fallback on round-robin is
performed. There is no reson to not do the same here.

This patch should fix the issue #1797. It must be backported to all
supported versions.
2022-07-22 17:07:34 +02:00
Christopher Faulet
ca67992979 BUG/MEDIUM: stconn: Only reset connect expiration when processing backend side
Since commit ae024ced0 ("MEDIUM: stream-int/stream: Use connect expiration
instead of SI expiration"), the connect expiration date is per-stream. So
there is only one expiration date instead of one per side, front and
back. So when a stream-connector is processed, we must test if it is a
frontend or a backend stconn before updating the connect expiration
date. Indeed, the frontend stconn must not reset the connect expiration
date.

This bug may have several side effect. One known bug is about peer sessions
blocked because the frontend peer applet is in ST_CLO state and its backend
connection is in ST_TAR state but without connect expiration date.

This patch should fix the issue #1791 and #1792. It must be backported to
2.6.
2022-07-21 14:50:14 +02:00
Willy Tarreau
41afd9084e BUILD: add detection for unsupported compiler models
As reported in github issue #1765, some people get trapped into building
haproxy and companion libraries on Windows using a compiler following the
LLP64 model. This has no chance to work, and definitely causes nasty bugs
everywhere when pointers are passed as longs. Let's save them time and
detect this at boot time.

The message and detection was factored with the existing one for -fwrapv
since we need the same info and actions.

This should be backported to all recent supported versions (the ones
that are likely to be tried on such platforms when people don't know).
2022-07-21 09:58:20 +02:00
William Lallemand
d4835a9680 BUG/MEDIUM: mworker: proc_self incorrectly set crashes upon reload
When updating from 2.4 to 2.6, the child->reloads++ instruction changed
place, resulting in a former worker from the 2.4 process, still
identified as a current worker once in 2.6, because its reload counter
is still 0.

Unfortunately this counter is used to chose the mworker_proc structure
that will be used for the new worker.

What happens next, is that the mworker_proc structure of the previous
process is selected, and this one has ipc_fd[1] set to -1, because this
structure was supposed to be in the master.

The process then forks, and mworker_sockpair_register_per_thread() tries
to register ipc_fd[1] which is set to -1, instead of the fd of the new
socketpair.

This patch fixes the issue by checking if child->pid is equal to -1 when
selecting proc_self. This way we could be sure it wasn't a previous
process.

Should fix issue #1785.

This must be backported as far as 2.4 to fix the issue related to the
reload computation difference. However backporting it in every stable
branch will enforce the reload process.
2022-07-21 00:52:43 +02:00
Frédéric Lécaille
a18c3339c8 BUG/MAJOR: mux_quic: fix invalid PROTOCOL_VIOLATION on POST data overlap
Stream data reception is incorrect when dealing with a partially new
offset with some data already consumed out of the RX buffer. In this
case, data length is adjusted but not the data buffer. In most cases,
ncb_add() operation will be rejected as already stored data does not
correspond with the new inserted offset. This will result in an invalid
CONNECTION_CLOSE with PROTOCOL_VIOLATION.

To fix this, buffer pointer is advanced while the length is reduced.

This can be reproduced with a POST request and patching haproxy to call
qcc_recv() multiple times by copying a quic_stream frame with different
offsets.

Must be backported to 2.6.
2022-07-20 15:34:58 +02:00
William Lallemand
bac3a82a50 BUG/MINOR: mworker/cli: relative pid prefix not validated anymore
Since e8422bf ("MEDIUM: global: remove the relative_pid from global and
mworker"), the relative pid prefix is not tested anymore on the master
CLI. Which means any value will fall into the "1" process.

Since we removed the nbproc, only the "1" and the "0" (master) value are
correct, any other value should return an error.

Fix issue #1793.

This must be backported as far as 2.5.
2022-07-20 14:43:47 +02:00
William Lallemand
0f17ab2fdd MINOR: ssl: enhance ca-file error emitting
Enhance the errors and warnings when trying to load a ca-file with
ssl_store_load_locations_file().

Add errors from ERR_get_error() and strerror to give more information to
the user.
2022-07-19 19:13:08 +02:00
William Lallemand
3b8bafd4a7 MINOR: init: load OpenSSL error strings
Load OpenSSL Error strings in order to be able to output reason strings.

This is mandatory to be able to use ERR_reason_error_string().
2022-07-19 19:13:08 +02:00
Willy Tarreau
c1640f79fe BUG/MEDIUM: fd/threads: fix incorrect thread selection in wakeup broadcast
In commit cfdd20a0b ("MEDIUM: fd: support broadcasting updates for foreign
groups in updt_fd_polling") we decided to pick a random thread number among
a set of candidates for a wakeup in case we need an instant change. But the
thread count range was wrong (MAX_THREADS) instead of tg->count, resulting
in random crashes when thread groups are > 1 and MAX_THREADS > 64.

No backport is needed, this was introduced in 2.7-dev2.
2022-07-19 16:01:04 +02:00
Christopher Faulet
7e94b40a22 BUG/MINOR: fd: Properly init the fd state in fd_insert()
When a new fd is inserted in the fdtab array, its state is initialized. The
"newstate" variable is used to compute the right state (0 by default, but
FD_ET_POSSIBLE flag is set if edge-triggered is supported for the fd).
However, this variable is never used and the fd state is always set to 0.

Now, the fd state is initialized with "newstate" variable.

This bug was introduced by commit ddedc1662 ("MEDIUM: fd: make
fd_insert/fd_delete atomically update fd.tgid"). No backport needed.
2022-07-19 12:11:04 +02:00
Christopher Faulet
f7ebe584d7 BUILD: debug: Add braces to if statement calling only CHECK_IF()
In src/ev_epoll.c, a CHECK_IF() is guarded by an if statement. So, when the
macro is empty, GCC (at least 11.3.1) is not happy because there is an if
statement with an empty body without braces... It is handled by
"-Wempty-body" option.

So, braces are added and GCC is now happy.

No backport needed.
2022-07-19 12:11:04 +02:00
Amaury Denoyelle
0933c7b3c8 BUG/MINOR: quic: do not send CONNECTION_CLOSE_APP in initial/handshake
As specified by RFC 9000, it is forbidden to send a CONNECTION_CLOSE of
type 0x1d (CONNECTION_CLOSE_APP) in an Initial or Handshake packet. It
must be converted to type 0x1c (CONNECTION_CLOSE) with APPLICATION_ERROR
code.

CONNECTION_CLOSE_APP are generated by QUIC MUX interaction. Thus,
special care must be taken when dealing with a 0-RTT packet, as this is
the only case where the MUX can be instantiated and quic-conn still on
the Initial or Handshake encryption level.

To enforce RFC 9000, xprt build packet function is now responsible to
translate a CONNECTION_CLOSE_APP if still on Initial/Handshake
encryption. This process is done in a dedicated function named
qc_build_cc_frm().

Without this patch, BUG_ON() statement in qc_build_frm() will be
triggered when building a CONNECTION_CLOSE_APP frame on Initial or
Handshake level. This is because QUIC_FT_CONNECTION_CLOSE_APP frame
builder mask does not allow these encryption levels, as opposed to
QUIC_FT_CONNECTION_CLOSE builder. This crash was reproduced by modifying
the H3 layer to force emission of a CONNECTION_CLOSE_APP on first frame
of a 0-RTT session.

Note however that CONNECTION_CLOSE emission during Handshake is a
complicated process for the server. For the moment, this is still
incomplete on haproxy side. RFC 9000 requires to emit it multiple times
in several packets under different encryption levels, depending on what
we know about the client encryption context.

This patch should be backported up to 2.6.
2022-07-19 11:19:50 +02:00
Willy Tarreau
03f3049df1 BUG/MINOR: tools: fix statistical_prng_range()'s output range
This function was added by commit 84ebfabf7 ("MINOR: tools: add
statistical_prng_range() to get a random number over a range") but it
contains a bug on the range, since mul32hi() covers the whole input
range, we must pass it range-1. For now it didn't have any impact, but
if used to find an array's index it will cause trouble.

This should be backported to 2.4.
2022-07-18 19:09:55 +02:00
William Lallemand
4348232231 BUG/MINOR: ssl: allow duplicate certificates in ca-file directories
It looks like OpenSSL 1.0.2 returns an error when trying to insert a
certificate whis is already present in a X509_STORE.

This patch simply ignores the X509_R_CERT_ALREADY_IN_HASH_TABLE error if
emitted.

Should fix part of issue #1780.

Must be backported in 2.6.
2022-07-18 18:49:27 +02:00
William Lallemand
3bda80789c BUG/MINOR: resolvers: shut off the warning for the default resolvers
When the resolv.conf file is empty or there is no resolv.conf file, an
empty resolvers will be created, which emits a warning during the
postparsing step.

This patch fixes the problem by freeing the resolvers section if the
parsing failed or if the nameserver list is empty.

Must be backported in 2.6, the previous patch which introduces
resolvers_destroy() is also required.
2022-07-18 14:39:36 +02:00
William Lallemand
e606c84fee MINOR: resolvers: resolvers_destroy() deinit and free a resolver
Split the resolvers_deinit() function into resolvers_destroy() and
resolvers_deinit() in order to be able to free a unique resolvers
section.
2022-07-18 14:39:36 +02:00
Willy Tarreau
5b3cd9561b BUG/MEDIUM: tools: avoid calling dlsym() in static builds (try 2)
The first approach in commit 288dc1d8e ("BUG/MEDIUM: tools: avoid calling
dlsym() in static builds") relied on dlopen() but on certain configs (at
least gcc-4.8+ld-2.27+glibc-2.17) it used to catch situations where it
ought not fail.

Let's have a second try on this using dladdr() instead. The variable was
renamed "build_is_static" as it's exactly what's being detected there.
We could even take it for reporting in -vv though that doesn't seem very
useful. At least the variable was made global to ease inspection via the
debugger, or in case it's useful later.

Now it properly detects a static build even with gcc-4.4+glibc-2.11.1 and
doesn't crash anymore.
2022-07-18 14:03:54 +02:00
Brad Smith
bc50e0d0fb BUILD: makefile: Fix install(1) handling for OpenBSD/NetBSD/Solaris/AIX
Add a new INSTALL variable to allow overridiing the flags passed to
install(1). install(1) on OpenBSD/NetBSD/Solaris/AIX does not support
the -v flag. With the new INSTALL variable and handling only use the
-v flag with the Linux targets.
2022-07-16 18:51:13 +02:00
Willy Tarreau
2200a9caef [RELEASE] Released version 2.7-dev2
Released version 2.7-dev2 with the following main changes :
    - BUG/MINOR: qpack: fix build with QPACK_DEBUG
    - MINOR: h3: handle errors on HEADERS parsing/QPACK decoding
    - BUG/MINOR: qpack: abort on dynamic index field line decoding
    - MINOR: qpack: properly handle invalid dynamic table references
    - MINOR: task: Add tasklet_wakeup_after()
    - BUG/MINOR: quic: Dropped packets not counted (with RX buffers full)
    - MINOR: quic: Add new stats counter to diagnose RX buffer overrun
    - MINOR: quic: Duplicated QUIC_RX_BUFSZ definition
    - MINOR: quic: Improvements for the datagrams receipt
    - CLEANUP: h2: Typo fix in h2_unsubcribe() traces
    - MINOR: quic: Increase the QUIC connections RX buffer size (upto 64Kb)
    - CLEANUP: mux-quic: adjust comment on qcs_consume()
    - MINOR: ncbuf: implement ncb_is_fragmented()
    - BUG/MINOR: mux-quic: do not signal FIN if gap in buffer
    - MINOR: fd: add a new FD_DISOWN flag to prevent from closing a deleted FD
    - BUG/MEDIUM: ssl/fd: unexpected fd close using async engine
    - MINOR: tinfo: make tid temporarily still reflect global ID
    - CLEANUP: config: remove unused proc_mask()
    - MINOR: debug: remove mask support from "debug dev sched"
    - MEDIUM: task: add and preset a thread ID in the task struct
    - MEDIUM: task/debug: move the ->thread_mask integrity checks to ->tid
    - MAJOR: task: use t->tid instead of ffsl(t->thread_mask) to take the thread ID
    - MAJOR: task: replace t->thread_mask with 1<<t->tid when thread mask is needed
    - CLEANUP: task: remove thread_mask from the struct task
    - MEDIUM: applet: only keep appctx_new_*() and drop appctx_new()
    - MEDIUM: task: only keep task_new_*() and drop task_new()
    - MINOR: applet: always use task_new_on() on applet creation
    - MEDIUM: task: remove TASK_SHARED_WQ and only use t->tid
    - MINOR: task: replace task_set_affinity() with task_set_thread()
    - CLEANUP: task: remove the unused task_unlink_rq()
    - CLEANUP: task: remove the now unused TASK_GLOBAL flag
    - MINOR: task: make rqueue_ticks atomic
    - MEDIUM: task: move the shared runqueue to one per thread
    - MEDIUM: task: replace the global rq_lock with a per-rq one
    - MINOR: task: remove grq_total and use rq_total instead
    - MINOR: task: replace global_tasks_mask with a check for tree's emptiness
    - MEDIUM: task: use regular eb32 trees for the run queues
    - MEDIUM: queue: revert to regular inter-task wakeups
    - MINOR: thread: make wake_thread() take care of the sleeping threads mask
    - MINOR: thread: move the flags to the shared cache line
    - MINOR: thread: only use atomic ops to touch the flags
    - MINOR: poller: centralize poll return handling
    - MEDIUM: polling: make update_fd_polling() not care about sleeping threads
    - MINOR: poller: update_fd_polling: wake a random other thread
    - MEDIUM: thread: add a new per-thread flag TH_FL_NOTIFIED to remember wakeups
    - MEDIUM: tasks/fd: replace sleeping_thread_mask with a TH_FL_SLEEPING flag
    - MINOR: tinfo: add the tgid to the thread_info struct
    - MINOR: tinfo: replace the tgid with tgid_bit in tgroup_info
    - MINOR: tinfo: add the mask of enabled threads in each group
    - MINOR: debug: use ltid_bit in ha_thread_dump()
    - MINOR: wdt: use ltid_bit in wdt_handler()
    - MINOR: clock: use ltid_bit in clock_report_idle()
    - MINOR: thread: use ltid_bit in ha_tkillall()
    - MINOR: thread: add a new all_tgroups_mask variable to know about active tgroups
    - CLEANUP: thread: remove thread_sync_release() and thread_sync_mask
    - MEDIUM: tinfo: add a dynamic thread-group context
    - MEDIUM: thread: make stopping_threads per-group and add stopping_tgroups
    - MAJOR: threads: change thread_isolate to support inter-group synchronization
    - MINOR: thread: add is_thread_harmless() to know if a thread already is harmless
    - MINOR: debug: mark oneself harmless while waiting for threads to finish
    - MINOR: wdt: do not rely on threads_to_dump anymore
    - MEDIUM: debug: make the thread dumper not rely on a thread mask anymore
    - BUILD: debug: fix build issue on clang with previous commit
    - BUILD: debug: re-export thread_dump_state
    - BUG/MEDIUM: threads: fix incorrect thread group being used on soft-stop
    - BUG/MEDIUM: thread: check stopping thread against local bit and not global one
    - MINOR: proxy: use tg->threads_enabled in hard_stop() to detect stopped threads
    - BUILD: Makefile: Add Lua 5.4 autodetect
    - CI: re-enable gcc asan builds
    - MEDIUM: mworker: set the iocb of the socketpair without using fd_insert()
    - MINOR: fd: Add BUG_ON checks on fd_insert()
    - CLEANUP: mworker: rename mworker_pipe to mworker_sockpair
    - CLEANUP: mux-quic: do not export qc_get_ncbuf
    - REORG: mux-quic: reorganize flow-control fields
    - MINOR: mux-quic: implement accessor for sedesc
    - MEDIUM: mux-quic: refactor streams opening
    - MINOR: mux-quic: rename qcs flag FIN_RECV to SIZE_KNOWN
    - MINOR: mux-quic: emit FINAL_SIZE_ERROR on invalid STREAM size
    - BUG/MINOR: peers/config: always fill the bind_conf's argument
    - BUG/MEDIUM: peers/config: properly set the thread mask
    - CLEANUP: bwlim: Set pointers to NULL when memory is released
    - BUG/MINOR: http-check: Preserve headers if not redefined by an implicit rule
    - BUG/MINOR: http-act: Properly generate 103 responses when several rules are used
    - BUG/MEDIUM: thread: mask stopping_threads with threads_enabled when checking it
    - CLEANUP: thread: also remove a thread's bit from stopping_threads on stop
    - BUG/MINOR: peers: fix possible NULL dereferences at config parsing
    - BUG/MINOR: http-htx: Fix scheme based normalization for URIs wih userinfo
    - MINOR: http: Add function to get port part of a host
    - MINOR: http: Add function to detect default port
    - BUG/MEDIUM: h1: Improve authority validation for CONNCET request
    - MINOR: http-htx: Use new HTTP functions for the scheme based normalization
    - BUG/MEDIUM: http-fetch: Don't fetch the method if there is no stream
    - REGTEESTS: filters: Fix CONNECT request in random-forwarding script
    - MEDIUM: mworker/systemd: send STATUS over sd_notify
    - BUG/MINOR: mux-h1: Be sure to commit htx changes in the demux buffer
    - BUG/MEDIUM: http-ana: Don't wait to have an empty buf to switch in TUNNEL state
    - BUG/MEDIUM: mux-h1: Handle connection error after a synchronous send
    - MEDIUM: epoll: don't synchronously delete migrated FDs
    - BUILD: debug: silence warning on gcc-5
    - BUILD: http: silence an uninitialized warning affecting gcc-5
    - BUG/MEDIUM: mux-quic: fix server chunked encoding response
    - REORG: mux-quic: rename stream initialization function
    - MINOR: mux-quic: rename stream purge function
    - MINOR: mux-quic: add traces on frame parsing functions
    - MINOR: mux-quic: implement qcs_alert()
    - MINOR: mux-quic: filter send/receive-only streams on frame parsing
    - MINOR: mux-quic: do not ack STREAM frames on unrecoverable error
    - MINOR: mux-quic: support stream opening via MAX_STREAM_DATA
    - MINOR: mux-quic: define basic stream states
    - MINOR: mux-quic: use stream states to mark as detached
    - MEDIUM: mux-quic: implement RESET_STREAM emission
    - MEDIUM: mux-quic: implement STOP_SENDING handling
    - BUG/MEDIUM: debug: fix possible hang when multiple threads dump at once
    - BUG/MINOR: quic: fix closing state on NO_ERROR code sent
    - CLEANUP: quic: clean up include on quic_frame-t.h
    - MINOR: quic: define a generic QUIC error type
    - MINOR: mux-quic: support app graceful shutdown
    - MINOR: mux-quic/h3: prepare CONNECTION_CLOSE on release
    - MEDIUM: quic: send CONNECTION_CLOSE on released MUX
    - CLEANUP: mux-quic: move qc_release()
    - MINOR: mux-quic: send one last time before release
    - MINOR: h3: store control stream in h3c
    - MINOR: h3: implement graceful shutdown with GOAWAY
    - BUG/MINOR: threads: produce correct global mask for tgroup > 1
    - BUG/MEDIUM: cli/threads: make "show threads" more robust on applets
    - BUG/MINOR: thread: use the correct thread's group in ha_tkillall()
    - BUG/MINOR: debug: enter ha_panic() only once
    - BUG/MEDIUM: debug: fix parallel thread dumps again
    - MINOR: cli/streams: show a stream's tgid next to its thread ID
    - DEBUG: cli: add a new "debug dev deadlock" expert command
    - MINOR: cli/activity: add a thread number argument to "show activity"
    - CLEANUP: applet: remove the obsolete command context from the appctx
    - MEDIUM: config: remove deprecated "bind-process" directives from frontends
    - MEDIUM: config: remove the "process" keyword on "bind" lines
    - MINOR: listener/config: make "thread" always support up to LONGBITS
    - CLEANUP: fd: get rid of the __GET_{NEXT,PREV} macros
    - MEDIUM: debug/threads: make the lock debugging take tgroups into account
    - MEDIUM: proto: stop protocols under thread isolation during soft stop
    - MEDIUM: poller: program the update in fd_update_events() for a migrated FD
    - MEDIUM: poller: disable thread-groups for poll() and select()
    - MINOR: thread: remove MAX_THREADS limitation
    - MEDIUM: cpu-map: replace the process number with the thread group number
    - MINOR: mworker/threads: limit the mworker sockets to group 1
    - MINOR: cli/threads: always bind CLI to thread group 1
    - MINOR: fd/thread: get rid of thread_mask()
    - MEDIUM: task/thread: move the task shared wait queues per thread group
    - MINOR: task: move the niced_tasks counter to the thread group context
    - DOC: design: add some thoughts about how to handle the update_list
    - MEDIUM: conn: make conn_backend_get always scan the same group
    - MAJOR: fd: remove pending updates upon real close
    - MEDIUM: fd/poller: make the update-list per-group
    - MINOR: fd: delete unused updates on close()
    - MINOR: fd: make fd_insert() apply the thread mask itself
    - MEDIUM: fd: add the tgid to the fd and pass it to fd_insert()
    - MINOR: cli/fd: show fd's tgid and refcount in "show fd"
    - MINOR: fd: add functions to manipulate the FD's tgid
    - MINOR: fd: add fd_get_running() to atomically return the running mask
    - MAJOR: fd: grab the tgid before manipulating running
    - MEDIUM: fd/poller: turn polled_mask to group-local IDs
    - MEDIUM: fd/poller: turn update_mask to group-local IDs
    - MEDIUM: fd/poller: turn running_mask to group-local IDs
    - MINOR: fd: make fd_clr_running() return the previous value instead
    - MEDIUM: fd: make thread_mask now represent group-local IDs
    - MEDIUM: fd: make fd_insert() take local thread masks
    - MEDIUM: fd: make fd_insert/fd_delete atomically update fd.tgid
    - MEDIUM: fd: quit fd_update_events() when FD is closed
    - MEDIUM: thread: change thread_resolve_group_mask() to return group-local values
    - MEDIUM: listener: switch bind_thread from global to group-local
    - MINOR: fd: add fd_reregister_all() to deal with boot-time FDs
    - MEDIUM: fd: support stopping FDs during starting
    - MAJOR: pollers: rely on fd_reregister_all() at boot time
    - MAJOR: poller: only touch/inspect the update_mask under tgid protection
    - MEDIUM: fd: support broadcasting updates for foreign groups in updt_fd_polling
    - CLEANUP: threads: remove the now unused all_threads_mask and tid_bit
    - MINOR: config: change default MAX_TGROUPS to 16
    - BUG/MEDIUM: tools: avoid calling dlsym() in static builds
2022-07-16 17:17:22 +02:00
Willy Tarreau
288dc1d8ee BUG/MEDIUM: tools: avoid calling dlsym() in static builds
Since 2.4 with commit 64192392c ("MINOR: tools: add functions to retrieve
the address of a symbol"), we can resolve symbols. However some old glibc
crash in dlsym() when the program is statically built.

Fortunately even on these old libs we can detect lack of support by
calling dlopen(NULL). Normally it returns a handle to the current
program, but on a static build it returns NULL. This is sufficient to
refrain from calling dlsym() (which will be of very limited use anyway),
so we check this once at boot and use the result when needed.

This may be backported to 2.4. On stable versions, be careful to place
the init code inside an if/endif guard that checks for DL support.
2022-07-16 13:49:34 +02:00
Willy Tarreau
856d56d2d2 MINOR: config: change default MAX_TGROUPS to 16
This will allows nbtgroups > 1 to be declared in the config without
recompiling. The theoretical limit is 64, though we'd rather not push
it too far for now as some structures might be enlarged to be indexed
per group. Let's start with 16 groups max, allowing to experiment with
dual-socket machines suffering from up to 8 loosely coupled L3 caches.
It's a good start and doesn't engage us too far.
2022-07-15 21:51:48 +02:00
Willy Tarreau
c6b596dcce CLEANUP: threads: remove the now unused all_threads_mask and tid_bit
Since these are not used anymore, let's now remove them. Given the
number of places where we're using ti->ldit_bit, maybe an equivalent
might be useful though.
2022-07-15 20:25:41 +02:00
Willy Tarreau
cfdd20a0b2 MEDIUM: fd: support broadcasting updates for foreign groups in updt_fd_polling
We're still facing the situation where it's impossible to update an FD
for a foreign group. That's of particular concern when disabling/enabling
listeners (e.g. pause/resume on signals) since we don't decide which thread
gets the signal and it needs to process all listeners at once.

Fortunately, not that much is unprotected in FDs. This patch adds a test for
tgid's equality in updt_fd_polling() so that if a change is applied for a
foreing group, then it's detected and taken care of separately. The method
consists in forcing the update on all bound threads in this group, adding it
to the group's update_list, and sending a wake-up as would be done for a
remote thread in the local group, except that this is done by grabbing a
reference to the FD's tgid.

Thanks to this, SIGTTOU/SIGTTIN now work for nbtgroups > 1 (after that was
temporarily broken by "MEDIUM: fd/poller: make the update-list per-group").
2022-07-15 20:25:41 +02:00
Willy Tarreau
1f947cb39e MAJOR: poller: only touch/inspect the update_mask under tgid protection
With thread groups and group-local masks, the update_mask cannot be
touched nor even checked if it may change below us. In order to avoid
this, we have to grab a reference to the FD's tgid before checking the
update mask. The operations are cheap enough so that we don't notice it
in performance tests. This is expected because the risk of meeting a
reassigned FD during an update remains very low.

It's worth noting that the tgid cannot be trusted during startup nor
during soft-stop since that may come from anywhere at the moment. Since
soft-stop runs under thread isolation we use that hint to decide whether
or not to check that the FD's tgid matches the current one.

The modification is applied to the 3 thread-aware pollers, i.e. epoll,
kqueue, and evports. Also one poll_drop counter was missing for shared
updates, though it might be hard to trigger it.

With this change applied, thread groups are usable in benchmarks.
2022-07-15 20:16:30 +02:00
Willy Tarreau
d95f18fa39 MAJOR: pollers: rely on fd_reregister_all() at boot time
The poller-specific thread init code now uses that new function to
safely register boot events. This ensures that we don't register an
event for another group and that we properly deal with parallel
thread startup.

It's only done for thread-aware pollers, there's no point in using
that in poll/select though that should work as well.
2022-07-15 20:16:30 +02:00
Willy Tarreau
9baff4ffd9 MEDIUM: fd: support stopping FDs during starting
There's a nasty case during boot, which is the master process. It stops
all listeners from the main thread, and as such we're seeing calls to
fd_delete() from a thread that doesn't match the FD's mask, but more
importantly from a group that doesn't match either. Fortunately this
happens in a process that doesn't see the threads creation, so the FDs
are left intact in the table and we can overwrite the tgid there.

The approach is ugly, it probably shows that we should use a dummy
value for the tgid during boot, that would be replaced once the FDs
migrate to their target, but we also need a way to make sure not to
miss them. Also that doesn't solve the possibility of closing a
listener at run time from the wrong thread group.
2022-07-15 20:16:30 +02:00
Willy Tarreau
88c4c14050 MINOR: fd: add fd_reregister_all() to deal with boot-time FDs
At boot the pollers are allocated for each thread and they need to
reprogram updates for all FDs they will manage. This code is not
trivial, especially when trying to respect thread groups, so we'd
rather avoid duplicating it.

Let's centralize this into fd.c with this function. It avoids closed
FDs, those whose thread mask doesn't match the requested one or whose
thread group doesn't match the requested one, and performs the update
if required under thread-group protection.
2022-07-15 20:16:30 +02:00
Willy Tarreau
d0b73bca71 MEDIUM: listener: switch bind_thread from global to group-local
It requires to both adapt the parser and change the algorithm to
redispatch incoming traffic so that local threads IDs may always
be used.

The internal structures now only reference thread group IDs and
group-local masks which are compatible with those now used by the
FD layer and the rest of the code.
2022-07-15 20:16:30 +02:00
Willy Tarreau
6018c02c36 MEDIUM: thread: change thread_resolve_group_mask() to return group-local values
It used to turn group+local to global but now we're doing the exact
opposite as we want to stick to group-local masks. This means that
"thread 3-4" might very well emit what "thread 2/1-2" used to emit
till now for 2 groups and 4 threads. This is needed because we'll
have to support group-local thread masks in receivers.

However the rest of the code (receivers) is not ready yet for this,
so using this code with more than one thread group will definitely
break some bindings.
2022-07-15 20:16:30 +02:00
Willy Tarreau
0b51eab764 MEDIUM: fd: quit fd_update_events() when FD is closed
The IOCB might have closed the FD itself, so it's not an error to
have fd.tgid==0 or anything else, nor to have a null running_mask.

In fact there are different conditions under which we can leave the
IOCB, all of them have been enumerated in the code's comments (namely
FD still valid and used, hence has running bit, FD closed but not yet
reassigned thus running==0, FD closed and reassigned, hence different
tgid and running becomes irrelevant, just like all other masks). For
this reason we have no other solution but to try to grab the tgid on
return before checking the other bits. In practice it doesn't represent
a big cost, because if the FD was closed and reassigned, it's instantly
detected and the bit is immediately released without blocking other
threads, and if the FD wasn't closed this doesn't prevent it from
being migrated to another thread. In the worst case a close by another
thread after a migration will be postponed till the moment the running
bit is cleared, which is the same as before.
2022-07-15 20:16:30 +02:00
Willy Tarreau
ddedc16624 MEDIUM: fd: make fd_insert/fd_delete atomically update fd.tgid
These functions need to set/reset the FD's tgid but when they're called
there may still be wakeups on other threads that discover late updates
and have to touch the tgid at the same time. As such, it is not possible
to just read/write the tgid there. It must only be done using operations
that are compatible with what other threads may be doing.

As we're using inc/dec on the refcount, it's safe to AND the area to zero
the lower part when resetting the value. However, in order to set the
value, there's no other choice but fd_claim_tgid() which will assign it
only if possible (via a CAS). This is convenient in the end because it
protects the FD's masks from being modified by late threads, so while
we hold this refcount we can safely reset the thread_mask and a few other
elements. A debug test for non-null masks was added to fd_insert() as it
must not be possible to face this situation thanks to the protection
offered by the tgid.
2022-07-15 20:16:30 +02:00
Willy Tarreau
27a3245599 MEDIUM: fd: make fd_insert() take local thread masks
fd_insert() was already given a thread group ID and a global thread mask.
Now we're changing the few callers to take the group-local thread mask
instead. It's passed directly into the FD's thread mask. Just like for
previous commit, it must not change anything when a single group is
configured.
2022-07-15 20:16:30 +02:00
Willy Tarreau
3638d174e5 MEDIUM: fd: make thread_mask now represent group-local IDs
With the change that was started on other masks, the thread mask was
still not fully converted, sometimes being used as a global mask and
sometimes as a local one. This finishes the code modifications so that
the mask is always considered as a group-local mask. This doesn't
change anything as long as there's a single group, but is necessary
for groups 2 and above since it's used against running_mask and so on.
2022-07-15 20:16:30 +02:00
Willy Tarreau
d6e1987612 MINOR: fd: make fd_clr_running() return the previous value instead
It's an AND so it destroys information and due to this there's a call
place where we have to perform two reads to know the previous value
then to change it. With a fetch-and-and instead, in a single operation
we can know if the bit was previously present, which is more efficient.
2022-07-15 20:16:30 +02:00
Willy Tarreau
a707d02657 MEDIUM: fd/poller: turn running_mask to group-local IDs
From now on, the FD's running_mask only refers to local thread IDs. However,
there remains a limitation, in updt_fd_polling(), we temporarily have to
check and set shared FDs against .thread_mask, which still contains global
ones. As such, nbtgroups > 1 may break (but this is not yet supported without
special build options).
2022-07-15 20:16:30 +02:00