Commit Graph

20022 Commits

Author SHA1 Message Date
Amaury Denoyelle
7c5591facb BUG/MEDIUM: mux-quic: improve streams fairness to prevent early timeout
Since the following mentioned patch, a send-list mechanism was
implemented to improve streams priorization on sending.
  commit 20f2a425ff
  MAJOR: mux-quic: rework stream sending priorization

This is done to prevent the same streams to always be used as first ones
on emission. However there is still a flaw on the algorithm. Once put in
the send-list, a streams is not removed until it has sent all of its
content. When a stream transfers a large object, it will remain in the
send-list during all the transfer and will soon monopolize the first
place. the stream does never leave its position until the transfer is
finished and will monopolize the first place. Other streams behind won't
have the opportunity to advance on their own transfers due to a Tx
buffer exhaustion.

This situation is especially problematic if a small timeout client is
used. As some streams won't advance on their transfer for a long period
of time, they will be aborted due to a stream layer timeout client
causing a RESET_STREAM emission.

To fix this, during sending each stream with at least some bytes
transferred from its tx.buf to qc_stream_desc out buffer is put at the
end of the send-list. This ensures that on the next iteration streams
that cannot transfer anything will be used in priority.

This patch improves significantly h2load benchmarks for large objects
with several streams opened in parallel on a single connection. Without
it, errors may be reported by h2load for aborted streams. For example,
this improved the following scenario on a 10mbit/s link with a 10s
timeout client :
  $ ./build/bin/h2load --npn-list h3 -t 1 -c 1 -m 30 -n 30 https://198.18.10.11:20443/?s=500k

This fix may help with the github issue #2004 where chrome browser stop
to use QUIC after receiving RESET_STREAM frames.

This should be backported up to 2.7.
2023-04-26 17:50:16 +02:00
Amaury Denoyelle
24962dd178 BUG/MEDIUM: mux-quic: do not emit RESET_STREAM for unknown length
Some HTX responses may not always contain a EOM block. For example this
is the case if content-length header is missing from the HTTP server
response. Stream termination is thus signaled to QUIC mux via shutw
callback. However, this is interpreted inconditionnally as an early
close by the mux with a RESET_STREAM emission. Most of the times, QUIC
clients report this as an error.

To fix this, check if htx.extra is set to HTX_UNKOWN_PAYLOAD_LENGTH for
a qcs instance. If true, shutw will never be used to emit a
RESET_STREAM. Instead, the stream will be closed properly with a FIN
STREAM frame. If all data were already transfered, an empty STREAM frame
is sent.

This fix may help with the github issue #2004 where chrome browser stop
to use QUIC after receiving RESET_STREAM frames.

This issue was reported by Vladimir Zakharychev. Thanks to him for his
help and testing. It was also reproduced locally using httpterm with the
query string "/?s=1k&b=0&C=1".

This should be backported up to 2.7.
2023-04-26 17:50:09 +02:00
William Lallemand
8c4d7eeff2 MINOR: acme.sh: add the deploy script for acme.sh in admin directory
Add the acme.sh deploy script for haproxy in the admin directory so
users can have an official download source.
2023-04-26 17:32:15 +02:00
Willy Tarreau
e8ae99b111 DEV: h2: support reading frame payload from a file
Now we can build a series of data frames by reading from a file and
chunking it into frames of requested length. It's mostly useful for
data frames (e.g. post). One way to announce these upfront is to
capture the output of curl use without content-length:

  $ nc -lp4446 > post-h2-nocl.bin
  $ curl -v --http2-prior-knowledge http://127.0.0.1:4446/url -H "content-length:" -d @/dev/null

Then just change the 5th byte from the end from 1 to 0 to remove the
end-of-stream bit, it will allow to chain a file, then to send an
empty DATA frame with ES set :

  $ (dev/h2/mkhdr.sh -i 1 -t data -d CHANGELOG;
     dev/h2/mkhdr.sh -i 1 -t data -l 0 -f es) > h2-data-changelog.bin

Then post that to the server:
  $ cat post-h2-nocl.bin h2-data-changelog.bin | nc 0 4446
2023-04-26 11:36:02 +02:00
Willy Tarreau
78bb934607 DEV: h2: add a script "mkhdr" to build h2 frames from scratch
It's a real pain to try to trigger certain edge cases in h2, so let's
write a simple tool aiming at creating frames.
2023-04-26 11:35:45 +02:00
Willy Tarreau
543e2544ca DEBUG: crash using an invalid opcode on aarch64 instead of an invalid access
On aarch64 there's also a guaranted invalid instruction, called UDF, and
which even supports an optional 16-bit immediate operand:

   https://developer.arm.com/documentation/ddi0596/2021-12/Base-Instructions/UDF--Permanently-Undefined-?lang=en

It's conveniently encoded as 4 zeroes (when the operand is zero). It's
unclear when support for it was added into GAS, if at all; even a
not-so-old 2.27 doesn't know about it. Let's byte-encode it.

Tested on an A72 and works as expected.
2023-04-25 19:53:39 +02:00
Willy Tarreau
77787ec9bc DEBUG: crash using an invalid opcode on x86/x86_64 instead of an invalid access
BUG_ON() calls currently trigger a segfault. This is more convenient
than abort() as it doesn't rely on any function call nor signal handler
and never causes non-unwindable stacks when opening cores. But it adds
quite some confusion in bug reports which are rightfully tagged "segv"
and do not instantly allow to distinguish real segv (e.g. null derefs)
from code asserts.

Some CPU architectures offer various crashing methods. On x86 we have
INT3 (0xCC), which stops into the debugger, and UD0/UD1/UD2. INT3 looks
appealing but for whatever reason (maybe signal handling somewhere) it
loses the last call point in the stack, making backtraces unusable. UD2
has the merit of being only 2 bytes and causing an invalid instruction,
which almost never happens normally, so it's easily distinguishable.
Here it was defined as a macro so that the line number in the core
matches the one where the BUG_ON() macro is called, and the debugger
shows the last frame exactly at its calligg point.

E.g. when calling "debug dev bug":

Program terminated with signal SIGILL, Illegal instruction.
  #0  debug_parse_cli_bug (args=<optimized out>, payload=<optimized out>, appctx=<optimized out>, private=<optimized out>) at src/debug.c:408
  408             BUG_ON(one > zero);
  [Current thread is 1 (Thread 0x7f7a660cc1c0 (LWP 14238))]
  (gdb) bt
  #0  debug_parse_cli_bug (args=<optimized out>, payload=<optimized out>, appctx=<optimized out>, private=<optimized out>) at src/debug.c:408
  #1  debug_parse_cli_bug (args=<optimized out>, payload=<optimized out>, appctx=<optimized out>, private=<optimized out>) at src/debug.c:402
  #2  0x000000000061a69f in cli_parse_request (appctx=appctx@entry=0x181c0160) at src/cli.c:832
  #3  0x000000000061af86 in cli_io_handler (appctx=0x181c0160) at src/cli.c:1035
  #4  0x00000000006ca2f2 in task_run_applet (t=0x181c0290, context=0x181c0160, state=<optimized out>) at src/applet.c:449
2023-04-25 18:51:10 +02:00
Frédéric Lécaille
7d23e8d1a6 CLEANUP: quic: Rename several <buf> variables into quic_sock.c
Rename some variables which are not struct buffer variables.

Should be backported to 2.7.
2023-04-24 15:53:27 +02:00
Frédéric Lécaille
bb426aa5f1 CLEANUP: quic: Rename <buf> variable into qc_parse_hd_form()
There is no struct buffer variable manipulated by this function.

Should be backported to 2.7.
2023-04-24 15:53:27 +02:00
Frédéric Lécaille
6ff52f9ce5 CLEANUP: quic: Rename <buf> variable into quic_packet_read_long_header()
Make this function be more readable: there is no struct buffer variable passed
as parameter to this function.

Should be backported to 2.7.
2023-04-24 15:53:27 +02:00
Frédéric Lécaille
81a02b59f5 CLEANUP: quic: Rename several <buf> variables at low level
Make quic_stateless_reset_token_cpy(), quic_derive_cid() and quic_get_cid_tid()
be more readable: there is no struct buffer variable manipulated by these
functions.

Should be backported to 2.7.
2023-04-24 15:53:27 +02:00
Frédéric Lécaille
182934d80b CLEANUP: quic: Rename quic_get_dgram_dcid() <buf> variable
quic_get_dgram_dcid() does not manipulate any struct buffer variable.

Should be backported to 2.7.
2023-04-24 15:53:26 +02:00
Frédéric Lécaille
1e0f8255a1 CLEANUP: quic: Make qc_build_pkt() be more readable
There is no <buf> variable passed to this function.
Also rename <buf_end> to <end> to mimic others functions.
Rename <beg> to <first_byte> and <end> to <last_byte>.

Should be backported to 2.7.
2023-04-24 15:53:26 +02:00
Frédéric Lécaille
3adb9e85a1 CLEANUP: quic: Rename <buf> variable for several low level functions
Make quic_build_packet_long_header(), quic_build_packet_short_header() and
quic_apply_header_protection() be more readable: there is no struct buffer
variables used by these functions.

Should be backported to 2.7.
2023-04-24 15:53:26 +02:00
Frédéric Lécaille
bef3098d33 CLEANUP: quic: Rename <buf> variable into quic_rx_pkt_parse()
Make this function be more readable: there is no struct buffer variable used
by this function.

Should be backported to 2.7.
2023-04-24 15:53:26 +02:00
Frédéric Lécaille
7f0b1c7016 CLEANUP: quic: Rename <buf> variable into quic_padding_check()
Make quic_padding_check() be more readable: there is not struct buffer variable
used by this function.

Should be backported to 2.7.
2023-04-24 15:53:26 +02:00
Frédéric Lécaille
dad0ede28a CLEANUP: quic: Rename <buf> variable to <token> in quic_generate_retry_token()
Make quic_generate_retry_token() be more readable: there is no struct buffer
variable used in this function.

Should be backported to 2.7.
2023-04-24 15:53:26 +02:00
Frédéric Lécaille
e66d67a1ae CLEANUP: quic: Remove useless parameters passes to qc_purge_tx_buf()
Remove the pointer to the connection passed as parameters to qc_purge_tx_buf()
and other similar function which came with qc_purge_tx_buf() implementation.
They were there do track the connection during tests.

Must be backported to 2.7.
2023-04-24 15:53:26 +02:00
Amaury Denoyelle
d5f03cd576 CLEANUP: quic: rename frame variables
Rename all frame variables with the suffix _frm. This helps to
differentiate frame instances from other internal objects.

This should be backported up to 2.7.
2023-04-24 15:35:22 +02:00
Amaury Denoyelle
888c5f283a CLEANUP: quic: rename frame types with an explicit prefix
Each frame type used in quic_frame union has been renamed with the
following prefix "qf_". This helps to differentiate frame instances from
other internal objects.

This should be backported up to 2.7.
2023-04-24 15:35:03 +02:00
Frédéric Lécaille
b73762ad78 BUG/MINOR: quic: Useless I/O handler task wakeups (draining, killing state)
From the idle_timer_task(), the I/O handler must be woken up to send ack. But
there is no reason to do that in draining state or killing state. In draining
state this is even forbidden.

Must be backported to 2.7.
2023-04-24 11:47:11 +02:00
Frédéric Lécaille
d21c628ffd BUG/MINOR: quic: Useless probing retransmission in draining or killing state
The timer task responsible of triggering probing retransmission did not inspect
the state of the connection before doing its job. But there is no need to
probe the peer when the connection is in draining or killing state. About the
draining state, this is even forbidden.

Must be backported to 2.7 and 2.6.
2023-04-24 11:46:33 +02:00
Frédéric Lécaille
c6bec2a3af BUG/MINOR: quic: Possible leak during probing retransmissions
qc_dgrams_retransmit() prepares two list of frames to be retransmitted into
two datagrams. If the first datagram could not be sent, the TX buffer will
be purged with the prepared packet and its frames, but this was not the case for
the second list of frames.

Must be backported in 2.7.
2023-04-24 11:38:28 +02:00
Frédéric Lécaille
ce0bb338c6 BUG/MINOR: quic: Possible memory leak from TX packets
This bug arrived with this commit which was not sufficient:

     BUG/MEDIUM: quic: Missing TX buffer draining from qc_send_ppkts()

Indeed, there were also remaining allocated TX packets to be released and
their TX frames.
Implement qc_purge_tx_buf() to do so which depends on qc_free_tx_coalesced_pkts()
and qc_free_frm_list().

Must be backported to 2.7.
2023-04-24 11:38:28 +02:00
Frédéric Lécaille
e95e00e305 MINOR: quic: Move traces at proto level
These traces has already been useful to debug issues.

Must be backported to 2.7 and 2.6.
2023-04-24 11:38:16 +02:00
Willy Tarreau
3b50e5c164 [RELEASE] Released version 2.8-dev8
Released version 2.8-dev8 with the following main changes :
    - BUG/MEDIUM: cli: Set SE_FL_EOI flag for '_getsocks' and 'quit' commands
    - BUG/MEDIUM: cli: Eat output data when waiting for appctx shutdown
    - BUG/MEDIUM: http-client: Eat output data when waiting for appctx shutdown
    - BUG/MEDIUM: stats: Eat output data when waiting for appctx shutdown
    - BUG/MEDIUM: log: Eat output data when waiting for appctx shutdown
    - BUG/MEDIUM: dns: Kill idle DNS sessions during stopping stage
    - BUG/MINOR: resolvers: Wakeup DNS idle task on stopping
    - BUG/MEDIUM: resolvers: Force the connect timeout for DNS resolutions
    - MINOR: hlua: Stop to check the SC state when executing a hlua cli command
    - BUG/MEDIUM: mux-h1: Report EOI when a TCP connection is upgraded to H2
    - BUG/MEDIUM: mux-h2: Never set SE_FL_EOS without SE_FL_EOI or SE_FL_ERROR
    - MINOR: quic: Trace fix in quic_pto_pktns() (handshaske status)
    - BUG/MINOR: quic: Wrong packet number space probing before confirmed handshake
    - MINOR: quic: Modify qc_try_rm_hp() traces
    - MINOR: quic: Dump more information at proto level when building packets
    - MINOR: quic: Add a trace for packet with an ACK frame
    - MINOR: activity: add a line reporting the average CPU usage to "show activity"
    - BUG/MINOR: stick_table: alert when type len has incorrect characters
    - MINOR: thread: keep a bitmask of enabled groups in thread_set
    - MINOR: fd: optimize fd_claim_tgid() for use in fd_insert()
    - MINOR: fd: add a lock bit with the tgid
    - MINOR: fd: implement fd_migrate_on() to migrate on a non-local thread
    - MINOR: receiver: reserve special values for "shards"
    - MINOR: bind-conf: support a new shards value: "by-group"
    - BUG/MEDIUM: fd: don't wait for tmask to stabilize if we're not in it.
    - MINOR: quic: Add packet loss and maximum cc window to "show quic"
    - BUG/MINOR: quic: Ignored less than 1ms RTTs
    - MINOR: quic: Add connection flags to traces
    - BUG/MEDIUM: quic: Code sanitization about acknowledgements requirements
    - BUG/MINOR: quic: Possible wrapped values used as ACK tree purging limit.
    - BUG/MINOR: quic: SIGFPE in quic_cubic_update()
    - MINOR: quic: Display the packet number space flags in traces
    - MINOR: quic: Remove a useless test about probing in qc_prep_pkts()
    - BUG/MINOR: quic: Wrong Application encryption level selection when probing
    - CI: bump "actions/checkout" to v3 for cross zoo matrix
    - CI: enable monthly test on Fedora Rawhide
    - BUG/MINOR: stream: Fix test on SE_FL_ERROR on the wrong entity
    - BUG/MEDIUM: stream: Report write timeouts before testing the flags
    - BUG/MEDIUM: stconn: Do nothing in sc_conn_recv() when the SC needs more room
    - MINOR: stream: Uninline and export sess_set_term_flags() function
    - MINOR: filters: Review and simplify errors handling
    - REGTESTS: fix the race conditions in log_uri.vtc
    - MINOR: channel: Forwad close to other side on abort
    - MINOR: stream: Introduce stream_abort() to abort on both sides in same time
    - MINOR: stconn: Rename SC_FL_SHUTR_NOW in SC_FL_ABRT_WANTED
    - MINOR: channel/stconn: Replace channel_shutr_now() by sc_schedule_abort()
    - MINOR: stconn: Rename SC_FL_SHUTW_NOW in SC_FL_SHUT_WANTED
    - MINOR: channel/stconn: Replace channel_shutw_now() by sc_schedule_shutdown()
    - MINOR: stconn: Rename SC_FL_SHUTR in SC_FL_ABRT_DONE
    - MINOR: channel/stconn: Replace sc_shutr() by sc_abort()
    - MINOR: stconn: Rename SC_FL_SHUTW in SC_FL_SHUT_DONE
    - MINOR: channel/stconn: Replace sc_shutw() by sc_shutdown()
    - MINOR: tree-wide: Replace several chn_cons() by the corresponding SC
    - MINOR: tree-wide: Replace several chn_prod() by the corresponding SC
    - BUG/MINOR: cli: Don't close when SE_FL_ERR_PENDING is set in cli analyzer
    - MINOR: stconn: Stop to set SE_FL_ERROR on sending path
    - MEDIUM: stconn: Forbid applets with more to deliver if EOI was reached
    - MINOR: stconn: Don't clear SE_FL_ERROR when endpoint is reset
    - MINOR: stconn: Add a flag to ack endpoint errors at SC level
    - MINOR: backend: Set SC_FL_ERROR on connection error
    - MINOR: stream: Set SC_FL_ERROR on channels' buffer allocation error
    - MINOR: tree-wide: Test SC_FL_ERROR with SE_FL_ERROR from upper layer
    - MEDIUM: tree-wide: Stop to set SE_FL_ERROR from upper layer
    - MEDIUM: backend: Stop to use SE flags to detect connection errors
    - MEDIUM: stream: Stop to use SE flags to detect read errors from analyzers
    - MEDIUM: stream: Stop to use SE flags to detect endpoint errors
    - MEDIUM: stconn: Rely on SC flags to handle errors instead of SE flags
    - BUG/MINOR: stconn: Don't set SE_FL_ERROR at the end of sc_conn_send()
    - BUG/MINOR: quic: Do not use ack delay during the handshakes
    - CLEANUP: use "offsetof" where appropriate
    - MINOR: ssl: remove OpenSSL 1.0.2 mention into certificate loading error
    - BUG/MEDIUM: http-ana: Properly switch the request in tunnel mode on upgrade
    - BUG/MEDIUM: log: Properly handle client aborts in syslog applet
    - MINOR: stconn: Add a flag to report EOS at the stream-connector level
    - MINOR: stconn: Propagate EOS from a mux to the attached stream-connector
    - MINOR: stconn: Propagate EOS from an applet to the attached stream-connector
    - MINOR: mux-h2: make the initial window size configurable per side
    - MINOR: mux-h2: make the max number of concurrent streams configurable per side
    - BUG/MINOR: task: allow to use tasklet_wakeup_after with tid -1
    - CLEANUP: quic: remove unused QUIC_LOCK label
    - CLEANUP: quic: remove unused scid_node
    - CLEANUP: quic: remove unused qc param on stateless reset token
    - CLEANUP: quic: rename quic_connection_id vars
    - MINOR: quic: remove uneeded tasklet_wakeup after accept
    - MINOR: quic: adjust Rx packet type parsing
    - MINOR: quic: adjust quic CID derive API
    - MINOR: quic: remove TID ref from quic_conn
    - MEDIUM: quic: use a global CID trees list
    - MINOR: quic: remove TID encoding in CID
    - MEDIUM: quic: handle conn bootstrap/handshake on a random thread
    - MINOR: quic: do not proceed to accept for closing conn
    - MINOR: protocol: define new callback set_affinity
    - MINOR: quic: delay post handshake frames after accept
    - MEDIUM: quic: implement thread affinity rebinding
    - BUG/MINOR: quic: transform qc_set_timer() as a reentrant function
    - MINOR: quic: properly finalize thread rebinding
    - MAJOR: quic: support thread balancing on accept
    - MINOR: listener: remove unneeded local accept flag
    - BUG/MINOR: http-ana: Update analyzers on both sides when switching in TUNNEL mode
    - CLEANUP: backend: Remove useless debug message in assign_server()
    - CLEANUP: cli: Remove useless debug message in cli_io_handler()
    - BUG/MEDIUM: stconn: Propagate error on the SC on sending path
    - MINOR: config: add "no-alpn" support for bind lines
    - REGTESTS: add a new "ssl_alpn" test to test ALPN negotiation
    - DOC: add missing documentation for "no-alpn" on bind lines
    - MINOR: ssl: do not set ALPN callback with the empty string
    - MINOR: ssl_crtlist: dump "no-alpn" on "show crtlist" when "no-alpn" was set
    - MEDIUM: config: set useful ALPN defaults for HTTPS and QUIC
    - BUG/MEDIUM: quic: prevent crash on Retry sending
    - BUG/MINOR: cfgparse: make sure to include openssl-compat
    - MINOR: clock: add now_mono_time_fast() function
    - MINOR: clock: add now_cpu_time_fast() function
    - MEDIUM: hlua: reliable timeout detection
    - MEDIUM: hlua: introduce tune.lua.burst-timeout
    - CLEANUP: hlua: avoid confusion between internal timers and tick based timers
    - MINOR: hlua: hook yield on known lua state
    - MINOR: hlua: safe coroutine.create()
    - BUG/MINOR: quic: Stop removing ACK ranges when building packets
    - MINOR: quic: Do not allocate too much ack ranges
    - BUG/MINOR: quic: Unchecked buffer length when building the token
    - BUG/MINOR: quic: Wrong Retry token generation timestamp computing
    - BUG/MINOR: mux-quic: fix crash with app ops install failure
    - BUG/MINOR: mux-quic: properly handle STREAM frame alloc failure
    - BUG/MINOR: h3: fix crash on h3s alloc failure
    - BUG/MINOR: quic: prevent crash on qc_new_conn() failure
    - BUG/MINOR: quic: consume Rx datagram even on error
    - CLEANUP: errors: fix obsolete function comments
    - CLEANUP: server: fix update_status() function comment
    - MINOR: server/event_hdl: add proxy_uuid to event_hdl_cb_data_server
    - MINOR: hlua/event_hdl: rely on proxy_uuid instead of proxy_name for lookups
    - MINOR: hlua/event_hdl: expose proxy_uuid variable in server events
    - MINOR: hlua/event_hdl: fix return type for hlua_event_hdl_cb_data_push_args
    - MINOR: server/event_hdl: prepare for upcoming refactors
    - BUG/MINOR: event_hdl: don't waste 1 event subtype slot
    - CLEANUP: event_hdl: updating obsolete comment for EVENT_HDL_CB_DATA
    - CLEANUP: event_hdl: fix comment typo about _sync assertion
    - MINOR: event_hdl: dynamically allocated event data members
    - MINOR: event_hdl: provide event->when for advanced handlers
    - MINOR: hlua/event_hdl: timestamp for events
    - DOC: lua: restore 80 char limitation
    - BUG/MINOR: server: incorrect report for tracking servers leaving drain
    - MINOR: server: explicitly commit state change in srv_update_status()
    - BUG/MINOR: server: don't miss proxy stats update on server state transitions
    - BUG/MINOR: server: don't miss server stats update on server state transitions
    - BUG/MINOR: server: don't use date when restoring last_change from state file
    - MINOR: server: central update for server counters on state change
    - MINOR: server: propagate server state change to lb through single function
    - MINOR: server: propagate lb changes through srv_lb_propagate()
    - MINOR: server: change adm_st_chg_cause storage type
    - MINOR: server: srv_append_status refacto
    - MINOR: server: change srv_op_st_chg_cause storage type
    - CLEANUP: server: remove unused variables in srv_update_status()
    - CLEANUP: server: fix srv_set_{running, stopping, stopped} function comment
    - MINOR: server: pass adm and op cause to srv_update_status()
    - MEDIUM: server: split srv_update_status() in two functions
    - MINOR: server/event_hdl: prepare for server event data wrapper
    - MINOR: quic: support migrating the listener as well
    - MINOR: quic_sock: index li->per_thr[] on local thread id, not global one
    - MINOR: listener: support another thread dispatch mode: "fair"
    - MINOR: receiver: add a struct shard_info to store info about each shard
    - MINOR: receiver: add RX_F_MUST_DUP to indicate that an rx must be duped
    - MEDIUM: proto: duplicate receivers marked RX_F_MUST_DUP
    - MINOR: proto: skip socket setup for duped FDs
    - MEDIUM: config: permit to start a bind on multiple groups at once
    - MINOR: listener: make accept_queue index atomic
    - MEDIUM: listener: rework thread assignment to consider all groups
    - MINOR: listener: use a common thr_idx from the reference listener
    - MINOR: listener: resync with the thread index before heavy calculations
    - MINOR: listener: make sure to avoid ABA updates in per-thread index
    - MINOR: listener: always compare the local thread as well
    - MINOR: Make `tasklet_free()` safe to be called with `NULL`
    - CLEANUP: Stop checking the pointer before calling `tasklet_free()`
    - CLEANUP: Stop checking the pointer before calling `pool_free()`
    - CLEANUP: Stop checking the pointer before calling `task_free()`
    - CLEANUP: Stop checking the pointer before calling `ring_free()`
    - BUG/MINOR: cli: clarify error message about stats bind-process
    - CI: cirrus-ci: bump FreeBSD image to 13-1
    - REGTESTS: remove unsupported "stats bind-process" keyword
    - CI: extend spellchecker whitelist, add "clen" as well
    - CLEANUP: assorted typo fixes in the code and comments
    - BUG/MINOR: sock_inet: use SO_REUSEPORT_LB where available
    - BUG/MINOR: tools: check libssl and libcrypto separately
    - BUG/MINOR: config: fix NUMA topology detection on FreeBSD
    - BUILD: sock_inet: forward-declare struct receiver
    - BUILD: proto_tcp: export the correct names for proto_tcpv[46]
    - CLEANUP: protocol: move the l3_addrlen to plug a hole in proto_fam
    - CLEANUP: protocol: move the nb_receivers to plug a hole in protocol
    - REORG: listener: move the bind_conf's thread setup code to listener.c
    - MINOR: proxy: make proxy_type_str() recognize peers sections
    - MEDIUM: peers: call bind_complete_thread_setup() to finish the config
    - MINOR: protocol: add a flags field to store info about protocols
    - MINOR: protocol: move the global reuseport flag to the protocols
    - MINOR: listener: automatically adjust shards based on support for SO_REUSEPORT
    - MINOR: protocol: add a function to check if some features are supported
    - MINOR: sock: add a function to check for SO_REUSEPORT support at runtime
    - MINOR: protocol: perform a live check for SO_REUSEPORT support
    - MINOR: listener: do not restrict CLI to first group anymore
    - MINOR: listener: add a new global tune.listener.default-shards setting
    - MEDIUM: listener: switch the default sharding to by-group
2023-04-23 10:21:37 +02:00
Willy Tarreau
0e875cf291 MEDIUM: listener: switch the default sharding to by-group
Sharding by-group is exactly identical to by-process for a single
group, and will use the same number of file descriptors for more than
one group, while significantly lowering the kernel's locking overhead.

Now that all special listeners (cli, peers) are properly handled, and
that support for SO_REUSEPORT is detected at runtime per protocol, there
should be no more reason for now switching to by-group by default.

That's what this patch does. It does only this and nothing else so that
it's easy to revert, should any issue be raised.

Testing on an AMD EPYC 74F3 featuring 24 cores and 48 threads distributed
into 8 core complexes of 3 cores each, shows that configuring 8 groups
(one per CCX) is sufficient to simply double the forwarded connection
rate from 112k to 214k/s, reducing kernel locking from 71 to 55%.
2023-04-23 10:18:16 +02:00
Willy Tarreau
7310164b2c MINOR: listener: add a new global tune.listener.default-shards setting
This new setting accepts "by-process", "by-group" and "by-thread" and
will dictate how listeners will be sharded by default when nothing is
specified. While the default remains "by-process", "by-group" should be
much more efficient with many threads, while not changing anything for
single-group setups.
2023-04-23 09:46:15 +02:00
Willy Tarreau
c38499ceae MINOR: listener: do not restrict CLI to first group anymore
Now that we're able to run listeners on any set of groups, we don't need
to maintain a special case about the stats socket anymore. It used to be
forced to group 1 only so as to avoid startup failures in case several
groups were configured, but if it's done now, it will automatically bind
the needed FDs to have one per group so this is no more an issue.
2023-04-23 09:46:15 +02:00
Willy Tarreau
f1003ea7fa MINOR: protocol: perform a live check for SO_REUSEPORT support
When testing if a protocol supports SO_REUSEPORT, we're now able to
verify if the OS does really support it. While it may be supported at
build time, it may possibly have been blocked in a container for
example so we'd rather know what it's like.
2023-04-23 09:46:15 +02:00
Willy Tarreau
b073573c10 MINOR: sock: add a function to check for SO_REUSEPORT support at runtime
The new function _sock_supports_reuseport() will be used to check if a
protocol type supports SO_REUSEPORT or not. This will be useful to verify
that shards can really work.
2023-04-23 09:46:15 +02:00
Willy Tarreau
8a5e6f4cca MINOR: protocol: add a function to check if some features are supported
The new function protocol_supports_flag() checks the protocol flags
to verify if some features are supported, but will support being
extended to refine the tests. Let's use it to check for REUSEPORT.
2023-04-23 09:46:15 +02:00
Willy Tarreau
c1fbdd6397 MINOR: listener: automatically adjust shards based on support for SO_REUSEPORT
Now if multiple shards are explicitly requested, and the listener's
protocol doesn't support SO_REUSEPORT, sharding is disabled, which will
result in the socket being automatically duped if needed. A warning is
emitted when this happens. If "shards by-group" or "shards by-thread"
are used, these will automatically be turned down to 1 since we want
this to be possible easily using -dR on the command line without having
to djust the config. For "by-thread", a diag warning will be emitted to
help troubleshoot possible performance issues.
2023-04-23 09:46:15 +02:00
Willy Tarreau
785b89f551 MINOR: protocol: move the global reuseport flag to the protocols
Some protocol support SO_REUSEPORT and others not. Some have such a
limitation in the kernel, and others in haproxy itself (e.g. sock_unix
cannot support multiple bindings since each one will unbind the previous
one). Also it's really protocol-dependent and not just family-dependent
because on Linux for some time it was supported for TCP and not UDP.

Let's move the definition to the protocols instead. Now it's preset in
tcp/udp/quic when SO_REUSEPORT is defined, and is otherwise left unset.
The enabled() config condition test validates IPv4 (generally sufficient),
and -dR / noreuseport all protocols at once.
2023-04-23 09:46:15 +02:00
Willy Tarreau
65df7e028d MINOR: protocol: add a flags field to store info about protocols
We'll use these flags to know if some protocols are supported, and if
so, with what options/extensions. Reuseport will move there for example.
Two functions were added to globally set/clear a flag.
2023-04-23 09:46:15 +02:00
Willy Tarreau
a22db6567f MEDIUM: peers: call bind_complete_thread_setup() to finish the config
The listeners in peers sections were still not handing the thread
groups fine. Shards were silently ignored and if a listener was bound
to more than one group, it would simply fail. Now we can call the
dedicated function to resolve all this and possibly create the missing
extra listeners.

bind_complete_thread_setup() was adjusted to use the proxy_type_str()
instead of writing "proxy" at the only place where this word was still
hard-coded so that we continue to speak about peers sections when
relevant.
2023-04-23 09:46:15 +02:00
Willy Tarreau
da0d2cb698 MINOR: proxy: make proxy_type_str() recognize peers sections
Now proxy_type_str() will emit "peers section" when the mode is set to
peers, so as to ease sharing more code between peers and proxies.
2023-04-23 09:46:15 +02:00
Willy Tarreau
f6a8444f55 REORG: listener: move the bind_conf's thread setup code to listener.c
What used to be only two lines to apply a mask in a loop in
check_config_validity() grew into a 130-line block that performs deeply
listener-specific operations that do not have their place there anymore.
In addition it's worth noting that the peers code still doesn't support
shards nor being bound to more than one group, which is a second reason
for moving that code to its own function. Nothing was changed except
recreating the missing variables from the bind_conf itself (the fe only).
2023-04-23 09:46:15 +02:00
Willy Tarreau
4c538df28c CLEANUP: protocol: move the nb_receivers to plug a hole in protocol
This field forces an unaligned hole between two list heads. Let's move
it up where it will be more easily combined with other fields. In
addition, turn it to unsigned while it's still not used.
2023-04-23 09:46:15 +02:00
Willy Tarreau
798d6b4124 CLEANUP: protocol: move the l3_addrlen to plug a hole in proto_fam
There's a two-byte hole in proto_fam after sock_family, let's move the
l3_addrlen there as a ushort. Note that contrary to what the comment
says, it's still not used by hash algorithms though it could.
2023-04-23 09:46:15 +02:00
Willy Tarreau
df4051cd58 BUILD: proto_tcp: export the correct names for proto_tcpv[46]
The exported names were not correct (missing the 'v').
2023-04-23 09:46:15 +02:00
Willy Tarreau
968a4f34fc BUILD: sock_inet: forward-declare struct receiver
Including sock_inet.h without receiver-t.h causes build failures due to
struct receiver not being defined. Let's just forward-declare it.
2023-04-23 09:46:15 +02:00
Willy Tarreau
e1a0107f9c BUG/MINOR: config: fix NUMA topology detection on FreeBSD
In 2.6-dev1, NUMA topology detection was enabled on FreeBSD with commit
f5d48f8b3 ("MEDIUM: cfgparse: numa detect topology on FreeBSD."). But
it suffers from a minor bug which is that it forgets to check for the
number of domains and always emits a confusing warning indicating that
multiple sockets were found while it's not the case.

This can be backported to 2.6.
2023-04-23 09:46:15 +02:00
Willy Tarreau
997ad155fe BUG/MINOR: tools: check libssl and libcrypto separately
The lib compatibility checks introduced in 2.8-dev6 with commit c3b297d5a
("MEDIUM: tools: further relax dlopen() checks too consider grouped
symbols") were partially incorrect in that they check at the same time
libcrypto and libssl. But if loading a library that only depends on
libcrypto, the ssl-only symbols will be missing and this might present
an inconsistency. This is what is observed on FreeBSD 13.1 when
libcrypto is being loaded, where it sees two symbols having disappeared.

The fix consists in splitting the checks for libcrypto and libssl.

No backport is needed, unless the patch above finally gets backported.
2023-04-23 09:46:15 +02:00
Willy Tarreau
9f53b7b41a BUG/MINOR: sock_inet: use SO_REUSEPORT_LB where available
On FreeBSD 13.1 I noticed that thread balancing using shards was not
always working. Sometimes several threads would work, but most of the
time a single one was taking all the traffic. This is related to how
SO_REUSEPORT works on FreeBSD since version 12, as it seems there is
no guarantee that multiple sockets will receive the traffic. However
there is SO_REUSEPORT_LB that is designed exactly for this, so we'd
rather use it when available.

This patch may possibly be backported, but nobody complained and it's
not sure that many users rely on shards. So better wait for some feedback
before backporting this.
2023-04-23 09:46:15 +02:00
Ilya Shipitsin
ccf8012f28 CLEANUP: assorted typo fixes in the code and comments
This is 36th iteration of typo fixes
2023-04-23 09:44:53 +02:00
Ilya Shipitsin
edfa7c99e9 CI: extend spellchecker whitelist, add "clen" as well
"clen" is all around the code, since codespell cannot distingush
variables names, let us ignore it
2023-04-23 09:44:53 +02:00
Ilya Shipitsin
e247038b1c REGTESTS: remove unsupported "stats bind-process" keyword
reg-tests/connection/proxy_protocol_random_fail.vtc fails due to
"stats bind-process":

***  h1    debug|[ALERT]    (1476756) : config : parsing [/tmp/haregtests-2023-04-22_19-24-10.diuT6B/vtc.1476661.74f1092e/h1/cfg:7] : 'stats' is
***  h1    debug|not supported anymore.
2023-04-23 09:44:53 +02:00
Ilya Shipitsin
dfb48d4c8a CI: cirrus-ci: bump FreeBSD image to 13-1
FreeBSD-13.2 released on April 11, 2023
2023-04-23 09:44:53 +02:00
Willy Tarreau
023c311d70 BUG/MINOR: cli: clarify error message about stats bind-process
In 2.7-dev2, "stats bind-process" was removed by commit 94f763b5e
("MEDIUM: config: remove deprecated "bind-process" directives from
frontends") and an error message indicates that it's no more supported.
However it says "stats" is not supported instead of "stats bind-process",
making it a bit confusing.

This should be backported to 2.7.
2023-04-23 09:40:56 +02:00