Commit Graph

38 Commits

Author SHA1 Message Date
Willy Tarreau
12fcd65468 MINOR: tasklet: support an optional set of wakeup flags to tasklet_wakeup_on()
tasklet_wakeup_on() and its derivates (tasklet_wakeup_after() and
tasklet_wakeup()) do not support passing a wakeup cause like
task_wakeup(). This is essentially due to an API limitation cause by
the fact that for a very long time the only reason for waking up was
to process pending I/O. But with the growing complexity of mux tasks,
it is becoming important to be able to skip certain heavy processing
when not strictly needed.

One possibility is to permit the caller of tasklet_wakeup() to pass
flags like task_wakeup(). Instead of going with a complex naming scheme,
let's simply make the flags optional and be zero when not specified. This
means that tasklet_wakeup_on() now takes either 2 or 3 args, and that the
third one is the optional flags to be passed to the callee. Eligible flags
are essentially the non-persistent ones (TASK_F_UEVT* and TASK_WOKEN_*)
which are cleared when the tasklet is executed. This way the handler
will find them in its <state> argument and will be able to distinguish
various causes for the call.
2024-11-19 20:13:41 +01:00
Willy Tarreau
8dc68f3c75 DOC: sched: document the missing TASK_F_UEVT* flags
These are user-defined one-shot events that are application-specific
and reset upon wakeup and were not documented. No backport is needed
since these were added to 3.1.
2024-11-19 20:13:41 +01:00
Willy Tarreau
e5ca72cb6f DOC: sched: add missing scheduler API documentation for tasklet_wakeup_after()
This was added to 2.6 but the doc was forgotten. Let's add it. It's not
needed to backport this since it's only used for new developments.
2024-11-19 20:13:41 +01:00
Willy Tarreau
8f09bdce10 MINOR: buffer: add a buffer list type with functions
The buffer ring is problematic in multiple aspects, one of which being
that it is only usable by one entity. With multiplexed protocols, we need
to have shared buffers used by many entities (streams and connection),
and the only way to use the buffer ring model in this case is to have
each entity store its own array, and keep a shared counter on allocated
entries. But even with the default 32 buf and 100 streams per HTTP/2
connection, we're speaking about 32*101*32 bytes = 103424 bytes per H2
connection, just to store up to 32 shared buffers, spread randomly in
these tables. Some users might want to achieve much higher than default
rates over high speed links (e.g. 30-50 MB/s at 100ms), which is 3 to 5
MB storage per connection, hence 180 to 300 buffers. There it starts to
cost a lot, up to 1 MB per connection, just to store buffer indexes.

Instead this patch introduces a variant which we call a buffer list.
That's basically just a free list encoded in an array. Each cell
contains a buffer structure, a next index, and a few flags. The index
could be reduced to 16 bits if needed, in order to make room for a new
struct member. The design permits initializing a whole freelist at once
using memset(0).

The list pointer is stored at a single location (e.g. the connection)
and all users (the streams) will just have indexes referencing their
first and last assigned entries (head and tail). This means that with
a single table we can now have all our buffers shared between multiple
streams, irrelevant to the number of potential streams which would want
to use them. Now the 180 to 300 entries array only costs 7.2 to 12 kB,
or 80 times less.

Two large functions (bl_deinit() & bl_get()) were implemented in buf.c.
A basic doc was added to explain how it works.
2024-10-12 16:29:15 +02:00
Willy Tarreau
4e65fc66f6 MAJOR: import: update mt_list to support exponential back-off (try #2)
This is the second attempt at importing the updated mt_list code (commit
59459ea3). The previous one was attempted with commit c618ed5ff4 ("MAJOR:
import: update mt_list to support exponential back-off") but revealed
problems with QUIC connections and was reverted.

The problem that was faced was that elements deleted inside an iterator
were no longer reset, and that if they were to be recycled in this form,
they could appear as busy to the next user. This was trivially reproduced
with this:

  $ cat quic-repro.cfg
  global
          stats socket /tmp/sock1 level admin
          stats timeout 1h
          limited-quic

  frontend stats
          mode http
          bind quic4@:8443 ssl crt rsa+dh2048.pem alpn h3
          timeout client 5s
          stats uri /

  $ ./haproxy -db -f quic-repro.cfg  &

  $ h2load -c 10 -n 100000 --npn h3 https://127.0.0.1:8443/
  => hang

This was purely an API issue caused by the simplified usage of the macros
for the iterator. The original version had two backups (one full element
and one pointer) that the user had to take care of, while the new one only
uses one that is transparent for the user. But during removal, the element
still has to be unlocked if it's going to be reused.

All of this sparked discussions with Fred and Aurélien regarding the still
unclear state of locking. It was found that the lock API does too much at
once and is lacking granularity. The new version offers a much more fine-
grained control allowing to selectively lock/unlock an element, a link,
the rest of the list etc.

It was also found that plenty of places just want to free the current
element, or delete it to do anything with it, hence don't need to reset
its pointers (e.g. event_hdl). Finally it appeared obvious that the
root cause of the problem was the unclear usage of the list iterators
themselves because one does not necessarily expect the element to be
presented locked when not needed, which makes the unlock easy to overlook
during reviews.

The updated version of the list presents explicit lock status in the
macro name (_LOCKED or _UNLOCKED suffixes). When using the _LOCKED
suffix, the caller is expected to unlock the element if it intends to
reuse it. At least the status is advertised. The _UNLOCKED variant,
instead, always unlocks it before starting the loop block. This means
it's not necessary to think about unlocking it, though it's obviously
not usable with everything. A few _UNLOCKED were used at obvious places
(i.e. where the element is deleted and freed without any prior check).

Interestingly, the tests performed last year on QUIC forwarding, that
resulted in limited traffic for the original version and higher bit
rate for the new one couldn't be reproduced because since then the QUIC
stack has gaind in efficiency, and the 100 Gbps barrier is now reached
with or without the mt_list update. However the unit tests definitely
show a huge difference, particularly on EPYC platforms where the EBO
provides tremendous CPU savings.

Overall, the following changes are visible from the application code:

  - mt_list_for_each_entry_safe() + 1 back elem + 1 back ptr
    => MT_LIST_FOR_EACH_ENTRY_LOCKED() or MT_LIST_FOR_EACH_ENTRY_UNLOCKED()
       + 1 back elem

  - MT_LIST_DELETE_SAFE() no longer needed in MT_LIST_FOR_EACH_ENTRY_UNLOCKED()
      => just manually set iterator to NULL however.
    For MT_LIST_FOR_EACH_ENTRY_LOCKED()
      => mt_list_unlock_self() (if element going to be reused) + NULL

  - MT_LIST_LOCK_ELT => mt_list_lock_full()
  - MT_LIST_UNLOCK_ELT => mt_list_unlock_full()

  - l = MT_LIST_APPEND_LOCKED(h, e);  MT_LIST_UNLOCK_ELT();
    => l=mt_list_lock_prev(h); mt_list_lock_elem(e); mt_list_unlock_full(e, l)
2024-07-09 16:46:38 +02:00
Willy Tarreau
bbc2f043e3 [RELEASE] Released version 3.1-dev2
Released version 3.1-dev2 with the following main changes :
    - BUG/MINOR: log: fix broken '+bin' logformat node option
    - DEBUG: hlua: distinguish burst timeout errors from exec timeout errors
    - REGTESTS: ssl: fix some regtests 'feature cmd' start condition
    - BUG/MEDIUM: ssl: AWS-LC + TLSv1.3 won't do ECDSA in RSA+ECDSA configuration
    - MINOR: ssl: activate sigalgs feature for AWS-LC
    - REGTESTS: ssl: activate new SSL reg-tests with AWS-LC
    - BUG/MEDIUM: proxy: fix email-alert invalid free
    - REORG: mailers: move free_email_alert() to mailers.c
    - BUG/MINOR: proxy: fix email-alert leak on deinit() (2nd try)
    - DOC: configuration: fix alphabetical order of bind options
    - DOC: management: document ptr lookup for table commands
    - BUG/MAJOR: quic: fix padding with short packets
    - BUG/MAJOR: quic: do not loop on emission on closing/draining state
    - MINOR: sample: date converter takes HTTP date and output an UNIX timestamp
    - SCRIPTS: git-show-backports: do not truncate git-show output
    - DOC: api/event_hdl: small updates, fix an example and add some precisions
    - BUG/MINOR: h3: fix crash on STOP_SENDING receive after GOAWAY emission
    - BUG/MINOR: mux-quic: fix crash on qcs SD alloc failure
    - BUG/MINOR: h3: fix BUG_ON() crash on control stream alloc failure
    - BUG/MINOR: quic: fix BUG_ON() on Tx pkt alloc failure
    - DEV: flags/show-fd-to-flags: adapt to recent versions
    - MINOR: capabilities: export capget and __user_cap_header_struct
    - MINOR: capabilities: prepare support for version 3
    - MINOR: capabilities: use _LINUX_CAPABILITY_VERSION_3
    - MINOR: cli/debug: show dev: add cmdline and version
    - MINOR: cli/debug: show dev: show capabilities
    - MINOR: debug: print gdb hints when crashing
    - BUILD: debug: also declare strlen() in __ABORT_NOW()
    - BUILD: Missing inclusion header for ssize_t type
    - BUG/MINOR: hlua: report proper context upon error in hlua_cli_io_handler_fct()
    - MINOR: cfgparse/log: remove leftover dead code
    - BUG/MEDIUM: stick-table: Decrement the ref count inside lock to kill a session
    - MINOR: stick-table: Always decrement ref count before killing a session
    - REORG: init: do MODE_CHECK_CONDITION logic first
    - REORG: init: encapsulate CHECK_CONDITION logic in a func
    - REORG: init: encapsulate 'reload' sockpair and master CLI listeners creation
    - REORG: init: encapsulate code that reads cfg files
    - BUG/MINOR: server: fix first server template name lookup UAF
    - MINOR: activity: make the memory profiling hash size configurable at build time
    - BUG/MEDIUM: server/dns: prevent DOWN/UP flap upon resolution timeout or error
    - BUG/MEDIUM: h3: ensure the ":method" pseudo header is totally valid
    - BUG/MEDIUM: h3: ensure the ":scheme" pseudo header is totally valid
    - BUG/MEDIUM: quic: fix race-condition in quic_get_cid_tid()
    - BUG/MINOR: quic: fix race condition in qc_check_dcid()
    - BUG/MINOR: quic: fix race-condition on trace for CID retrieval
2024-06-29 11:28:41 +02:00
Aurelien DARRAGON
13e0972aea DOC: api/event_hdl: small updates, fix an example and add some precisions
Fix an example suggesting that using EVENT_HDL_SUB_TYPE(x, y) with y being
0 was valid. Then add some notes to explain how to use
EVENT_HDL_SUB_FAMILY() and EVENT_HDL_SUB_TYPE() with valid values.

Also mention that the feature is available starting from 2.8 and not 2.7.
Finally, perform some purely cosmetic updates.

This could be backported in 2.8.
2024-06-21 18:12:31 +02:00
Willy Tarreau
72d0dcda8e MINOR: dynbuf: pass a criticality argument to b_alloc()
The goal is to indicate how critical the allocation is, between the
least one (growing an existing buffer ring) and the topmost one (boot
time allocation for the life of the process).

The 3 tcp-based muxes (h1, h2, fcgi) use a common allocation function
to try to allocate otherwise subscribe. There's currently no distinction
of direction nor part that tries to allocate, and this should be revisited
to improve this situation, particularly when we consider that mux-h2 can
reduce its Tx allocations if needed.

For now, 4 main levels are planned, to translate how the data travels
inside haproxy from a producer to a consumer:
  - MUX_RX:   buffer used to receive data from the OS
  - SE_RX:    buffer used to place a transformation of the RX data for
              a mux, or to produce a response for an applet
  - CHANNEL:  the channel buffer for sync recv
  - MUX_TX:   buffer used to transfer data from the channel to the outside,
              generally a mux but there can be a few specificities (e.g.
              http client's response buffer passed to the application,
              which also gets a transformation of the channel data).

The other levels are a bit different in that they don't strictly need to
allocate for the first two ones, or they're permanent for the last one
(used by compression).
2024-05-10 17:18:13 +02:00
Willy Tarreau
ff3dcb20f2 [RELEASE] Released version 2.9-dev9
Released version 2.9-dev9 with the following main changes :
    - DOC: internal: filters: fix reference to entities.pdf
    - BUG/MINOR: ssl: load correctly @system-ca when ca-base is define
    - MINOR: lua: Add flags to configure logging behaviour
    - MINOR: lua: change tune.lua.log.stderr default from 'on' to 'auto'
    - BUG/MINOR: backend: fix wrong BUG_ON for avail conn
    - BUG/MAJOR: backend: fix idle conn crash under low FD
    - MINOR: backend: refactor insertion in avail conns tree
    - DEBUG: mux-h2/flags: fix list of h2c flags used by the flags decoder
    - BUG/MEDIUM: server/log: "mode log" after server keyword causes crash
    - MINOR: connection: add conn_pr_mode_to_proto_mode() helper func
    - BUG/MEDIUM: server: "proto" not working for dynamic servers
    - MINOR: server: add helper function to detach server from proxy list
    - DEBUG: add a tainted flag when ha_panic() is called
    - DEBUG: lua: add tainted flags for stuck Lua contexts
    - DEBUG: pools: detect that malloc_trim() is in progress
    - BUG/MINOR: quic: do not consider idle timeout on CLOSING state
    - MINOR: frontend: implement a dedicated actconn increment function
    - BUG/MINOR: ssl: use a thread-safe sslconns increment
    - MEDIUM: quic: count quic_conn instance for maxconn
    - MEDIUM: quic: count quic_conn for global sslconns
    - BUG/MINOR: ssl: suboptimal certificate selection with TLSv1.3 and dual ECDSA/RSA
    - REGTESTS: ssl: update the filters test for TLSv1.3 and sigalgs
    - BUG/MINOR: mux-quic: fix early close if unset client timeout
    - BUG/MEDIUM: ssl: segfault when cipher is NULL
    - BUG/MINOR: tcpcheck: Report hexstring instead of binary one on check failure
    - MEDIUM: systemd: be more verbose about the reload
    - MINOR: sample: Add fetcher for getting all cookie names
    - BUG/MINOR: proto_reverse_connect: support SNI on active connect
    - MINOR: proxy/stktable: add resolve_stick_rule helper function
    - BUG/MINOR: stktable: missing free in parse_stick_table()
    - BUG/MINOR: cfgparse/stktable: fix error message on stktable_init() failure
    - MINOR: stktable: stktable_init() sets err_msg on error
    - MINOR: stktable: check if a type should be used as-is
    - MEDIUM: stktable/peers: "write-to" local table on peer updates
    - CI: github: update wolfSSL to 5.6.4
    - DOC: install: update the wolfSSL required version
    - MINOR: server: Add parser support for set-proxy-v2-tlv-fmt
    - MINOR: connection: Send out generic, user-defined server TLVs
    - BUG/MEDIUM: pattern: don't trim pools under lock in pat_ref_purge_range()
    - MINOR: mux-h2: always use h2_send() in h2_done_ff(), not h2_process()
    - OPTIM: mux-h2: call h2_send() directly from h2_snd_buf()
    - BUG/MINOR: server: remove some incorrect free() calls on null elements
2023-11-04 09:38:16 +01:00
Aleksandar Lazic
1428e7b66d DOC: internal: filters: fix reference to entities.pdf
In doc/internals/api/filters.txt was the referece to
doc/internals/entities.pdf which was delteted in the
past.
2023-10-23 11:33:45 +02:00
Willy Tarreau
6cbb5a057b Revert "MAJOR: import: update mt_list to support exponential back-off"
This reverts commit c618ed5ff4.

The list iterator is broken. As found by Fred, running QUIC single-
threaded shows that only the first connection is accepted because the
accepter relies on the element being initialized once detached (which
is expected and matches what MT_LIST_DELETE_SAFE() used to do before).
However while doing this in the quic_sock code seems to work, doing it
inside the macro show total breakage and the unit test doesn't work
anymore (random crashes). Thus it looks like the fix is not trivial,
let's roll this back for the time it will take to fix the loop.
2023-09-15 17:13:43 +02:00
Willy Tarreau
c618ed5ff4 MAJOR: import: update mt_list to support exponential back-off
The new mt_list code supports exponential back-off on conflict, which
is important for use cases where there is contention on a large number
of threads. The API evolved a little bit and required some updates:

  - mt_list_for_each_entry_safe() is now in upper case to explicitly
    show that it is a macro, and only uses the back element, doesn't
    require a secondary pointer for deletes anymore.

  - MT_LIST_DELETE_SAFE() doesn't exist anymore, instead one just has
    to set the list iterator to NULL so that it is not re-inserted
    into the list and the list is spliced there. One must be careful
    because it was usually performed before freeing the element. Now
    instead the element must be nulled before the continue/break.

  - MT_LIST_LOCK_ELT() and MT_LIST_UNLOCK_ELT() have always been
    unclear. They were replaced by mt_list_cut_around() and
    mt_list_connect_elem() which more explicitly detach the element
    and reconnect it into the list.

  - MT_LIST_APPEND_LOCKED() was only in haproxy so it was left as-is
    in list.h. It may however possibly benefit from being upstreamed.

This required tiny adaptations to event_hdl.c and quic_sock.c. The
test case was updated and the API doc added. Note that in order to
keep include files small, the struct mt_list definition remains in
list-t.h (par of the internal API) and was ifdef'd out in mt_list.h.

A test on QUIC with both quictls 1.1.1 and wolfssl 5.6.3 on ARM64 with
80 threads shows a drastic reduction of CPU usage thanks to this and
the refined memory barriers. Please note that the CPU usage on OpenSSL
3.0.9 is significantly higher due to the excessive use of atomic ops
by openssl, but 3.1 is only slightly above 1.1.1 though:

  - before: 35 Gbps, 3.5 Mpps, 7800% CPU
  - after:  41 Gbps, 4.2 Mpps, 2900% CPU
2023-09-13 11:50:33 +02:00
Willy Tarreau
5f10176e2c MEDIUM: init: initialize the trash earlier
More and more utility functions rely on the trash while most of the init
code doesn't have access to it because it's initialized very late (in
PRE_CHECK for the initial one). It's a pool, and it purposely supports
being reallocated, so let's initialize it in STG_POOL so that early
STG_INIT code can at least use it.
2023-09-08 16:25:19 +02:00
Willy Tarreau
768b62857e [RELEASE] Released version 2.8-dev7
Released version 2.8-dev7 with the following main changes :
    - BUG/MINOR: stats: Don't replace sc_shutr() by SE_FL_EOS flag yet
    - BUG/MEDIUM: mux-h2: Be able to detect connection error during handshake
    - BUG/MINOR: quic: Missing padding in very short probe packets
    - MINOR: proxy/pool: prevent unnecessary calls to pool_gc()
    - CLEANUP: proxy: remove stop_time related dead code
    - DOC/MINOR: reformat configuration.txt's "quoting and escaping" table
    - MINOR: http_fetch: Add support for empty delim in url_param
    - MINOR: http_fetch: add case insensitive support for smp_fetch_url_param
    - MINOR: http_fetch: Add case-insensitive argument for url_param/urlp_val
    - REGTESTS : Add test support for case insentitive for url_param
    - BUG/MEDIUM: proxy/sktable: prevent watchdog trigger on soft-stop
    - BUG/MINOR: backend: make be_usable_srv() consistent when stopping
    - BUG/MINOR: ssl: Remove dead code in cli_parse_update_ocsp_response
    - BUG/MINOR: ssl: Fix potential leak in cli_parse_update_ocsp_response
    - BUG/MINOR: ssl: ssl-(min|max)-ver parameter not duplicated for bundles in crt-list
    - BUG/MINOR: quic: Wrong use of now_ms timestamps (cubic algo)
    - MINOR: quic: Add recovery related information to "show quic"
    - BUG/MINOR: quic: Wrong use of now_ms timestamps (newreno algo)
    - BUG/MINOR: quic: Missing max_idle_timeout initialization for the connection
    - MINOR: quic: Implement cubic state trace callback
    - MINOR: quic: Adjustments for generic control congestion traces
    - MINOR: quic: Traces adjustments at proto level.
    - MEDIUM: quic: Ack delay implementation
    - BUG/MINOR: quic: Wrong rtt variance computing
    - MINOR: cli: support filtering on FD types in "show fd"
    - MINOR: quic: Add a fake congestion control algorithm named "nocc"
    - CI: run smoke tests on config syntax to check memory related issues
    - CLEANUP: assorted typo fixes in the code and comments
    - CI: exclude doc/{design-thoughts,internals} from spell check
    - BUG/MINOR: quic: Remaining useless statements in cubic slow start callback
    - BUG/MINOR: quic: Cubic congestion control window may wrap
    - MINOR: quic: Add missing traces in cubic algorithm implementation
    - BUG/MAJOR: quic: Congestion algorithms states shared between the connection
    - BUG/MINOR: ssl: Undefined reference when building with OPENSSL_NO_DEPRECATED
    - BUG/MINOR: quic: Remove useless BUG_ON() in newreno and cubic algo implementation
    - MINOR: http-act: emit a warning when a header field name contains forbidden chars
    - DOC: config: strict-sni allows to start without certificate
    - MINOR: quic: Add trace to debug idle timer task issues
    - BUG/MINOR: quic: Unexpected connection closures upon idle timer task execution
    - BUG/MINOR: quic: Wrong idle timer expiration (during 20s)
    - BUILD: quic: 32bits compilation issue in cli_io_handler_dump_quic()
    - BUG/MINOR: quic: Possible wrong PTO computing
    - BUG/MINOR: tcpcheck: Be able to expect an empty response
    - BUG/MEDIUM: stconn: Add a missing return statement in sc_app_shutr()
    - BUG/MINOR: stream: Fix test on channels flags to set clientfin/serverfin touts
    - MINOR: applet: Uninline appctx_free()
    - MEDIUM: applet/trace: Register a new trace source with its events
    - CLEANUP: stconn: Remove remaining debug messages
    - BUG/MEDIUM: channel: Improve reports for shut in co_getblk()
    - BUG/MEDIUM: dns: Properly handle error when a response consumed
    - MINOR: stconn: Remove unecessary test on SE_FL_EOS before receiving data
    - MINOR: stconn/channel: Move CF_READ_DONTWAIT into the SC and rename it
    - MINOR: stconn/channel: Move CF_SEND_DONTWAIT into the SC and rename it
    - MINOR: stconn/channel: Move CF_NEVER_WAIT into the SC and rename it
    - MINOR: stconn/channel: Move CF_EXPECT_MORE into the SC and rename it
    - MINOR: mux-pt: Report end-of-input with the end-of-stream after a read
    - BUG/MINOR: mux-h1: Properly report EOI/ERROR on read0 in h1_rcv_pipe()
    - CLEANUP: mux-h1/mux-pt: Remove useless test on SE_FL_SHR/SE_FL_SHW flags
    - MINOR: mux-h1: Report an error to the SE descriptor on truncated message
    - MINOR: stconn: Always ack EOS at the end of sc_conn_recv()
    - MINOR: stconn/applet: Handle EOI in the applet .wake callback function
    - MINOR: applet: No longer set EOI on the SC
    - MINOR: stconn/applet: Handle EOS in the applet .wake callback function
    - MEDIUM: cache: Use the sedesc to report and detect end of processing
    - MEDIUM: cli: Use the sedesc to report and detect end of processing
    - MINOR: dns: Remove the test on the opposite SC state to send requests
    - MEDIUM: dns: Use the sedesc to report and detect end of processing
    - MEDIUM: spoe: Use the sedesc to report and detect end of processing
    - MEDIUM: hlua/applet: Use the sedesc to report and detect end of processing
    - MEDIUM: log: Use the sedesc to report and detect end of processing
    - MEDIUM: peers: Use the sedesc to report and detect end of processing
    - MINOR: sink: Remove the tests on the opposite SC state to process messages
    - MEDIUM: sink: Use the sedesc to report and detect end of processing
    - MEDIUM: stats: Use the sedesc to report and detect end of processing
    - MEDIUM: promex: Use the sedesc to report and detect end of processing
    - MEDIUM: http_client: Use the sedesc to report and detect end of processing
    - MINOR: stconn/channel: Move CF_EOI into the SC and rename it
    - MEDIUM: tree-wide: Move flags about shut from the channel to the SC
    - MINOR: tree-wide: Simplifiy some tests on SHUT flags by accessing SCs directly
    - MINOR: stconn/applet: Add BUG_ON_HOT() to be sure SE_FL_EOS is never set alone
    - MINOR: server: add SRV_F_DELETED flag
    - BUG/MINOR: server/del: fix srv->next pointer consistency
    - BUG/MINOR: stats: properly handle server stats dumping resumption
    - BUG/MINOR: sink: free forward_px on deinit()
    - BUG/MINOR: log: free log forward proxies on deinit()
    - MINOR: server: always call ssl->destroy_srv when available
    - MINOR: server: correctly free servers on deinit()
    - BUG/MINOR: hlua: hook yield does not behave as expected
    - MINOR: hlua: properly handle hlua_process_task HLUA_E_ETMOUT
    - BUG/MINOR: hlua: enforce proper running context for register_x functions
    - MINOR: hlua: Fix two functions that return nothing useful
    - MEDIUM: hlua: Dynamic list of frontend/backend in Lua
    - MINOR: hlua_fcn: alternative to old proxy and server attributes
    - MEDIUM: hlua_fcn: dynamic server iteration and indexing
    - MEDIUM: hlua_fcn/api: remove some old server and proxy attributes
    - CLEANUP: hlua: fix conflicting comment in hlua_ctx_destroy()
    - MINOR: hlua: add simple hlua reference handling API
    - MINOR: hlua: fix return type for hlua_checkfunction() and hlua_checktable()
    - BUG/MINOR: hlua: fix reference leak in core.register_task()
    - BUG/MINOR: hlua: fix reference leak in hlua_post_init_state()
    - BUG/MINOR: hlua: prevent function and table reference leaks on errors
    - CLEANUP: hlua: use hlua_ref() instead of luaL_ref()
    - CLEANUP: hlua: use hlua_pushref() instead of lua_rawgeti()
    - CLEANUP: hlua: use hlua_unref() instead of luaL_unref()
    - MINOR: hlua: simplify lua locking
    - BUG/MEDIUM: hlua: prevent deadlocks with main lua lock
    - MINOR: hlua_fcn: add server->get_rid() method
    - MINOR: hlua: support for optional arguments to core.register_task()
    - DOC: lua: silence "literal block ends without a blank line" Sphinx warnings
    - DOC: lua: silence "Unexpected indentation" Sphinx warnings
    - BUG/MINOR: event_hdl: fix rid storage type
    - BUG/MINOR: event_hdl: make event_hdl_subscribe thread-safe
    - MINOR: event_hdl: global sublist management clarification
    - BUG/MEDIUM: event_hdl: clean soft-stop handling
    - BUG/MEDIUM: event_hdl: fix async data refcount issue
    - MINOR: event_hdl: normal tasks support for advanced async mode
    - MINOR: event_hdl: add event_hdl_async_equeue_isempty() function
    - MINOR: event_hdl: add event_hdl_async_equeue_size() function
    - MINOR: event_hdl: pause/resume for subscriptions
    - MINOR: proxy: add findserver_unique_id() and findserver_unique_name()
    - MEDIUM: hlua/event_hdl: initial support for event handlers
    - MINOR: hlua/event_hdl: per-server event subscription
    - EXAMPLES: add basic event_hdl lua example script
    - MINOR: http-ana: Add a HTTP_MSGF flag to state the Expect header was checked
    - BUG/MINOR: http-ana: Don't switch message to DATA when waiting for payload
    - BUG/MINOR: quic: Possible crashes in qc_idle_timer_task()
    - MINOR: quic: derive first DCID from client ODCID
    - MINOR: quic: remove ODCID dedicated tree
    - MINOR: quic: remove address concatenation to ODCID
    - BUG/MINOR: mworker: unset more internal variables from program section
    - BUG/MINOR: errors: invalid use of memprintf in startup_logs_init()
    - MINOR: applet: Use unsafe version to get stream from SC in the trace function
    - BUG/MUNOR: http-ana: Use an unsigned integer for http_msg flags
    - MINOR: compression: Make compression offload a flag
    - MINOR: compression: Prepare compression code for request compression
    - MINOR: compression: Store algo and type for both request and response
    - MINOR: compression: Count separately request and response compression
    - MEDIUM: compression: Make it so we can compress requests as well.
    - BUG/MINOR: lua: remove incorrect usage of strncat()
    - CLEANUP: tcpcheck: remove the only occurrence of sprintf() in the code
    - CLEANUP: ocsp: do no use strpcy() to copy a path!
    - CLEANUP: tree-wide: remove strpcy() from constant strings
    - CLEANUP: opentracing: remove the last two occurrences of strncat()
    - BUILD: compiler: fix __equals_1() on older compilers
    - MINOR: compiler: define a __attribute__warning() macro
    - BUILD: bug.h: add a warning in the base API when unsafe functions are used
    - BUG/MEDIUM: listeners: Use the right parameters for strlcpy2().
2023-04-08 17:38:39 +02:00
Aurelien DARRAGON
b289fd1420 MINOR: event_hdl: normal tasks support for advanced async mode
advanced async mode (EVENT_HDL_ASYNC_TASK) provided full support for
custom tasklets registration.

Due to the similarities between tasks and tasklets, it may be useful
to use the advanced mode with an existing task (not a tasklet).
While the API did not explicitly disallow this usage, things would
get bad if we try to wakeup a task using tasklet_wakeup() for notifying
the task about new events.

To make the API support both custom tasks and tasklets, we use the
TASK_IS_TASKLET() macro to call the proper waking function depending
on the task's type:

  - For tasklets: we use tasklet_wakeup()
  - For tasks: we use task_wakeup()

If 68e692da0 ("MINOR: event_hdl: add event handler base api")
is being backported, then this commit should be backported with it.
2023-04-05 08:58:17 +02:00
Christopher Faulet
c960a3b60f BUG/MINOR: pool/stats: Use ullong to report total pool usage in bytes in stats
The same change was already performed for the cli. The stats applet and the
prometheus exporter are also concerned. Both use the stats API and rely on
pool functions to get total pool usage in bytes. pool_total_allocated() and
pool_total_used() must return 64 bits unsigned integer to avoid any wrapping
around 4G.

This may be backported to all versions.
2022-12-22 13:46:21 +01:00
Willy Tarreau
284cfc67b8 MINOR: pool: make the thread-local hot cache size configurable
Till now it was only possible to change the thread local hot cache size
at build time using CONFIG_HAP_POOL_CACHE_SIZE. But along benchmarks it
was sometimes noticed a huge contention in the lower level memory
allocators indicating that larger caches could be beneficial, especially
on machines with large L2 CPUs.

Given that the checks against this value was no longer on a hot path
anymore, there was no reason for continuing to force it to be tuned at
build time. So this patch allows to set it by tune.memory-hot-size.

It's worth noting that during the boot phase the value remains zero so
that it's possible to know if the value was set or not, which opens the
possibility that we try to automatically adjust it based on the per-cpu
L2 cache size or the use of certain protocols (none of this is done yet).
2022-12-20 14:51:12 +01:00
Willy Tarreau
9192d20f02 MINOR: pools: make DEBUG_UAF a runtime setting
Since the massive pools cleanup that happened in 2.6, the pools
architecture was made quite more hierarchical and many alternate code
blocks could be moved to runtime flags set by -dM. One of them had not
been converted by then, DEBUG_UAF. It's not much more difficult actually,
since it only acts on a pair of functions indirection on the slow path
(OS-level allocator) and a default setting for the cache activation.

This patch adds the "uaf" setting to the options permitted in -dM so
that it now becomes possible to set or unset UAF at boot time without
recompiling. This is particularly convenient, because every 3 months on
average, developers ask a user to recompile haproxy with DEBUG_UAF to
understand a bug. Now it will not be needed anymore, instead the user
will only have to disable pools and enable uaf using -dMuaf. Note that
-dMuaf only disables previously enabled pools, but it remains possible
to re-enable caching by specifying the cache after, like -dMuaf,cache.
A few tests with this mode show that it can be an interesting combination
which catches significantly less UAF but will do so with much less
overhead, so it might be compatible with some high-traffic deployments.

The change is very small and isolated. It could be helpful to backport
this at least to 2.7 once confirmed not to cause build issues on exotic
systems, and even to 2.6 a bit later as this has proven to be useful
over time, and could be even more if it did not require a rebuild. If
a backport is desired, the following patches are needed as well:

  CLEANUP: pools: move the write before free to the uaf-only function
  CLEANUP: pool: only include pool-os from pool.c not pool.h
  REORG: pool: move all the OS specific code to pool-os.h
  CLEANUP: pools: get rid of CONFIG_HAP_POOLS
  DEBUG: pool: show a few examples in -dMhelp
2022-12-08 18:54:59 +01:00
Ilya Shipitsin
5fa29b8a74 CLEANUP: assorted typo fixes in the code and comments
This is 34th iteration of typo fixes
2022-12-07 09:08:18 +01:00
Aurelien DARRAGON
d06b9c8b99 DOC/MINOR: api: add documentation for event_hdl feature
This is an initial work for the dedicated
event handler API internal documentation.

The file is located at doc/internals/api/event_hdl.txt

event_hdl feature has been introduced with:
	MINOR: event_hdl: add event handler base api
2022-12-02 09:40:52 +01:00
Ilya Shipitsin
4a689dad03 CLEANUP: assorted typo fixes in the code and comments
This is 32nd iteration of typo fixes
2022-10-30 17:17:56 +01:00
Ilya Shipitsin
3b64a28e15 CLEANUP: assorted typo fixes in the code and comments
This is 31st iteration of typo fixes
2022-08-06 17:12:51 +02:00
Willy Tarreau
eed3911a54 MINOR: task: replace task_set_affinity() with task_set_thread()
The latter passes a thread ID instead of a mask, making the code simpler.
2022-07-01 19:15:14 +02:00
Willy Tarreau
de5b33e339 DOC: internal: add a description of the stream connectors and descriptors
The "layers" mini-doc shows how streams, stconn, sedesc, conns, applets
and muxes interact, with field names, pointers and invariants. It should
be completed but already provides a quick overview about what can be
guaranteed at any step and at different layers.
2022-05-27 19:34:36 +02:00
Willy Tarreau
8f7133e242 DOC: internal: document the new cleaner approach to the appctx
It explains the problems with the previous union, the temporary state
for the transition between 2.6 and 2.7, and how to perform the changes.
2022-05-06 18:33:49 +02:00
William Lallemand
b53eb8790e MINOR: init: add the pre-check callback
This adds a call to function <fct> to the list of functions to be called at
the step just before the configuration validity checks. This is useful when you
need to create things like it would have been done during the configuration
parsing and where the initialization should continue in the configuration
check.
It could be used for example to generate a proxy with multiple servers using
the configuration parser itself. At this step the trash buffers are allocated.
Threads are not yet started so no protection is required. The function is
expected to return non-zero on success, or zero on failure. A failure will make
the process emit a succinct error message and immediately exit.
2022-04-22 15:45:47 +02:00
Willy Tarreau
0722d5d58e DOC: internal: update the pools API to mention boot-time settings
These ones are useful for debugging and must be mentionned in the
API doc.
2022-02-24 08:58:04 +01:00
Willy Tarreau
3ebe4d989c MEDIUM: initcall: move STG_REGISTER earlier
The STG_REGISTER init level is used to register known keywords and
protocol stacks. It must be called earlier because some of the init
code already relies on it to be known. For example, "haproxy -vv"
for now is constrained to start very late only because of this.

This patch moves it between STG_LOCK and STG_ALLOC, which is fine as
it's used for static registration.
2022-02-23 17:11:33 +01:00
Willy Tarreau
af580f659c MINOR: pools: disable redundant poisonning on pool_free()
The poisonning performed on pool_free() used to help a little bit with
use-after-free detection, but usually did more harm than good in that
it was never possible to perform post-mortem analysis on released
objects once poisonning was enabled on allocation. Now that there is
a dedicated DEBUG_POOL_INTEGRITY, let's get rid of this annoyance
which is not even documented in the management manual.
2022-02-23 17:11:33 +01:00
Willy Tarreau
add43fa43e DEBUG: pools: add new build option DEBUG_POOL_TRACING
This new option, when set, will cause the callers of pool_alloc() and
pool_free() to be recorded into an extra area in the pool that is expected
to be helpful for later inspection (e.g. in core dumps). For example it
may help figure that an object was released to a pool with some sub-fields
not yet released or that a use-after-free happened after releasing it,
with an immediate indication about the exact line of code that released
it (possibly an error path).

This only works with the per-thread cache, and even objects refilled from
the shared pool directly into the thread-local cache will have a NULL
there. That's not an issue since these objects have not yet been freed.
It's worth noting that pool_alloc_nocache() continues not to set any
caller pointer (e.g. when the cache is empty) because that would require
a possibly undesirable API change.

The extra cost is minimal (one pointer per object) and this completes
well with DEBUG_POOL_INTEGRITY.
2022-01-24 16:40:48 +01:00
Willy Tarreau
0575d8fd76 DEBUG: pools: add new build option DEBUG_POOL_INTEGRITY
When enabled, objects picked from the cache are checked for corruption
by comparing their contents against a pattern that was placed when they
were inserted into the cache. Objects are also allocated in the reverse
order, from the oldest one to the most recent, so as to maximize the
ability to detect such a corruption. The goal is to detect writes after
free (or possibly hardware memory corruptions). Contrary to DEBUG_UAF
this cannot detect reads after free, but may possibly detect later
corruptions and will not consume extra memory. The CPU usage will
increase a bit due to the cost of filling/checking the area and for the
preference for cold cache instead of hot cache, though not as much as
with DEBUG_UAF. This option is meant to be usable in production.
2022-01-21 19:07:48 +01:00
Willy Tarreau
b64ef3e3f8 DOC: internals: document the pools architecture and API
The purpose here is to explain how memory pools work, what their
architecture is depending on the build options (4 possible combinations),
and how the various build options affect their behavior.

Two pool-specific macros that were previously documented in initcalls
were moved to pools.txt.
2022-01-11 14:51:41 +01:00
Ilya Shipitsin
5e87bcf870 CLEANUP: assorted typo fixes in the code and comments This is 29th iteration of typo fixes 2022-01-03 14:40:58 +01:00
Willy Tarreau
6232d11626 DOC: internals: document the scheduler API
Another non-trivial part that is often needed. Exported functions
and flags available to applications were documented as well as some
restrictions and falltraps.
2021-11-18 11:27:30 +01:00
Willy Tarreau
946ebbb0bd DOC: internals: document the list API
This one is aging and not always trivial. Let's indicate what the
macros do and what traps not to fall into.
2021-11-17 15:30:53 +01:00
Willy Tarreau
bc84657410 DOC: internals: document the IST API
This one was missing. It should be easier to use now. It is obvious that
some functions are missing, and it looks like ist2str() and istpad() are
exactly the same.
2021-11-08 16:50:48 +01:00
Willy Tarreau
08d3220de5 [RELEASE] Released version 2.5-dev13
Released version 2.5-dev13 with the following main changes :
    - SCRIPTS: git-show-backports: re-enable file-based filtering
    - MINOR: jwt: Make invalid static JWT algorithms an error in `jwt_verify` converter
    - MINOR: mux-h2: add trace on extended connect usage
    - BUG/MEDIUM: mux-h2: reject upgrade if no RFC8441 support
    - MINOR: stream/mux: implement websocket stream flag
    - MINOR: connection: implement function to update ALPN
    - MINOR: connection: add alternative mux_ops param for conn_install_mux_be
    - MEDIUM: server/backend: implement websocket protocol selection
    - MINOR: server: add ws keyword
    - BUG/MINOR: resolvers: fix sent messages were counted twice
    - BUG/MINOR: resolvers: throw log message if trash not large enough for query
    - MINOR: resolvers/dns: split dns and resolver counters in dns_counter struct
    - MEDIUM: resolvers: rename dns extra counters to resolvers extra counters
    - BUG/MINOR: jwt: Fix jwt_parse_alg incorrectly returning JWS_ALG_NONE
    - DOC: add QUIC instruction in INSTALL
    - CLEANUP: halog: Remove dead stores
    - DEV: coccinelle: Add ha_free.cocci
    - CLEANUP: Apply ha_free.cocci
    - DEV: coccinelle: Add rule to use `istnext()` where possible
    - CLEANUP: Apply ist.cocci
    - REGTESTS: Use `feature cmd` for 2.5+ tests (2)
    - DOC: internals: move some API definitions to an "api" subdirectory
    - MINOR: quic: Allocate listener RX buffers
    - CLEANUP: quic: Remove useless code
    - MINOR: quic: Enhance the listener RX buffering part
    - MINOR: quic: Remove a useless lock for CRYPTO frames
    - MINOR: quic: Use QUIC_LOCK QUIC specific lock label.
    - MINOR: backend: Get client dst address to set the server's one only if needful
    - MINOR: compression: Warn for 'compression offload' in defaults sections
    - MEDIUM: connection: rename fc_conn_err and bc_conn_err to fc_err and bc_err
    - DOC: configuration: move the default log formats to their own section
    - MINOR: ssl: make the ssl_fc_sni() sample-fetch function always available
    - MEDIUM: log: add the client's SNI to the default HTTPS log format
    - DOC: config: add an example of reasonably complete error-log-format
    - DOC: config: move error-log-format before custom log format
2021-11-06 09:25:57 +01:00
Willy Tarreau
a62f184d3d DOC: internals: move some API definitions to an "api" subdirectory
It's not always easy to figure that there are some docs on internal API
stuff, let's move them to their own directory. There's a diagram for
lists that could be placed there but instead would deserve a greppable
description for quick lookups, so it was not moved there.
2021-11-05 11:53:22 +01:00