haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-01-03 02:32:03 +00:00

Author	SHA1	Message	Date
Aurelien DARRAGON	3a81e997ac	MINOR: event_hdl: global sublist management clarification event_hdl_sub_list_init() and event_hdl_sub_list_destroy() don't expect to be called with a NULL argument (to use global subscription list implicitly), simply because the global subscription list init and destroy is internally managed. Adding BUG_ON() to detect such invalid usages, and updating some comments to prevent confusion around these functions. If `68e692da0` ("MINOR: event_hdl: add event handler base api") is being backported, then this commit should be backported with it.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	d514ca45c6	BUG/MINOR: event_hdl: make event_hdl_subscribe thread-safe List insertion in event_hdl_subscribe() was not thread-safe when dealing with unique identifiers. Indeed, in this case the list insertion is conditional (we check for a duplicate, then we insert). And while we're using mt lists for this, the whole operation is not atomic: there is a race between the check and the insertion. This could lead to the same ID being registered multiple times with concurrent calls to event_hdl_subscribe() on the same ID. To fix this, we add 'insert_lock' dedicated lock in the subscription list struct. The lock's cost is nearly 0 since it is only used when registering identified subscriptions and the lock window is very short: we only guard the duplicate check and the list insertion to make the conditional insertion "atomic" within a given subscription list. This is the only place where we need the lock: as soon as the item is properly inserted we're out of trouble because all other operations on the list are already thread-safe thanks to mt lists. A new lock hint is introduced: LOCK_EHDL which is dedicated to event_hdl The patch may seem quite large since we had to rework the logic around the subscribe function and switch from simple mt_list to a dedicated struct wrapping both the mt_list and the insert_lock for the event_hdl_sub_list type. (sizeof(event_hdl_sub_list) is now 24 instead of 16) However, all the changes are internal: we don't break the API. If `68e692da0` ("MINOR: event_hdl: add event handler base api") is being backported, then this commit should be backported with it.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	53eb6aecce	BUG/MINOR: event_hdl: fix rid storage type rid is stored as a uint32_t within struct server, but it was stored as a signed int within the server event data struct. Switching from signed int to uint32_t in event_hdl_cb_data_server struct to make sure it won't overflow. If `129ecf441` ("MINOR: server/event_hdl: add support for SERVER_ADD and SERVER_DEL events") is being backported, then this commit should be backported with it.	2023-04-05 08:58:17 +02:00
Thierry Fournier	1edf36a369	MEDIUM: hlua_fcn: dynamic server iteration and indexing This patch proposes to enumerate servers using internal HAProxy list. Also, remove the flag SRV_F_NON_PURGEABLE which makes the server non purgeable each time Lua uses the server. Removing reg-tests/cli_delete_server_lua.vtc since this test is no longer relevant (we don't set the SRV_F_NON_PURGEABLE flag anymore) and we already have a more generic test: reg-tests/server/cli_delete_server.vtc Co-authored-by: Aurelien DARRAGON <adarragon@haproxy.com>	2023-04-05 08:58:16 +02:00
Thierry Fournier	467913c84e	MEDIUM: hlua: Dynamic list of frontend/backend in Lua When HAproxy is loaded with a lot of frontends/backends (tested with 300k), it is slow to start and it uses a lot of memory just for indexing backends in the lua tables. This patch uses the internal frontend/backend index of HAProxy in place of lua table. HAProxy startup is now quicker as each frontend/backend object is created on demand and not at init. This has to come with some cost: the execution of Lua will be a little bit slower.	2023-04-05 08:58:16 +02:00
Thierry Fournier	599f2311a8	MINOR: hlua: Fix two functions that return nothing useful Two lua init function seems to return something useful, but it is not the case. The function "hlua_concat_init" seems to return a failure status, but the function never fails. The function "hlua_fcn_reg_core_fcn" seems to return a number of elements in the stack, but it is not the case.	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	f175b08bfb	BUG/MINOR: server/del: fix srv->next pointer consistency We recently discovered a bug which affects dynamic server deletion: When a server is deleted, it is removed from the "visible" server list. But as we've seen in previous commit ("MINOR: server: add SRV_F_DELETED flag"), it can still be accessed by someone who keeps a reference on it (waiting for the final srv_drop()). Throughout this transient state, server ptr is still valid (may be dereferenced) and the flag SRV_F_DELETED is set. However, as the server is not part of server list anymore, we have an issue: srv->next pointer won't be updated anymore as the only place where we perform such update is in cli_parse_delete_server() by iterating over the "visible" server list. Because of this, we cannot guarantee that a server with the SRV_F_DELETED flag has a valid 'next' ptr: 'next' could be pointing to a fully removed (already freed) server. This problem can be easily demonstrated with server dumping in the stats: server list dumping is performed in stats_dump_proxy_to_buffer() The function can be interrupted and resumed later by design. ie: output buffer is full: partial dump and finish the dump after the flush This is implemented by calling srv_take() on the server being dumped, and only releasing it when we're done with it using srv_drop(). (drop can be delayed after function resume if buffer is full) While the function design seems OK, it works with the assumption that srv->next will still be valid after the function resumes, which is not true. (especially if multiple servers are being removed in between the 2 dumping attempts) In practice, this did not cause any crash yet (at least this was not reported so far), because server dumping is so fast that it is very unlikely that multiple server deletions make their way between 2 dumping attempts in most setups. But still, this is a problem that we need to address because some upcoming work might depend on this assumption as well and for the moment it is not safe at all. ======================================================================== Here is a quick reproducer: With this patch, we're creating a large deletion window of 3s as soon as we reach a server named "t2" while iterating over the list. This will give us plenty of time to perform multiple deletions before the function is resumed. \| diff --git a/src/stats.c b/src/stats.c \| index 84a4f9b6e..15e49b4cd 100644 \| --- a/src/stats.c \| +++ b/src/stats.c \| @@ -3189,11 +3189,24 @@ int stats_dump_proxy_to_buffer(struct stconn sc, struct htx htx, \| * Temporarily increment its refcount to prevent its \| * anticipated cleaning. Call free_server to release it. \| / \| + struct server orig = ctx->obj2; \| for (; ctx->obj2 != NULL; \| ctx->obj2 = srv_drop(sv)) { \| \| sv = ctx->obj2; \| + printf("sv = %s\n", sv->id); \| srv_take(sv); \| + if (!strcmp("t2", sv->id) && orig == px->srv) { \| + printf("deletion window: 3s\n"); \| + thread_idle_now(); \| + thread_harmless_now(); \| + sleep(3); \| + thread_harmless_end(); \| + \| + thread_idle_end(); \| + \| + goto full; /* simulate full buffer / \| + } \| \| if (htx) { \| if (htx_almost_full(htx)) \| @@ -4353,6 +4366,7 @@ static void http_stats_io_handler(struct appctx appctx) \| struct channel res = sc_ic(sc); \| struct htx req_htx, res_htx; \| \| + printf("http dump\n"); \| / only proxy stats are available via http / \| ctx->domain = STATS_DOMAIN_PROXY; \| Ok, we're ready, now we start haproxy with the following conf: global stats socket /tmp/ha.sock mode 660 level admin expose-fd listeners thread 1-1 nbthread 2 frontend stats mode http bind :8081 thread 2-2 stats enable stats uri / backend farm server t1 127.0.0.1:1899 disabled server t2 127.0.0.1:18999 disabled server t3 127.0.0.1:18998 disabled server t4 127.0.0.1:18997 disabled And finally, we execute the following script: curl localhost:8081/stats& sleep .2 echo "del server farm/t2" \| nc -U /tmp/ha.sock echo "del server farm/t3" \| nc -U /tmp/ha.sock This should be enough to reveal the issue, I easily manage to consistently crash haproxy with the following reproducer: http dump sv = t1 http dump sv = t1 sv = t2 deletion window = 3s [NOTICE] (2940566) : Server deleted. [NOTICE] (2940566) : Server deleted. http dump sv = t2 sv = ��U [1] 2940566 segmentation fault (core dumped) ./haproxy -f ttt.conf ======================================================================== To fix this, we add prev_deleted mt_list in server struct. For a given "visible" server, this list will contain the pending "deleted" servers references that point to it using their 'next' ptr. This way, whenever this "visible" server is going to be deleted via cli_parse_delete_server() it will check for servers in its 'prev_deleted' list and update their 'next' pointer so that they no longer point to it, and then it will push them in its 'next->prev_deleted' list to transfer the update responsibility to the next 'visible' server (if next != NULL). Then, following the same logic, the server about to be removed in cli_parse_delete_server() will push itself as well into its 'next->prev_deleted' list (if next != NULL) so that it may still use its 'next' ptr for the time it is in transient removal state. In srv_drop(), right before the server is finally freed, we make sure to remove it from the 'next->prev_deleted' list so that 'next' won't try to perform the pointers update for this server anymore. This has to be done atomically to prevent 'next' srv from accessing a purged server. As a result: for a valid server, either deleted or not, 'next' ptr will always point to a non deleted (ie: visible) server. With the proposed fix, and several removal combinations (including unordered cli_parse_delete_server() and srv_drop() calls), I cannot reproduce the crash anymore. Example tricky removal sequence that is now properly handled: sv list: t1,t2,t3,t4,t5,t6 ops: take(t2) del(t4) del(t3) del(t5) drop(t3) drop(t4) drop(t5) drop(t2)	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	75b9d1c041	MINOR: server: add SRV_F_DELETED flag Set the SRV_F_DELETED flag when server is removed from the cli. When removing a server from the cli (in cli_parse_delete_server()), we update the "visible" server list so that the removed server is no longer part of the list. However, despite the server being removed from "visible" server list, one could still access the server data from a valid ptr (ie: srv_take()) Deleted flag helps detecting when a server is in transient removal state: that is, removed from the list, thus not visible but not yet purged from memory.	2023-04-05 08:58:16 +02:00
Christopher Faulet	7faac7cf34	MINOR: tree-wide: Simplifiy some tests on SHUT flags by accessing SCs directly At many places, we simplify the tests on SHUT flags to remove calls to chn_prod() or chn_cons() function because the corresponding SC is available.	2023-04-05 08:57:06 +02:00
Christopher Faulet	87633c3a11	MEDIUM: tree-wide: Move flags about shut from the channel to the SC The purpose of this patch is only a one-to-one replacement, as far as possible. CF_SHUTR(_NOW) and CF_SHUTW(_NOW) flags are now carried by the stream-connecter. CF_ prefix is replaced by SC_FL_ one. Of course, it is not so simple because at many places, we were testing if a channel was shut for reads and writes in same time. To do the same, shut for reads must be tested on one side on the SC and shut for writes on the other side on the opposite SC. A special care was taken with process_stream(). flags of SCs must be saved to be able to detect changes, just like for the channels.	2023-04-05 08:57:06 +02:00
Christopher Faulet	904763f562	MINOR: stconn/channel: Move CF_EOI into the SC and rename it The channel flag CF_EOI is renamed to SC_FL_EOI and moved into the stream-connector.	2023-04-05 08:57:06 +02:00
Christopher Faulet	84d3ef982c	MINOR: stconn/channel: Move CF_EXPECT_MORE into the SC and rename it The channel flag CF_EXPECT_MORE is renamed to SC_FL_SND_EXP_MORE and moved into the stream-connector.	2023-04-05 08:57:05 +02:00
Christopher Faulet	68ef218a72	MINOR: stconn/channel: Move CF_NEVER_WAIT into the SC and rename it The channel flag CF_NEVER_WAIT is renamed to SC_FL_SND_NEVERWAIT and moved into the stream-connector.	2023-04-05 08:57:05 +02:00
Christopher Faulet	5c281d58ea	MINOR: stconn/channel: Move CF_SEND_DONTWAIT into the SC and rename it The channel flag CF_SEND_DONTWAIT is renamed to SC_FL_SND_ASAP and moved into the stream-connector.	2023-04-05 08:57:05 +02:00
Christopher Faulet	9a790f63ed	MINOR: stconn/channel: Move CF_READ_DONTWAIT into the SC and rename it The channel flag CF_READ_DONTWAIT is renamed to SC_FL_RCV_ONCE and moved into the stream-connector.	2023-04-05 08:57:05 +02:00
Christopher Faulet	26e0935681	MEDIUM: applet/trace: Register a new trace source with its events Traces are now supported for applets. The first argument is always the appctx. This will help to debug applets.	2023-04-05 08:46:06 +02:00
Christopher Faulet	a5915eb1dd	MINOR: applet: Uninline appctx_free() This functin is uninlined and move in src/applet.c. It is mandatory to add traces for applets.	2023-04-05 08:46:06 +02:00
Remi Tricot-Le Breton	26e1432436	BUG/MINOR: ssl: Undefined reference when building with OPENSSL_NO_DEPRECATED If OPENSSL_NO_DEPRECATED is set, we get a 'error: ‘RSA_PKCS1_PADDING’ undeclared' when building jwt.c. The symbol is not deprecated, we are just missing an include. This was raised in GitHub issue #2098. It does not need to be backported.	2023-04-03 11:46:54 +02:00
Frédéric Lécaille	7d6270a845	BUG/MAJOR: quic: Congestion algorithms states shared between the connection This very old bug is there since the first implementation of newreno congestion algorithm implementation. This was a very bad idea to put a state variable into quic_cc_algo struct which only defines the congestion control algorithm used by a QUIC listener, typically its type and its callbacks. This bug could lead to crashes since BUG_ON() calls have been added to each algorithm implementation. This was revealed by interop test, but not very often as there was not very often several connections run at the time during these tests. Hopefully this was also reported by Tristan in GH #2095. Move the congestion algorithm state to the correct structures which are private to a connection (see cubic and nr structs). Must be backported to 2.7 and 2.6.	2023-04-02 13:10:13 +02:00
Ilya Shipitsin	07be66d21b	CLEANUP: assorted typo fixes in the code and comments This is 35th iteration of typo fixes	2023-04-01 18:33:40 +02:00
Frédéric Lécaille	db4bc6b4f3	MINOR: quic: Add a fake congestion control algorithm named "nocc" This algorithm does nothing except initializing the congestion control window to a fixed value. Very smart! Modify the QUIC congestion control configuration parser to support this new algorithm. The congestion control algorithm must be set as follows: quic-cc-algo nocc-<cc window size(KB)) For instance if "nocc-15" is provided as quic-cc-algo keyword value, this will set a fixed window of 15KB.	2023-03-31 17:09:03 +02:00
Frédéric Lécaille	d721571d26	MEDIUM: quic: Ack delay implementation Reuse the idle timeout task to delay the acknowledgments. The time of the idle timer expiration is for now on stored in ->idle_expire. The one to trigger the acknowledgements is stored in ->ack_expire. Add QUIC_FL_CONN_ACK_TIMER_FIRED new connection flag to mark a connection as having its acknowledgement timer been triggered. Modify qc_may_build_pkt() to prevent the sending of "ack only" packets and allows the connection to send packet when the ack timer has fired. It is possible that acks are sent before the ack timer has triggered. In this case it is cancelled only if ACK frames are really sent. The idle timer expiration must be set again when the ack timer has been triggered or when it is cancelled. Must be backported to 2.7.	2023-03-31 13:41:17 +02:00
Frédéric Lécaille	8f991948f5	MINOR: quic: Traces adjustments at proto level. Dump variables displayed by TRACE_ENTER() or TRACE_LEAVE() by calls to TRACE_PROTO(). No more variables are displayed by the two former macros. For now on, these information are accessible from proto level. Add new calls to TRACE_PROTO() at important locations in relation whith QUIC transport protocol. When relevant, try to prefix such traces with TX or RX keyword to identify the concerned subpart (transmission or reception) of the protocol. Must be backported to 2.7.	2023-03-31 09:54:59 +02:00
Frédéric Lécaille	acc9cfdf79	MINOR: quic: Adjustments for generic control congestion traces Display the elapsed time since packets were sent in place of the timestamp which do not bring easy to read information. Must be backported to 2.7.	2023-03-31 09:54:59 +02:00
Frédéric Lécaille	d7243318c4	BUG/MINOR: quic: Wrong use of now_ms timestamps (cubic algo) As now_ms may wrap, one must use the ticks API to protect the cubic congestion control algorithm implementation from side effects due to this. Furthermore to make the cubic congestion control algorithm more readable and easy to maintain, adding a new state ("in recovery period" QUIC_CC_ST_RP new enum) helps in reaching this goal. Implement quic_cc_cubic_rp_cb() which is the callback for this new state. Must be backported to 2.7 and 2.6.	2023-03-31 09:54:59 +02:00
Aurelien DARRAGON	7e64d8720e	BUG/MINOR: backend: make be_usable_srv() consistent when stopping When a proxy enters the STOPPED state, it will no longer accept new connections. However, it doesn't mean that it's completely inactive yet: it will still be able to handle already pending / keep-alive connections, thus finishing ongoing work before effectively stopping. be_usable_srv(), which is used by nbsrv converter and sample fetch, will return 0 if the proxy is either stopped or disabled. nbsrv behaves this way since it was originally implemented in `b7e7c4720` ("MINOR: Add nbsrv sample converter"). (Since then, multiple refactors were performed around this area, but the current implementation still follows the same logic) It was found that if nbsrv is used in a proxy section to perform routing logic, unexpected decisions are being made when nbsrv is used on a proxy with STOPPED state, since in-flight requests will suffer from nbsrv returning 0 instead of the current number of usable servers which may still process existing connections. For instance, this can happen during process soft-stop, or even when stopping the proxy from the cli / lua. To fix this: we now make sure be_usable_srv() always returns the current number of usable servers, unless the proxy is explicitly disabled (from the config, not at runtime) This could be backported up to 2.6. For older versions, the need for a backport should be evaluated first. -- Note for 2.4: proxy flags did not exist, it was implemented with fd10ab5e ("MINOR: proxy: Introduce proxy flags to replace disabled bitfield") For 2.2: STOPPED and DISABLED states were not separated, so we have no easy way to apply the fix anyway.	2023-03-31 07:45:08 +02:00
Martin DOLEZ	110e4a8733	MINOR: http_fetch: add case insensitive support for smp_fetch_url_param This commit adds a new argument to smp_fetch_url_param that makes the parameter key comparison case-insensitive. Several levels of callers were modified to pass this info.	2023-03-30 14:11:10 +02:00
Aurelien DARRAGON	2c5b9ded9b	CLEANUP: proxy: remove stop_time related dead code Since `eb77824` ("MEDIUM: proxy: remove the deprecated "grace" keyword"), stop_time is never set, so the related code in manage_proxy() is not relevant anymore. Removing code that refers to p->stop_time, since it was probably overlooked.	2023-03-28 20:26:47 +02:00
Frédéric Lécaille	c425e03b28	BUG/MINOR: quic: Missing STREAM frame type updated This patch follows this commit which was not sufficient: BUG/MINOR: quic: Missing STREAM frame data pointer updates Indeed, after updating the ->offset field, the bit which informs the frame builder of its presence must be systematically set. This bug was revealed by the following BUG_ON() from quic_build_stream_frame() : bug condition "!!(frm->type & 0x04) != !!stream->offset.key" matched at src/quic_frame.c:515 This should fix the last crash occured on github issue #2074. Must be backported to 2.6 and 2.7.	2023-03-27 16:01:44 +02:00
Willy Tarreau	1751db140a	MINOR: pools: report a replaced memory allocator instead of just malloc_trim() Instead of reporting the inaccurate "malloc_trim() support" on -vv, let's report the case where the memory allocator was actively replaced from the one used at build time, as this is the corner case we want to be cautious about. We also put a tainted bit when this happens so that it's possible to detect it at run time (e.g. the user might have inherited it from an environment variable during a reload operation). The now unused is_trim_enabled() function was finally dropped.	2023-03-22 18:05:02 +01:00
Willy Tarreau	7aee683541	MINOR: pools: export trim_all_pools() This way it will be usable from outside instead of malloc_trim().	2023-03-22 17:30:28 +01:00
Willy Tarreau	eaba76b02d	MINOR: pools: intercept malloc_trim() instead of trying to plug holes As reported by Miroslav in commit `d8a97d8f6` ("BUG/MINOR: illegal use of the malloc_trim() function if jemalloc is used") there are still occasional cases where it's discovered that malloc_trim() is being used without its suitability being checked first. This is a problem when using another incompatible allocator. But there's a class of use cases we'll never be able to cover, it's dynamic libraries loaded from Lua. In order to address this more reliably, we now define our own malloc_trim() that calls the previous one after checking that the feature is supported and that the allocator is the expected one. This way child libraries that would call it will also be safe. The function is intentionally left defined all the time so that it will be possible to clean up some code that uses it by removing ifdefs.	2023-03-22 17:30:28 +01:00
Amaury Denoyelle	1d0ed1a2e9	BUG/MINOR: trace: fix hardcoded level for TRACE_PRINTF Level argument was not ignored by TRACE_PRINTF due to an hardcoded value of TRACE_LEVEL_DEVELOPER inside the macro. This must be backported up to 2.6.	2023-03-22 15:31:55 +01:00
Miroslav Zagorac	d8a97d8f60	BUG/MINOR: illegal use of the malloc_trim() function if jemalloc is used In the event that HAProxy is linked with the jemalloc library, it is still shown that malloc_trim() is enabled when executing "haproxy -vv": .. Support for malloc_trim() is enabled. .. It's not so much a problem as it is that malloc_trim() is called in the pat_ref_purge_range() function without any checking. This was solved by setting the using_default_allocator variable to the correct value in the detect_allocator() function and before calling malloc_trim() it is checked whether the function should be called.	2023-03-22 14:14:50 +01:00
Willy Tarreau	0de1e6180a	BUILD: thread: implement thread_harmless_end_sig() for threadless builds Building without thread support was broken in 2.8-dev2 with commit `7e70bfc8c` ("MINOR: threads: add a thread_harmless_end() version that doesn't wait") that forgot to define the function for the threadless cases. No backport is needed.	2023-03-22 10:40:06 +01:00
Willy Tarreau	69869e6354	MINOR: dynbuf: set POOL_F_NO_FAIL on buffer allocation b_alloc() is used to allocate a buffer. We can provoke fault injection based on forced memory allocation failures using -dMfail on the command line, but we know that the buffer_wait list is a bit weak and doesn't always recover well. As such, submitting buffer allocation to such a treatment seriously limits the usefulness of -dMfail which cannot really be used for other purposes. Let's just disable it for buffers for now.	2023-03-21 09:15:13 +01:00
Willy Tarreau	ac78c4fd9d	MINOR: ssl-sock: pass the CO_SFL_MSG_MORE info down the stack Despite having replaced the SSL BIOs to use our own raw_sock layer, we still didn't exploit the CO_SFL_MSG_MORE flag which is pretty useful to avoid sending incomplete packets. It's particularly important for SSL since the extra overhead almost guarantees that each send() will be followed by an incomplete (and often odd-sided) segment. We already have an xprt_st set of flags to pass info to the various layers, so let's just add a new one, SSL_SOCK_SEND_MORE, that is set or cleared during ssl_sock_from_buf() to transfer the knowledge of CO_SFL_MSG_MORE. This way we can recover this information and pass it to raw_sock. This alone is sufficient to increase by ~5-10% the H2 bandwidth over SSL when multiple streams are used in parallel.	2023-03-17 16:43:51 +01:00
Fr�d�ric L�caille	ca07979b97	BUG/MINOR: quic: Missing STREAM frame data pointer updates This patch follows this one which was not sufficient: "BUG/MINOR: quic: Missing STREAM frame length updates" Indeed, it is not sufficient to update the ->len and ->offset member of a STREAM frame to move it forward. The data pointer must also be updated. This is not done by the STREAM frame builder. Must be backported to 2.6 and 2.7.	2023-03-17 09:21:18 +01:00
Willy Tarreau	9824f8c890	MINOR: buffer: add br_single() to check if a buffer ring has more than one buf It's cheaper and cleaner than using br_count()==1 given that it just compares two indexes, and that a ring having a single buffer is in a special case where it is between empty and used up-to-1. In other words it's not congested.	2023-03-16 18:45:46 +01:00
Willy Tarreau	e5a26eb2de	MINOR: buffer: add br_count() to return the number of allocated bufs We have no way to know how many buffers are currently allocated in a buffer ring. Let's add br_count() for this.	2023-03-16 18:45:46 +01:00
Christopher Faulet	3a7b539b12	BUG/MEDIUM: connection: Preserve flags when a conn is removed from an idle list The commit `5e1b0e7bf` ("BUG/MEDIUM: connection: Clear flags when a conn is removed from an idle list") introduced a regression. CO_FL_SAFE_LIST and CO_FL_IDLE_LIST flags are used when the connection is released to properly decrement used/idle connection counters. if a connection is idle, these flags must be preserved till the connection is really released. It may be removed from the list but not immediately released. If these flags are lost when it is finally released, the current number of used connections is erroneously decremented. If means this counter may become negative and the counters tracking the number of idle connecitons is not decremented, suggesting a leak. So, the above commit is reverted and instead we improve a bit the way to detect an idle connection. The function conn_get_idle_flag() must now be used to know if a connection is in an idle list. It returns the connection flag corresponding to the idle list if the connection is idle (CO_FL_SAFE_LIST or CO_FL_IDLE_LIST) or 0 otherwise. But if the connection is scheduled to be removed, 0 is also returned, regardless the connection flags. This new function is used when the connection is temporarily removed from the list to be used, mainly in muxes. This patch should fix #2078 and #2057. It must be backported as far as 2.2.	2023-03-16 15:34:20 +01:00
Remi Tricot-Le Breton	a6c0a59e9a	MINOR: ssl: Use ocsp update task for "update ssl ocsp-response" command Instead of having a dedicated httpclient instance and its own code decorrelated from the actual auto update one, the "update ssl ocsp-response" will now use the update task in order to perform updates. Since the cli command allows to update responses that were never included in the auto update tree, a new flag was added to the certificate_ocsp structure so that the said entry can be inserted into the tree "by hand" and it won't be reinserted back into the tree after the update process is performed. The 'update_once' flag "stole" a bit from the 'fail_count' counter since it is the one less likely to reach UINT_MAX among the ocsp counters of the certificate_ocsp structure. This new logic required that every certificate_ocsp entry contained all the ocsp-related information at all time since entries that are not supposed to be configured automatically can still be updated through the cli. The logic of the ssl_sock_load_ocsp was changed accordingly.	2023-03-14 11:07:32 +01:00
Willy Tarreau	8f6da64641	MINOR: quic_sock: un-statify quic_conn_sock_fd_iocb() This one is printed as the iocb in the "show fd" output, and arguably this wasn't very convenient as-is: 293 : st=0x000123(cl heopI W:sRa R:sRA) ref=0 gid=1 tmask=0x8 umask=0x0 prmsk=0x8 pwmsk=0x0 owner=0x7f488487afe0 iocb=0x50a2c0(main+0x60f90) Let's unstatify it and export it so that the symbol can now be resolved from the various points that need it.	2023-03-10 14:30:01 +01:00
William Lallemand	2078d4b1f7	BUG/MINOR: mworker: use MASTER_MAXCONN as default maxconn value In environments where SYSTEM_MAXCONN is defined when compiling, the master will use this value instead of the original minimal value which was set to 100. When this happens, the master process could allocate RAM excessively since it does not need to have an high maxconn. (For example if SYSTEM_MAXCONN was set to 100000 or more) This patch fixes the issue by using the new define MASTER_MAXCONN which define a default maxconn of 100 for the master process. Must be backported as far as 2.5.	2023-03-09 14:28:44 +01:00
Willy Tarreau	cd8914bc52	BUG/MAJOR: fd/threads: close a race on closing connections after takeover As mentioned in commit `237e6a0d6` ("BUG/MAJOR: fd/thread: fix race between updates and closing FD"), a race was found during stress tests involving heavy backend connection reuse with many competing closes. Here the problem is complex. The analysis in commit `f69fea64e` ("MAJOR: fd: get rid of the DWCAS when setting the running_mask") that removed the DWCAS in 2.5 overlooked a few races. First, a takeover from thread1 could happen just after fd_update_events() in thread2 validates it holds the tmask bit in the CAS loop. Since thread1 releases running_mask after the operation, thread2 will succeed the CAS and both will believe the FD is theirs. This does explain the occasional crashes seen with h1_io_cb() being called on a bad context, or sock_conn_iocb() seeing conn->subs vanish after checking it. This issue can be addressed using a DWCAS in both fd_takeover() and fd_update_events() as it was before the patch above but this is not portable to all archs and is not easy to adapt for those lacking it, due to some operations still happening only on individual masks after the thread groups were added. Second, the checks after fd_clr_running() for the current thread being the last one is not sufficient: at the exact moment the operation completes, another thread may also set and drop the running bit and see itself as alone, and both can call _fd_close_orphan() in parallel. In order to prevent this from happening, we cannot rely on the absence of others, we need an explicit flag indicating that the FD must be closed. One approach that was attempted consisted in playing with the thread_mask but that was not reliable since it could still match between the late deletion and the early insertion that follows. Instead, a new FD flag was added, FD_MUST_CLOSE, that exactly indicates that the call to _fd_delete_orphan() must be done. It is set by fd_delete(), and atomically cleared by the first one which checks it, and which is the only one to call _fd_delete_orphan(). With both points addressed, there's no more visible race left: - takeover() only happens under the connection list's lock and cannot compete with fd_delete() since fd_delete() must first remove the connection from the list before deleting the FD. That's also why it doesn't need to call _fd_delete_orphan() when dropping its running bit. - takeover() sets its running bit then atomically replaces the thread mask, so that until that's done, it doesn't validate the condition to end the synchonization loop in fd_update_events(). Once it's OK, the previous thread's bit is lost, and this is checked for in fd_update_events() - fd_update_events() can compete with fd_delete() at various places which are explained above. Since fd_delete() clears the thread mask as after setting its running bit and after setting the FD_MUST_CLOSE bit, the synchronization loop guarantees that the thread mask is seen before going further, and that once it's seen, the FD_MUST_CLOSE flag is already present. - fd_delete() may start while fd_update_events() has already started, but fd_delete() must hold a bit in thread_mask before starting, and that is checked by the first test in fd_update_events() before setting the running_mask. - the poller's _update_fd() will not compete against _fd_delete_orphan() nor fd_insert() thanks to the fd_grab_tgid() that's always done before updating the polled_mask, and guarantees that we never pretend that a polled_mask has a bit before the FD is added. The issue is very hard to reproduce and is extremely time-sensitive. Some tests were required with a 1-ms timeout with request rates closely matching 1 kHz per server, though certain tests sometimes benefitted from saturation. It was found that adding the following slowdown at a few key places helped a lot and managed to trigger the bug in 0.5 to 5 seconds instead of tens of minutes on a 20-thread setup: { volatile int i = 10000; while (i--); } Particularly, placing it at key places where only one of running_mask or thread_mask is set and not the other one yet (e.g. after the synchronization loop in fd_update_events or after dropping the running bit) did yield great results. Many thanks to Olivier Houchard for this expert help analysing these races and reviewing candidate fixes. The patch must be backported to 2.5. Note that 2.6 does not have tgid in FDs, and that it requires a change of output on fd_clr_running() as we need the previous bit. This is provided by carefully backporting commit `d6e1987612` ("MINOR: fd: make fd_clr_running() return the previous value instead"). Tests have shown that the lack of tgid is a showstopper for 2.6 and that unless a better workaround is found, it could still be preferable to backport the minimum pieces required for fd_grab_tgid() to 2.6 so that it stays stable long.	2023-03-09 14:01:48 +01:00
Frédéric Lécaille	cc101cd2aa	BUG/MINOR: quic: Wrong RETIRE_CONNECTION_ID sequence number check This bug arrived with this commit: b5a8020e9 MINOR: quic: RETIRE_CONNECTION_ID frame handling (RX) and was revealed by h3 interop tests with clients like s2n-quic and quic-go as noticed by Amaury. Indeed, one must check that the CID matching the sequence number provided by a received RETIRE_CONNECTION_ID frame does not match the DCID of the packet. Remove useless ->curr_cid_seq_num member from quic_conn struct. The sequence number lookup must be done in qc_handle_retire_connection_id_frm() to check the validity of the RETIRE_CONNECTION_ID frame, it returns the CID to be retired into <cid_to_retire> variable passed as parameter to this function if the frame is valid and if the CID was not already retired Must be backported to 2.7.	2023-03-08 14:53:12 +01:00
Amaury Denoyelle	5907fede87	MEDIUM: quic: release closing connections on stopping Since the following commit : commit `fb375574f9` MINOR: quic: mark quic-conn as jobs on socket allocation quic-conn instances are marked as jobs. This prevent haproxy process to stop while there is transfer in progress. To not delay process termination, idle connections are woken up through their MUX instances to be able to release them immediately. However, there is no mechanism to wake up quic connections left on closing or draining state. This means that haproxy process termination is delayed until every closing quic connections timer has expired. To improve this, a new function quic_handle_stopping() is called when haproxy process is stopping. It simply wakes up the idle timer task of all connections in the global closing list. These connections will thus be released immediately to not interrupt haproxy process stopping. This should be backported up to 2.7.	2023-03-08 14:41:28 +01:00
Amaury Denoyelle	efed86c973	MINOR: quic: create a global list dedicated for closing QUIC conns When a CONNECTION_CLOSE is emitted or received, a QUIC connection enters respectively in draining or closing state. These states are a loose equivalent of TCP TIME_WAIT. No data can be exchanged anymore but the connection is maintained during a certain timer to handle packet reordering or loss. A new global list has been defined for QUIC connections in closing/draining state inside thread_ctx structure. Each time a connection enters in one of this state, it will be moved from the default global list to the new closing list. The objective of this patch is to quickly filter connections on closing/draining. Most notably, this will be used to wake up these connections and avoid that haproxy process stopping is delayed by them. A dedicated function qc_detach_th_ctx_list() has been implemented to transfer a quic-conn from one list instance to the other. This takes care of back-references attach to a quic-conn instance in case of a running "show quic". This should be backported up to 2.7.	2023-03-08 14:39:48 +01:00
Frédéric Lécaille	5e3201ea77	MINOR: quic: Add transport parameters to "show quic" Modify quic_transport_params_dump() and others function relative to the transport parameters value dump from TRACE() to make their output more compact. Add call to quic_transport_params_dump() to dump the transport parameters from "show quic" CLI command. Must be backported to 2.7.	2023-03-08 08:50:54 +01:00
Frédéric Lécaille	ece86e64c4	MINOR: quic: Add spin bit support Add QUIC_FL_RX_PACKET_SPIN_BIT new RX packet flag to mark an RX packet as having the spin bit set. Idem for the connection with QUIC_FL_CONN_SPIN_BIT flag. Implement qc_handle_spin_bit() to set/unset QUIC_FL_CONN_SPIN_BIT for the connection as soon as a packet number could be deciphered. Modify quic_build_packet_short_header() to set the spin bit when building a short packet header. Validated by quic-tracker spin bit test. Must be backported to 2.7.	2023-03-08 08:50:54 +01:00
Frédéric Lécaille	8ac8a8778d	MINOR: quic: RETIRE_CONNECTION_ID frame handling (RX) Add ->curr_cid_seq_num new quic_conn struct frame to store the connection ID sequence number currently used by the connection. Implement qc_handle_retire_connection_id_frm() to handle this RX frame. Implement qc_retire_connection_seq_num() to remove a connection ID from its sequence number. Implement qc_build_new_connection_id_frm to allocate a new NEW_CONNECTION_ID frame from a CID. Modify qc_parse_pkt_frms() which parses the frames of an RX packet to handle the case of the RETIRE_CONNECTION_ID frame. Must be backported to 2.7.	2023-03-08 08:50:54 +01:00
Frédéric Lécaille	b4c5471425	MINOR: quic: Store the next connection IDs sequence number in the connection Add ->next_cid_seq_num new member to quic_conn struct to store the next connection ID to be used to alloacated a connection ID. It is initialized to 0 from qc_new_conn() which initializes a connection. Modify new_quic_cid() to use this variable each time it is called without giving the possibility to the caller to pass the sequence number for the connection to be allocated. Modify quic_build_post_handshake_frames() to use ->next_cid_seq_num when building NEW_CONNECTION_ID frames after the hanshake has been completed. Limit the number of connection IDs provided to the peer to the minimum between 4 and the value it sent with active_connection_id_limit transport parameter. This includes the connection ID used by the connection to send this new connection IDs. Must be backported to 2.7.	2023-03-08 08:50:54 +01:00
Frédéric Lécaille	51a7caf921	MINOR: quic: Add traces about QUIC TLS key update Dump the secret used to derive the next one during a key update initiated by the client and dump the resulted new secret and the new key and iv to be used to decryption Application level packets. Also add a trace when the key update is supposed to be initiated on haproxy side. This has already helped in diagnosing an issue evealed by the key update interop test with xquic as client. Must be backported to 2.7.	2023-03-03 19:12:26 +01:00
Amaury Denoyelle	c8a0efbda8	BUG/MEDIUM: quic: properly handle duplicated STREAM frames When a STREAM frame is re-emitted, it will point to the same stream buffer as the original one. If an ACK is received for either one of these frame, the underlying buffer may be freed. Thus, if the second frame is declared as lost and schedule for retransmission, we must ensure that the underlying buffer is still allocated or interrupt the retransmission. Stream buffer is stored as an eb_tree indexed by the stream ID. To avoid to lookup over a tree each time a STREAM frame is re-emitted, a lost STREAM frame is flagged as QUIC_FL_TX_FRAME_LOST. In most cases, this code is functional. However, there is several potential issues which may cause a segfault : - when explicitely probing with a STREAM frame, the frame won't be flagged as lost - when splitting a STREAM frame during retransmission, the flag is not copied To fix both these cases, QUIC_FL_TX_FRAME_LOST flag has been converted to a <dup> field in quic_stream structure. This field is now properly copied when splitting a STREAM frame. Also, as this is now an inner quic_frame field, it will be copied automatically on qc_frm_dup() invocation thus ensuring that it will be set on probing. This issue was encounted randomly with the following backtrace : #0 __memmove_avx512_unaligned_erms () #1 0x000055f4d5a48c01 in memcpy (__len=18446698486215405173, __src=<optimized out>, #2 quic_build_stream_frame (buf=0x7f6ac3fcb400, end=<optimized out>, frm=0x7f6a00556620, #3 0x000055f4d5a4a147 in qc_build_frm (buf=buf@entry=0x7f6ac3fcb5d8, #4 0x000055f4d5a23300 in qc_do_build_pkt (pos=<optimized out>, end=<optimized out>, #5 0x000055f4d5a25976 in qc_build_pkt (pos=0x7f6ac3fcba10, #6 0x000055f4d5a30c7e in qc_prep_app_pkts (frms=0x7f6a0032bc50, buf=0x7f6a0032bf30, #7 qc_send_app_pkts (qc=0x7f6a0032b310, frms=0x7f6a0032bc50) at src/quic_conn.c:4184 #8 0x000055f4d5a35f42 in quic_conn_app_io_cb (t=0x7f6a0009c660, context=0x7f6a0032b310, This should fix github issue #2051. This should be backported up to 2.6.	2023-03-03 15:08:02 +01:00
Remi Tricot-Le Breton	86d1e0b163	BUG/MINOR: ssl: Fix ocsp-update when using "add ssl crt-list" When adding a new certificate through the CLI and appending it to a crt-list with the 'ocsp-update' option set, the new certificate would not be added to the OCSP response update list. The only thing that was missing was the copy of the ocsp_update mode from the ssl_bind_conf into the ckch_store's object. An extra wakeup of the update task also needed to happen in case the newly inserted entry needs to be updated before the next wakeup of the task. This patch does not need to be backported.	2023-03-02 15:57:56 +01:00
Remi Tricot-Le Breton	5843237993	MINOR: ssl: Add global options to modify ocsp update min/max delay The minimum and maximum delays between two automatic updates of a given OCSP response can now be set via global options. It allows to limit the update rate of OCSP responses for configurations that use many frontend certificates with the ocsp-update option set if the updates are deemed too costly.	2023-03-02 15:37:23 +01:00
Remi Tricot-Le Breton	07b7c15bce	MINOR: ssl: Reorder struct certificate_ocsp members Just swapping those two 'refcount' and 'response' members enables to fill two 4 bytes holes in the structure.	2023-03-02 15:37:20 +01:00
Remi Tricot-Le Breton	0c96ee48b4	MINOR: ssl: Add certificate's path to certificate_ocsp structure In order to have some information about the frontend certificate when dumping the contents of the ocsp update tree from the cli, we could either keep a reference to a ckch_store in the certificate_ocsp structure, which might cause some dangling reference problems, or simply copy the path to the certificate in the ocsp response structure. This latter solution was chosen because of its simplicity.	2023-03-02 15:37:15 +01:00
Remi Tricot-Le Breton	ad6cba83a4	MINOR: ssl: Store specific ocsp update errors in response and update ctx Those new specific error codes will enable to know a bit better what went wrong during and OCSP update process. They will come to use in future sample fetches as well as in debugging means (via the cli or future traces).	2023-03-02 15:37:12 +01:00
Remi Tricot-Le Breton	9e94df3e55	MINOR: ssl: Add ocsp update success/failure counters Those counters will be used for debugging purposes and will be dumped via a cli command.	2023-03-02 15:37:11 +01:00
Amaury Denoyelle	e0fe118dad	MINOR: quic: implement qc_notify_send() Implement qc_notify_send(). This function is responsible to notify the upper layer subscribed on SUB_RETRY_SEND if sending condition are back to normal. For the moment, this patch has no functional change as only congestion window room is checked before notifying the upper layer. However, this will be extended when poller subscribe of socket on sendto() error will be implemented. qc_notify_send() will thus be responsible to ensure that all condition are met before wake up the upper layer. This should be backported up to 2.7.	2023-03-01 14:29:16 +01:00
Amaury Denoyelle	1febc2d316	MEDIUM: quic: improve fatal error handling on send Send is conducted through qc_send_ppkts() for a QUIC connection. There is two types of error which can be encountered on sendto() or affiliated syscalls : * transient error. In this case, sending is simulated with the remaining data and retransmission process is used to have the opportunity to retry emission * fatal error. If this happens, the connection should be closed as soon as possible. This is done via qc_kill_conn() function. Until this patch, only ECONNREFUSED errno was considered as fatal. Modify the QUIC send API to be able to differentiate transient and fatal errors more easily. This is done by fixing the return value of the sendto() wrapper qc_snd_buf() : * on fatal error, a negative error code is returned. This is now the case for every errno except EAGAIN, EWOULDBLOCK, ENOTCONN, EINPROGRESS and EBADF. * on a transient error, 0 is returned. This is the case for the listed errno values above and also if a partial send has been conducted by the kernel. * on success, the return value of sendto() syscall is returned. This commit will be useful to be able to handle transient error with a quic-conn owned socket. In this case, the socket should be subscribed to the poller and no simulated send will be conducted. This commit allows errno management to be confined in the quic-sock module which is a nice cleanup. On a final note, EBADF should be considered as fatal. This will be the subject of a next commit. This should be backported up to 2.7.	2023-02-28 10:51:25 +01:00
Willy Tarreau	7b8aac4439	MINOR: tinfo: make thread_set functions return nth group/mask instead of first thread_set_first_group() and thread_set_first_tmask() were modified and renamed to instead return the number and mask of the nth group. Passing zero continues to return the first one, but it will be more convenient to use this way when building shards.	2023-02-28 10:28:47 +01:00
Willy Tarreau	fea8c19119	CLEANUP: listener: only store conn counts for local threads The listeners have a thr_conn[] array indexed on the thread number that is used during connection redispatching to know what threads are the least loaded. Since we introduced thread groups, and based on the fact that a listener may only belong to one group, there's no point storing counters for all threads, we just need to store them for all threads in the group. Doing so reduces the struct listener from 1500 to 632 bytes. This may be backported to 2.7 to save a bit of resources.	2023-02-28 10:28:47 +01:00
Christopher Faulet	85eabfbf67	MEDIUM: mux-quic: Don't expect data from server as long as request is unfinished As for the H1 and H2 stream, the QUIC stream now states it does not expect data from the server as long as the request is unfinished. The aim is the same. We must be sure to not trigger a read timeout on server side if the client is still uploading data. From the moment the end of the request is received and forwarded to upper layer, the QUIC stream reports it expects to receive data from the opposite endpoint. This re-enables read timeout on the server side.	2023-02-27 17:45:45 +01:00
Christopher Faulet	8aabc8ebfd	MINOR: stconn: Report a send activity when endpoint is willing to consume data When the endpoint (applet or mux) is now willing to consume data while it said it wouldn't, a send activity is reported. Indeed, the writes was blocked because of the endpoint. It is now ready to consume outgoing data. So an send activity must be reported to reset corresponding timers. Concretly, when the flag SE_FL_WONT_CONSULE is removed, a send activity is reported.	2023-02-27 17:45:45 +01:00
Willy Tarreau	a2a3d5dd25	CLEANUP: ring: remove the now unused ring's offset Since the previous patch, the ring's offset is not used anymore. The haring utility remains backward-compatible since it can trust the buffer element that's at the beginning of the map and which still contains all the valid data.	2023-02-24 09:26:30 +01:00
Aurelien DARRAGON	d3ffba4512	MINOR: listener: pause_listener() becomes suspend_listener() We are simply renaming pause_listener() to suspend_listener() to prevent confusion around listener pausing. A suspended listener can be in two differents valid states: - LI_PAUSED: the listener is effectively paused, it will unpause on resume_listener() - LI_ASSIGNED (not bound): the listener does not support the LI_PAUSED state, so it was unbound to satisfy the suspend request, it will correcly re-bind on resume_listener() Besides that, we add the LI_F_SUSPENDED flag to mark suspended listeners in suspend_listener() and unmark them in resume_listener(). We're also adding li_suspend proxy variable to track the number of currently suspended listeners: That is, the number of listeners that were suspended through suspend_listener() and that are either in LI_PAUSED or LI_ASSIGNED state. Counter is increased on successful suspend in suspend_listener() and it is decreased on successful resume in resume_listener() -- Backport notes: -> 2.4 only, as "MINOR: proxy/listener: support for additional PAUSED state" was not backported: Replace this: \| /* PROXY_LOCK is require \| proxy_cond_resume(px); By this: \| ha_warning("Resumed %s %s.\n", proxy_cap_str(px->cap), px->id); \| send_log(px, LOG_WARNING, "Resumed %s %s.\n", proxy_cap_str(px->cap), px->id); -> 2.6 and 2.7 only, as "MINOR: listener: make sure we don't pause/resume" was custom patched: Replace this: \|@@ -253,6 +253,7 @@ struct listener { \| \| /* listener flags (16 bits) / \| #define LI_F_FINALIZED 0x0001 / listener made it to the READY\|\|LIMITED\|\|FULL state at least once, may be suspended/resumed safely / \|+#define LI_F_SUSPENDED 0x0002 / listener has been suspended using suspend_listener(), it is either is LI_PAUSED or LI_ASSIGNED state / \| \| / Descriptor for a "bind" keyword. The ->parse() function returns 0 in case of \| * success, or a combination of ERR_* flags if an error is encountered. The By this: \|@@ -222,6 +222,7 @@ struct li_per_thread { \| \| #define LI_F_QUIC_LISTENER 0x00000001 /* listener uses proto quic / \| #define LI_F_FINALIZED 0x00000002 / listener made it to the READY\|\|LIMITED\|\|FULL state at least once, may be suspended/resumed safely / \|+#define LI_F_SUSPENDED 0x00000004 / listener has been suspended using suspend_listener(), it is either is LI_PAUSED or LI_ASSIGNED state / \| \| / The listener will be directly referenced by the fdtab[] which holds its \| * socket. The listener provides the protocol-specific accept() function to	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	2370599f96	MINOR: listener: make sure we don't pause/resume bypassed listeners Some listeners are kept in LI_ASSIGNED state but are not supposed to be started since they were bypassed on initial startup (eg: in protocol_bind_all() or in enable_listener()...) Introduce the LI_F_FINALIZED flag: when the variable is non zero it means that the listener made it past the LI_LISTEN state (finalized) at least once so we can safely pause / resume. This way we won't risk starting a previously bypassed listener which never made it that far and thus was not expected to be lazy-started by accident. As listener_pause() and listener_resume() are currently partially broken, such unexpected lazy-start won't happen. But we're trying to restore pause() and resume() behavior so this patch will be required before going any further. We had to re-introduce listeners 'flags' struct member since it was recently moved into bind_conf struct. But here we do have a legitimate need for these listener-only flags. This should only be backported if explicitly required by another commit. -- Backport notes: -> 2.4 and 2.5: The 2-bytes hole we're using in the current patch does not apply, let's use the 4-byte hole located under the 'option' field. Replace this: \|@@ -226,7 +226,8 @@ struct li_per_thread { \| struct listener { \| enum obj_type obj_type; /* object type = OBJ_TYPE_LISTENER / \| enum li_state state; / state: NEW, INIT, ASSIGNED, LISTEN, READY, FULL / \|- / 2-byte hole here / \|+ uint16_t flags; / listener flags: LI_F_* / \| int luid; / listener universally unique ID, used for SNMP / \| int nbconn; / current number of connections on this listener / \| unsigned int thr_idx; / thread indexes for queue distribution : (t2<<16)+t1 / By this: \|@@ -209,6 +209,8 @@ struct listener { \| short int nice; / nice value to assign to the instantiated tasks / \| int luid; / listener universally unique ID, used for SNMP / \| int options; / socket options : LI_O_* / \|+ uint16_t flags; / listener flags: LI_F_* / \|+ / 2-bytes hole here / \| __decl_thread(HA_RWLOCK_T lock); \| \| struct fe_counters counters; /* statistics counters / -> 2.4 only: We need to adjust some contextual lines. Replace this: \|@@ -477,7 +478,7 @@ int pause_listener(struct listener l, int lpx, int lli) \| if (!lli) \| HA_RWLOCK_WRLOCK(LISTENER_LOCK, &l->lock); \| \|- if (l->state <= LI_PAUSED) \|+ if (!(l->flags & LI_F_FINALIZED) \|\| l->state <= LI_PAUSED) \| goto end; \| \| if (l->rx.proto->suspend) By this: \|@@ -477,7 +478,7 @@ int pause_listener(struct listener l, int lpx, int lli) \| !(proc_mask(l->rx.settings->bind_proc) & pid_bit)) \| goto end; \| \|- if (l->state <= LI_PAUSED) \|+ if (!(l->flags & LI_F_FINALIZED) \|\| l->state <= LI_PAUSED) \| goto end; \| \| if (l->rx.proto->suspend) And this: \|@@ -535,7 +536,7 @@ int resume_listener(struct listener l, int lpx, int lli) \| if (MT_LIST_INLIST(&l->wait_queue)) \| goto end; \| \|- if (l->state == LI_READY) \|+ if (!(l->flags & LI_F_FINALIZED) \|\| l->state == LI_READY) \| goto end; \| \| if (l->rx.proto->resume) By this: \|@@ -535,7 +536,7 @@ int resume_listener(struct listener l, int lpx, int lli) \| !(proc_mask(l->rx.settings->bind_proc) & pid_bit)) \| goto end; \| \|- if (l->state == LI_READY) \|+ if (!(l->flags & LI_F_FINALIZED) \|\| l->state == LI_READY) \| goto end; \| \| if (l->rx.proto->resume) -> 2.6 and 2.7 only: struct listener 'flags' member still exists, let's use it. Remove this from the current patch: \|@@ -226,7 +226,8 @@ struct li_per_thread { \| struct listener { \| enum obj_type obj_type; / object type = OBJ_TYPE_LISTENER / \| enum li_state state; / state: NEW, INIT, ASSIGNED, LISTEN, READY, FULL / \|- / 2-byte hole here / \|+ uint16_t flags; / listener flags: LI_F_* / \| int luid; / listener universally unique ID, used for SNMP / \| int nbconn; / current number of connections on this listener / \| unsigned int thr_idx; / thread indexes for queue distribution : (t2<<16)+t1 / Then, replace this: \|@@ -251,6 +250,9 @@ struct listener { \| EXTRA_COUNTERS(extra_counters); \| }; \| \|+/ listener flags (16 bits) / \|+#define LI_F_FINALIZED 0x0001 / listener made it to the READY\|\|LIMITED\|\|FULL state at least once, may be suspended/resumed safely / \|+ \| / Descriptor for a "bind" keyword. The ->parse() function returns 0 in case of \| * success, or a combination of ERR_* flags if an error is encountered. The \| * function pointer can be NULL if not implemented. The function also has an By this: \|@@ -221,6 +221,7 @@ struct li_per_thread { \| }; \| \| #define LI_F_QUIC_LISTENER 0x00000001 /* listener uses proto quic / \|+#define LI_F_FINALIZED 0x00000002 / listener made it to the READY\|\|LIMITED\|\|FULL state at least once, may be suspended/resumed safely / \| \| / The listener will be directly referenced by the fdtab[] which holds its \| * socket. The listener provides the protocol-specific accept() function to	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	bcad7e6319	MINOR: listener: add relax_listener() function There is a need for a small difference between resuming and relaxing a listener. When resuming, we expect that the listener may completely resume, this includes unpausing or rebinding if required. Resuming a listener is a best-effort operation: no matter the current state, try our best to bring the listener up to the LI_READY state. There are some cases where we only want to "relax" listeners that were previously restricted using limit_listener() or listener_full() functions. Here we don't want to ressucitate listeners, we're simply interested in cancelling out the previous restriction. To this day, listener_resume() on a unbound listener is broken, that's why the need for this wasn't felt yet. But we're trying to restore historical listener_resume() behavior, so we better prepare for this by introducing an explicit relax_listener() function that only does what is expected in such cases. This commit depends on: - "MINOR: listener/api: add lli hint to listener functions"	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	4059e094db	MINOR: listener/api: add lli hint to listener functions Add listener lock hint (AKA lli) to (stop/resume/pause)_listener() functions. All these functions implicitely take the listener lock when they are called: It could be useful to be able to call them while already holding the lock, so we're adding lli hint to make them take the lock only when it is missing. This should only be backported if explicitly required by another commit -- -> 2.4 and 2.5 common backport notes: These 2 commits need to be backported first: - `187396e34` "CLEANUP: listener: function comment typo in stop_listener()" - `a57786e87` "BUG/MINOR: listener: null pointer dereference suspected by coverity" -> 2.4 special backport notes: In addition to the previously mentionned dependencies, the patch needs to be slightly adapted to match the corresponding contextual lines: Replace this: \|@@ -471,7 +474,8 @@ int pause_listener(struct listener l, int lpx) \| if (!lpx && px) \| HA_RWLOCK_WRLOCK(PROXY_LOCK, &px->lock); \| \|- HA_RWLOCK_WRLOCK(LISTENER_LOCK, &l->lock); \|+ if (!lli) \|+ HA_RWLOCK_WRLOCK(LISTENER_LOCK, &l->lock); \| \| if (l->state <= LI_PAUSED) \| goto end; By this: \|@@ -471,7 +474,8 @@ int pause_listener(struct listener l, int lpx) \| if (!lpx && px) \| HA_RWLOCK_WRLOCK(PROXY_LOCK, &px->lock); \| \|- HA_RWLOCK_WRLOCK(LISTENER_LOCK, &l->lock); \|+ if (!lli) \|+ HA_RWLOCK_WRLOCK(LISTENER_LOCK, &l->lock); \| \| if ((global.mode & (MODE_DAEMON \| MODE_MWORKER)) && \| !(proc_mask(l->rx.settings->bind_proc) & pid_bit)) Replace this: \|@@ -169,7 +169,7 @@ void protocol_stop_now(void) \| HA_SPIN_LOCK(PROTO_LOCK, &proto_lock); \| list_for_each_entry(proto, &protocols, list) { \| list_for_each_entry_safe(listener, lback, &proto->receivers, rx.proto_list) \|- stop_listener(listener, 0, 1); \|+ stop_listener(listener, 0, 1, 0); \| } \| HA_SPIN_UNLOCK(PROTO_LOCK, &proto_lock); \| } By this: \|@@ -169,7 +169,7 @@ void protocol_stop_now(void) \| HA_SPIN_LOCK(PROTO_LOCK, &proto_lock); \| list_for_each_entry(proto, &protocols, list) { \| list_for_each_entry_safe(listener, lback, &proto->receivers, rx.proto_list) \| if (!listener->bind_conf->frontend->grace) \|- stop_listener(listener, 0, 1); \|+ stop_listener(listener, 0, 1, 0); \| } \| HA_SPIN_UNLOCK(PROTO_LOCK, &proto_lock); Replace this: \|@@ -2315,7 +2315,7 @@ void stop_proxy(struct proxy p) \| HA_RWLOCK_WRLOCK(PROXY_LOCK, &p->lock); \| \| list_for_each_entry(l, &p->conf.listeners, by_fe) \|- stop_listener(l, 1, 0); \|+ stop_listener(l, 1, 0, 0); \| \| if (!(p->flags & (PR_FL_DISABLED\|PR_FL_STOPPED)) && !p->li_ready) { \| / might be just a backend / By this: \|@@ -2315,7 +2315,7 @@ void stop_proxy(struct proxy p) \| HA_RWLOCK_WRLOCK(PROXY_LOCK, &p->lock); \| \| list_for_each_entry(l, &p->conf.listeners, by_fe) \|- stop_listener(l, 1, 0); \|+ stop_listener(l, 1, 0, 0); \| \| if (!p->disabled && !p->li_ready) { \| /* might be just a backend */	2023-02-23 15:05:05 +01:00
Christopher Faulet	2bf99123ef	MINOR: stconn: Add functions to set/clear SE_FL_EXP_NO_DATA flag from endpoint se_expect_data() and se_expect_no_data() should be used from the endpoint to inform upper layer it expects data or not from the opposite endpoint.	2023-02-23 13:44:32 +01:00
Christopher Faulet	be5cc766b0	MINOR: stconn: Remove half-closed timeout The half-closed timeout is now directly retrieved from the proxy settings. There is no longer usage for the .hcto field in the stconn structure. So let's remove it.	2023-02-22 15:59:16 +01:00
Christopher Faulet	bcdcfad3ff	MINOR: stconn: Set half-close timeout using proxy settings We now directly use the proxy settings to set the half-close timeout of a stream-connector. The function sc_set_hcto() must be used to do so. This timeout is only set when a shutw is performed. So it is not really a big deal to use a dedicated function to do so.	2023-02-22 15:59:16 +01:00
Christopher Faulet	15315d6c0a	CLEANUP: stconn: Remove old read and write expiration dates Old read and write expiration dates are no longer used. Thus we can safely remove them.	2023-02-22 15:59:16 +01:00
Christopher Faulet	b374ba563a	MAJOR: stream: Use SE descriptor date to detect read/write timeouts We stop to use the channel's expiration dates to detect read and write timeouts on the channels. We now rely on the stream-endpoint descriptor to do so. All the stuff is handled in process_stream(). The stream relies on 2 helper functions to know if the receives or sends may expire: sc_rcv_may_expire() and sc_snd_may_expire().	2023-02-22 15:57:16 +01:00
Christopher Faulet	2ca4cc1936	MINOR: applet/stconn: Add a SE flag to specify an endpoint does not expect data An endpoint should now set SE_FL_EXP_NO_DATA flag if it does not expect any data from the opposite endpoint. This way, the stream will be able to disable any read timeout on the opposite endpoint. Applets should use applet_expect_no_data() and applet_expect_data() functions to set or clear the flag. For now, only dns and sink forwarder applets are concerned.	2023-02-22 15:56:28 +01:00
Christopher Faulet	4c13568b49	MEDIUM: stconn: Add two date to track successful reads and blocked sends The stream endpoint descriptor now owns two date, lra (last read activity) and fsb (first send blocked). The first one is updated every time a read activity is reported, including data received from the endpoint, successful connect, end of input and shutdown for reads. A read activity is also reported when receives are unblocked. It will be used to detect read timeouts. The other one is updated when no data can be sent to the endpoint and reset when some data are sent. It is the date of the first send blocked by the endpoint. It will be used to detect write timeouts. Helper functions are added to report read/send activity and to retrieve lra/fsb date.	2023-02-22 14:52:15 +01:00
Christopher Faulet	5aaacfbccd	MEDIUM: stconn: Replace read and write timeouts by a unique I/O timeout Read and write timeouts (.rto and .wto) are now replaced by an unique timeout, call .ioto. Since the recent refactoring on channel's timeouts, both use the same value, the client timeout on client side and the server timeout on the server side. Thus, this part may be simplified. Now it represents the I/O timeout.	2023-02-22 14:52:15 +01:00
Christopher Faulet	f8413cba2a	MEDIUM: channel/stconn: Move rex/wex timer from the channel to the sedesc These timers are related to the I/O. Thus it is logical to move them into the SE descriptor. The patch is a bit huge but it is just a replacement. However it is error-prone. From the stconn or the stream, helper functions are used to get, set or reset these timers. This simplify the timers manipulations.	2023-02-22 14:52:15 +01:00
Christopher Faulet	ed7e66fe1a	MINOR: channel/stconn: Move rto/wto from the channel to the stconn Read and write timeouts concerns the I/O. Thus, it is logical to move it into the stconn. At the end, the stream is responsible to detect the timeouts. So it is logcial to have these values in the stconn and not in the SE descriptor. But it may change depending on the recfactoring. So, now: * scf->rto is used instead of req->rto * scf->wto is used instead of res->wto * scb->rto is used instead of res->rto * scb->wto is used instead of req->wto	2023-02-22 14:52:15 +01:00
Christopher Faulet	2e56a73459	MAJOR: channel: Remove flags to report READ or WRITE errors This patch removes CF_READ_ERROR and CF_WRITE_ERROR flags. We now rely on SE_FL_ERR_PENDING and SE_FL_ERROR flags. SE_FL_ERR_PENDING is used for write errors and SE_FL_ERROR for read or unrecoverable errors. When a connection error is reported, SE_FL_ERROR and SE_FL_EOS are now set and a read event and a write event are reported to be sure the stream will properly process the error. At the stream-connector level, it is similar. When an error is reported during a send, a write event is triggered. On the read side, nothing more is performed because an error at this stage is enough to wake the stream up. A major change is brought with this patch. We stop to check flags of the ooposite channel to report abort or timeout. It also means when an read or write error is reported on a side, we no longer update the other side. Thus a read error on the server side does no long lead to a write error on the client side. This should ease errors report.	2023-02-22 14:52:15 +01:00
Christopher Faulet	81fdeb8ce2	MEDIUM: channel: Remove CF_READ_NOEXP flag This flag was introduced in 1.3 to fix a design issue. It was untouch since then but there is no reason to still have this trick. Note it could be good to review what happens in HTTP with the server is waiting for the end of the request. It could be good to be sure a client timeout is always reported.	2023-02-22 14:52:14 +01:00
Aurelien DARRAGON	3ffbf3896d	BUG/MEDIUM: httpclient/lua: fix a race between lua GC and hlua_ctx_destroy In bb581423b ("BUG/MEDIUM: httpclient/lua: crash when the lua task timeout before the httpclient"), a new logic was implemented to make sure that when a lua ctx destroyed, related httpclients are correctly destroyed too to prevent a such httpclients from being resuscitated on a destroyed lua ctx. This was implemented by adding a list of httpclients within the lua ctx, and a new function, hlua_httpclient_destroy_all(), that is called under hlua_ctx_destroy() and runs through the httpclients list in the lua context to properly terminate them. This was done with the assumption that no concurrent Lua garbage collection cycles could occur on the same ressources, which seems OK since the "lua" context is about to be freed and is not explicitly being used by other threads. But when 'lua-load' is used, the main lua stack is shared between multiple OS threads, which means that all lua ctx in the process are linked to the same parent stack. Yet it seems that lua GC, which can be triggered automatically under lua_resume() or manually through lua_gc(), does not limit itself to the "coroutine" stack (the stack referenced in lua->T) when performing the cleanup, but is able to perform some cleanup on the main stack plus coroutines stacks that were created under the same main stack (via lua_newthread()) as well. This can be explained by the fact that lua_newthread() coroutines are not meant to be thread-safe by design. Source: http://lua-users.org/lists/lua-l/2011-07/msg00072.html (lua co-author) It did not cause other issues so far because most of the time when using 'lua-load', the global lua lock is taken when performing critical operations that are known to interfere with the main stack. But here in hlua_httpclient_destroy_all(), we don't run under the global lock. Now that we properly understand the issue, the fix is pretty trivial: We could simply guard the hlua_httpclient_destroy_all() under the global lua lock, this would work but it could increase the contention over the global lock. Instead, we switched 'lua->hc_list' which was introduced with bb581423b from simple list to mt_list so that concurrent accesses between hlua_httpclient_destroy_all and hlua_httpclient_gc() are properly handled. The issue was reported by @Mark11122 on Github #2037. This must be backported with bb581423b ("BUG/MEDIUM: httpclient/lua: crash when the lua task timeout before the httpclient") as far as 2.5.	2023-02-22 11:44:22 +01:00
Willy Tarreau	27629a7d65	MINOR: compiler: add a TOSTR() macro to turn a value into a string Pretty often we have to emit a value (setting, limit etc) in an error message, and this value is known at compile-time, and just doing this forces to use a printf format such as "%d". Let's have a simple macro to turn any other macro or value into a string that can be concatenated with the rest of the string around. This simplifies error messages production on the CLI for example.	2023-02-22 09:10:53 +01:00
Remi Tricot-Le Breton	879debeecb	BUG/MINOR: cache: Cache response even if request has "no-cache" directive Since commit `cc9bf2e5f` "MEDIUM: cache: Change caching conditions" responses that do not have an explicit expiration time are not cached anymore. But this mechanism wrongly used the TX_CACHE_IGNORE flag instead of the TX_CACHEABLE one. The effect this had is that a cacheable response that corresponded to a request having a "Cache-Control: no-cache" for instance would not be cached. Contrary to what was said in the other commit message, the "checkcache" option should not be impacted by the use of the TX_CACHEABLE flag instead of the TX_CACHE_IGNORE one. The response is indeed considered as not cacheable if it has no expiration time, regardless of the presence of a cookie in the response. This should fix GitHub issue #2048. This patch can be backported up to branch 2.4.	2023-02-21 18:35:41 +01:00
Christopher Faulet	c13f3028e8	MINOR: cfgcond: Implement enabled condition expression Implement a way to test if some options are enabled at run-time. For now, following options may be detected: POLL, EPOLL, KQUEUE, EVPORTS, SPLICE, GETADDRINFO, REUSEPORT, FAST-FORWARD, SERVER-SSL-VERIFY-NONE These options are those that can be disabled on the command line. This way it is possible, from a reg-test for instance, to know if a feature is supported or not : feature cmd "$HAPROXY_PROGRAM -cc '!(globa.tune & GTUNE_NO_FAST_FWD)'"	2023-02-21 11:44:55 +01:00
Christopher Faulet	a1fdad784b	MINOR: cfgcond: Implement strstr condition expression Implement a way to match a substring in a string. The strstr expresionn can now be used to do so.	2023-02-21 11:44:55 +01:00
Christopher Faulet	2f7c82bfdf	BUG/MINOR: haproxy: Fix option to disable the fast-forward The option was renamed to only permit to disable the fast-forward. First there is no reason to enable it because it is the default behavior. Then it introduced a bug because there is no way to be sure the command line has precedence over the configuration this way. So, the option is now named "tune.disable-fast-forward" and does not support any argument. And of course, the commande line option "-dF" has now precedence over the configuration. No backport needed.	2023-02-21 11:44:55 +01:00
Amaury Denoyelle	77ed63106d	MEDIUM: quic: trigger fast connection closing on process stopping With previous commit, quic-conn are now handled as jobs to prevent the termination of haproxy process. This ensures that QUIC connections are closed when all data are acknowledged by the client and there is no more active streams. The quic-conn layer emits a CONNECTION_CLOSE once the MUX has been released and all streams are acknowledged. Then, the timer is scheduled to definitely free the connection after the idle timeout period. This allows to treat late-arriving packets. Adjust this procedure to deactivate this timer when process stopping is in progress. In this case, quic-conn timer is set to expire immediately to free the quic-conn instance as soon as possible. This allows to quickly close haproxy process. This should be backported up to 2.7.	2023-02-20 11:20:18 +01:00
Amaury Denoyelle	eb7d320d25	MINOR: mux-quic: implement client-fin timeout Implement client-fin timeout for MUX quic. This timeout is used once an applicative layer shutdown has been called. In HTTP/3, this corresponds to the emission of a GOAWAY. This should be backported up to 2.7.	2023-02-20 11:20:18 +01:00
Amaury Denoyelle	b30247b16c	MINOR: mux-quic: define qc_shutdown() Factorize shutdown operation in a dedicated function qc_shutdown(). This will allow to call it from multiple places. A new flag QC_CF_APP_SHUT is also defined to ensure it will only be executed once even if called multiple times per connection. This commit will be useful to properly support haproxy soft stop. This should be backported up to 2.7.	2023-02-20 11:18:58 +01:00
Frédéric Lécaille	2f531116ed	MINOR: quic: Add traces to qc_kill_conn() Very minor modification to help in debugging issues. Must be backported to 2.7.	2023-02-17 17:36:30 +01:00
Frédéric Lécaille	a2c62c3141	MINOR: quic: Kill the connections on ICMP (port unreachable) packet receipt The send*() syscall which are responsible of such ICMP packets reception fails with ECONNREFUSED as errno. man(7) udp ECONNREFUSED No receiver was associated with the destination address. This might be caused by a previous packet sent over the socket. We must kill asap the underlying connection. Must be backported to 2.7.	2023-02-17 17:36:30 +01:00
Frédéric Lécaille	75c8ad5490	MINOR: quic: Move code to wakeup the timer task to avoid anti-amplication deadlock This code was there because the timer task was not running on the same thread as the one which parse the QUIC packets. Now that this is no more the case, we can wake up this task directly. Must be backported to 2.7.	2023-02-17 17:36:30 +01:00
Frédéric Lécaille	1dbeb35f80	MINOR: quic: Add new traces about by connection RX buffer handling Move quic_rx_pkts_del() out of quic_conn.h to make it benefit from the TRACE API. Add traces which already already helped in diagnosing an issue encountered with ngtcp2 which sent too much 1RTT packets before the handshake completion. This has been fixed here after having discussed with Tasuhiro on QUIC dev slack: https://github.com/ngtcp2/ngtcp2/pull/663 Must be backported to 2.7.	2023-02-17 17:36:30 +01:00
Amaury Denoyelle	14037bf26f	MINOR: h3: add traces on decode_qcs callback Add traces inside h3_decode_qcs(). Every error path has now its dedicated trace which should simplify debugging. Each early returns has been converted to a goto invocation. To complete the demux tracing, demux frame type and length are now printed using the h3s instance whenever its possible on trace invocation. A new internal value H3_FT_UNINIT is used as a frame type to mark demuxing as inactive. This should be backported up to 2.7.	2023-02-17 17:31:52 +01:00
Amaury Denoyelle	381d8137e3	MINOR: h3/hq-interop: handle no data in decode_qcs() with FIN set Properly handle a STREAM frame with no data but the FIN bit set at the application layer. H3 and hq-interop decode_qcs() callback have been adjusted to not return early in this case. If the FIN bit is accepted, a HTX EOM must be inserted for the upper stream layer. If the FIN is rejected because the stream cannot be closed, a proper CONNECTION_CLOSE error will be triggered. A new utility function qcs_http_handle_standalone_fin() has been implemented in the qmux_http module. This allows to simply add the HTX EOM on qcs HTX buffer. If the HTX buffer is empty, a EOT is first added to ensure it will be transmitted above. This commit will allow to properly handle FIN notify through an empty STREAM frame. However, it is not sufficient as currently qcc_recv() skip the decode_qcs() invocation when the offset is already received. This will be fixed in the next commit. This should be backported up to 2.6 along with the next patch.	2023-02-17 16:25:00 +01:00
Willy Tarreau	3e820a1056	MINOR: threads: add flags to know if a thread is started and/or running Several times during debugging it has been difficult to find a way to reliably indicate if a thread had been started and if it was still running. It's really not easy because the elements we look at are not necessarily reliable (e.g. harmless bit or idle bit might not reflect what we think during a signal). And such notions can be subjective anyway. Here we define two thread flags, TH_FL_STARTED which is set as soon as a thread enters run_thread_poll_loop() and drops the idle bit, and another one, TH_FL_IN_LOOP, which is set when entering run_poll_loop() and cleared when leaving it. This should help init/deinit code know whether it's called from a non-initialized thread (i.e. tid must not be trusted), or shared functions know if they're being called from a running thread or from init/deinit code outside of the polling loop.	2023-02-17 16:01:34 +01:00
Christopher Faulet	d4eaa8af6b	MINOR: global: Add an option to disable the data fast-forward The new global option "tune.fast-forward" can be set to "off" to disable the data fast-forward. It is an debug option, thus it is internally marked as experimental. The directive "expose-experimental-directives" must be set first to use this one. By default, the data fast-forward is enable. It could be usefull to force to wake the stream up when data are received. To be sure, evreything works fine in this case. The data fast-forward is an optim. It must work without it. But some code may rely on the fact the stream will not be woken up. With this option, it is possible to spot some hidden bugs.	2023-02-17 10:17:02 +01:00
William Lallemand	44979ad680	BUG/MINOR: config: crt-list keywords mistaken for bind ssl keywords This patch fixes an issue in the "-dK" keywords dumper, which was mistakenly displaying the "crt-list" keywords for "bind ssl" keywords. The patch fixes the issue by dumping the "crt-list" keywords in its own section, and dumping the "bind" keywords which are in the "SSL" scope with a "bind ssl" prefix. This commit depends on the previous "MINOR: ssl: rename confusing ssl_bind_kws" commit. Must be backported in 2.6. Diff of the `./haproxy -dKall -q -c -f /dev/null` output before and after the patch in 2.8-dev4: \| @@ -190,30 +190,9 @@ listen \| use-fcgi-app \| bind <addr> accept-netscaler-cip +1 \| bind <addr> accept-proxy \| - bind <addr> allow-0rtt \| - bind <addr> alpn +1 \| bind <addr> backlog +1 \| - bind <addr> ca-file +1 \| - bind <addr> ca-ignore-err +1 \| - bind <addr> ca-sign-file +1 \| - bind <addr> ca-sign-pass +1 \| - bind <addr> ca-verify-file +1 \| - bind <addr> ciphers +1 \| - bind <addr> ciphersuites +1 \| - bind <addr> crl-file +1 \| - bind <addr> crt +1 \| - bind <addr> crt-ignore-err +1 \| - bind <addr> crt-list +1 \| - bind <addr> curves +1 \| bind <addr> defer-accept \| - bind <addr> ecdhe +1 \| bind <addr> expose-fd +1 \| - bind <addr> force-sslv3 \| - bind <addr> force-tlsv10 \| - bind <addr> force-tlsv11 \| - bind <addr> force-tlsv12 \| - bind <addr> force-tlsv13 \| - bind <addr> generate-certificates \| bind <addr> gid +1 \| bind <addr> group +1 \| bind <addr> id +1 \| @@ -225,48 +204,52 @@ listen \| bind <addr> name +1 \| bind <addr> namespace +1 \| bind <addr> nice +1 \| - bind <addr> no-ca-names \| - bind <addr> no-sslv3 \| - bind <addr> no-tls-tickets \| - bind <addr> no-tlsv10 \| - bind <addr> no-tlsv11 \| - bind <addr> no-tlsv12 \| - bind <addr> no-tlsv13 \| - bind <addr> npn +1 \| - bind <addr> prefer-client-ciphers \| bind <addr> process +1 \| bind <addr> proto +1 \| bind <addr> severity-output +1 \| bind <addr> shards +1 \| - bind <addr> ssl \| - bind <addr> ssl-max-ver +1 \| - bind <addr> ssl-min-ver +1 \| - bind <addr> strict-sni \| bind <addr> tcp-ut +1 \| bind <addr> tfo \| bind <addr> thread +1 \| - bind <addr> tls-ticket-keys +1 \| bind <addr> transparent \| bind <addr> uid +1 \| bind <addr> user +1 \| bind <addr> v4v6 \| bind <addr> v6only \| - bind <addr> verify +1 \| bind <addr> ssl allow-0rtt \| bind <addr> ssl alpn +1 \| bind <addr> ssl ca-file +1 \| + bind <addr> ssl ca-ignore-err +1 \| + bind <addr> ssl ca-sign-file +1 \| + bind <addr> ssl ca-sign-pass +1 \| bind <addr> ssl ca-verify-file +1 \| bind <addr> ssl ciphers +1 \| bind <addr> ssl ciphersuites +1 \| bind <addr> ssl crl-file +1 \| + bind <addr> ssl crt +1 \| + bind <addr> ssl crt-ignore-err +1 \| + bind <addr> ssl crt-list +1 \| bind <addr> ssl curves +1 \| bind <addr> ssl ecdhe +1 \| + bind <addr> ssl force-sslv3 \| + bind <addr> ssl force-tlsv10 \| + bind <addr> ssl force-tlsv11 \| + bind <addr> ssl force-tlsv12 \| + bind <addr> ssl force-tlsv13 \| + bind <addr> ssl generate-certificates \| bind <addr> ssl no-ca-names \| + bind <addr> ssl no-sslv3 \| + bind <addr> ssl no-tls-tickets \| + bind <addr> ssl no-tlsv10 \| + bind <addr> ssl no-tlsv11 \| + bind <addr> ssl no-tlsv12 \| + bind <addr> ssl no-tlsv13 \| bind <addr> ssl npn +1 \| - bind <addr> ssl ocsp-update +1 \| + bind <addr> ssl prefer-client-ciphers \| bind <addr> ssl ssl-max-ver +1 \| bind <addr> ssl ssl-min-ver +1 \| + bind <addr> ssl strict-sni \| + bind <addr> ssl tls-ticket-keys +1 \| bind <addr> ssl verify +1 \| server <name> <addr> addr +1 \| server <name> <addr> agent-addr +1 \| @@ -591,6 +574,23 @@ listen \| http-after-response unset-var* \| userlist \| peers \| +crt-list \| + allow-0rtt \| + alpn +1 \| + ca-file +1 \| + ca-verify-file +1 \| + ciphers +1 \| + ciphersuites +1 \| + crl-file +1 \| + curves +1 \| + ecdhe +1 \| + no-ca-names \| + npn +1 \| + ocsp-update +1 \| + ssl-max-ver +1 \| + ssl-min-ver +1 \| + verify +1 \| # List of registered CLI keywords: \| @!<pid> [MASTER] \| @<relative pid> [MASTER]	2023-02-16 16:14:37 +01:00
William Lallemand	af67806651	MINOR: ssl: rename confusing ssl_bind_kws The ssl_bind_kw structure is exclusively used for crt-list keyword, it must be named otherwise to remove the confusion. The structure was renamed ssl_crtlist_kws.	2023-02-16 16:03:45 +01:00
Amaury Denoyelle	15c74702d5	MINOR: quic: implement a basic "show quic" CLI handler Implement a basic "show quic" CLI handler. This command will be useful to display various information on all the active QUIC frontend connections. This work is heavily inspired by "show sess". Most notably, a global list of quic_conn has been introduced to be able to loop over them. This list is stored per thread in ha_thread_ctx. Also add three CLI handlers for "show quic" in order to allocate and free the command context. The dump handler runs on thread isolation. Each quic_conn is referenced using a back-ref to handle deletion during handler yielding. For the moment, only a list of raw quic_conn pointers is displayed. The handler will be completed over time with more information as needed. This should be backported up to 2.7.	2023-02-09 18:11:00 +01:00
Aurelien DARRAGON	3e7a0bb70b	MINOR: cfgparse/server: move (min/max)conn postparsing logic into dedicated function In check_config_validity() function, we performed some consistency checks to adjust minconn/maxconn attributes for each declared server. We move this logic into a dedicated function named srv_minmax_conn_apply() to be able to perform those checks later in the process life when needed (ie: dynamic servers)	2023-02-08 14:48:21 +01:00
William Lallemand	a14686d096	MINOR: ssl/ocsp: add a function to check the OCSP update configuration Deduplicate the code which checks the OCSP update in the ckch_store and in the crtlist_entry. Also, jump immediatly to error handling when the ERR_FATAL is catched.	2023-02-08 11:40:31 +01:00
William Lallemand	b4b9caa65f	BUILD: ssl/ocsp: ssl_ocsp-t.h depends on ssl_sock-t.h ssl_ocsp-t.h uses SSL_SOCK_NUM_KEYTYPES which is defined in ssl_sock-t.h. No backport needed.	2023-02-08 11:31:03 +01:00
Willy Tarreau	28360dc53f	MEDIUM: clock: force internal time to wrap early after boot GH issue #2034 clearly indicates yet another case of time roll-over that went badly. Issues that happen only once every 50 days are hard to detect and debug, and are usually reported more or less synchronized from multiple sources. This patch finally does what had long been planned but never done yet, which is to force the time to wrap early after boot so that any such remaining issue can be spotted quicker. The margin delay here is 20s (it may be changed by setting BOOT_TIME_WRAP_SEC to another value). This value seems sufficient to permit failed health checks to succeed and traffic to come in and possibly start to update some time stamps (accept dates in logs, freq counters, stick-tables expiration dates etc). It could theoretically be helpful to have this in 2.7, but as can be seen with the two patches below, we've already had incorrect use cases of the internal monotonic time when the wall-clock one was needed, so we could expect to detect other ones in the future. Note that this will not induce bugs, it will only make them happen much faster (i.e. no need to wait for 50 days before seeing them). If it were to eventually be backported, these two previous patches must also be backported: BUG/MINOR: clock: use distinct wall-clock and monotonic start dates BUG/MEDIUM: cache: use the correct time reference when comparing dates	2023-02-08 11:10:33 +01:00
Willy Tarreau	6093ba47c0	BUG/MINOR: clock: do not mix wall-clock and monotonic time in uptime calculation We've had a start date even before the internal monotonic clock existed, but once the monotonic clock was added, the start date was not updated to distinguish the wall clock time units and the internal monotonic time units. The distinction is important because both clocks do not necessarily progress at the same speed. The very rare occurrences of the wall-clock date are essentially for human consumption and communication with third parties (e.g. report the start date in "show info" for monitoring purposes). However currently this one is also used to measure the distance to "now" as being the process' uptime. This is actually not correct. It only works because for now the two dates are initialized at the exact same instant at boot but could still be wrong if the system's date shows a big jump backwards during startup for example. In addition the current situation prevents us from enforcing an abritrary offset at boot to reveal some heisenbugs. This patch adds a new "start_time" at boot that is set from "now" and is used in uptime calculations. "start_date" instead is now set from "date" and will always reflect the system date for human consumption (e.g. in "show info"). This way we're now sure that any drift of the internal clock relative to the system date will not impact the reported uptime. This could possibly be backported though it's unlikely that anyone has ever noticed the problem.	2023-02-08 11:06:55 +01:00
Frédéric Lécaille	b7a406ac34	MINOR: quic: Update version_information transport parameter to draft-14 This is necessary to make our stack negotiate the QUIC versions with clients. (See https://author-tools.ietf.org/iddiff?url1=draft-ietf-quic-version-negotiation-13&url2=draft-ietf-quic-version-negotiation-14&difftype=--html) Must be backported to 2.7.	2023-02-06 11:54:07 +01:00
Aurelien DARRAGON	e5958d0292	BUG/MEDIUM: stats: fix resolvers dump In ("BUG/MEDIUM: stats: Rely on a local trash buffer to dump the stats"), we forgot to apply the patch in resolvers.c which provides the stats_dump_resolvers() function that is involved when dumping with "resolvers" domain. As a consequence, resolvers dump was broken because stats_dump_one_line(), which is used in stats_dump_resolv_to_buffer(), implicitely uses trash_chunk from stats.c to prepare the dump, and stats_putchk() is then called with global trash (currently empty) as output data. Given that trash_dump variable is static and thus only available within stats.c we change stats_putchk() function prototype so that the function does not take the output buffer as an argument. Instead, stats_putchk() will implicitly use the local trash_dump variable declared in stats.c. It will also prevent further mixups between stats_dump_* functions and stats_putchk(). This needs to be backported with ("BUG/MEDIUM: stats: Rely on a local trash buffer to dump the stats")	2023-02-06 07:53:03 +01:00
Willy Tarreau	f2988e1447	CLEANUP: listener/thread: remove now unused bind_conf's bind_tgroup/bind_thread Not needed anymore since last commit, let's get rid of it.	2023-02-03 18:00:21 +01:00
Willy Tarreau	f0de8cacc4	MEDIUM: listener/config: make the "thread" parser rely on thread_sets Instead of reading and storing a single group and a single mask for a "thread" directive on a bind line, we now store the complete range in a thread set that's stored in the bind_conf. The bind_parse_thread() function now just calls parse_thread_set() to complete the current set, which starts empty, and thread_resolve_group_mask() was updated to support retrieving thread group numbers or absolute thread numbers directly from the pre-filled thread_set, and continue to feed bind_tgroup and bind_thread. The CLI parsers which were pre-initialized to set the bind_tgroup to 1 cannot do it anymore as it would prevent one from restricting the thread set. Instead check_config_validity() now detects the CLI frontend and passes the info down to thread_resolve_group_mask() that will automatically use only the group 1's threads for these listeners. The same is done for the peers listeners for now. At this step it's already possible to start with all previous valid configs as well as extended ones supporting comma-delimited thread sets. In addition the parser already accepts large ranges spanning multiple groups, but since the underlying listeners infrastructure is not read, for now we're maintaining a specific check against this at the higher level of the config validity check. The patch is a bit large because thread resolution is performed in multiple steps, so we need to adjust all of them at once to preserve functional and technical consistency.	2023-02-03 18:00:21 +01:00
Willy Tarreau	bef43dfa60	MINOR: thread: add a simple thread_set API The purpose is to be able to store large thread sets, defined by ranges that may cross group boundaries, as well as define lists of groups and masks. The thread_set struct implements the storage, and the parser is in parse_thread_set(), with a focus on "bind" lines, but not only.	2023-02-03 18:00:21 +01:00
Willy Tarreau	9e2682afed	MINOR: listener: remove the now useless LI_F_QUIC_LISTENER flag This flag is only used to tag a QUIC listener, which we now know by its bind_conf's xprt as well. It's only used to decide whether or not to perform an extra initialization step on the listener. Let's drop it as well as the flags field. With the various fields and options moved, the listener struct reduced by 48 bytes total.	2023-02-03 18:00:20 +01:00
Willy Tarreau	b25634d23e	CLEANUP: listener: remove the now unused options field All options that made sense were moved to the bind_conf, and remaining ones were removed. This field isn't used at all anymore. The thr_idx field was moved there to plug the hole.	2023-02-03 18:00:20 +01:00
Willy Tarreau	4c1d3a953d	MINOR: listener: get rid of LI_O_TCP_L4_RULES and LI_O_TCP_L5_RULES LI_O_TCP_L4_RULES and LI_O_TCP_L5_RULES are only set by from the proxy based on the presence or absence of tcp_req l4/l5 rules. It's basically as cheap to check the list as it is to check the flag, except that there is no need to maintain a copy. Let's get rid of them, and this may ease addition of more dynamic stuff later.	2023-02-03 18:00:20 +01:00
Willy Tarreau	1714680cec	MINOR: listener: move LI_O_UNLIMITED and LI_O_NOSTOP to bind_conf These two flags are entirely for internal use and are even per proxy in practice since they're used for peers and CLI to indicate (for the first one) that the listener(s) are not subject to connection limits, and for the second that the listener(s) should not be stopped on soft-stop. No need to keep them in the listeners, let's move them to the bind_conf under names BC_O_UNLIMITED and BC_O_NOSTOP.	2023-02-03 18:00:20 +01:00
Willy Tarreau	f1b4730f7d	MINOR: listener: move the ACC_PROXY and ACC_CIP options to bind_conf These are only set per bind line and used when creating a sessions, we can move them to the bind_conf under the names BC_O_ACC_PROXY and BC_O_ACC_CIP respectively.	2023-02-03 18:00:20 +01:00
Willy Tarreau	c492f1b17f	MINOR: listener: move TCP_FO to bind_conf It's set per bind line ("tfo") and only used in tcp_bind_listener() so there's no point keeping the address family tests, let's just store the flag in the bind_conf under the name BC_O_TCP_FO.	2023-02-03 18:00:20 +01:00
Willy Tarreau	d9b4d21248	MINOR: listener: move the DEF_ACCEPT option to the bind_conf This option is set per bind line, and was only set stored when the address family is AF_INET4 or AF_INET6. That's pointless since it's used only in tcp_bind_listener() which is only used for such families as well, so it can now be moved to the bind_conf under the name BC_O_DEF_ACCEPT.	2023-02-03 18:00:20 +01:00
Willy Tarreau	9bdcf42922	MINOR: listener: move the NOQUICKACK option to the bind_conf It solely depends on the bind line so let's move it there under the name BC_O_NOQUICKACK.	2023-02-03 18:00:20 +01:00
Willy Tarreau	cfb7c2f515	MINOR: listener: move the NOLINGER option to the bind_conf It's currently declared per-frontend, though it would make sense to support it per-line but in no case per-listener. Let's move the option to a bind_conf option BC_O_NOLINGER.	2023-02-03 18:00:20 +01:00
Willy Tarreau	7dbd4187dc	MINOR: listener: move the nice field to the bind_conf This is another bind line setting which can move to the bind_conf. Note that it leaves a 2-byte hole in the listener struct.	2023-02-03 18:00:20 +01:00
Willy Tarreau	d5983cef80	MINOR: listener: remove the useless ->default_target field This field is used by stream_new() to optionally set the applet the stream will connect to for simple proxies like the CLI for example. But it has never been configurable to anything and is always strictly equal to the frontend's ->default_target. Let's just drop it and make stream_new() only use the frontend's. It makes more sense anyway as we don't want the proxy to work differently based on the "bind" line. This idea was brought in 1.6 hoping that the h2 implementation would use applets for decoding (which was dropped after the very first attempt in 1.8).	2023-02-03 18:00:20 +01:00
Willy Tarreau	3083615410	MINOR: listener: move the ->accept callback to the bind_conf The accept callback directly derives from the upper layer, generally it's session_accept_fd(). As such it's also defined per bind line so it makes sense to move it there.	2023-02-03 18:00:20 +01:00
Willy Tarreau	758c69d951	MINOR: listener: move the maxconn parameter to the bind_conf The maxconn is set per bind line so let's move it there. This might possibly even slightly reduce inter-thread contention since this one is read-mostly and it was stored next to nbconn which changes for each connection setup or teardown.	2023-02-03 18:00:20 +01:00
Willy Tarreau	1920f897d8	MINOR: listener: move the backlog setting from listener to bind_conf The backlog setting is also defined by the bind_conf, so let's move it there.	2023-02-03 18:00:20 +01:00
Willy Tarreau	882f2485a1	MINOR: listener: move maxaccept from listener to bind_conf Like for previous values, maxaccept is really per-bind_conf, so let's move it there. Some frontends (peers, log) set it to 1 so the assignment was slightly moved.	2023-02-03 18:00:20 +01:00
Willy Tarreau	ee378165fb	MINOR: listener: move maxseg and tcp_ut to bind_conf These two arguments were only set and only used with tcpv4/tcpv6. Let's just store them into the bind_conf instead of duplicating them for all listeners since they're fixed per "bind" line.	2023-02-03 18:00:20 +01:00
Willy Tarreau	7866e8e50d	MEDIUM: listener: move the analysers mask to the bind_conf When bind_conf were created, some elements such as the analysers mask ought to have moved there but that wasn't the case. Now that it's getting clearer that bind_conf provides all binding parameters and the listener is essentially a listener on an address, it's starting to get really confusing to keep such parameters in the listener, so let's move the mask to the bind_conf. We also take this opportunity for pre-setting the mask to the frontend's upon initalization. Now several loops have one less argument to take care of.	2023-02-03 18:00:20 +01:00
Fr�d�ric L�caille	0aa79953c9	BUG/MINOR: quic: Unchecked source connection ID The SCID (source connection ID) used by a peer (client or server) is sent into the long header of a QUIC packet in clear. But it is also sent into the transport parameters (initial_source_connection_id). As these latter are encrypted into the packet, one must check that these two pieces of information do not differ due to a packet header corruption. Furthermore as such a connection is unusuable it must be killed and must stop as soon as possible processing RX/TX packets. Implement qc_kill_con() to flag a connection as unusable and to kille it asap waking up the idle timer task to release the connection. Add a check to quic_transport_params_store() to detect that the SCIDs do not match and make it call qc_kill_con(). Add several tests about connection to be killed at several critial locations, especially in the TLS stack callback to receive CRYPTO data from or derive secrets, and before preparing packet after having received others. Must be backported to 2.6 and 2.7.	2023-02-03 17:55:55 +01:00
Fr�d�ric L�caille	af25a69c8b	MEDIUM: quic: Remove qc_conn_finalize() from the ClientHello TLS callbacks This is a bad idea to make the TLS ClientHello callback call qc_conn_finalize(). If this latter fails, this would generate a TLS alert and make the connection send packet whereas it is not functional. But qc_conn_finalize() job was to install the transport parameters sent by the QUIC listener. This installation cannot be done at any time. This must be done after having possibly negotiated the QUIC version and before sending the first Handshake packets. It seems the better moment to do that in when the Handshake TX secrets are derived. This has been found inspecting the ngtcp2 code. Calling SSL_set_quic_transport_params() too late would make the ServerHello to be sent without the transport parameters. The code for the connection update which was done from qc_conn_finalize() has been moved to quic_transport_params_store(). So, this update is done as soon as possible. Add QUIC_FL_CONN_TX_TP_RECEIVED to flag the connection as having received the peer transport parameters. Indeed this is required when the ClientHello message is splitted between packets. Add QUIC_FL_CONN_FINALIZED to protect the connection from calling qc_conn_finalize() more than one time. This latter is called only when the connection has received the transport parameters and after returning from SSL_do_hanshake() which is the function which trigger the TLS ClientHello callback call. Remove the calls to qc_conn_finalize() from from the TLS ClientHello callbacks. Must be backported to 2.6. and 2.7.	2023-02-03 17:55:55 +01:00
Fr�d�ric L�caille	9969adbcdc	MINOR: stats: add by HTTP version cumulated number of sessions and requests Add cum_sess_ver[] new array of counters to count the number of cumulated HTTP sessions by version (h1, h2 or h3). Implement proxy_inc_fe_cum_sess_ver_ctr() to increment these counter. This function is called each a HTTP mux is correctly initialized. The QUIC must before verify the application operations for the mux is for h3 before calling proxy_inc_fe_cum_sess_ver_ctr(). ST_F_SESS_OTHER stat field for the cumulated of sessions others than HTTP sessions is deduced from ->cum_sess_ver counter (for all the session, not only HTTP sessions) from which the HTTP sessions counters are substracted. Add cum_req[] new array of counters to count the number of cumulated HTTP requests by version and others than HTTP requests. This new member replace ->cum_req. Modify proxy_inc_fe_req_ctr() which increments these counters to pass an HTTP version, 0 special values meaning "other than an HTTP request". This is the case for instance for syslog.c from which proxy_inc_fe_req_ctr() is called with 0 as version parameter. ST_F_REQ_TOT stat field compputing for the cumulated number of requests is modified to count the sum of all the cum_req[] counters. As this patch is useful for QUIC, it must be backported to 2.7.	2023-02-03 17:55:49 +01:00
Willy Tarreau	1ea5f410ff	CLEANUP: quic: no need for atomics on packet refcnt This is a leftover from the implementation's history, but the quic_rx_packet and quic_tx_packet ref counts were still atomically updated. It was found in perf top that the cost of the atomic inc in quic_tx_packet_refinc() alone was responsible for 1% of the CPU usage at 135 Gbps. Given that packets are only processed on their assigned thread we don't need that anymore and this can be replaced with regular non-atomic operations. Doing this alone has reduced the CPU usage of qc_do_build_pkt() from 3.6 to 2.5% and increased the overall bit rate by about 1%.	2023-02-03 13:39:20 +01:00
Amaury Denoyelle	24d5b72ca9	MINOR: quic: add config for retransmit limit Define a new configuration option "tune.quic.max-frame-loss". This is used to specify the limit for which a single frame instance can be detected as lost. If exceeded, the connection is closed. This should be backported up to 2.7.	2023-02-03 11:56:46 +01:00
Amaury Denoyelle	e4abb1f2da	MEDIUM: quic: implement a retransmit limit per frame Add a <loss_count> new field in quic_frame structure. This field is set to 0 and incremented each time a sent packet is declared lost. If <loss_count> reached a hard-coded limit, the connection is deemed as failing and is closed immediately with a CONNECTION_CLOSE using INTERNAL_ERROR. By default, limit is set to 10. This should ensure that overall memory usage is limited if a peer behaves incorrectly. This should be backported up to 2.7.	2023-02-03 11:56:42 +01:00
Amaury Denoyelle	57b3eaa793	MINOR: quic: refactor frame deallocation Define a new function qc_frm_free() to handle frame deallocation. New BUG_ON() statements ensure that the deallocated frame is not referenced by other frame. To support this, all LIST_DELETE() have been replaced by LIST_DEL_INIT(). This should enforce that frame deallocation is robust. As a complement, qc_frm_unref() has been moved into quic_frame module. It is justified as this is a utility function related to frame deallocation. It allows to use it in quic_pktns_tx_pkts_release() before calling qc_frm_free(). This should be backported up to 2.7.	2023-02-03 11:55:41 +01:00
Amaury Denoyelle	40c24f1a10	MINOR: quic: define new functions for frame alloc Define two utility functions for quic_frame allocation : * qc_frm_alloc() is used to allocate a new frame * qc_frm_dup() is used to allocate a new frame by duplicating an existing one Theses functions are useful to centralize quic_frame initialization. Note that pool_zalloc() is replaced by a proper pool_alloc() + explicit initialization code. This commit will simplify implementation of the per frame retransmission limitation. Indeed, a new counter will be added in quic_frame structure which must be initialized to 0. This should be backported up to 2.7.	2023-02-03 10:44:26 +01:00
Amaury Denoyelle	2216b0866e	MINOR: quic: remove fin from quic_stream frame type A dedicated <fin> field was used in quic_stream structure. However, this info is already encoded in the frame type field as specified by QUIC protocol. In fact, only code for packet reception used the <fin> field. On the sending side, we only checked for the FIN bit. To align both sides, remove the <fin> field and only used the FIN bit. This should be backported up to 2.7.	2023-02-03 09:46:55 +01:00
Amaury Denoyelle	1e340ba6bc	MINOR: mux-quic/h3: define stream close callback Define a new qcc_app_ops callback named close(). This will be used to notify app-layer about the closure of a stream by the remote peer. Its main usage is to ensure that the closure is allowed by the application protocol specification. For the moment, close is not implemented by H3 layer. However, this function will be mandatory to properly reject a STOP_SENDING on the control stream and preventing a later crash. As such, this commit must be backported with the next one on 2.6. This is related to github issue #2006.	2023-01-30 15:56:25 +01:00
Aurelien DARRAGON	b2e2ec51b3	MEDIUM: proxy/http_ext: implement dynamic http_ext proxy http-only options implemented in http_ext were statically stored within proxy struct. We're making some changes so that http_ext are now stored in a dynamically allocated structs. http_ext related structs are only allocated when needed to save some space whenever possible, and they are automatically freed upon proxy deletion. Related PX_O_HTTP{7239,XFF,XOT) option flags were removed because we're now considering an http_ext option as 'active' if it is allocated (ptr is not NULL) A few checks (and BUG_ON) were added to make these changes safe because it adds some (acceptable) complexity to the previous design. Also, proxy.http was renamed to proxy.http_ext to make things more explicit.	2023-01-27 15:18:59 +01:00
Aurelien DARRAGON	9ded834adc	OPTIM: http_ext/7239: introduce c_mode to save some space forwarded header option (rfc7239) deals with sample expressions in two steps: first a sample expression string is extracted from the config file and later in startup sequence this string is converted into the resulting sample_expr. We need to perform these two steps because we cannot compile the expr too early in the parsing sequence. (or we would miss some context) Because of this, we have two dinstinct structure members (expr and expr_s) for each 7239 field supporting sample expressions. This is not cool, because we're bloating the http forwarded config structure, and thus, bloating proxy config structure. To address this, we now merge both expr and expr_s members inside a single union to regain some space. This forces us to perform some additional logic to make sure to use the proper structure member at different parsing steps. Thanks to this, we're also able to free/release related config hints and sample expression strings as soon as the sample expression compilation is finished.	2023-01-27 15:18:59 +01:00
Aurelien DARRAGON	f958341610	MINOR: proxy: move 'originalto' option to http_ext Just like forwarded (7239) header and forwardfor header, move parsing, logic and management of 'originalto' option into http_ext dedicated class. We're only doing this to standardize proxy http options management. Existing behavior remains untouched.	2023-01-27 15:18:59 +01:00
Aurelien DARRAGON	730b9836a6	MINOR: proxy: move 'forwardfor' option to http_ext Just like forwarded (7239) header, move parsing, logic and management of 'forwardfor' option into http_ext dedicated class. We're only doing this to standardize proxy http options management. Existing behavior remains untouched.	2023-01-27 15:18:59 +01:00
Aurelien DARRAGON	b2bb9257d2	MINOR: proxy/http_ext: introduce proxy forwarded option Introducing http_ext class for http extension related work that doesn't fit into existing http classes. HTTP extension "forwarded", introduced with 7239 RFC is now supported by haproxy. The option supports various modes from simple to complex usages involving custom sample expressions. Examples : # Those servers want the ip address and protocol of the client request # Resulting header would look like this: # forwarded: proto=http;for=127.0.0.1 backend www_default mode http option forwarded #equivalent to: option forwarded proto for # Those servers want the requested host and hashed client ip address # as well as client source port (you should use seed for xxh32 if ensuring # ip privacy is a concern) # Resulting header would look like this: # forwarded: host="haproxy.org";for="_000000007F2F367E:60138" backend www_host mode http option forwarded host for-expr src,xxh32,hex for_port # Those servers want custom data in host, for and by parameters # Resulting header would look like this: # forwarded: host="host.com";by=_haproxy;for="[::1]:10" backend www_custom mode http option forwarded host-expr str(host.com) by-expr str(_haproxy) for for_port-expr int(10) # Those servers want random 'for' obfuscated identifiers for request # tracing purposes while protecting sensitive IP information # Resulting header would look like this: # forwarded: for=_000000002B1F4D63 backend www_for_hide mode http option forwarded for-expr rand,hex By default (no argument provided), forwarded option will try to mimic x-forward-for common setups (source client ip address + source protocol) The option is not available for frontends. no option forwarded is supported. More info about 7239 RFC here: https://www.rfc-editor.org/rfc/rfc7239.html More info about the feature in doc/configuration.txt This should address feature request GH #575 Depends on: - "MINOR: http_htx: add http_append_header() to append value to header" - "MINOR: sample: add ARGC_OPT" - "MINOR: proxy: introduce http only options"	2023-01-27 15:18:59 +01:00
Aurelien DARRAGON	832e9f4119	MINOR: proxy: introduce http only options This commit is innoffensive but will allow to do some code refactors in existing proxy http options. Newly created http related proxy options will also benefit from this.	2023-01-27 15:18:59 +01:00
Aurelien DARRAGON	5f7f5fe76a	MINOR: sample: add ARGC_OPT Add ARGC_OPT enum to provide more context for upcoming sample parse errors involving proxy "option" config directives.	2023-01-27 15:18:59 +01:00
Aurelien DARRAGON	38ebffaf10	MINOR: http_htx: add http_prepend_header() to prepend value to header Just like http_append_header(), but this time to insert new value before an existing one. If the header already contains one or multiple values, ',' is automatically inserted after the new value.	2023-01-27 15:18:59 +01:00
Aurelien DARRAGON	a5a8552cab	MINOR: http_htx: add http_append_header() to append value to header Calling this function as an alternative to http_replace_header_value() to append a new value to existing header instead of replacing the whole header content. If the header already contains one or multiple values: a ',' is automatically appended before the new value. This function is not meant for prepending (providing empty ctx value), in which case we should consider implementing dedicated prepend alternative function.	2023-01-27 15:18:59 +01:00
Willy Tarreau	271c440392	MINOR: h2: add h2_phdr_to_ist() to make ISTs from pseudo headers Till now pseudo headers were passed as const strings, but having them as ISTs will be more convenient for traces. This doesn't change anything for strings which are derived from them (and being constants they're still zero-terminated).	2023-01-26 15:49:43 +01:00
Willy Tarreau	b8b243ac6a	MINOR: trace: add the long awaited TRACE_PRINTF() TRACE_PRINTF() can be used to produce arbitrary trace contents at any trace level. It uses the exact same arguments as other TRACE_* macros, but here they are mandatory since they are followed by the format-string, though they may be filled with zeroes. The reason for the arguments is to match tracking or filtering and not pollute other non-inspected objects. It will probably be used inside loops, in which case there are two points to be careful about: - output atomicity is only per-message, so competing threads may see their messages interleaved. As such, it is recommended that the caller places a recognizable unique context at the beginning of the message such as a connection pointer. - iterating over arrays or lists for all requests could be very expensive. In order to avoid this it is best to condition the call via TRACE_ENABLED() with the same arguments, which will return the same decision. - messages longer than TRACE_MAX_MSG-1 (1023 by default) will be truncated. For example, in order to dump the list of HTTP headers between hpack and h2: if (outlen > 0 && TRACE_ENABLED(TRACE_LEVEL_DEVELOPER, H2_EV_RX_FRAME\|H2_EV_RX_HDR, h2c->conn, 0, 0, 0)) { int i; for (i = 0; list[i].n.len; i++) TRACE_PRINTF(TRACE_LEVEL_DEVELOPER, H2_EV_RX_FRAME\|H2_EV_RX_HDR, h2c->conn, 0, 0, 0, "h2c=%p hdr[%d]=%s:%s", h2c, i, list[i].n.ptr, list[i].v.ptr); } In addition, a lower-level TRACE_PRINTF_LOC() macro is provided, that takes two extra arguments, the caller's location and the caller's function name. This will allow to emit composite traces from central functions on the behalf of another one.	2023-01-26 15:49:43 +01:00
Willy Tarreau	4b36d5e8de	MINOR: trace: add a trace_no_cb() dummy callback for when to use no callback By default, passing a NULL cb to the trace functions will result in the source's default one to be used. For some cases we won't want to use any callback at all, not event the default one. Let's define a trace_no_cb() function for this, that does absolutely nothing.	2023-01-26 15:49:43 +01:00
Willy Tarreau	8f9a9704bb	MINOR: trace: add a TRACE_ENABLED() macro to determine if a trace is active Sometimes it would be necessary to prepare some messages, pre-process some blocks or maybe duplicate some contents before they vanish for the purpose of tracing them. However we don't want to do that for everything that is submitted to the traces, it's important to do it only for what will really be traced. The __trace() function has all the knowledge for this, to the point of even checking the lockon pointers. This commit splits the function in two, one with the trace decision logic, and the other one for the trace production. The first one is now usable through wrappers such as _trace_enabled() and TRACE_ENABLED() which will indicate whether traces are going to be produced for the current source, level, event mask, parameters and tracking.	2023-01-26 15:49:43 +01:00
Willy Tarreau	80f36b2ac2	CLEANUP: trace: remove the QUIC-specific ifdefs There are ifdefs at several places to only define TRC_ARGS_QCON when QUIC is defined, but nothing prevents this code from building without. Let's just remove those ifdefs, the single "if" they avoid is not worth the extra maintenance burden.	2023-01-26 15:49:43 +01:00
Amaury Denoyelle	71fd03632f	MINOR: mux-quic/h3: send SETTINGS as soon as transport is ready As specified by HTTP3 RFC, SETTINGS frame should be sent as soon as possible. Before this patch, this was only done on the first qc_send() invocation. This delay significantly SETTINGS emission until the first H3 response is ready to be transferred. This patch fixes this by ensuring SETTINGS is emitted when MUX-QUIC is being setup. As a side point, return value of finalize operation is checked. This means that an error during SETTINGS emission will cause the connection init to fail. This should be backported up to 2.7.	2023-01-25 16:01:55 +01:00
Willy Tarreau	7e70bfc8cb	MINOR: threads: add a thread_harmless_end() version that doesn't wait thread_harmless_end() needs to wait for rdv_requests to disappear so that we're certain to respect a harmless promise that possibly allowed another thread to proceed under isolation. But this doesn't work in a signal handler because a thread could be interrupted by the debug handler while already waiting for isolation and with rdv_request>0. As such this function could cause a deadlock in such a signal handler. Let's implement a specific variant for this, thread_harmless_end_sig(), that just resets the thread's bit and doesn't wait. It must of course not be used past a check point that would allow the isolation requester to return and see the thread as temporarily harmless then turning back on its promise. This will be needed to fix a race in the debug handler.	2023-01-19 19:22:17 +01:00
Willy Tarreau	b2f38c13d1	BUG/MINOR: thread: always reload threads_enabled in loops A few loops waiting for threads to synchronize such as thread_isolate() rightfully filter the thread masks via the threads_enabled field that contains the list of enabled threads. However, it doesn't use an atomic load on it. Before 2.7, the equivalent variables were marked as volatile and were always reloaded. In 2.7 they're fields in ha_tgroup_ctx[], and the risk that the compiler keeps them in a register inside a loop is not null at all. In practice when ha_thread_relax() calls sched_yield() or an x86 PAUSE instruction, it could be verified that the variable is always reloaded. If these are avoided (e.g. architecture providing neither solution), it's visible in asm code that the variables are not reloaded. In this case, if a thread exists just between the moment the two values are read, the loop could spin forever. This patch adds the required _HA_ATOMIC_LOAD() on the relevant threads_enabled fields. It must be backported to 2.7.	2023-01-19 19:22:17 +01:00
Amaury Denoyelle	7d78eff889	MINOR: h3: extend function for QUIC varint encoding Slighty adjust b_quic_enc_int(). This function is used to encode an integer as a QUIC varint in a struct buffer. A new parameter is added to the function API to specify the width of the encoded integer. By default, 0 should be use to ensure that the minimum space is used. Other valid values are 1, 2, 4 or 8. An error is reported if the width is not large enough. This new parameter will be useful when buffer space is reserved prior to encode an unknown integer value. The maximum size of 8 bytes will be reserved and some data can be put after. When finally encoding the integer, the width can be requested to be 8 bytes. With this new parameter, a small refactoring of the function has been conducted to remove some useless internal variables. This should be backported up to 2.7. It will be mostly useful to implement H3 trailers encoding.	2023-01-19 15:09:01 +01:00
Remi Tricot-Le Breton	bb35e1f5aa	BUG/MINOR: ssl: Fix compilation with OpenSSL 1.0.2 (missing ECDSA_SIG_set0) This function was introduced in OpenSSL 1.1.0. Prior to that, the ECDSA_SIG structure was public. This function was used in commit `5a8f02ae` "BUG/MEDIUM: jwt: Properly process ecdsa signatures (concatenated R and S params)". This patch needs to be backported up to branch 2.5 alongside commit `5a8f02ae`.	2023-01-19 11:13:51 +01:00
William Lallemand	2edc6d0301	Revert "BUILD: ssl: add ECDSA_SIG_set0() for openssl < 1.1 or libressl < 2.7" This reverts commit d65791e26c12b57723f2feb7eacdbbd99601371b. Conflict with the patch which was originally written and lacks the BN_clear_free() and the NULL check.	2023-01-19 11:13:24 +01:00
Willy Tarreau	d65791e26c	BUILD: ssl: add ECDSA_SIG_set0() for openssl < 1.1 or libressl < 2.7 Commit `5a8f02ae6` ("BUG/MEDIUM: jwt: Properly process ecdsa signatures (concatenated R and S params)") makes use of ECDSA_SIG_set0() which only appeared in openssl-1.1.0 and libressl 2.7, and breaks the build before. Let's just do what it minimally does (only assigns the two fields to the destination). This will need to be backported where the commit above is, likely 2.5.	2023-01-19 10:57:00 +01:00
Fr�d�ric L�caille	21c4c9b854	MINOR: quic: Replace v2 draft definitions by those of the final 2 version This should finalize the support for the QUIC version 2. Must be backported to 2.7.	2023-01-17 16:35:20 +01:00
Fr�d�ric L�caille	12a0317fed	MINOR: quic: Add "no-quic" global option Add "no-quic" to "global" section to disable the use of QUIC transport protocol by all configured QUIC listeners. This is listeners with QUIC addresses on their "bind" lines. Internally, the socket addresses binding is skipped by protocol_bind_all() for receivers with <proto_quic4> or <proto_quic6> as protocol (see protocol struct). Add information about "no-quic" global option to the documentation. Must be backported to 2.7.	2023-01-17 16:35:20 +01:00
Willy Tarreau	e77f4306ba	BUG/MEDIUM: stconn: also consider SE_FL_EOI to switch to SE_FL_ERROR In se_fl_set_error() we used to switch to SE_FL_ERROR only when there is already SE_FL_EOS indicating that the read side is closed. But that is not sufficient, we need to consider all cases where no more reads will be performed on the connection, and as such also include SE_FL_EOI. Without this, some aborted connections during a transfer sometimes only stop after the timeout, because the ERR_PENDING is never promoted to ERROR. This must be backported to 2.7 and requires previous patch "CLEANUP: stconn: always use se_fl_set_error() to set the pending error".	2023-01-17 16:27:35 +01:00
Christopher Faulet	2e47e3a1cf	MINOR: htx: Add an HTX value for the extra field is payload length is unknown When the payload length cannot be determined, the htx extra field is set to the magical vlaue ULLONG_MAX. It is not obvious. This a dedicated HTX value is now used. Now, HTX_UNKOWN_PAYLOAD_LENGTH must be used in this case, instead of ULLONG_MAX.	2023-01-13 11:51:11 +01:00
Christopher Faulet	4da82395d8	CLEANUP: http-ana: Remove HTTP_MSG_ERROR state This state is now unused. Thus it can be removed.	2023-01-13 11:22:13 +01:00
Christopher Faulet	71236dedb9	MINOR: http-ana: Add a function to set HTTP termination flags There is already a function to set termination flags but it is not well suited for HTTP streams. So a function, dedicated to the HTTP analysis, was added. This way, this new function will be called for HTTP analysers on error. And if the error is not caugth at this stage, the generic function will still be called from process_stream(). Here, by default a PRXCOND error is reported and depending on the stream state, the reson will be set accordingly: * If the backend SC is in INI state, SF_FINST_T is reported on tarpit and SF_FINST_R otherwise. * SF_FINST_Q is the server connection is queued * SF_FINST_C in any connection attempt state (REQ/TAR/ASS/CONN/CER/RDY). Except for applets, a SF_FINST_R is reported. * Once the server connection is established, SF_FINST_H is reported while HTTP_MSG_DATA state on the response side. * SF_FINST_L is reported if the response is in HTTP_MSG_DONE state or higher and a client error/timeout was reported. * Otherwise SF_FINST_D is reported.	2023-01-13 09:45:23 +01:00
Willy Tarreau	6be8d09a61	OPTIM: global: move byte counts out of global and per-thread During multiple tests we've already noticed that shared stats counters have become a real bottleneck under large thread counts. With QUIC it's pretty visible, with qc_snd_buf() taking 2.5% of the CPU on a 48-thread machine at only 25 Gbps, and this CPU is entirely spent in the atomic increment of the byte count and byte rate. It's also visible in H1/H2 but slightly less since we're working with larger buffers, hence less frequent updates. These counters are exclusively used to report the byte count in "show info" and the byte rate in the stats. Let's move them to the thread_ctx struct and make the stats reader just collect each thread's stats when requested. That's way more efficient than competing on a single cache line. After this, qc_snd_buf has totally disappeared from the perf profile and tests made in h1 show roughly 1% performance increase on small objects.	2023-01-12 16:37:45 +01:00
Amaury Denoyelle	0a1154afb5	MINOR: mux-quic: use send-list for STOP_SENDING/RESET_STREAM emission When a STOP_SENDING or RESET_STREAM must be send, its corresponding qcs is inserted into <qcc.send_list> via qcc_reset_stream() or qcc_abort_stream_read(). This allows to remove the iteration on full qcs tree in qc_send(). Instead, STOP_SENDING and RESET_STREAM is done in the loop over <qcc.send_list> as with STREAM frames. This should improve slightly the performance, most notably when large number of streams are opened. This must be backported up to 2.7.	2023-01-10 17:49:50 +01:00
Amaury Denoyelle	f9b03265f0	MEDIUM: h3: send SETTINGS before STREAM frames Complete qcc_send_stream() function to allow to specify if the stream should be handled in priority. Internally this will insert the qcs instance in front of <qcc.send_list> to be able to treat it before other streams. This functionality is useful when some QUIC streams should be sent before others. Most notably, this is used to guarantee that H3 SETTINGS is done first via the control stream. This must be backported up to 2.7.	2023-01-10 17:49:50 +01:00
Amaury Denoyelle	20f2a425ff	MAJOR: mux-quic: rework stream sending priorization Implement a mechanism to register streams ready to send data in new STREAM frames. Internally, this is implemented with a new list <qcc.send_list> which contains qcs instances. A qcs can be registered safely using the new function qcc_send_stream(). This is done automatically in qc_send_buf() which covers most cases. Also, application layer is free to use it for internal usage streams. This is currently the case for H3 control stream with SETTINGS sending. The main point of this patch is to handle stream sending fairly. This is in stark contrast with previous code where streams with lower ID were always prioritized. This could cause other streams to be indefinitely blocked behind a stream which has a lot of data to transfer. Now, streams are handled in an order scheduled by se_desc layer. This commit is the first one of a serie which will bring other improvments which also relied on the send_list implementation. This must be backported up to 2.7 when deemed sufficiently stable.	2023-01-10 17:49:50 +01:00
Christopher Faulet	da89e9b95b	MINOR: channel/applets: Stop to test CF_WRITE_ERROR flag if CF_SHUTW is enough In applets, we stop processing when a write error (CF_WRITE_ERROR) or a shutdown for writes (CF_SHUTW) is detected. However, any write error leads to an immediate shutdown for writes. Thus, it is enough to only test if CF_SHUTW is set.	2023-01-09 18:41:08 +01:00
Christopher Faulet	4b490b7517	MINOR: channel: Stop to test CF_READ_ERROR flag if CF_SHUTR is enough When a read error (CF_READ_ERROR) is reported, a shutdown for reads is always performed (CF_SHUTR). Thus, there is no reason to check if CF_READ_ERROR is set if CF_SHUTR is also checked.	2023-01-09 18:41:08 +01:00
Christopher Faulet	2357718217	MEDIUM: channel: Remove CF_READ_ATTACHED and report CF_READ_EVENT instead CF_READ_ATTACHED flag is only used in input events for stream analyzers, CF_MASK_ANALYSER. A read event can be reported instead and this flag can be removed. We must only take care to report a read event when the client connection is upgraded from TCP to HTTP.	2023-01-09 18:41:08 +01:00
Christopher Faulet	049fbcd36a	MINOR: channel: Remove CF_ANA_TIMEOUT and report CF_READ_EVENT instead It appears CF_ANA_TIMEOUT is flag only used in CF_MASK_ANALYSER. All analyzer timeout relies on the analysis expiration date (chn->analyse_exp). Worst, once set, this flag is never removed. Thus this flag can be removed and replaced by a read event (CF_READ_EVENT).	2023-01-09 18:41:08 +01:00
Christopher Faulet	a63f8f379f	MINOR: channel: Remove CF_WRITE_ACTIVITY Thanks to previous changes, CF_WRITE_ACTIVITY flags can be removed. Everywhere it was used, its value is now directly used (CF_WRITE_EVENT\|CF_WRITE_ERROR).	2023-01-09 18:41:08 +01:00
Christopher Faulet	33e03cec5f	MINOR: channel: Remove CF_READ_ACTIVITY Thanks to previous changes, CF_READ_ACTIVITY flags can be removed. Everywhere it was used, its value is now directly used (CF_READ_EVENT\|CF_READ_ERROR).	2023-01-09 18:41:08 +01:00
Christopher Faulet	d898841530	MEDIUM: channel: Use CF_WRITE_EVENT instead of CF_WRITE_PARTIAL Just like CF_READ_PARTIAL, CF_WRITE_PARTIAL is now merged with CF_WRITE_EVENT. There a subtlety in sc_notify(). The "connect" event (formely CF_WRITE_NULL) is now detected with (CF_WRITE_EVENT + sc->state < SC_ST_EST).	2023-01-09 18:41:08 +01:00
Christopher Faulet	285f7616ee	MEDIUM: channel: Use CF_READ_EVENT instead of CF_READ_PARTIAL CF_READ_PARTIAL flag is now merged with CF_READ_EVENT. It means CF_READ_EVENT is set when a read0 is received (formely CF_READ_NULL) or when data are received (formely CF_READ_ACTIVITY). There is nothing special here, except conditions to wake the stream up in sc_notify(). Indeed, the test was a bit changed to reflect recent change. read0 event is now formalized by (CF_READ_EVENT + CF_SHUTR).	2023-01-09 18:41:08 +01:00
Christopher Faulet	b96f2aa380	REORG: channel: Rename CF_WRITE_NULL to CF_WRITE_EVENT As for CF_READ_NULL, it appears CF_WRITE_NULL and other write events on a channel are mainly used to wake up the stream and may be replace by on write event. In this patch, we introduce CF_WRITE_EVENT flag as a replacement to CF_WRITE_EVENT_NULL. There is no breaking change for now, it is just a rename. Gradually, other write events will be merged with this one.	2023-01-09 18:41:08 +01:00
Christopher Faulet	6e1bbc446b	REORG: channel: Rename CF_READ_NULL to CF_READ_EVENT CF_READ_NULL flag is not really useful and used. It is a transient event used to wakeup the stream. As we will see, all read events on a channel may be resumed to only one and are all used to wake up the stream. In this patch, we introduce CF_READ_EVENT flag as a replacement to CF_READ_NULL. There is no breaking change for now, it is just a rename. Gradually, other read events will be merged with this one.	2023-01-09 18:41:08 +01:00
Willy Tarreau	5a72d03a58	MINOR: stick-table: implement the sc-add-gpc() action This action increments the General Purpose Counter at the index <idx> of the array associated to the sticky counter designated by <sc-id> by the value of either integer <int> or the integer evaluation of expression <expr>. Integers and expressions are limited to unsigned 32-bit values. If an error occurs, this action silently fails and the actions evaluation continues. <idx> is an integer between 0 and 99 and <sc-id> is an integer between 0 and 2. It also silently fails if the there is no GPC stored at this index. The entry in the table is refreshed even if the value is zero. The 'gpc_rate' is automatically adjusted to reflect the average growth rate of the gpc value. The main use of this action is to count scores or total volumes (e.g. estimated danger per source IP reported by the server or a WAF, total uploaded bytes, etc).	2023-01-07 09:11:22 +01:00
Willy Tarreau	6c0117168e	MEDIUM: stick-table: set the track-sc limit at boottime via tune.stick-counters The number of stick-counter entries usable by track-sc rules is currently set at build time. There is no good value for this since the vast majority of users don't need any, most need only a few and rare users need more. Adding more counters for everyone increases memory and CPU usages for no reason. This patch moves the per-session and per-stream arrays to a pool of a size defined at boot time. This way it becomes possible to set the number of entries at boot time via a new global setting "tune.stick-counters" that sets the limit for the whole process. When not set, the MAX_SESS_STR_CTR value still applies, or 3 if not set, as before. It is also possible to lower the value to 0 to save a bit of memory if not used at all. Note that a few low-level sample-fetch functions had to be protected due to the ability to use sample-fetches in the global section to set some variables.	2023-01-06 18:08:49 +01:00
Christopher Faulet	61aded057d	BUG/MAJOR: buf: Fix copy of wrapping output data when a buffer is realigned There is a bug in b_slow_realign() function when wrapping output data are copied in the swap buffer. block1 and block2 sizes are inverted. Thus blocks with a wrong size are copied. It leads to data mixin if the first block is in reality larger than the second one or to a copy of data outside the buffer is the first block is smaller than the second one. The bug was introduced when the buffer API was refactored in 1.9. It was found by a code review and seems never to have been triggered in almost 5 years. However, we cannot exclude it is responsible of some unresolved bugs. This patch should fix issue #1978. It must be backported as far as 2.0.	2023-01-05 09:34:49 +01:00
Willy Tarreau	6e70a3986c	BUILD: makefile: only consider settings from enabled options Due to the previous SSL exception we coudln't restrict the collected CFLAGS/LDFLAGS to those of enabled options, so all of them were considered if set. The problem is that it would prevent simply disabling a build option without unsetting its xxx_CFLAGS or _LDFLAGS values if those had incompatible values (e.g. -lfoo). Now that only existing options are listed in collect_opts_flags, we can safely check that the option is set and only consider its settings in this case. Thus OT_LDFLAGS will not be used if USE_OT is not set for example.	2022-12-23 17:01:55 +01:00
Willy Tarreau	6a2cd33509	BUILD: makefile: remove the special case of the SSL option By creating USE_SSL and enabling it when USE_OPENSSL is set, we can get rid of the special case that was made with it regarding cflags collect and when resetting options. The option doesn't need to be manually set, though in the future it might prove useful if other non-openssl API are supported.	2022-12-23 16:53:35 +01:00
Willy Tarreau	2b8d0978f3	BUILD: makefile: make all OpenSSL variants use the same settings It's getting complicated to configure includes and lib dirs for OpenSSL API variants such as WolfSSL, because some settings are common and others are specific but carry a prefix that doesn't match the USE_* rule scheme. This patch simplifies everything by considering that all SSL libs will use SSL_INC, SSL_LIB, SSL_CFLAGS and SSL_LDFLAGS. That's much more convenient. This works thanks to the settings collector which explicitly checks the SSL_* settings. When USE_OPENSSL_WOLFSSL is set, then USE_OPENSSL is implied, so that there's no need to duplicate maintenance effort.	2022-12-23 16:53:35 +01:00
Willy Tarreau	8fa2f49f24	BUILD: makefile: add a function to collect all options' CFLAGS/LDFLAGS The new function collect_opts_flags now scans all USE_* options defined in use_opts and appends the corresponding _CFLAGS and _LDFLAGS to OPTIONS_{C,LD}FLAGS respectively. This will be useful to get rid of all the individual concatenations to these variables.	2022-12-23 16:53:35 +01:00
Willy Tarreau	b14e89e322	BUILD: makefile: initialize all build options' variables at once A lot of _SRC, _INC, _LIB etc variables are set and expected to be initialized to an empty string by default. However, an in-depth review of all of them showed that WOLFSSL_{INC,LIB}, SSL_{INC,LIB}, LUA_{INC,LIB}, and maybe others were not always initialized and could sometimes leak from the environment and as such cause strange build issues when running from cascaded scripts that had exported them. The approach taken here consists in iterating over all USE_* options and unsetting any _SRC, _INC, _LIB, _CFLAGS and _LDFLAGS that follows the same name. For the few variable names options that don't exactly match the build option (SSL & WOLFSSL), these ones are specifically added to the list. The few that were explicitly cleared in their own sections were just removed since not needed anymore. Note that an "undefine" command appeared in GNU make 3.82 but since we support older ones we can only initialize the variables to an empty string here. It's not a problem in practice. We're now certain that these variables are empty wherever they are used, and that it is possible to just append to them, or use them as-is.	2022-12-23 16:53:35 +01:00
Willy Tarreau	848362f2d2	BUILD: makefile: sort the features list The features list that appears in -vv appears in a random order, which always makes it a pain to look for certain features. Let's sort it.	2022-12-23 16:53:35 +01:00
Willy Tarreau	69e7b7f677	BUILD: makefile: move common options-oriented macros to include/make/options.mk Some macros and functions are barely understandable and are only used to iterate over known options from the use_opts list. Better assign them a name and move them into a dedicated file to clean the makefile a little bit. Now at least "use_opts" only appears once, where it is defined. This also allowed to completely remove the BUILD_FEATURES macro that caused some confusion until previous commit.	2022-12-23 16:53:35 +01:00
Amaury Denoyelle	663e872e3a	MEDIUM: mux-quic: implement STOP_SENDING emission Implement STOP_SENDING. This is divided in two main functions : * qcc_abort_stream_read() which can be used by application protocol to request for a STOP_SENDING. This set the flag QC_SF_READ_ABORTED. * qcs_send_reset() is a static function called after the preceding one. It will send a STOP_SENDING via qcc_send(). QC_SF_READ_ABORTED flag is now properly used : if activated on a stream during qcc_recv(), <qcc.app_ops.decode_qcs> callback is skipped. Also, abort reading on unknown unidirection remote stream is now fully supported with the emission of a STOP_SENDING as specified by RFC 9000. This commit is part of implementing H3 errors at the stream level. This will allows the H3 layer to request the peer to close its endpoint for an error on a stream. This should be backported up to 2.7.	2022-12-22 16:38:16 +01:00
Amaury Denoyelle	5854fc08cc	MINOR: mux-quic: handle RESET_STREAM reception Implement RESET_STREAM reception by mux-quic. On reception, qcs instance will be mark as remotely closed and its Rx buffer released. The stream layer will be flagged on error if still attached. This commit is part of implementing H3 errors at the stream level. Indeed, on H3 stream errors, STOP_SENDING + RESET_STREAM should be emitted. The STOP_SENDING will in turn generate a RESET_STREAM by the remote peer which will be handled thanks to this patch. This should be backported up to 2.7.	2022-12-22 16:38:04 +01:00
Amaury Denoyelle	a473f196f1	MEDIUM: mux-quic: implement shutw Implement mux_ops shutw operation for QUIC mux. A RESET_STREAM is emitted unless the stream is already closed due to all data or RESET_STREAM already transmitted. This operation is notably useful when upper stream layer wants to close the connection early due to an error. This was tested by using a HTTP server which listens with PROXY protocol support. The corresponding server line on haproxy configuration deliberately not specify send-proxy. This causes the server to close abruptly the connection. Without this patch, nothing was done on the QUIC stream which was kept open until the whole connection is closed. Now, a proper RESET_STREAM is emitted to report the error. This should be backported up to 2.7.	2022-12-22 16:22:39 +01:00
William Lallemand	be6a873096	BUG/MINOR: httpclient/log: free of invalid ptr with httpclient_log_format free_proxy() must check if the ptr is not httpclient_log_format before trying to free p->conf.logformat_string. No backport needed.	2022-12-22 15:39:31 +01:00
Christopher Faulet	c960a3b60f	BUG/MINOR: pool/stats: Use ullong to report total pool usage in bytes in stats The same change was already performed for the cli. The stats applet and the prometheus exporter are also concerned. Both use the stats API and rely on pool functions to get total pool usage in bytes. pool_total_allocated() and pool_total_used() must return 64 bits unsigned integer to avoid any wrapping around 4G. This may be backported to all versions.	2022-12-22 13:46:21 +01:00
Remi Tricot-Le Breton	c8d814ed63	MINOR: ssl: Move OCSP code to a dedicated source file This is a simple cleanup that moves OCSP related code to a dedicated file instead of interlacing it in some pure ssl connection code.	2022-12-21 11:21:07 +01:00
Remi Tricot-Le Breton	6477bbd78d	MEDIUM: ssl: Add ocsp update task main function This patch contains the main function of the ocsp auto update mechanism as well as an init and destroy function of the task used for this. The task is not created in this patch but in a later one. The function has two distinct parts and the branching to one or the other is completely based on the fact that the cur_ocsp pointer of the ssl_ocsp_task_ctx member is set. If the pointer is not set, we need to look at the first item of the update tree and see if it needs to be updated. If it does not we simply wait until the time is right and let the task asleep. If it does need to be updated, we simply build and send the corresponding ocsp request thanks to the http_client. The task is then sent to sleep with an expire time set to infinity. The http_client will wake it back up once the response is received (or a timeout occurs). Just note that during this whole process the cetificate_ocsp object corresponding to the entry being updated is taken out of the update tree and only stored in the ssl_ocsp_task_ctx context. Once the task is waken up by the http_client, it branches on the response processing part of the function which basically checks that the response is valid and inserts it into the ocsp_response tree. The task then goes back to sleep until another entry needs to be updated.	2022-12-21 11:21:07 +01:00
Remi Tricot-Le Breton	fb2b9988e8	MINOR: ssl: Store 'ocsp-update' mode in the ckch_data and check for inconsistencies The 'ocsp-update' option is parsed at the same time as all the other bind line options but it does not actually have anything to do with the bind line since it concerns the frontend certificate instead. For that reason, we should have a mean to identify inconsistencies in the configuration and raise an error when a given certificate has two different ocsp-update modes specified in one or more crt-lists. The simplest way to do it is to store the ocsp update mode directly in the ckch and not only in the ssl_bind_conf.	2022-12-21 11:21:07 +01:00
Remi Tricot-Le Breton	03c5ffff8e	MINOR: ssl: Add crt-list ocsp-update option This option will define how the ocsp update mechanism behaves. The option can either be set to 'on' or 'off' and can only be specified in a crt-list entry so that we ensure that it concerns a single certificate. The 'off' mode is the default one and corresponds to the old behavior (no automatic update). When the option is set to 'on', we will try to get an ocsp response whenever an ocsp uri can be found in the frontend's certificate. The only limitation of this mode is that the certificate's issuer will have to be known in order for the OCSP certid to be built. This patch only adds the parsing of the option. The full functionality will come in a later commit.	2022-12-21 11:21:07 +01:00

... 2 3 4 5 6 ...

6954 Commits