haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-02-16 18:46:54 +00:00

Author	SHA1	Message	Date
Willy Tarreau	721ea5b06c	MINOR: mux-h2: count within a connection, how many streams are receiving data A stream is receiving data from after the HEADERS frame missing END_STREAM, to the end of the stream or HREM (the presence of END_STREAM). We're now adding a flag to the stream that indicates this state, as well as a counter in the connection of streams currently receiving data. The purpose will be to gauge at any instant the number of streams that might have to share the available bandwidth and buffers count in order not to allocate too much flow control to any single stream. For now the counter is kept up to date, and is reported in "show fd".	2024-10-12 16:29:16 +02:00
Willy Tarreau	c9275084bc	MEDIUM: mux-h2: start to introduce the window size in the offset calculation Instead of incrementing the last_max_ofs by the amount of received bytes, we now start from the new current offset to which we add the static window size. The result is exactly the same but it prepares the code to use a window size combined with an offset instead of just refilling the budget from what was received. It was even verified that changing h2_fe_settings_initial_window_size in the middle of a transfer using gdb does indeed allow the transfer speed to adapt accordingly.	2024-10-12 16:29:16 +02:00
Willy Tarreau	1cc851d9f2	MEDIUM: mux-h2: start to update stream when sending WU The rationale here is that we don't absolutely need to update the stream offset live, there's already the rcvd_s counter to remind us we've received data. So we can continue to exploit the current check points for this. Now we know that rcvd_s indicates the amount of newly received bytes for the stream since last call to h2c_send_strm_wu() so we can update our stream offsets within that function. The wu_s counter is set to the difference between next_adv_ofs and last_adv_ofs, which are resynchronized once the frame is sent. If the stream suddenly disappears with unacked data (aborted upload), the presence of the last update in h2c->wu_s is sufficient to let the connection ack the data alone, and upon subsequent calls with new rcvd_s, the received counter will be used to ack, like before. We don't need to do more anyway since the goal is to let the client abort ASAP when it gets an RST. At this point, the stream knows its current rx offset, the computed max offset and the last advertised one.	2024-10-12 16:29:16 +02:00
Willy Tarreau	eb0fe66c61	MINOR: mux-h2: create and initialize an rx offset per stream In H2, everything is accounted as budget. But if we want to moderate the rcv window that's not very convenient, and we'd rather have offsets instead so that we know where we are in the stream. Let's first add the fields to the struct and initialize them. The curr_rx_ofs indicates the position in the stream where next incoming bytes will be stored. last_adv_ofs tells what's the offset that was last advertised as the window limit, and next_max_ofs is the one that will need to be advertised, which is curr_rx_ofs plus the current window. next_max_ofs will have to cause a WINDOW_UPDATE to be emitted when it's higher than last_adv_ofs, and once the WU is sent, its value will have to be copied over last_adv_ofs. The problem is, for now wherever we emit a stream WU, we have no notion of stream (the stream might even not exist anymore, e.g. after aborting an upload), because we currently keep a counter of stream window to be acked for the current stream ID (h2c->dsi) in the connection (rcvd_s). Similarly there are a few places early in the frame header processing where rcvd_s is incremented without knowing the stream yet. Thus, lookups will be needed for that, unless such a connection-level counter remains used and poured into the stream's count once known (delicate). Thus for now this commit only creates the fields and initializes them.	2024-10-12 16:29:15 +02:00
Willy Tarreau	560e474cdd	MINOR: mux-h2: split the amount of rx data from the amount to ack We'll need to keep track of the total amount of data received for the current stream, and the amount of data to ack for the current stream, which might soon diverge as soon as we'll have to update the stream's offset with received data, which are different from those to be ACKed. One reason is that in case a stream doesn't exist anymore (e.g. aborted an upload), the rcvd_s info might get lost after updating the stream, so we do need to have an in-connection counter for that. What's done here is that the rcvd_s count is transferred to wu_s in h2c_send_strm_wu(), to be used as the counter to send, and both are considered as sufficient when non-null to call the function.	2024-10-12 16:29:15 +02:00
Willy Tarreau	8f09bdce10	MINOR: buffer: add a buffer list type with functions The buffer ring is problematic in multiple aspects, one of which being that it is only usable by one entity. With multiplexed protocols, we need to have shared buffers used by many entities (streams and connection), and the only way to use the buffer ring model in this case is to have each entity store its own array, and keep a shared counter on allocated entries. But even with the default 32 buf and 100 streams per HTTP/2 connection, we're speaking about 3210132 bytes = 103424 bytes per H2 connection, just to store up to 32 shared buffers, spread randomly in these tables. Some users might want to achieve much higher than default rates over high speed links (e.g. 30-50 MB/s at 100ms), which is 3 to 5 MB storage per connection, hence 180 to 300 buffers. There it starts to cost a lot, up to 1 MB per connection, just to store buffer indexes. Instead this patch introduces a variant which we call a buffer list. That's basically just a free list encoded in an array. Each cell contains a buffer structure, a next index, and a few flags. The index could be reduced to 16 bits if needed, in order to make room for a new struct member. The design permits initializing a whole freelist at once using memset(0). The list pointer is stored at a single location (e.g. the connection) and all users (the streams) will just have indexes referencing their first and last assigned entries (head and tail). This means that with a single table we can now have all our buffers shared between multiple streams, irrelevant to the number of potential streams which would want to use them. Now the 180 to 300 entries array only costs 7.2 to 12 kB, or 80 times less. Two large functions (bl_deinit() & bl_get()) were implemented in buf.c. A basic doc was added to explain how it works.	2024-10-12 16:29:15 +02:00
Willy Tarreau	ac66df4e2e	REORG: buffers: move some of the heavy functions from buf.h to buf.c Over time, some of the buffer management functions grew quite a bit, and were still forced to remain inlined since all defined in buf.h. Let's create buf.c and move the heaviest ones there. All those moved here were above 200 bytes.	2024-10-12 16:29:15 +02:00
Willy Tarreau	d288ddb575	CLEANUP: muxes: remove useless inclusion of ebmbtree.h Since 2.7 with commit `8522348482` ("BUG/MAJOR: conn-idle: fix hash indexing issues on idle conns"), we've been using eb64 trees and not ebmb trees anymore, and later we dropped all that to centralize the operations in the server. Let's remove the ebmbtree.h includes from the muxes that do not use them.	2024-10-12 16:29:15 +02:00
Willy Tarreau	cf3fe1eed4	MINOR: mux-h2/traces: print the size of the DATA frames DATA frames produce a special trace with the amount of transferred data in arg4, but this was not reported by h2_trace(). This commit just adds it.	2024-10-12 16:29:15 +02:00
Willy Tarreau	af064b497a	BUG/MINOR: mux-h2/traces: present the correct buffer for trailers errors traces The local "rxbuf" buffer was passed to the trace instead of h2s->rxbuf that is used when decoding trailers. The impact is essentially the impossibility to present some buffer contents in some rare cases. It may be backported but it's unlikely that anyone will ever notice the difference.	2024-10-12 16:29:15 +02:00
Willy Tarreau	0fa654ca92	BUILD: cache: silence an uninitialized warning at -Og with gcc-12.2 Building with gcc-12.2 -Og yields this incorrect warning in cache.c: In function 'release_entry_unlocked', inlined from 'http_action_store_cache' at src/cache.c:1449:4: src/cache.c:330:9: warning: 'object' may be used uninitialized [-Wmaybe-uninitialized] 330 \| release_entry(cache, entry, 1); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/cache.c: In function 'http_action_store_cache': src/cache.c:1200:29: note: 'object' was declared here 1200 \| struct cache_entry object, old; \| ^~~~~~ This is wrong, the only way to reach the function is with first!=NULL and the gotos that reach there are all those made with first==NULL. Let's just preset object to NULL to silence it.	2024-10-12 16:28:54 +02:00
William Lallemand	edf85a1d76	MINOR: cfgparse: simulate long configuration parsing with force-cfg-parser-pause This command is pausing the configuration parser for <timeout> milliseconds. This is useful for development or for testing timeouts of init scripts, particularly to simulate a very long reload. It requires the expose-experimental-directives to be set.	2024-10-11 17:40:37 +02:00
Amaury Denoyelle	232083c3e5	BUG/MEDIUM: mux-quic: ensure timeout server is active for short requests If a small request is received on QUIC MUX frontend, it can be transmitted directly with the FIN on attach operation. rcv_buf is skipped by the stream layer. Thus, it is necessary to ensure that there is similar behavior when FIN is reported either on attach or rcv_buf. One difference was that se_expect_data() was called only for rcv_buf but not on attach. This most obvious effect is that stream timeout was deactivated for this request : client timeout was disabled on EOI but server one not armed due to previous se_expect_no_data(). This prevents the early closure of too long requests. To fix this, add an invokation of se_expect_data() on attach operation. This bug can simply be detected using httpterm with delay request (for example /?t=10000) and using smaller client/server timeouts. The bug is present if the request is not aborted on timeout but instead continue until its proper HTTP 200 termination. This has been introduced by the following commit : `85eabfbf67` MEDIUM: mux-quic: Don't expect data from server as long as request is unfinished This must be backported up to 2.8.	2024-10-10 17:20:39 +02:00
Aurelien DARRAGON	7144e60cd2	MINOR: sample: postresolve sink names in debug() converter debug() converter used to resolve sink names during parsing time. Because of this, we were unable to specify sink names that were defined after the debug() converter was placed. Like in the previous commit, let's implement proper postparsing for the debug() converter, in order to be able to use sink names that are about to be defined later in the config file.	2024-10-10 16:55:15 +02:00
Aurelien DARRAGON	ed266589b6	MINOR: trace: postresolve sink names A previous known limitation about traces was that parsing was performed on the fly, meaning that when using "sink" keyword, only sinks that were either internal or previously defined in the config could be used. Indeed, it was not possible to use a ring section defined AFTER the traces section when using the 'sink' keyword from traces. This limitation was also mentioned in the config file. Let's get rid of that limitation by implementing proper postparsing for the sink parameter in traces section. To do this, make use of the new sink_find_early() helper to start referencing sink by their names even if they don't exist yet (if they are about to be defined later in the config) Traces commands on the cli are not concerned by this change.	2024-10-10 16:55:15 +02:00
Aurelien DARRAGON	1bdf6e884a	MEDIUM: sink: implement sink_find_early() sink_find_early() is a convenient function that can be used instead of sink_find() during parsing time in order to try to find a matching sink even if the sink is not defined yet. Indeed, if the sink is not defined, sink_find_early() will try to create it and mark it as forward-declared. It will also save informations from the caller to better identify it in case of errors. If the sink happens to be found in the config, it will transition from forward-declared type to its final type. Else, it means that the sink was not found in the config, in this case, during postresolve, we raise an error to indicate that the sink was not found in the configuration. It should help solve postresolving issue with rings, because for now only log targets implement proper ring postresolving.. but rings may be used at different places in the code, such as debug() converter or in "traces" section.	2024-10-10 16:55:15 +02:00
Damien Claisse	ba7c03c18e	MINOR: ssl: disable server side default CRL check with WolfSSL Patch `64a77e3ea5` disabled CRL check when no CRL file was provided, but it only did it on bind side. Add the same fix in server context initialization side. This allows to enable peer verification (verify required) on a server using TLS, without having to provide a CRL file.	2024-10-10 09:31:19 +02:00
Amaury Denoyelle	456c3997b2	BUG/MEDIUM: quic: properly decount out-of-order ACK on stream release Out-of-order STREAM ACK are buffered in its related streambuf tree. On insertion, overlapping or contiguous ranges are merged together. The total size of buffered ack range is stored in <room> streambuf member and reported to QUIC MUX layer on streambuf release. The objective is to ensure QUIC MUX layer can allocate Tx buffers conveniently to preserve a good transfer throughput. Streamdesc is the overall container of many streambufs. It may also been released when its upper QCS instance is freed, after all stream data have been emitted. In this case, the active streambuf is also released via custom code. However, in this code path, <room> was not reported to the QUIC MUX layer. This bug caused wrong estimation for the QUIC MUX txbuf window, with bytes reamining even after all ACK reception. This may cause transfer freeze on other connection streams, with RESET_STREAM emission on timeout client. To fix this, reuse the existing qc_stream_buf_release() function on streamdesc release. This ensures that notify_room is correctly used. No need to backport.	2024-10-09 17:47:16 +02:00
Amaury Denoyelle	f0049d0748	BUG/MINOR: quic: fix discarding of already stored out-of-order ACK To properly decount out-of-order acked data range, contiguous or overlapping ranges are first merged before their insertion in a tree. The first step ensure that a newly reported range is not completely covered by the existing tree ranges. However, one of the condition was incorrect. Fix this to ensure that the final range tree does not contain duplicated entry. The impact of this bug is unknown. However, it may have allowed the insertion of overlapping ranges, which could in turn cause an error in QUIC MUX txbuf window, with a possible transfer freeze. No need to backport.	2024-10-09 17:32:30 +02:00
Aurelien DARRAGON	f88f162868	BUG/MEDIUM: hlua: properly handle sample func errors in hlua_run_sample_{fetch,conv}() To execute sample fetches and converters from lua. hlua API leverages the sample API. Prior to executing the sample func, the arg checker is called from hlua_run_sample_{fetch,conv}() to detect potential errors. However, hlua_run_sample_{fetch,conv}() both pass NULL as <err> argument, but it is wrong for two reasons. First we miss an opportunity to report precise error messages to help the user know what went wrong during the check.. and more importantly, some val check functions consider that the <err> pointer is never NULL. This is the case for example with check_crypto_hmac(). Because of this, when such val check functions encounter an error, they will crash the process because they will try to de-reference NULL. This bug was discovered and reported by GH user @JB0925 on #2745. Perhaps val check functions should make sure that the provided <err> pointer is != NULL prior to de-referencing it. But since there are multiple occurences found in the code and the API isn't clear about that, it is easier to fix the hlua part (caller) for now. To fix the issue, let's always provide a valid <err> pointer when leveraging val_arg() check function pointer, and make use of it in case or error to report relevant message to the user before freeing it. It should be backported to all stable versions.	2024-10-08 12:00:42 +02:00
Aurelien DARRAGON	d0e0105181	BUG/MEDIUM: hlua: make hlua_ctx_renew() safe hlua_ctx_renew() is called from unsafe places where the caller doesn't expect it to LJMP.. however hlua_ctx_renew() makes use of Lua library function that could potentially raise errors, such as lua_newthread(), and it does nothing to catch errors. Because of this, haproxy could unexpectedly crash. This was discovered and reported by GH user @JB0925 on #2745. To fix the issue, let's simply make hlua_ctx_renew() safe by applying the same logic implemented for hlua_ctx_init() or hlua_ctx_destroy(), which is catching Lua errors by leveraging SET_SAFE_LJMP_PARENT() helper. It should be backported to all stable versions.	2024-10-08 12:00:36 +02:00
Aurelien DARRAGON	3f4a788329	REGTESTS: add some tests for 'do-log' action Now that 'do-log' action may be used for all existing action contexts, let's add some tests in reg-tests/log/log_profile.vtc to ensure it works as expected. quic-ini is not tested as it may not be builtin depending on build options..	2024-10-04 21:38:19 +02:00
Aurelien DARRAGON	3ba924a4da	MINOR: action: add do-log action Thanks to the two previous commits, we can now expose the do-log action on all available action contexts, including the new quic-init context. Each context is responsible for exposing the do-log action by registering the relevant log steps, saving the idendifier, and then store it in the rule's context so that do_log_action() automatically uses it to produce the log during runtime. To use the feature, it is simply needed to use "do-log" (without argument) on an action directive, example: tcp-request connection do-log As mentioned before, each context where the action is exposed has its own log step identifier. Currently known identifiers are: quic-initial: quic-init tcp-request connection: tcp-req-conn tcp-request session: tcp-req-sess tcp-request content: tcp-req-cont tcp-response content: tcp-res-cont http-request: http-req http-response: http-res http-after-response: http-after-res Thus, these "additional" logging steps can be used as-is under log-profile section (after "on" keyword). However, although the parser will accept them, it makes no sense to use them with the "log-steps" proxy keyword, since the only path for these origins to trigger a log generation is through the explicit use of "do-log" action. This need was described in GH #401, it should help to conditionally trigger logs using ACL at specific key points.. and may either be used alone or combined with "log-steps" to add additional log "trackers" during transaction handling. Documentation was updated and some examples were added.	2024-10-04 21:38:14 +02:00
Aurelien DARRAGON	0e271f1d2a	MINOR: log: add do_log_parse_act() helper func Function may be used from places where per-context actions are usually registered (tcp_act.c, http_act.c, quic_rules.c.. to name a few) in order to expose the do_log() action.	2024-10-04 21:38:08 +02:00
Aurelien DARRAGON	e63c7da508	MINOR: log: add do_log() logging helper do_log() is quite similar to sess_log() or strm_log(), excepts that it may be called at any time during session handling in an opportunistic way as long as the session exists (the stream may or may not exist). Also, it will try to emit the log as INFO by default, unless set-log-level is used on the stream, or error origin flag is set.	2024-10-04 21:38:02 +02:00
Amaury Denoyelle	f6599cf5a6	MEDIUM: quic: decount out-of-order ACK data range for MUX txbuf window This commit is the last one of a serie whose objective is to restore QUIC transfer throughput performance to the state prior to the recent QUIC MUX buffer allocator rework. This gain is obtained by reporting received out-of-order ACK data range to the QUIC MUX which can then decount room in its txbuf window. This is implemented in QUIC streamdesc layer by adding a new invokation of notify_room callback. This is done into qc_stream_buf_store_ack() which handle out-of-order ACK data range. Previous commit has introduced merging of overlapping ACK data range. As such, it's easy to only report the newly acknowledged data range. As with in-order ACKs, this new notification is only performed on released streambuf. As such, when a streambuf instance is released, notify_room notification now also reports the total length of out-of-order ACK data range currently stored. This value is stored in a new streambuf member <room> to avoid unnecessary tree lookup. This <room> member also serves on in-order ACK notification to reduce the notified room. This prevents to report invalid values when overlap ranges are treated first out-of-order and then in-order, which would cause an invalid QUIC MUX txbuf window value. After this change has been implemented, performance has been significantly improved, both with ngtcp2-client rate usage and on interop goodput test. These values are now similar to the rate observed on older haproxy version before QUIC MUX buffer allocator rework.	2024-10-04 18:09:51 +02:00
Amaury Denoyelle	ae3e768d32	MEDIUM: quic: merge contiguous/overlapping buffered ack stream range Transfer throughput was deteriorated since recent rework of QUIC MUX txbuf allocator. This was partially restorated with the commit to decount individual in-order ACK from the MUX buffer window. To fully retrieve the old performance level, all ACKs must be decounted when handled by QUIC streamdesc layer, event out-of-order ranges. However, this is not easily implemented as several ranges may exist in parallel with overlap on the underlying data. It would cause miscalculation for QUIC MUX buffer window if such ranges were blindly reported. The proper solution is to first implement merge of contiguous or overlapping ACK data ranges to reduce the number of stored ranges to the minimal. This is the purpose of this patch. This is implemented in a new static function named qc_stream_buf_store_ack() into streamdesc layer. The merge algorithm is simple enough. First, it ensures the newly added range is not already fully covered by a preexisting entry. Then, it checks if there is contiguity/overlap with one or several ranges starting at the same of a greater offset. If true, the newly added entry is extended to cover them all, and all contiguous/overlapped ranges are removed. Finally, if there is contiguity or overlap with an entry starting at a smaller offset, no new range is instantiated and instead the smaller offset is extended. Now that contiguous or overlapped ranges cannot exits anymore, ACK data ranges tree instiatiation can used EB_ROOT_UNIQUE. Outside of the longer term objective which is to decount out-of-order ACKs from MUX txbuf window, this commit could also improve some performance and/or memory usage for connections where stream data fragmentation and packet reording is high.	2024-10-04 18:07:52 +02:00
Amaury Denoyelle	e7578084b0	MINOR: quic: implement dedicated type for out-of-order stream ACK QUIC streamdesc layer is responsible to handle reception of ACK for streams. It removes stream data from the underlying buffers on ACK reception. Streamdesc layer treats ACK in order at the stream level. Out of order ACKs are buffered in a tree until they can be handled on older data acknowledgement reception. Previously, qf_stream instance which comes from the quic_tx_packet was used as tree node to buffer such ranges. Introduce a new type dedicated to represent out of order stream ack data range. This type is named qc_stream_ack. It contains minimal infos only relative to the acknowledged stream data range. This allows to reduce size of frequently used quic_frame with the removal of tree node from qf_stream. Another side effect of this change is that now quic_frame are always released immediately on ACK reception, both in-order and out-of-order. This allows to also release the quic_tx_packet instance which should reduce memory consumption. The drawback of this change is that qc_stream_ack instance must be allocated on out-of-order ACK reception. As such, qc_stream_desc_ack() may fail if an error happens on allocation. For the moment, such error is silenly recovered up to qc_treat_rx_pkts() with the dropping of the received packet containing the ACK frame. In the future, it may be useful to close the connection as this error may only happens on low memory usage.	2024-10-04 17:56:45 +02:00
Amaury Denoyelle	4ff87db5fe	MEDIUM: quic: decount acknowledged data for MUX txbuf window Recently, a new allocation mechanism was implemented for Tx buffers used by QUIC MUX. Now, underlying congestion window size is used to determine if it is still possible or not to allocate a new buffer when necessary. This mechanism has render the QUIC stack more flexible. However, it also has brought some performance degradation, with transfer time longer in certain environment. It was first discovered on the measurement results of the interop. It can also easily be reproduced using the following ngtcp2-client example which forces a very small congestion window due to frequent loss : $ ngtcp2-client -q --no-quic-dump --no-http-dump --exit-on-all-streams-close -r 0.1 127.0.0.1 20443 "https://[::]:20443/?s=10m" This performance decrease is caused by the allocator which is now too strict. It may cause buffer underrun frequently at the MUX layer when the congestion window is too small, as new buffers cannot be allocated until the current one is fully acknowledged. This resuls in transfers with very bad throughput utilisation. The objective of this new serie of patches is to relax some restrictions to permit QUIC MUX to allocate new buffers more quickly, while preserving the initial limitation based on congestion window size. An interesting method for this is to notify QUIC MUX about newly available room on individual ACK reception, without waiting for the full bffer acknowledgement. This is easily implemented by adding a new notify_room invokation in QUIC streamdesc layer on ACK reception. However, ACK reception are handled in-order at the stream level. Out of order ACKs are buffered and are not decounted for now. This will be implemented in a future commit. Note that for a single buffer instance, data can in parallel be written by QUIC MUX and removed on ACK reception. This could cause room notification to QUIC MUX layer to report invalid values. As such, ACK reception are only accounted for released buffers. This ensures that such buffers won't received any new data. In the same time, buffer room is notified on release operation as it does not need acknowledgement. This commit has permit to improve performance for the ngtcp2-client scenario above. However, it is not yet sufficient enough for interop goodput test.	2024-10-04 17:31:26 +02:00
Amaury Denoyelle	324a49ed4d	MINOR: quic: strengthen qc_release_frm() quic_frame is the type used to represent frames emitted in a QUIC Tx packet. Each frame is attached to a packet, and can also be linked to other frames from the the same packet, or duplicated frames for retransmission. As such, quic_frame free operation is a tedious process. qc_release_frm() has been implemented to ensure quic_frame is always properly freed after detaching from all its list attach point. One particular point is to ensure that when a frame is released, the frame origin and all origin copies, including the current <frm> are flagged as acked and detached from the reflist. Add a BUG_ON() to ensure this loop is properly conducted when dealing with the current <frm> instance.	2024-10-04 16:00:05 +02:00
Christopher Faulet	131b877565	BUG/MINOR: stats: Fix the name for the total number of streams created Because of a copy/paste error, CurrStreams was reused by mistake. It should be "CumStreams" No backports needed.	2024-10-04 15:44:40 +02:00
Amaury Denoyelle	c1d714156e	BUG/MAJOR: mux-quic: do not crash on empty STREAM frame emission Most of the time STREAM frames emitted by QUIC MUX have some data in it. However, it is possible to use an empty frame when a delayed FIN must be transferred. Recently, QUIC MUX send callback notification has been refactored. Now, this callback is blindly called by quic_conn lower layer each time a STREAM frame is built into a newly Tx packet. QUIC MUX is responsible to ensure the notified frame corresponds to newly emitted data or retransmission. Offsets are used for this comparison, but this requires special care for empty FIN frames. Sadly, the comparison written to determine if an empty FIN frame was sent for the first time or retransmitted is not correct. This caused such frame to always be dismissed as retransmission in QUIC MUX sent callback. This prevented the related QCS instance to be removed from the send_list, causing qcc_io_send() to retry a new emission. This was finally interrupted by the BUG_ON() assertion to prevent an infinite loop. Fix this crash by updating the condition in QUIC MUX send callback. For empty STREAM frame, it is sufficient to check if QC_SF_FIN_STREAM was already removed or not to detect a retransmission. Indeed, empty STREAM frames are never used outside of delayed FIN reporting. No need to backport. This crash was introduced in the current dev branch by the following commit. `d7f4e5abf0` MEDIUM: quic: strengthen MUX send notification	2024-10-04 11:31:11 +02:00
Willy Tarreau	7cdc9325a1	[RELEASE] Released version 3.1-dev9 Released version 3.1-dev9 with the following main changes : - MINOR: tools: add minimal file name management - CLEANUP: stick-table: make the file location point to a global file name - MINOR: proxy: use the global file names for conf->file - CLEANUP: cfgparse: factor proxy vs log-forward collisions - BUG/MINOR: cfgparse: detect another uncaught case of duplicate defaults - MINOR: proxy: add a list of orphaned defaults sections - MEDIUM: cfgparse: drop duplicate named defaults sections after use - OPTIM: cfgparse: speed up duplicate server detection - MEDIUM: cfgparse: warn about deprecated use of duplicate server names - BUG/MINOR: server: shut down streams under thread isolation - BUG/MINOR: proxy: also make the cli and resolvers use the global name - REGTESTS: log: fix log-profile.vtc - MEDIUM: mailers: warn about deprecated legacy mailers - BUG/MEDIUM: cli: Be sure to catch immediate client abort - DEV: flags/applet: decode appctx flags - BUG/MEDIUM: cli: Deadlock when setting frontend maxconn - MINOR: log: fix indent in strm_log() - MINOR: log: introduce extra log profile steps - MINOR: log: handle extra log origins in _process_send_log_override() - MINOR: log: introduce log_orig flags - MINOR: log: explicitly handle extra log origins as error when relevant - MINOR: log: support extra log origins for '%OG' alias - MINOR: proxy: add log_steps struct member - MINOR: log: introduce "log-steps" proxy keyword - MINOR: log: add log_orig_proxy() helper function - MEDIUM: log: consider log-steps proxy setting for existing log origins - DOC: config: document proxy "log-steps" keyword - REGTESTS: add a test for proxy "log-steps" - Revert "BUG/MINOR: server: shut down streams under thread isolation" - MINOR: task: define two new one-shot events for use with WOKEN_OTHER or MSG - BUG/MEDIUM: stream: make stream_shutdown() async-safe - BUG/MINOR: server: make sure the HMAINT state is part of MAINT - BUG/MINOR: queue: make sure that maintenance redispatches server queue - MINOR: server: make srv_shutdown_sessions() call pendconn_redistribute() - BUILD: tools: only include execinfo.h for the real backtrace() function - MINOR: tools: do not attempt to use backtrace() on linux without glibc - OPTIM: channel: speed up co_getline()'s search of the end of line - OPTIM: stconn: Don't pretend mux have more data to deliver on EOI/EOS/ERROR - BUG/MINOR: mcli: Pretend the mux have more data to deliver between two commands - MINOR: action: Export release_expr_int_action() release function - MINOR: stream: Rely on a per-stream max connection retries value - MINOR: stream: Support dynamic changes of the number of connection retries - MINOR: stream/stats: Expose the current number of streams in stats - MINOR: stream/stats: Expose the total number of streams ever created in stats - BUG/MINOR: cfgparse-global: fix allowed args number for setenv - MINOR: cfgparse-global: add dedicated parser for *env keywords - MINOR: mux-quic: complete Tx infos for QCS dump - MINOR: quic: ensure txbuf realloc is only performed on empty buffer - MINOR: mux-quic: strengthen qcs_send_metadata() usage - MINOR: quic: remove unneeded notification of txbuf room - MINOR: quic: refactor MUX send notification - MEDIUM: quic: strengthen MUX send notification - MINOR: quic: refactor STREAM room notification - MINOR: quic: do not remove qc_stream_desc automatically on ACK handling - MINOR: quic: store streambuf in a streamdesc tree - MINOR: quic: move buffered ACK to streambuf - MEDIUM: quic: handle out-of-order ACK at streamdesc layer - MEDIUM: quic: refactor buffered STREAM ACK consuming - BUG/MEDIUM: queue: always dequeue the backend when redistributing the last server - MINOR: config/trace: Add a 'traces' section to declare debug traces - MINOR: trace: Be able to chain commands for a source in one line - MINOR: tcpcheck: Add support for an option host header value for httpchk option - BUG/MINOR: mux-h1: Fix condition to set EOI on SE during zero-copy forwarding - MINOR: mux-h1: Use a dedicated function to conditionnaly set EOI flag on SE - BUG/MINOR: http-ana: Disable fast-fwd for unfinished req waiting for upgrade - BUG/MINOR: mux-quic: fix crash on qcc_init() early return - BUG/MINOR: quic: fix trace on releasing STREAM frame after ack	2024-10-03 17:47:33 +02:00
Amaury Denoyelle	b74df9fbc9	BUG/MINOR: quic: fix trace on releasing STREAM frame after ack Fix NULL argument pass to qc_release_frm(). This allows to give more context on the traces inside it. Note that no crash occured as QUIC traces always check validity on first arg before derefencing it. No backport needed.	2024-10-02 17:10:51 +02:00
Amaury Denoyelle	58b7a72d07	BUG/MINOR: mux-quic: fix crash on qcc_init() early return qcc_release() may be used in case qcc_init() cannot complete. In this case, connection instance is NULL. As such, it cannot be dereferenced without testing it first. This should fix github coverity report #2739. No backport needed.	2024-10-02 17:06:31 +02:00
Christopher Faulet	cea1379cf1	BUG/MINOR: http-ana: Disable fast-fwd for unfinished req waiting for upgrade If a request is waiting for a protocol upgrade but it is not finished, the data fast-forwarding is disabled. Otherwise, the request analyzers will miss the end of the message. This case is possible since the commit 01fb1a54 ("BUG/MEDIUM: mux-h1/mux-h2: Reject upgrades with payload on H2 side only"). Indeed, before, a protocol upgrade was not allowed for request with payload. But it is now possible and this comes with a side-effect. It is not really satisfying but for now there is no other way to sync the muxes and the applicative stream. It seems to be a reasonnable fix for now, waiting for a deeper refactoring. This patch must be backported with the commit above.	2024-10-02 10:31:40 +02:00
Christopher Faulet	267ba1d889	MINOR: mux-h1: Use a dedicated function to conditionnaly set EOI flag on SE The same conditions are evaluated in h1_process_demux() and h1_fastfwd() to know if SE_FL_EOI flag must be set or not on the sedesc. So now, a dedicated function is used.	2024-10-02 10:22:51 +02:00
Christopher Faulet	6b39e245e1	BUG/MINOR: mux-h1: Fix condition to set EOI on SE during zero-copy forwarding During zero-copy data forwarding, the producer must set the EOI flag on the SE when end of the message is reached. It is already done but there is a case where this flag is set while it should not. When a request wants to perform a protocol upgrade and it is waiting for the server response, the flag must not be set because the HTTP message is finished but some data are possibly still expected, depending on the server response. On a 101-switching-protocol, more data will be sent because the producer is switch to TUNNEL state. So, now, the right condition is used. In DONE state, SE_FL_EOI flag is set on the sedesc iff: - it is the response - it is the request and the response is also in DONNE state - it is a request but no a protocol upgrade nor a CONNECT This patch must be backported as far as 2.9.	2024-10-02 10:22:51 +02:00
Christopher Faulet	27ee292731	MINOR: tcpcheck: Add support for an option host header value for httpchk option Support for headers and body hidden in the version for the "option httpchk" directive was removed. However a Host header is mandatory for HTTP/1.1 requests and some servers may return an error if it is not set. For now, to add it, an "http-check send" rule must be added. But it is not really handy to use an extra config line for this purpose. So now, it is possible to set the host header value, a log-format string, as extra argument to "option httpchk" directive. It must be the fourth argument: option httpchk GET / HTTP/1.1 www.srv.com While this patch is not a bug fix, it is simple enough to be backported if necessary. On 2.9 and older, lf_init_expr() does not exist and LIST_INIT() must be used instead.	2024-10-02 10:22:51 +02:00
Christopher Faulet	c39c351a73	MINOR: trace: Be able to chain commands for a source in one line In the configuration file or on the CLI, configuring traces for a specific source is a bit painful because this must be done in several lines. Thanks to this patch, it is now possible to fully configure traces for a source in one line. For instance, the following on the CLI: trace h1 sink stderr; trace h1 level developer; trace h1 verbosity complete; trace h1 start now can now be replaced by: trace h1 sink stderr level developer verbosity complete start now The same is true for the 'trace' directives in the configuration file.	2024-10-02 10:22:51 +02:00
Christopher Faulet	15a520d474	MINOR: config/trace: Add a 'traces' section to declare debug traces It is no longer supported to declare debug traces, via 'trace' directive, in a global section. A 'traces' directive must be used instead. The syntax of the 'trace' directive in these sections remains the same. But it is no longer experimental. The main reason for this change is to avoid to have a ring section defined before a global one. Indeed, for now, forward declarations of ring sections are not supported. So to configure traces, you had to add a ring section before the global one defining the traces. Most of time, that meant to have two global sections : global [...] # global settings ring <name> [...] global [...] # trace config In addition, it will be possible to easily extend the traces section by adding some new directives.	2024-10-02 10:22:51 +02:00
Willy Tarreau	53f52e67a0	BUG/MEDIUM: queue: always dequeue the backend when redistributing the last server An interesting bug was revealed by commit 5541d4995d ("BUG/MEDIUM: queue: deal with a rare TOCTOU in assign_server_and_queue()"). When shutting down a server to redistribute its connections, no check is made on the backend's queue. If we're turning off the last server and the backend has pending connections, these ones will wait there till the queue timeout. But worse, since the commit above, we can enter an endless loop in the following situation: - streams are present in the backend's queue - streams are purged on the last server via srv_shutdown_streams() - that one calls pendconn_redistribute(srv) which does not purge the backend's pendconns - a stream performs some load balancing and enters assign_server_and_queue() - assign_server() is called in turn - the LB algo is non-deterministic and there are entries in the backend's queue. The function notices it and returns SRV_STATUS_FULL - assign_server_and_queue() calls pendconn_add() to add the connection to the backend's queue - on return, pendconn_must_try_again() is called, it figures there's no stream served anymore on the server nor the proxy, so it removes the pendconn from the queue and returns 1 - assign_server_and_queue() loops back to the beginning to try again, while the conditions have not changed, resulting in an endless loop. Ideally a change count should be used in the queues so that it's possible to detect that some dequeuing happened and/or that a last stream has left. But that wouldn't completely solve the problem that is that we must never ever add to a queue when there's no server streams to dequeue the new entries. The current solution consists in making pendconn_redistribute() take care of the proxy after the server in case there's no more server available on the proxy. It at least ensures that no pending streams are left in the backend's queue when shutting streams down or when the last server goes down. The try_again loop remains necessary to deal with inevitable races during pendconn additions. It could be limited to a few rounds, though, but it should never trigger if the conditions are sufficient to permit it to converge. One way to reproduce the issue is to run a config with a single server with maxconn 1 and plenty of threads, then run in loops series of: "disable server px/s;shutdown sessions server px/s; wait 100ms server-removable px/s; show servers conn px; enable server px/s" on the CLI at ~10/s while injecting with around 40 concurrent conns at 40-100k RPS. In this case in 10s - 1mn the crash can appear with a backtrace like this one for at least 1 thread: #0 pendconn_add (strm=strm@entry=0x17f2ce0) at src/queue.c:487 #1 0x000000000064797d in assign_server_and_queue (s=s@entry=0x17f2ce0) at src/backend.c:1064 #2 0x000000000064a928 in srv_redispatch_connect (s=s@entry=0x17f2ce0) at src/backend.c:1962 #3 0x000000000064ac54 in back_handle_st_req (s=s@entry=0x17f2ce0) at src/backend.c:2287 #4 0x00000000005ae1d5 in process_stream (t=t@entry=0x17f4ab0, context=0x17f2ce0, state=<optimized out>) at src/stream.c:2336 It's worth noting that other threads may often appear waiting after the poller and one in server_atomic_sync() waiting for isolation, because the event that is processed when shutting the server down is consumed under isolation, and having less threads available to dequeue remaining requests increases the probability to trigger the problem, though it is not at all necessary (some less common traces never show them). This should carefully be backported wherever the commit above was backported.	2024-10-01 18:57:51 +02:00
Amaury Denoyelle	8d68717a41	MEDIUM: quic: refactor buffered STREAM ACK consuming For the moment, streamdesc layer can only deal with in-order ACK at the stream level. Received out-of-order ACKs are buffered in a tree attached to a streambuf instance. Previously, caller of qc_stream_desc_ack() was responsible to implement consumption of these buffered ACKs. Refactor this by implementing it directly at the streamdesc layer within qc_stream_desc_ack(). This simplifies quic_rx ACK handling and ensure buffered ACKs are consumed as soon as possible.	2024-10-01 16:22:23 +02:00
Amaury Denoyelle	cc4384aeb7	MEDIUM: quic: handle out-of-order ACK at streamdesc layer qc_stream_desc_ack() is the entrypoint for streamdesc layer to handle a new acknowledgement of previously emitted STREAM data. Previously, it was only able to deal with in-order ACK offset. The caller was responsible to buffer out-of-order ACKs. Change this by dealing with the latter case directly in qc_stream_desc_ack(). This notably simplify ACK handling in quic_rx module.	2024-10-01 16:22:20 +02:00
Amaury Denoyelle	62558a9285	MINOR: quic: move buffered ACK to streambuf QUIC streamdesc layer is used to manage QUIC MUX stream txbuf data storage until acknowledgment. Currently, it only supports in-order acknowledgment at the stream level. This requires to be able to buffer out-of-order ACKs until they can be handled. Previously, these ACKs were stored in a tree to the streamdesc instance. Move this indexed storage at the streambuf instance. This commit is purely an architecture change. However, it will allow to extend ACK management in future patches, such as the ability to merge overlapping out-of-order ACKs.	2024-10-01 16:19:42 +02:00
Amaury Denoyelle	943e48dadd	MINOR: quic: store streambuf in a streamdesc tree qc_stream_desc layer is used by QUIC MUX to store emitted STREAM data until their acknowledgement. Each stream with Tx capability can allocate its own qc_stream_desc. In turn, each stream desc can have one or multiple data buffers. This is useful when a MUX stream releases a buffer and allocate a new one, to preserve bandwith without waiting to receive all acknowledgement of the previous buffer. Each buffer is encapsulated in a qc_stream_buf structure. Previously, it was stored as a list into qc_stream_desc. Change this storage to use a tree instead. Each buffer is indexed by their offset. This commit does not introduce functional changes. However, this rearchitecture will be necessary for future commit to extend ACK management which require fetching individual buffer instance, not just the first or last element of a streamdesc, by their offset.	2024-10-01 16:19:41 +02:00
Amaury Denoyelle	f4a83fbb14	MINOR: quic: do not remove qc_stream_desc automatically on ACK handling qc_stream_desc_ack() is used to handle ACK received for STREAM frame. It removes acknowledged data from their underlying buffer. If all data were removed after ACK handling, qc_stream_desc instance would automatically be freed at the end of qc_stream_desc_ack(). However, this renders the function complicated to use. Simplify this by removing this automatic removal. Now, caller is responsible to check after ACK handling if qc_stream_desc instance can be removed. This is easily done using qc_stream_desc_done() helper.	2024-10-01 16:19:25 +02:00
Amaury Denoyelle	db68f8ed86	MINOR: quic: refactor STREAM room notification qc_stream_desc is an intermediary layer between QUIC MUX and quic_conn. It is a facility which permits to store data to emit and keep them for retransmission until acknowledgment. This layer is responsible to notify QUIC MUX each time a buffer is freed. This is necessary as MUX buffer allocation is limited by the underlying congestion window size. Refactor this to use a mechanism similar to send notification. A new callback notify_room can now be registered to qc_stream_desc instance. This is set by QUIC MUX to qmux_ctrl_room(). On MUX QUIC free, special care is now taken to reset notify_room callback to NULL. Thanks to this refactoring, further adjustment have been made to refine the architecture. One of them is the removal of qc_stream_desc QC_SD_FL_OOB_BUF, which is now converted to a MUX layer flag QC_SF_TXBUF_OOB.	2024-10-01 16:19:25 +02:00
Amaury Denoyelle	d7f4e5abf0	MEDIUM: quic: strengthen MUX send notification Previous commit implement a refactor of MUX send notification from quic_conn layer. With this new architecture, a proper callback is defined for each qc_stream_desc instance. This architecture change allows to simplify notification from quic_conn layer. First, ensure the MUX callback to properly ignore retransmission of an already emitted frame. Luckily, this can be handled easily by comparing offsets and FIN status. Also, each QCS instance can now be unregistered from send notification just prior qc_stream_desc releasing. This ensures a QCS is never manipulated from quic_conn after its emission ending. Both these changes render the send notification more robust. As a nice effect, flag QUIC_FL_CONN_TX_MUX_CONTEXT can be removed as it is now unneeded.	2024-10-01 16:19:25 +02:00
Amaury Denoyelle	6ad99af0a9	MINOR: quic: refactor MUX send notification For STREAM emission, MUX QUIC generates one or several frames and emit them via qc_send_mux(). Lower layer may use them as-is, or split them to lower chunk to fit in a QUIC packet. It is then responsible to notify the MUX to report the amount of data sent. Previously, this was done via a direct call from quic_conn to MUX using qcc_streams_sent_done(). Modify this to have a better isolation accross layers. Define a send callback handled by the qc_stream_desc instance. This allows the MUX to register each QCS instance individually to the renamved qmux_ctrl_send() which replaces qcc_streams_sent_done(). At quic_conn layer, qc_stream_desc_send() can be used now. This is a wrapper to qc_stream_desc layer to invoke the send callback if registered. This mechanism of qc_stream_desc callback should be extended later to implement other notifications accross the QUIC stack.	2024-10-01 16:19:25 +02:00

... 3 4 5 6 7 ...

23283 Commits