haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-01-05 03:29:35 +00:00

Author	SHA1	Message	Date
Fr�d�ric L�caille	464281af46	CLEANUP: quic: Useless tests in qc_rx_pkt_handle() There is no reason to test <qc> nullity at the end of this function because it is clearly not null, furthermore the trace handle the case where <qc> is null. Must be backported to 2.7.	2023-05-24 16:30:11 +02:00
Fr�d�ric L�caille	ab3aa0ff22	CLEANUP: quic: Indentation fix quic_rx_pkt_retrieve_conn() Add missing spaces. Must be backported to 2.7.	2023-05-24 16:30:11 +02:00
Fr�d�ric L�caille	5fa633e22f	MINOR: quic: Align "show quic" command help information Align the "show quic" help information with all the others command help information. Furthermore, makes this information match the management documentation. Must be backported to 2.7.	2023-05-24 16:30:11 +02:00
Fr�d�ric L�caille	35b63964a0	BUG/MINOR: quic: Missing Retry token length on receipt quic_retry_token_check() must decipher the token sent to and received back from clients. This token is made of the token format byte, the ODCID prefixed by its one byte length, the timestamp of its creation, and terminated by an AEAD TAG followed by the salt used to derive the secret to cipher the token. So, the length of these data must be between 2 + QUIC_ODCID_MINLEN + sizeof(uint32_t) + QUIC_TLS_TAG_LEN + QUIC_RETRY_TOKEN_SALTLEN and 2 + QUIC_CID_MAXLEN + sizeof(uint32_t) + QUIC_TLS_TAG_LEN + QUIC_RETRY_TOKEN_SALTLEN. Must be backported to 2.7 and 2.6.	2023-05-24 16:30:11 +02:00
Fr�d�ric L�caille	6d6ddb2ce5	BUG/MINOR: quic: Wrong token length check (quic_generate_retry_token()) This bug would never occur because the buffer supplied to quic_generate_retry_token() to build a Retry token is large enough to embed such a token. Anyway, this patch fixes quic_generate_retry_token() implementation. There were two errors: this is the ODCID which is added to the token. Furthermore the timestamp was not taken into an account. Must be backported to 2.6 and 2.7.	2023-05-24 16:30:11 +02:00
Fr�d�ric L�caille	aaf32f0c83	MINOR: quic: Add low level traces (addresses, DCID) Add source and destination addresses to QUIC_EV_CONN_RCV trace event. This is used by datagram/socket level functions (quic_sock.c). Must be backported to 2.7.	2023-05-24 16:30:11 +02:00
Willy Tarreau	9577a152b5	BUILD: makefile: do not erase build options for some build options One painfully annoying thing with the build options change detection is that they get rebuild for about everything except when the build target is exactly "reg-tests". But in practice every time reg tests are run we end up having to experience a full rebuild because the reg-tests script runs "make version" which is sufficient to refresh the file. There are two issues here. The first one is that we ought to skip all targets that do not make use of the build options. This includes all the tools such as "flags" for example, or utility targets like "tags", "help" or "version". The second issue is that with most of these extra targets we do not set the TARGET variable, and that one is used when creating the build_opts file, so let's preserve the file when TARGET is not set. Now it's possible to re-run a make after a make reg-tests without having to rebuild the whole project.	2023-05-24 16:23:24 +02:00
Willy Tarreau	060769836e	CLEANUP: makefile: don't display a dummy features list without a target "make help" ends with a list of enabled/disabled features for TARGET '', which makes no sense. Let's only display enabled/disabled features when a target is set. It also removes visual pollution when users seek help.	2023-05-24 16:23:24 +02:00
Amaury Denoyelle	f1df006ffe	DEV: add a Lua helper script for SSL keys logging This script can be used through a http-request rules to log SSL keys for traffic on a dedicated frontend. The resulting file can then be injected into wireshark to decipher the corresponding network capture.	2023-05-24 16:08:23 +02:00
Christopher Faulet	c2f1d0ee5e	BUG/MEDIUM: mux-h2: Propagate termination flags when frontend SC is created We must evaluate if EOS/EOI/ERR_PENDING/ERROR flags must be set on the SE when the frontend SC is created because the rxbuf is transferred to the steeam at this stage. It means the call to h2_rcv_buf() may be skipped on some circumstances. And indeed, it happens when HAproxy quickly replies, for instance because of a deny rule. In this case, depending on the scheduling, the abort may block the receive attempt from the SC. In this case if SE flags were not properly set earlier, there is no way to terminate the request and the session may be freezed. For now, I can't explain why there is no timeout when this happens but it remains an issue because here we should not rely on timeouts to close the stream. This patch relies on following commits: * MINOR: mux-h2: Add a function to propagate termination flags from h2s to SE * MINOR: mux-h2: Set H2_SF_ES_RCVD flag when decoding the HEADERS frame The issue was encountered on the 2.8 but it seems the bug exists since the 2.4. But it is probably a good idea to only backport the series to 2.7 only and wait for a bug report on earlier versions. This patch should solve the issue #2147.	2023-05-24 16:06:11 +02:00
Christopher Faulet	531dd050ff	MINOR: mux-h2: Add a function to propagate termination flags from h2s to SE The function h2s_propagate_term_flags() was added to check the H2S state and evaluate when EOI/EOS/ERR_PENDING/ERROR flags must be set on the SE. It is not the only place where those flags are set. But it centralizes the synchro between the H2 stream and the SC. For now, this function is only used at the end of h2_rcv_buf(). But it will be used to fix a bug.	2023-05-24 16:06:11 +02:00
Christopher Faulet	1a60a66306	MINOR: mux-h2: Set H2_SF_ES_RCVD flag when decoding the HEADERS frame The flag H2_SF_ES_RCVD is set on the H2 stream when the ES flag is found in a frame. On HEADERS frame, it was set in function processing the frame. It is moved in the function decoding the frame. Fundamentally, this changes nothing. But it will be useful to have this information earlier when a client H2 stream is created.	2023-05-24 16:06:11 +02:00
Christopher Faulet	78b1eb2b04	BUG/MINOR: mux-h2: Check H2_SF_BODY_TUNNEL on H2S flags and not demux frame ones In h2c_frt_stream_new(), H2_SF_BODY_TUNNEL flags was tested on demux frame flags (h2c->dff) instead of the h2s flags. By chance, it is a noop test becasue H2_SF_BODY_TUNNEL value, once converted to an int8_t, is 0. It is a 2.8-specific issue. No backport needed.	2023-05-24 16:06:11 +02:00
Willy Tarreau	1e1c28873c	BUILD: makefile: fix build issue on GNU make < 3.82 Thierry Fournier reported a build breakage with the ubiquitous make 3.81, LDFLAGS were ignored. This is caused by the declaration of the collect_opt_flags macro that is defined with an "=" sign, something that only appeared in 3.82 and that is not necessary. With it removed, the build now works fine at least from 3.80 to 4.3. No backport is needed since this makefile cleanup appeared in 2.8.	2023-05-24 15:51:03 +02:00
Amaury Denoyelle	152beeec34	MINOR: mux-quic: report error on stream-endpoint earlier A RESET_STREAM is emitted in several occasions : - protocol error during HTTP/3.0 parsing - STOP_SENDING reception In both cases, if a stream-endpoint is attached we must set its ERR flag. This was correctly done but after some delay as it was only when the RESET_STREAM was emitted. Change this to set the ERR flag as soon as one of the upper cases has been encountered. This should help to release faster streams in error. This should be backported up to 2.7.	2023-05-24 14:46:52 +02:00
Amaury Denoyelle	37d78997ae	MINOR: mux-quic: only set EOS on RESET_STREAM recv A recent review was done to rationalize ERR/EOS/EOI flags on stream endpoint. A common definition for both H1/H2/QUIC mux have been written in the following documentation : ./doc/internals/stconn-close.txt In QUIC it is possible to close each channels of a stream independently with RESET_STREAM and STOP_SENDING frames. When a RESET_STREAM is received, it indicates that the peer has ended its transmission in an abnormal way. However, it is still ready to receive. Previously, on RESET_STREAM reception, QUIC MUX set the ERR flag on stream-endpoint. However, according to the QUIC mechanism, it should be instead EOS but this was impossible due to a BUG_ON() which prevents EOS without EOI or ERR. This BUG_ON was only present because this case was never used before the introduction of QUIC. It was removed in a recent commit which allows us to now properly set EOS alone on RESET_STREAM reception. In practice, this change allows to continue to send data even after RESET_STREAM reception. However, currently browsers always emit it with a STOP_SENDING as this is used to abort the whole H3 streams. In the end this will result in a stream-endpoint with EOS and ERR_PENDING/ERR flags. This should be backported up to 2.7.	2023-05-24 14:39:17 +02:00
Amaury Denoyelle	8de35925f7	MINOR: mux-quic: set both EOI EOS for stream fin A recent review was done to rationalize ERR/EOS/EOI flags on stream endpoint. A common definition for both H1/H2/QUIC mux have been written in the following documentation : ./doc/internals/stconn-close.txt Always set EOS with EOI flag to conform to this specification. EOI is set whenever the proper stream end has been encountered : with QUIC it corresponds to a STREAM frame with FIN bit. At this step, RESET_STREAM frames are ignored by QUIC MUX as allowed by RFC 9000. This means we can always set EOS at the same time with EOI. This should be backported up to 2.7.	2023-05-24 14:23:22 +02:00
Ilya Shipitsin	97c344dae0	BUILD: quic: re-enable chacha20_poly1305 for libressl this reverts `d2be9d4c48` LibreSSL implements EVP_chacha20_poly1305() with EVP_CIPHER for every released version starting with 3.6.0	2023-05-23 19:20:36 +02:00
Mariam John	6ff043de2c	DOC/MINOR: config: Fix typo in description for `ssl_bc` in configuration.txt Fix a minor typo in the description of the `ssl_bc` sample fetch method described under Section `7.3.4. Fetching samples at Layer 5` in configuration.txt. Changed `other` to `to`.	2023-05-23 17:06:06 +02:00
Willy Tarreau	e49e9e64a2	DOC: internal: add a bit of documentation for the stconn closing conditions The conditions where ERR, EOS and EOI are found are not always crystal clear, and the fact that there's still a good bunch of original ones dating from the early days and that seem to test for non-existing cases doesn't help either. After auditing the code base and projecting the 3 main muxes' stream termination conditions, with Christopher and Amaury we could establish the current flags matrix which indicates both what each combination means for each mux and when it is set by each of them (or not set and for what reason). It should be sufficient to void doubts when adding code or when chasing a bug. It must not be backported because it is highly specific to the latest 2.8-dev.	2023-05-23 16:18:19 +02:00
Willy Tarreau	b7209d42d9	MEDIUM: stconn: make the SE_FL_ERR_PENDING to ERROR transition systematic During a code audit of the various situations that promote ERR_PENDING to ERROR, it appeared that: - all muxes use se_fl_set_error() to set it, which chooses either based on EOI/EOS presence ; - EOI/EOS that arrive late after ERR_PENDING were not systematically upgraded to ERROR This results in confusion about how such ERROR or ERR_PENDING ought to be handled, which is not quite desirable. This patch adds a test to se_fl_set() to detect if we're setting EOI or EOS while ERR_PENDING is present, or the other way around so that any sequence of EOI/EOS <-> ERR_PENDING results in ERROR being set. This way there will no longer be possible situations where ERROR is missing while the other ones are set.	2023-05-23 16:17:04 +02:00
Christopher Faulet	2437377445	MEDIUM: stconn/applet: Allow SF_SL_EOS flag alone During the refactoring on SC/SE flags, it was stated that SE_FL_EOS flag should not be set without on of SE_FL_EOI or SE_FL_ERROR flags. In fact, it is a problem for the QUIC/H3 multiplexer. When a RST_STREAM frame is received, it means no more data will be received from the peer. And this happens before the end of the message (RST_STREAM frame received after the end of the message are ignored). At this stage, it is a problem to report an error because from the QUIC point of view, it is valid. Data may still be sent to the peer. If an error is reported, this will stop the data sending too. In the same idea, the H1 mulitplexer reports an error when the message is truncated because of a read0. But only an EOS flag should be reported in this case, not an error. Fundamentally, it is important to distinguish errors from shuts for reads because some cases are valid. For instance a H1 client can choose to stop uploading data if it received the server response. So, relax tests on SE flags by removing BUG_ON_HOT() on SE_FL_EOS flag. For now, the abort will be handled in the HTTP analyzers.	2023-05-23 15:52:35 +02:00
Amaury Denoyelle	aa39cc9f42	MINOR: quic: fix alignment of oneline show quic Output of 'show quic' CLI in oneline mode was not correctly done. This was caused both due to differing qc pointer size and ports length. Force proper alignment by using maximum sizes as expected and complete with blanks if needed. This should be backported up to 2.7.	2023-05-22 14:18:02 +02:00
Amaury Denoyelle	7385ff3f0c	BUG/MINOR: quic: handle Tx packet allocation failure properly qc_prep_app_pkts() is responsible to built several new packets for sending. It can fail due to memory allocation error. Before this patch, the Tx buffer was released on error even if some packets were properly generated. With this patch, if an error happens on qc_prep_app_pkts(), we still try to send already built packets if Tx buffer is not empty. The sending loop is then interrupted and the Tx buffer is released with data cleared. This should be backported up to 2.7.	2023-05-22 14:18:02 +02:00
Amaury Denoyelle	f8fbb0b94e	MINOR: quic: use WARN_ON for encrypt failures It is expected that quic_packet_encrypt() and quic_apply_header_protection() never fails as encryption is done in place. This allows to remove their return value. This is useful to simplify error handling on sending path. An error can only be encountered on the first steps when allocating a new packet or copying its frame content. After a clear packet is successfully built, no error is expected on encryption. However, it's still unclear if our assumption that in-place encryption function never fail. As such, a WARN_ON() statement is used if an error is detected at this stage. Currently, it's impossible to properly manage this without data loss as this will leave partially unencrypted data in the send buffer. If warning are reported a solution will have to be implemented. This should be backported up to 2.7.	2023-05-22 11:20:44 +02:00
Amaury Denoyelle	5eadc27623	MINOR: quic: remove return val of quic_aead_iv_build() quic_aead_iv_build() should never fail unless we call it with buffers of different size. This never happens in the code as every input buffers are of size QUIC_TLS_IV_LEN. Remove the return value and add a BUG_ON() to prevent future misusage. This is especially useful to remove one error handling on the sending patch via quic_packet_encrypt(). This should be backported up to 2.7.	2023-05-22 11:17:18 +02:00
Amaury Denoyelle	8d6d246dbc	CLEANUP: mux-quic/h3: complete BUG_ON with comments Complete each useful BUG_ON statements with a comment to explain its purpose. Also convert BUG_ON_HOT to BUG_ON as they should not have a big impact. This should be backported up to 2.7.	2023-05-22 11:17:18 +02:00
Daniel Epperson	ffdf6a32a7	DOC: add size format section to manual The manual refers to an HAProxy size format but does not define it. This patch adds a section to the manual to define the HAProxy size format.	2023-05-17 17:21:44 +02:00
Christopher Faulet	f48b23f5da	[RELEASE] Released version 2.8-dev12 Released version 2.8-dev12 with the following main changes : - BUILD: mjson: Fix warning about unused variables - MINOR: spoe: Don't stop disabled proxies - BUG/MEDIUM: filters: Don't deinit filters for disabled proxies during startup - BUG/MINOR: hlua_fcn/queue: fix broken pop_wait() - BUG/MINOR: hlua_fcn/queue: fix reference leak - CLEANUP: hlua_fcn/queue: make queue:push() easier to read - BUG/MINOR: quic: Buggy acknowlegments of acknowlegments function - DEBUG: list: add DEBUG_LIST to purposely corrupt list heads after delete - MINOR: stats: report the total number of warnings issued - MINOR: stats: report the number of times the global maxconn was reached - BUG/MINOR: mux-quic: do not prevent shutw on error - BUG/MINOR: mux-quic: do not free frame already released by quic-conn - BUG/MINOR: mux-quic: no need to subscribe for detach streams - MINOR: mux-quic: add traces for stream wake - MINOR: mux-quic: do not send STREAM frames if already subscribe - MINOR: mux-quic: factorize send subscribing - MINOR: mux-quic: simplify return path of qc_send() - MEDIUM: quic: streamline error notification - MEDIUM: mux-quic: adjust transport layer error handling - MINOR: stats: report the listener's protocol along with the address in stats - BUG/MEDIUM: mux-fcgi: Never set SE_FL_EOS without SE_FL_EOI or SE_FL_ERROR - BUG/MEDIUM: mux-fcgi: Don't request more room if mux is waiting for more data - MINOR: stconn: Add a cross-reference between SE descriptor - BUG/MINOR: proxy: missing free in free_proxy for redirect rules - MINOR: proxy: add http_free_redirect_rule() function - BUG/MINOR: http_rules: fix errors paths in http_parse_redirect_rule() - CLEANUP: http_act: use http_free_redirect_rule() to clean redirect act - MINOR: tree-wide: use free_acl_cond() where relevant - CLEANUP: acl: discard prune_acl_cond() function - BUG/MINOR: cli: don't complain about empty command on empty lines - MINOR: cli: add an option to display the uptime in the CLI's prompt - MINOR: master/cli: also implement the timed prompt on the master CLI - MINOR: cli: make "show fd" identify QUIC connections and listeners - MINOR: httpclient: allow to disable the DNS resolvers of the httpclient - BUILD: debug: fix build issue on 32-bit platforms in "debug dev task" - MINOR: ncbuf: missing malloc checks in standalone code - DOC: lua: fix core.{proxies,frontends,backends} visibility - EXAMPLES: fix race condition in lua mailers script - BUG/MINOR: errors: handle malloc failure in usermsgs_put() - BUG/MINOR: log: fix memory error handling in parse_logsrv() - BUG/MINOR: quic: Wrong redispatch for external data on connection socket - MINOR: htx: add function to set EOM reliably - MINOR: mux-quic: remove dedicated function to handle standalone FIN - BUG/MINOR: mux-quic: properly handle buf alloc failure - BUG/MINOR: mux-quic: handle properly recv ncbuf alloc failure - BUG/MINOR: quic: do not alloc buf count on alloc failure - BUG/MINOR: mux-quic: differentiate failure on qc_stream_desc alloc - BUG/MINOR: mux-quic: free task on qc_init() app ops failure - MEDIUM: session/ssl: return the SSL error string during a SSL handshake error - CI: enable monthly Fedora Rawhide clang builds - MEDIUM: mworker/cli: does not disconnect the master CLI upon error - MINOR: stconn: Remove useless test on sedesc on detach to release the xref - MEDIUM: proxy: stop emitting logs for internal proxies when stopping - MINOR: ssl: add new sample ssl_c_r_dn - BUG/MEDIUM: mux-h2: make sure control frames do not refresh the idle timeout - BUILD: ssl: ssl_c_r_dn fetches uses functiosn only available since 1.1.1 - BUG/MINOR: mux-quic: handle properly Tx buf exhaustion - BUG/MINOR: h3: missing goto on buf alloc failure - BUILD: ssl: get0_verified chain is available on libreSSL - BUG/MINOR: makefile: use USE_LIBATOMIC instead of USE_ATOMIC - MINOR: mux-quic: add trace to stream rcv_buf operation - MINOR: mux-quic: properly report end-of-stream on recv - MINOR: mux-quic: uninline qc_attach_sc() - BUG/MEDIUM: mux-quic: fix EOI for request without payload - MINOR: checks: make sure spread-checks is used also at boot time - BUG/MINOR: tcp-rules: Don't shortened the inspect-delay when EOI is set - REGTESTS: log: Reduce response inspect-delay for last_rule.vtc - DOC: config: Clarify conditions to shorten the inspect-delay for TCP rules - CLEANUP: server: remove useless tmptrash assigments in srv_update_status() - BUG/MINOR: server: memory leak in _srv_update_status_op() on server DOWN - CLEANUP: check; Remove some useless assignments to NULL - CLEANUP: stats: update the trash chunk where it's used - MINOR: clock: measure the total boot time - MINOR: stats: report the boot time in "show info" - BUG/MINOR: checks: postpone the startup of health checks by the boot time - MINOR: clock: provide a function to automatically adjust now_offset - BUG/MINOR: clock: automatically adjust the internal clock with the boot time - CLEANUP: fcgi-app; Remove useless assignment to NULL - REGTESTS: log: Reduce again response inspect-delay for last_rule.vtc - CI: drop Fedora m32 pipeline in favour of cross matrix - MEDIUM: checks: Stop scheduling healthchecks during stopping stage - MEDIUM: resolvers: Stop scheduling resolution during stopping stage - BUG/MINOR: hlua: SET_SAFE_LJMP misuse in hlua_event_runner() - BUG/MINOR: debug: fix pointer check in debug_parse_cli_task()	2023-05-17 17:10:12 +02:00
Aurelien DARRAGON	b6a24a52a2	BUG/MINOR: debug: fix pointer check in debug_parse_cli_task() Task pointer check in debug_parse_cli_task() computes the theoric end address of provided task pointer to check if it is valid or not thanks to may_access() helper function. However, relative ending address is calculated by adding task size to 't' pointer (which is a struct task pointer), thus it will result to incorrect address since the compiler automatically translates 't + x' to 't + x * sizeof(t)' internally (with sizeof(t) != 1 here). Solving the issue by using 'ptr' (which is the void * raw address) as starting address to prevent automatic address scaling. This was revealed by coverity, see GH #2157. No backport is needed, unless `9867987` ("DEBUG: cli: add "debug dev task" to show/wake/expire/kill tasks and tasklets") gets backported.	2023-05-17 16:49:17 +02:00
Aurelien DARRAGON	7428adaf0d	BUG/MINOR: hlua: SET_SAFE_LJMP misuse in hlua_event_runner() When hlua_event_runner() pauses the subscription (ie: if the consumer can't keep up the pace), hlua_traceback() is used to get the current lua trace (running context) to provide some info to the user. However, as hlua_traceback() may raise an error (__LJMP) is set, it is used within a SET_SAFE_LJMP() / RESET_SAFE_LJMP() combination to ensure lua errors are properly handled and don't result in unexpected behavior. But the current usage of SET_SAFE_LJMP() within the function is wrong since hlua_traceback() will run a second time (unprotected) if the first (protected) attempt fails. This is undefined behavior and could even lead to crashes. Hopefully it is very hard to trigger this code path, thus we can consider this as a minor bug. Also using this as an opportunity to enhance the message report to make it more meaningful to the user. This should fix GH #2159. It is a 2.8 specific bug, no backport needed unless `c84899c636` ("MEDIUM: hlua/event_hdl: initial support for event handlers") gets backported.	2023-05-17 16:48:40 +02:00
Christopher Faulet	06e9c81bd0	MEDIUM: resolvers: Stop scheduling resolution during stopping stage When the process is stopping, the server resolutions are suspended. However the task is still periodically woken up for nothing. If there is a huge number of resolution, it may lead to a noticeable CPU consumption for no reason. To avoid this extra CPU cost, we stop to schedule the the resolution tasks during the stopping stage. Of course, it is only true for server resolutinos. Dynamic ones, via do-resolve actions, are not concerned. These ones must still be triggered during stopping stage. Concretly, during the stopping stage, the resolvers task is no longer scheduled if there is no running resolutions. In this case, if a do-resolve action is evaluated, the task is woken up. This patch should partially solve the issue #2145.	2023-05-17 16:48:33 +02:00
Christopher Faulet	8bca3cc8c7	MEDIUM: checks: Stop scheduling healthchecks during stopping stage When the process is stopping, the health-checks are suspended. However the task is still periodically woken up for nothing. If there is a huge number of health-checks and if they are woken up in same time, it may lead to a noticeable CPU consumption for no reason. To avoid this extra CPU cost, we stop to schedule the health-check tasks when the proxy is disabled or stopped. This patch should partially solve the issue #2145.	2023-05-17 14:57:10 +02:00
Ilya Shipitsin	8a46f98615	CI: drop Fedora m32 pipeline in favour of cross matrix Fedora m32 monthly was introduced before cross matrix. Actually, many of cross builds are 32 bit, no need to keep dedicated Fedora definition	2023-05-17 14:57:10 +02:00
Christopher Faulet	292619fc90	REGTESTS: log: Reduce again response inspect-delay for last_rule.vtc It was previously reduced from 10s to 1s but it remains too high, espeically for the CI. It may be drastically reduced to 100ms. Idea is to just be sure we will wait for the response before evaluating the TCP rules.	2023-05-17 11:12:25 +02:00
Christopher Faulet	c8a7bb16b7	CLEANUP: fcgi-app; Remove useless assignment to NULL When the fcgi configuration is checked and fcgi rules are created, a useless assignment to NULL is reported by Covertiy. Let's remove it. This patch should fix the coverity report #2161.	2023-05-17 09:42:37 +02:00
Willy Tarreau	c7b9308f20	BUG/MINOR: clock: automatically adjust the internal clock with the boot time This is a better and more general solution to the problem described in this commit: BUG/MINOR: checks: postpone the startup of health checks by the boot time Now we're updating the now_offset that is used to compute now_ms at the few points where we update the ready date during boot. This ensures that now_ms while being stable during all the boot process will be correct and will start with the boot value right after the boot is finished. As such the patch above is rolled back (we don't want to count the boot time twice). This must not be backported because it relies on the more flexible clock architecture in 2.8.	2023-05-17 09:33:54 +02:00
Willy Tarreau	5345490b8e	MINOR: clock: provide a function to automatically adjust now_offset Right now there's no way to enforce a specific value of now_ms upon startup in order to compensate for the time it takes to load a config, specifically when dealing with the health check startup. For this we'd need to force the now_offset value to compensate for the last known value of the current date. This patch exposes a function to do exactly this.	2023-05-17 09:33:54 +02:00
Willy Tarreau	8e978a094d	BUG/MINOR: checks: postpone the startup of health checks by the boot time When health checks are started at boot, now_ms could be off by the boot time. In general it's not even noticeable, but with very large configs taking up to one or even a few seconds to start, this can result in a part of the servers' checks being scheduled slightly in the past. As such all of them will start groupped, partially defeating the purpose of the spread-checks setting. For example, this can cause a burst of connections for the network, or an excess of CPU usage during SSL handshakes, possibly even causing some timeouts to expire early. Here in order to compensate for this, we simply add the known boot time to the computed delay when scheduling the startup of checks. That's very simple and particularly efficient. For example, a config with 5k servers in 800 backends checked every 5 seconds, that was taking 3.8 seconds to start used to show this distribution of health checks previously despite the spread-checks 50: 3690 08:59:25 417 08:59:26 213 08:59:27 71 08:59:28 428 08:59:29 860 08:59:30 918 08:59:31 938 08:59:32 1124 08:59:33 904 08:59:34 647 08:59:35 890 08:59:36 973 08:59:37 856 08:59:38 893 08:59:39 154 08:59:40 Now with the fix it shows this: 470 08:59:59 929 09:00:00 896 09:00:01 937 09:00:02 854 09:00:03 827 09:00:04 906 09:00:05 863 09:00:06 913 09:00:07 873 09:00:08 162 09:00:09 This should be backported to all supported versions. It depends on this commit: MINOR: clock: measure the total boot time For 2.8 where the internal clock is now totally independent on the human one, an more generic fix will consist in simply updating now_ms to reflect the startup time.	2023-05-17 09:33:54 +02:00
Willy Tarreau	5723b382ed	MINOR: stats: report the boot time in "show info" Just like we have the uptime in "show info", let's add the boot time. It's trivial to collect as it's just the difference between the ready date and the start date, and will allow users to monitor this element in order to take action before it starts becoming problematic. Here the boot time is reported in milliseconds, so this allows to even observe sub-second anomalies in startup delays.	2023-05-17 09:33:54 +02:00
Willy Tarreau	da4aa6905c	MINOR: clock: measure the total boot time Some huge configs take a significant amount of time to start and this can cause some trouble (e.g. health checks getting delayed and grouped, process not responding to the CLI etc). For example, some configs might start fast in certain environments and slowly in other ones just due to the use of a wrong DNS server that delays all libc's resolutions. Let's first start by measuring it by keeping a copy of the most recently known ready date, once before calling check_config_validity() and then refine it when leaving this function. A last call is finally performed just before deciding to split between master and worker processes, and it covers the whole boot. It's trivial to collect and even allows to get rid of a call to clock_update_date() in function check_config_validity() that was used in hope to better schedule future events.	2023-05-17 09:33:54 +02:00
Willy Tarreau	52fd879953	CLEANUP: stats: update the trash chunk where it's used When integrating the number of warnings in "show info" in 2.8 with commit `3c4a297d2` ("MINOR: stats: report the total number of warnings issued"), the update of the trash buffer used by the Tainted flag got displaced lower. There's no harm for now util someone adds a new metric requiring a call to chunk_newstr() and gets both values merged. Let's move the call to its location now.	2023-05-17 09:33:54 +02:00
Christopher Faulet	cb76030356	CLEANUP: check; Remove some useless assignments to NULL In process_chk_conn(), some assignments to NULL are useless and are reported by Coverity as unused value. while it is harmless, these assignments can be removed. This patch should fix the coverity report #2158.	2023-05-17 09:28:23 +02:00
Aurelien DARRAGON	0d2f1acee6	BUG/MINOR: server: memory leak in _srv_update_status_op() on server DOWN When server is transitionning from UP to DOWN, a log message is generated. e.g.: "Server backend_name/server_name is DOWN") However since `f71e064` ("MEDIUM: server: split srv_update_status() in two functions"), the allocated buffer tmptrash which is used to prepare the log message is not freed after it has been used, resulting in a small memory leak each time a server goes DOWN because of an operational change. This is a 2.8 specific bug, no backport needed unless the above commit gets backported.	2023-05-17 09:21:01 +02:00
Aurelien DARRAGON	22d584a993	CLEANUP: server: remove useless tmptrash assigments in srv_update_status() Within srv_update_status subfunctions _op() and _adm(), each time tmptrash is freed, we assign it to NULL to ensure it will not be reused. However, within those functions it is not very useful given that tmptrash is never checked against NULL except upon allocation through alloc_trash_chunk(), which happens everytime a new log message is generated, sent, and then freed right away, so there are no code paths that could lead to tmptrash being checked for reuse (tmptrash is systematically overwritten since all log messages are independant from each other). This was raised by coverity, see GH #2162.	2023-05-17 09:21:01 +02:00
Christopher Faulet	43525abceb	DOC: config: Clarify conditions to shorten the inspect-delay for TCP rules Add a sentence to state when the inspect-delay is shortened for a TCP rule.	2023-05-17 09:21:01 +02:00
Christopher Faulet	4ab27a2403	REGTESTS: log: Reduce response inspect-delay for last_rule.vtc Because of the previous fix, log/last_rule.vtc script is failing. The inspect-delay is no longer shorten when the end of the message is reached. Thus WAIT_END acl is trully respected. 10s is too high and hit the Vtext timeout, making the script fails.	2023-05-17 09:21:01 +02:00
Christopher Faulet	2d5a5665fe	BUG/MINOR: tcp-rules: Don't shortened the inspect-delay when EOI is set A regression was introduced with the commit `cb59e0bc3` ("BUG/MINOR: tcp-rules: Stop content rules eval on read error and end-of-input"). We should not shorten the inspect-delay when the EOI flag is set on the SC. Idea of the inspect-delay is to wait a TCP rule is matching. It is only interrupted if an error occurs, on abort or if the peer shuts down. It is also interrupted if the buffer is full. This last case is a bit ambiguous and discutable. It could be good to add ACLS, like "wait_complete" and "wait_full" to do so. But for now, we only remove the test on SC_FL_EOI flag. This patch must be backported to all stable versions.	2023-05-17 09:21:01 +02:00
Willy Tarreau	b93758cec9	MINOR: checks: make sure spread-checks is used also at boot time This makes use of spread-checks also for the startup of the check tasks. This provides a smoother load on startup for uneven configurations which tend to enable only some servers. Below is the connection distribution per second of the SSL checks of a config with 5k servers spread over 800 backends, with a check inter of 5 seconds: - default: 682 08:00:50 826 08:00:51 773 08:00:52 1016 08:00:53 885 08:00:54 889 08:00:55 825 08:00:56 773 08:00:57 1016 08:00:58 884 08:00:59 888 08:01:00 491 08:01:01 - with spread-checks 50: 437 08:01:19 866 08:01:20 777 08:01:21 1023 08:01:22 1118 08:01:23 923 08:01:24 641 08:01:25 859 08:01:26 962 08:01:27 860 08:01:28 929 08:01:29 909 08:01:30 866 08:01:31 849 08:01:32 114 08:01:33 - with spread-checks 50 + this patch: 680 08:01:55 922 08:01:56 962 08:01:57 899 08:01:58 819 08:01:59 843 08:02:00 916 08:02:01 896 08:02:02 886 08:02:03 846 08:02:04 903 08:02:05 894 08:02:06 178 08:02:07 The load is much smoother from the start, this can help initial health checks succeed when many target the same overloaded server for example. This could be backported as it should make border-line configs more reliable across reloads.	2023-05-17 08:10:40 +02:00
Amaury Denoyelle	bf86d89ea6	BUG/MEDIUM: mux-quic: fix EOI for request without payload When a full message is received for a stream, MUX is responsible to set EOI flag. This was done through rcv_buf stream callback by checking if QCS HTX buffer contained the EOM flag. This is not correct for HTTP without body. In this case, QCS HTX buffer is never used. Only a local HTX buffer is used to transfer headers just as stream endpoint is created. As such, EOI is never transmitted to the upper layer. If the transfer occur without any issue, this does not seem to cause any problem. However, in case the transfer is aborted, the stream is never released which cause a memory leak and prevent the process soft-stop. To fix this, also check if EOM is put by application layer during headers conversion. If true, this is transferred through a new argument to qc_attach_sc() MUX function which is responsible to set the EOI flag. This issue was reproduced using h2load with hundred of connections. h2load is interrupted with a SIGINT which causes streams to never be closed on haproxy side. This should be backported up to 2.6.	2023-05-16 17:53:45 +02:00

1 2 3 4 5 ...

20223 Commits