haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2024-12-27 07:02:11 +00:00

Author	SHA1	Message	Date
Remi Tricot-Le Breton	328a893713	MINOR: ssl: Change level of ocsp-update logs The pure ocsp-update log used to be in log level "info" and it would be mixed with actual traffic logs. This patch changes it to level "notice".	2024-03-20 16:12:11 +01:00
Remi Tricot-Le Breton	d4eeaa4003	MEDIUM: ssl: Change output of ocsp-update log Since commit "BUG/MEDIUM: ssl: Fix crash in ocsp-update log function", some information from the log line are "faked" because they can be actually retrieved anymore (or never could). We should then remove them from the logline all along instead of providing some useless fields. We then only keep pure OCSP-update information in the log line: "<certname> <status> <status str> <fail count> <success count>"	2024-03-20 16:12:11 +01:00
Remi Tricot-Le Breton	d4e3be18df	BUG/MEDIUM: ssl: Fix crash in ocsp-update log function The ocsp-update logging mechanism was built around the 'sess_log' function which required to keep a pointer to the said session until the logging function could be called. This was made by keeping a pointer to the appctx returned by the 'httpclient_start' function. But this appctx lives its life on its own and might be destroyed before 'ssl_ocsp_send_log' is called, which could result in a crash (UAF). Fixing this crash requires to stop using the 'sess_log' function to emit the ocsp-update logs. The log line will then need to be built by hand out of the information actually available when 'ssl_ocsp_send_log' is called. Since we don't use the "regular" logging functions anymore, we don't need to use the error_logformat anymore. In order to keep a consistent behavior than before, we will keep the same format for the logs but replace the fields that required a 'sess' pointer by fake values (the %ci:%cp for instance, which was never filled anyway). This crash was raised in GitHub issue #2442. It should be backported up to branch 2.8.	2024-03-20 16:12:10 +01:00
Remi Tricot-Le Breton	5c25c577a0	BUG/MEDIUM: ssl: Fix crash when calling "update ssl ocsp-response" when an update is ongoing The CLI command "update ssl ocsp-response" was forcefully removing an OCSP response from the update tree regardless of whether it used to be in it beforehand or not. But since the main OCSP upate task works by removing the entry being currently updated from the update tree and then reinserting it when the update process is over, it meant that in the CLI command code we were modifying a structure that was already being used. These concurrent accesses were not properly locked on the "regular" update case because it was assumed that once an entry was removed from the update tree, the update task was the only one able to work on it. Rather than locking the whole update process, an "updating" flag was added to the certificate_ocsp in order to prevent the "update ssl ocsp-response" command from trying to update a response already being updated. An easy way to reproduce this crash was to perform two "simultaneous" calls to "update ssl ocsp-response" on the same certificate. It would then crash on an eb64_delete call in the main ocsp update task function. This patch can be backported up to 2.8. Wait a little bit before backporting.	2024-03-20 16:12:10 +01:00
Remi Tricot-Le Breton	3f04568ab1	REGTESTS: ssl: Add OCSP related tests Add tests that combine the OCSP update mechanism and the various preexisting commands that allow to manipulate certificates and crt-lists.	2024-03-20 16:12:10 +01:00
Remi Tricot-Le Breton	69071490ff	BUG/MAJOR: ocsp: Separate refcount per instance and per store With the current way OCSP responses are stored, a single OCSP response is stored (in a certificate_ocsp structure) when it is loaded during a certificate parsing, and each SSL_CTX that references it increments its refcount. The reference to the certificate_ocsp is kept in the SSL_CTX linked to each ckch_inst, in an ex_data entry that gets freed when the context is freed. One of the downsides of this implementation is that if every ckch_inst referencing a certificate_ocsp gets detroyed, then the OCSP response is removed from the system. So if we were to remove all crt-list lines containing a given certificate (that has an OCSP response), and if all the corresponding SSL_CTXs were destroyed (no ongoing connection using them), the OCSP response would be destroyed even if the certificate remains in the system (as an unused certificate). In such a case, we would want the OCSP response not to be "usable", since it is not used by any ckch_inst, but still remain in the OCSP response tree so that if the certificate gets reused (via an "add ssl crt-list" command for instance), its OCSP response is still known as well. But we would also like such an entry not to be updated automatically anymore once no instance uses it. An easy way to do it could have been to keep a reference to the certificate_ocsp structure in the ckch_store as well, on top of all the ones in the ckch_instances, and to remove the ocsp response from the update tree once the refcount falls to 1, but it would not work because of the way the ocsp response tree keys are calculated. They are decorrelated from the ckch_store and are the actual OCSP_CERTIDs, which is a combination of the issuer's name hash and key hash, and the certificate's serial number. So two copies of the same certificate but with different names would still point to the same ocsp response tree entry. The solution that answers to all the needs expressed aboved is actually to have two reference counters in the certificate_ocsp structure, one actual reference counter corresponding to the number of "live" pointers on the certificate_ocsp structure, incremented for every SSL_CTX using it, and one for the ckch stores. If the ckch_store reference counter falls to 0, the corresponding certificate must have been removed via CLI calls ('set ssl cert' for instance). If the actual refcount falls to 0, then no live SSL_CTX uses the response anymore. It could happen if all the corresponding crt-list lines were removed and there are no live SSL sessions using the certificate anymore. If any of the two refcounts becomes 0, we will always remove the response from the auto update tree, because there's no point in spending time updating an OCSP response that no new SSL connection will be able to use. But the certificate_ocsp object won't be removed from the tree unless both refcounts are 0. Must be backported up to 2.8. Wait a little bit before backporting.	2024-03-20 16:12:10 +01:00
Amaury Denoyelle	87b96cf3a5	BUG/MAJOR: connection: fix server used_conns with H2 + reuse safe By default, backend connections are accounted by the server. This allows to determine the number of idle connections to keep. A backend connection can also be marked as private to prevent its reuse. It is thus removed from server lists into the session list. As such, a private connection is not accounted into server : conn_set_private() uses srv_release_conn() to ensure this. When using HTTP/2 on backend side with default http-reuse safe, the above principle are mixed. Indeed, when a connection is first used, or switches from idle to used, it is moved into the session list but it is not flagged as private. This is done to prevent its sharing by different clients to prevent head-of-line blocking issue. When all streams are closed, the connection becomes idle again and is reinserted in the server list. This has been introduced by the following patch : `0d21deaded` MEDIUM: backend: add reused conn to sess if mux marked as HOL blocking When freeing a backend connection, special care is taken to ensure server used counter is decremented. This is implemented into conn_backend_deinit(). However, this function does this only if the connection is not present in a session list. This is valid for private connections. However, if a connection is non-private and present only temporarily into a session list, the decrement operation won't be executed despite the connection being accounted by the server. This bug has several impacts. The server used counter won't be able to reach its initial null value, even when all its connections are closed. This can result in a wrong estimation of necessary idle connections, which may cause unnecessary new connection usage. Also, this will prevent definitely the server from being removed via "delete server" CLI command. This should be backported up to 2.4. Note that conn_backend_deinit() was introduced in 2.9. For lesser versions, the change should be done directly into conn_free().	2024-03-20 14:26:57 +01:00
Amaury Denoyelle	fd3ce173aa	BUG/MEDIUM: http_ana: ignore NTLM for reuse aggressive/always and no H1 Backend connections can be marked as private to prevent their sharing by multiple clients. Now, this has become an exception as only two reasons for data traffic can trigger this (checks are ignored here) : * http-reuse never * HTTP response with NTLM header The first case is easy to manage as the connection is flagged as private since its inception. However, the second case is dynamic as the connection can be flagged anytime during its lifetime. When using a backend protocol such as HTTP/2 with reuse mode aggressive or always, we face a design issue as the connection would be marked as private, despite potentially being shared by several clients at the same time. This is conceptually invalid, but worst it can trigger crashes on MUX stream detach callback depending on the order of release of the streams, by calling session_check_idle_conn() with a NULL session. It could also be possible to have several NTLM responses on a single connection for different sessions. In this case, connection owner is still being updated without attaching the connection to its correct session, which ultimately would cause a crash on session_check_idle_conn with an invalid session. Here are two backtrace examples from GDB for such cases : Thread 1 (Thread 0x7ff73e9fc700 (LWP 648859)): #0 session_check_idle_conn (conn=0x7ff72f597800, sess=0x0) at include/haproxy/session.h:209 #1 h2_detach (sd=<optimized out>) at src/mux_h2.c:4520 #2 0x000056151742be24 in sc_detach_endp (scp=scp@entry=0x7ff73e9f0f18) at src/stconn.c:376 #3 0x000056151742c208 in sc_destroy (sc=<optimized out>) at src/stconn.c:444 #4 0x0000561517370871 in stream_free (s=s@entry=0x7ff72a2dbd80) at src/stream.c:728 #5 0x000056151737541f in process_stream (t=t@entry=0x7ff72d5e2620, context=0x7ff72a2dbd80, state=<optimized out>) at src/stream.c:2645 #6 0x0000561517456cbb in run_tasks_from_lists (budgets=budgets@entry=0x7ff73e9f10d0) at src/task.c:632 #7 0x00005615174576b9 in process_runnable_tasks () at src/task.c:876 #8 0x000056151742275a in run_poll_loop () at src/haproxy.c:2996 #9 0x0000561517422db1 in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3195 #10 0x00007ff789e081ca in start_thread () from /lib64/libpthread.so.0 #11 0x00007ff789a39e73 in clone () from /lib64/libc.so.6 (gdb) Thread 1 (Thread 0x7ff52e7fc700 (LWP 681458)): #0 0x0000556ebd6e7e69 in session_check_idle_conn (conn=0x7ff5787ff100, sess=0x7ff51d2539a0) at include/haproxy/session.h:209 #1 h2_detach (sd=<optimized out>) at src/mux_h2.c:4520 #2 0x0000556ebd7f3e24 in sc_detach_endp (scp=scp@entry=0x7ff52e7f0f18) at src/stconn.c:376 #3 0x0000556ebd7f4208 in sc_destroy (sc=<optimized out>) at src/stconn.c:444 #4 0x0000556ebd738871 in stream_free (s=s@entry=0x7ff520e28200) at src/stream.c:728 #5 0x0000556ebd73d41f in process_stream (t=t@entry=0x7ff565783700, context=0x7ff520e28200, state=<optimized out>) at src/stream.c:2645 #6 0x0000556ebd81ecbb in run_tasks_from_lists (budgets=budgets@entry=0x7ff52e7f10d0) at src/task.c:632 #7 0x0000556ebd81f6b9 in process_runnable_tasks () at src/task.c:876 #8 0x0000556ebd7ea75a in run_poll_loop () at src/haproxy.c:2996 #9 0x0000556ebd7eadb1 in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3195 #10 0x00007ff5752081ca in start_thread () from /lib64/libpthread.so.0 #11 0x00007ff574e39e73 in clone () from /lib64/libc.so.6 (gdb) To solve this issue, simply ignore NTLM responses when using a multiplexer with streams support and the connection is not already attached to the session. The connection is not marked as private and will continue to be shared freely accross clients. This is considered conceptually valid as NTLM usage (rfc 4559) with HTTP is broken and was designed only with HTTP/1.1 in mind. A side-effect of the change is that SESS_FL_PREFER_LAST is also not set anymore on NTLM detection, which allows following requests to be load-balanced accross several server instances. The original behavior is kept for HTTP/1 or if the connection is already attached to the session. This last case happens when using HTTP/2 with default http-reuse safe mode since the following patch : `0d21deaded` MEDIUM: backend: add reused conn to sess if mux marked as HOL blocking This should be backported up to all stable releases. Up until 2.4, it can be taken as-is. For lesser versions, above patch is not present. In this case the condition should be restricted only to HTTP/1 usage : if (srv_conn && strcmp(srv_conn->mux->name, "H1") == 0) {	2024-03-20 14:26:57 +01:00
Amaury Denoyelle	c130f74803	BUG/MINOR: session: ensure conn owner is set after insert into session A crash could occured if a session_add_conn() would temporarily failed when called via h2_detach(). In this case, connection owner is reset to NULL. However, if this wasn't the last connection stream, the connection won't be destroyed. When h2_detach() is recalled for another stream and this time session_add_conn() succeeds, a crash will occur due to session_check_idle_conn() invocation with a NULL connection owner. To fix this, ensure connection owner is always set after session_add_conn() success. This bug is considered as minor as the only failure reason for session_add_conn() is a pool allocation issue. This should be backported up to all stable releases.	2024-03-20 14:26:57 +01:00
Christopher Faulet	eb89e4f3e0	BUG/MEDIUM: spoe: Return an invalid frame on recv if size is too small Frames with a too small size must be detected on receive and an error must be triggered. It is especially important for frames of size 0. Otherwise, because the frame length is used as return value, the frame is ignored (0 is the return value to state the frame must be ignored). It is an issue because in this case, outgoing data, the 4 bytes representing the frame size, are never consumed. If the agent also closes the connection, this leads to a wakeup loop because outgoing data are stuck and a shutdown is pending. In addition, all pending outgoing data are systematcially skipped when the applet is in SPOE_APPCTX_ST_END state. The patch should fix the issue #2490. It must be backported to all stable versions.	2024-03-19 07:54:25 +01:00
Ilia Shipitsin	3a0fc8641b	CI: temporarily adjust kernel entropy to work with ASAN/clang clang runtime (shipped with clang14) is not compatible with recent Ubuntu kernels more details: https://github.com/actions/runner-images/issues/9491	2024-03-18 19:54:33 +01:00
Ilia Shipitsin	5fe02c33bc	CLEANUP: assorted typo fixes in the code and comments This is 40th iteration of typo fixes	2024-03-18 19:54:33 +01:00
Christopher Faulet	885e40494c	MINOR: spoe: Add SPOE filters in the exposed deprecated directives It is the first deprecated directive exposed via the 'expose-deprecated-directives' global option. This way, it is possible to silent the warning about the SPOE uses.	2024-03-15 11:31:48 +01:00
Christopher Faulet	189f74d4ff	MINOR: cfgparse: Add a global option to expose deprecated directives Similarly to "expose-exprimental-directives" option, there is no a global option to expose some deprecated directives. Idea is to have a way to silent warnings about deprecated directives when there is no alternative solution. Of course, deprecated directives covered by this option are not listed and may change. It is only a best effort to let users upgrade smoothly.	2024-03-15 11:31:48 +01:00
Christopher Faulet	dff9807188	MAJOR: spoe: Deprecate the SPOE filter As announced on the ML few weeks (months ?) ago and on several GH issues, the SPOE is now deprecated. Sadly, this filter should be refactored to work properly. It was implemented as a functionnal PoC for the 1.7 and since then, no time was invest to improve it and make it truly maintainable in time. Worst, other parts of HAProxy evolve, especially applets part, making maintenance ever more expensive. Instead of keeping the SPOE filter in a this state and always reply to users encountering issues or limitations that it is far from perfect but we cannot work on it for now, we decided to deprecate it. We can still change our mind before the 3.0.0 release if the situation evolves. Otherwise the filter will be removed or marked as unmaintained for the 3.1. If the situation does not change, it means the 3.0 will be the last version with a true SPOE support.	2024-03-15 11:29:39 +01:00
Christopher Faulet	6547b14292	BUG/MINOR: spoe: Be sure to be able to quickly close IDLE applets on soft-stop On soft-stop, we try, as far as possible, to process all pending messages before closing SPOE applets. However, in sync mode, when an applets waiting for a response receives the ACK frame, it is switched to IDLE state without checking if it may be closed. In this case, we will wait the idle timeout before closing de applet, delaying the soft-stop. To reduce this delay, on soft-stop, IDLE applets are woken up. On the next wakeup, the applet will try to process pending messages or will be closed. This patch should be backported to all stable versions.	2024-03-15 09:09:22 +01:00
Christopher Faulet	3c066b1e34	BUG/MEDIUM: spoe: Don't rely on stream's expiration to detect processing timeout On stream side, the SPOE filter relied on the stream's expiration date to be woken up and be able to detect processing timeout. However, the stream expiration date must not be updated this way. Mainly because it may be overwritten at the end of process_stream(). In the worst case, it is set to TICK_ETERNITY for any reason. In this case, it is impossible to detect the SPOE filter must time out and abort the processing. The right way to do is to set an analysis expiration date on the corresponding channel, depending on the direction. This expiration date will be used to compute the stream's expiration date at the end of process_stream(). This patch may be related to issue #2478. It must be backported to all stable versions.	2024-03-15 09:09:22 +01:00
Amaury Denoyelle	7dae3ceaa0	BUG/MAJOR: server: do not delete srv referenced by session A server can only be deleted if there is no elements which reference it. This is taken care via srv_check_for_deletion(), most notably for active and idle connections. A special case occurs for connections directly managed by a session. This is for so-called private connections, when using http-reuse never or H2 + http-reuse safe for example. In this case. server does not account these connections into its idle lists. This caused a bug as the server is deleted despite the session still being able to access it. To properly fix this, add a new referencing element into the server for these session connections. A mt_list has been chosen for this. On default http-reuse, private connections are typically not used so it won't make any difference. If using H2 servers, or more generally when dealing with private connections, insert/delete should typically occur only once per session lifetime so impact on performance should be minimal. This should be backported up to 2.4. Note that srv_check_for_deletion() was introduced in 3.0 dev tree. On backport, the extra condition in it should be placed in cli_parse_delete_server() instead.	2024-03-14 15:21:07 +01:00
Amaury Denoyelle	5ad801c058	MINOR: session: rename private conns elements By default, backend connections are attached to a server instance. This allows to implement connection reuse. However, in some particular cases, connection cannot be shared accross several clients. These connections are considered and private and are attached to the session instance instead. These private connections are also indexed by the target server to not mix them. All of this is implemented via a dedicated structure previously named struct sess_srv_list. Rename it to better reflect its usage to struct sess_priv_conns. Also rename its internal members and all of the associated functions. This commit is only a renaming, thus no functional impact is expected.	2024-03-14 15:21:02 +01:00
Christopher Faulet	f31a4e302e	BUG/MINOR: listener: Don't schedule frontend without task in listener_release() null pointer dereference was reported by Coverity in listener_release() function. Indeed, we must not try to schedule frontend without task when a limit is still blocking the frontend. This issue was introduced by commit `65ae1347c7` ("BUG/MINOR: listener: Wake proxy's mngmt task up if necessary on session release") This patch should fix issue #2488. It must be backported to all stable version with the commit above.	2024-03-14 09:34:36 +01:00
Christopher Faulet	65ae1347c7	BUG/MINOR: listener: Wake proxy's mngmt task up if necessary on session release When a session is released, listener_release() function is called to notify the listener. It is an opportunity to resume limited/full listeners. We first try to resume the listener owning the released session, then all limited listeners in the global queue and finally all limited listeners in the frontend's waiting queue. This last step is only performed if there is no limit applied on the frontend. Nothing is performed if the session rate is still limited. And it is an issue because if this happens for the last listener's session, there is no other event to wake the frontend's managment task up and the listener remains in the limited state. To fix the issue, when a limit is still applied on the frontent, we must compute the new wake up date from the sessions rate and schedule the frontend's managment task. It is easy to reproduce the issue in SSL by setting a maxconn and a rate limit on sessions. This patch should fix the issue #2476. It must be backported to all stable versions.	2024-03-13 15:20:06 +01:00
William Lallemand	9c2e900a9b	CI: github: add -dI to haproxy arguments -dI is useful when running with ASAN, allow to fork addr2line	2024-03-13 11:23:14 +01:00
William Lallemand	70be894e41	MINOR: debug: enable insecure fork on the command line -dI allow to enable "insure-fork-wanted" directly from the command line, which is useful when you want to run ASAN with addr2line with a lot of configuration files without editing them.	2024-03-13 11:23:14 +01:00
Aurelien DARRAGON	07b2e84bce	BUG/MEDIUM: hlua: streams don't support mixing lua-load with lua-load-per-thread (2nd try) While trying to reproduce another crash case involving lua filters reported by @bgrooot on GH #2467, we found out that mixing filters loaded from different contexts ('lua-load' vs 'lua-load-per-thread') for the same stream isn't supported and may even cause the process to crash. Historically, mixing lua-load and lua-load-per-threads for a stream wasn't supported, but this changed thanks to `0913386` ("BUG/MEDIUM: hlua: streams don't support mixing lua-load with lua-load-per-thread"). However, the above fix didn't consider lua filters's use-case properly: unlike lua fetches, actions or even services, lua filters don't simply use the stream hlua context as a "temporary" hlua running context to process some hlua code. For fetches, actions.. hlua executions are processed sequentially, so we simply reuse the hlua context from the previous action/fetch to run the next one (this allows to bypass memory allocations and initialization, thus it increases performance), unless we need to run on a different hlua state-id, in which case we perform a reset of the hlua context. But this cannot work with filters: indeed, once registered, a filter will last for the whole stream duration. It means that the filter will rely on the stream hlua context from ->attach() to ->detach(). And here is the catch, if for the same stream we register 2 lua filters from different contexts ('lua-load' + 'lua-load-per-thread'), then we have an issue, because the hlua stream will be re-created each time we switch between runtime contexts, which means each time we switch between the filters (may happen for each stream processing step), and since lua filters rely on the stream hlua to carry context between filtering steps, this context will be lost upon a switch. Given that lua filters code was not designed with that in mind, it would confuse the code and cause unexpected behaviors ranging from lua errors to crashing process. So here we take another approach: instead of re-creating the stream hlua context each time we switch between "global" and "per-thread" runtime context, let's have both of them inside the stream directly as initially suggested by Christopher back then when talked about the original issue. For this we leverage hlua_stream_ctx_prepare() and hlua_stream_ctx_get() helper functions which return the proper hlua context for a given stream and state_id combination. As for debugging infos reported after ha_panic(), we check for both hlua runtime contexts to check if one of them was active when the panic occured (only 1 runtime ctx per stream may be active at a given time). This should be backported to all stable versions with `0913386` ("BUG/MEDIUM: hlua: streams don't support mixing lua-load with lua-load-per-thread") This commit depends on: - "DEBUG: lua: precisely identify if stream is stuck inside lua or not" [for versions < 2.9 the ha_thread_dump_one() part should be skipped] - "MINOR: hlua: use accessors for stream hlua ctx" For 2.4, the filters API didn't exist. However it may be a good idea to backport it anyway because ->set_priv()/->get_priv() from tcp/http lua applets may also be affected by this bug, plus it will ease code maintenance. Of course, filters-related parts should be skipped in this case.	2024-03-13 09:24:46 +01:00
Aurelien DARRAGON	aa554be69c	MINOR: hlua: use accessors for stream hlua ctx Change hlua_stream_ctx_prepare() prototype so that it now returns the proper hlua ctx on success instead of returning a boolean. Add hlua_stream_ctx_get() to retrieve hlua ctx out of a given stream. This way we may easily change the storage mechanism for hlua stream in the future without extensive code changes. No backport needed unless a commit depends on it.	2024-03-13 09:24:46 +01:00
Aurelien DARRAGON	1a2cdf64c9	DEBUG: lua: precisely identify if stream is stuck inside lua or not When ha_panic() is called by the watchdog, we try to guess from ha_task_dump() and ha_thread_dump_one() if the thread was stuck while executing lua from the stream context. However we consider this is the case by simply checking if the stream hlua context was set, but this is not very precise because if the hlua context is set, then it simply means that at least one lua instruction was executed at the stream level, not that the stuck was currently executing lua when the panic occured. This is especially true with filters, one could simply register a lua filter that does nothing but this will still end up initializing the stream hlua context for each stream. If the thread end up being stuck during the stream handling, then debug dumping functions will report that the stream was stuck while handling lua, which is not necessarilly true, and could in fact confuse us even more. So here we take another approach, we add the BUSY flag to hlua context: this flag is set by hlua_ctx_resume() around lua_resume() call, this way we can precisely tell if the thread was handling lua when it was interrupted, and we rely on this flag in debug functions to check if the thread was effectively stuck inside lua or not while processing the stream No backport needed unless a commit depends on it.	2024-03-13 09:24:46 +01:00
Aurelien DARRAGON	85d81e4d0a	BUG/MINOR: hlua: fix missing lock in hlua_filter_delete() hlua_filter_delete() calls hlua_unref() on the stream hlua stack, but we should own the lock prior to manipulating the stack. This should be backported up to 2.6.	2024-03-13 09:24:46 +01:00
Aurelien DARRAGON	ecd8f3bfd7	BUG/MINOR: hlua: missing lock in hlua_filter_new() This is a complementary patch to `8670db7` ("BUG/MAJOR: hlua: improper lock usage with hlua_ctx_resume()") for hlua_filter_new(). Indeed, the HLUA_E_ERRMSG case still relies on the lua stack but didn't take the lock to do so. This should be backported up to 2.6.	2024-03-13 09:24:46 +01:00
Aurelien DARRAGON	4aefffc38c	BUG/MINOR: hlua: segfault when loading the same filter from different contexts Trying to register the same lua filter from global and per-thread context (using 'lua-load' + 'lua-load-per-thread') causes a segmentation fault in hlua_post_init(). This is due to a simple copy paste error as we try to print the function name in the error message (like we do when loading the same lua function from different contexts) instead of the filter name. This should be backported up to 2.6.	2024-03-13 09:24:46 +01:00
William Lallemand	bb25ee7b26	CI: github: add -DDEBUG_LIST to the default builds Add the -DDEBUG_LIST flag which allow to check if a list element was removed twice.	2024-03-13 09:01:11 +01:00
William Lallemand	bbc215d3bd	CLEANUP: ssl: remove useless #ifdef in openssl-compat.h Remove a useless #ifdef in openssl-compat.h	2024-03-13 08:51:04 +01:00
William Lallemand	501d9fdb86	MEDIUM: ssl: allow to change the OpenSSL security level from global section The new "ssl-security-level" option allows one to change the OpenSSL security level without having to change the openssl.cnf global file of your distribution. This directives applies on every SSL_CTX context. People sometimes change their security level directly in the ciphers directive, however there are some cases when the security level change is not applied in the right order (for example when applying a DH param). Before this patch, it was to possible to trick by using a specific openssl.cnf file and start haproxy this way: OPENSSL_CONF=./openssl.cnf ./haproxy -f bug-2468.cfg Values for the security level can be found there: https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_set_security_level.html This was discussed in github issue #2468.	2024-03-12 17:37:11 +01:00
William Lallemand	7e9e4a8f50	MEDIUM: ssl: initialize the SSL stack explicitely In issue #2448, users are complaining that FIPS is not working correctly since the removal of SSL_library_init(). This was removed because SSL_library_init() is deprecated with OpenSSL 3.x and emits a warning. But the initialization was not needed anymore because it is done at the first openssl API call. However it some cases it is needed. SSL_library_init() is now a define to OPENSSL_init_ssl(0, NULL). This patch adds OPENSSL_init_ssl(0, NULL) to the init. This could be backported in every stable branches, however let's wait before backporting it.	2024-03-12 12:03:07 +01:00
Willy Tarreau	7223296092	BUG/MINOR: server: fix first server template not being indexed 3.0-dev1 introduced a small regression with commit `b4db3be86e` ("BUG/MINOR: server: fix server_find_by_name() usage during parsing"). By changing the way servers are indexed and moving it into the server template loop, the first one is no longer indexed because the loop starts at low+1 since it focuses on duplication. Let's index the first one explicitly now. This should not be backported, unless the commit above is backported.	2024-03-12 08:23:03 +01:00
Dragan Dosen	0091692d97	BUG/MINOR: ssl: do not set the aead_tag flags in sample_conv_aes_gcm() This was not useful and was using uninitialized value. Introduced with the commit `08ac28237` ("MINOR: Add aes_gcm_enc converter"). Must be backported wherever the commit `08ac28237` was backported.	2024-03-11 19:20:44 +01:00
Dragan Dosen	d7610e6dde	BUG/MINOR: ssl: fix possible ctx memory leak in sample_conv_aes_gcm() The issue was introduced with the commit `c31499d74` ("MINOR: ssl: Add aes_gcm_dec converter"). This must be backported to all stable branches where the above converter is present, but it may need to be adjusted for older branches because of code refactoring.	2024-03-11 19:20:31 +01:00
Brooks Davis	c03a023882	MINOR: tools: use public interface for FreeBSD get_exec_path() Where possible (FreeBSD 13+), use the public, documented interface to the ELF auxiliary argument vector: elf_aux_info(). __elf_aux_vector is a private interface exported so that the runtime linker can set its value during process startup and not intended for public consumption. In FreeBSD 15 it has been removed from libc and moved to libsys.	2024-03-11 19:00:37 +01:00
William Lallemand	3262c2ddcd	DOC: configuration: clarify ciphersuites usage (V2) The previous attempt removed the TLSv1.3 version for the "ciphersuites" keywords. However it looks like the TLSv1.2 support for SSL_CTX_set_ciphersuites() is a bug, and can have inconsistent behavior. This patch revert the previous attempt and add explaining about this problem and clear examples on how to configure TLSv1.2 ciphers + TLSv1.3 ciphersuites. Revert "DOC: configuration: clarify ciphersuites usage" This reverts commit `e2a44d6c94`. This must be backported to all stable branches. Fixes issue #2459.	2024-03-11 17:58:16 +01:00
Amaury Denoyelle	c499d66f37	MINOR: quic: remove qc_treat_rx_crypto_frms() This commit removes qc_treat_rx_crypto_frms(). This function was used in a single place inside qc_ssl_provide_all_quic_data(). Besides, its naming was confusing as conceptually it is directly linked to quic_ssl module instead of quic_rx. Thus, body of qc_treat_rx_crypto_frms() is inlined directly inside qc_ssl_provide_all_quic_data(). Also, qc_ssl_provide_quic_data() is now only used inside quic_ssl to its scope is set to static. Overall, API for CRYPTO frame handling is now cleaner.	2024-03-11 14:27:51 +01:00
Amaury Denoyelle	b068e758fb	MINOR: quic: simplify rescheduling for handshake On CRYPTO frames reception, tasklet is rescheduled with TASK_HEAVY to limit CPU consumption. This commit slighly simplifies this by regrouping TASK_HEAVY setting and tasklet_wakeup() instructions in a single location in qc_handle_crypto_frm(). All other unnecessary tasklet_wakeup() are removed.	2024-03-11 14:15:36 +01:00
Willy Tarreau	6770259083	MEDIUM: mux-h2: allow to set the glitches threshold to kill a connection Till now it was still needed to write rules to eliminate bad behaving H2 clients, while most of the time it would be desirable to just be able to set a threshold on the level of anomalies on a connection. This is what this patch does. By setting a glitches threshold for frontend and backend, it allows to automatically turn a connection to the error state when the threshold is reached so that the connection dies by itself without having to write possibly complex rules. One subtlety is that we still have the error state being exclusive to the parser's state so this requires the h2c_report_glitches() function to return a status indicating if the threshold was reached or not so that processing can instantly stop and bypass the state update, otherwise the state could be turned back to a valid one (e.g. after parsing CONTINUATION); we should really contemplate the possibility to use H2_CF_ERROR for this. Fortunately there were very few places where a glitch was reported outside of an error path so the changes are quite minor. Now by setting the front value to 1000, a client flooding with short CONTINUATION frames is instantly stopped.	2024-03-11 08:25:08 +01:00
Willy Tarreau	e6e7e1587e	MINOR: mux-h2: always use h2c_report_glitch() The function aims at centralizing counter measures but due to the fact that it only increments the counter by one unit, sometimes it was not used and the value was calculated directly. Let's pass the increment in argument so that it can be used everywhere.	2024-03-11 07:36:56 +01:00
Willy Tarreau	db1a7513b7	[RELEASE] Released version 3.0-dev5 Released version 3.0-dev5 with the following main changes : - BUG/MEDIUM: applet: Fix HTX .rcv_buf callback function to release outbuf buffer - BUG/MAJOR: ssl/ocsp: crash with ocsp when old process exit or using ocsp CLI - BUG/MEDIUM: server: fix dynamic servers initial settings - BUG/MINOR: ssl/cli: duplicate cleaning code in cli_parse_del_crtlist - LICENSE: event_hdl: fix GPL license version - LICENSE: http_ext: fix GPL license version - BUG/MEDIUM: mux-h1: Fix again 0-copy forwarding of chunks with an unknown size - BUG/MINOR: mux-h1: Properly report when mux is blocked during a nego - MINOR: mux-h1: Move checks performed before a shutdown in a dedicated function - MINOR: mux-h1: Move all stuff to detach a stream in an internal function - MAJOR: mux-h1: Drain requests on client side before shut a stream down - MEDIUM: htx/http-ana: No longer close connection on early HAProxy response - MINOR: quic: filter show quic by address - MINOR: quic: specify show quic output fields - MINOR: quic: add MUX output for show quic - CLEANUP: mux-h2: Fix h2s_make_data() comment about the return value - DOC: configuration: clarify ciphersuites usage - BUG/MINOR: config/quic: Alert about PROXY protocol use on a QUIC listener - BUG/MINOR: hlua: Fix log level to the right value when set via TXN:set_loglevel - MINOR: hlua: Be able to disable logging from lua - BUG/MINOR: tools: seed the statistical PRNG slightly better - BUG/MINOR: hlua: fix unsafe lua_tostring() usage with empty stack - BUG/MINOR: hlua: don't use lua_tostring() from unprotected contexts - BUG/MINOR: hlua: fix possible crash in hlua_filter_new() under load - BUG/MINOR: hlua: improper lock usage in hlua_filter_callback() - BUG/MINOR: hlua: improper lock usage in hlua_filter_new() - BUG/MEDIUM: hlua: improper lock usage with SET_SAFE_LJMP() - BUG/MAJOR: hlua: improper lock usage with hlua_ctx_resume() - BUG/MINOR: hlua: don't call ha_alert() in hlua_event_subscribe() - MINOR: hlua: use SEND_ERR to report errors in hlua_event_runner() - CLEANUP: hlua: txn class functions may LJMP - BUG/MINOR: sink: fix a race condition in the TCP log forwarding code - BUILD: thread: move lock label definitions to thread-t.h - BUILD: tree-wide: fix a few missing includes in a few files - BUILD: buf: make b_ncat() take a const for the source - CLEANUP: assorted typo fixes in the code and comments - CLEANUP: fix typo in naming for variable "unused" - CI: run more smoke tests on config syntax to check memory related issues - CI: enable monthly build only test on netbsd-9.3 - CI: skip scheduled builds on forks - BUG/MINOR: ssl/cli: typo in new ssl crl-file CLI description - BUG/MEDIUM: quic: fix connection freeze on post handshake - BUG/MINOR: mux-quic: fix crash on aborting uni remote stream - CLEANUP: log: fix obsolete comment for add_sample_to_logformat_list() - CLEANUP: tree-wide: use proper ERR_* return values for PRE_CHECK fcts - BUG/MINOR: cfgparse: report proper location for log-format-sd errors - MINOR: vars: export var_set and var_unset functions - MINOR: Add aes_gcm_enc converter - BUG/MEDIUM: quic: fix handshake freeze under high traffic - MINOR: quic: always use ncbuf for rx CRYPTO - BUILD: ssl: define EVP_CTRL_AEAD_GET_TAG for older versions - DOC: design: write first notes about ring-v2 - OPTIM: sink: try to merge "dropped" messages faster - OPTIM: sink: drop the sink lock used to count drops - DEV: haring: make haring not depend on the struct ring itself - DEV: haring: split the code between ring and buffer - DEV: haring: automatically use the advertised ring header size - BUILD: solaris: fix compilation errors	2024-03-09 16:50:15 +01:00
matthias sweertvaegher	062ea3a3d4	BUILD: solaris: fix compilation errors Compilation on solaris fails because of usage of names reserved on that platform, i.e. 'queue' and 's_addr'. This patch redefines 'queue' as '_queue' and renames 's_addr' to 'srv_addr' which fixes compilation for now. Future plan: rename 'queue' in code base so define can be removed again. Backporting: 2.9, 2.8	2024-03-09 11:24:54 +01:00
Willy Tarreau	88e141b823	DEV: haring: automatically use the advertised ring header size Instead of emitting a warning, since we don't need the ring struct anymore, we can just read what we need, parse the buffer and use the advertised offset. Thus for now -f is simply ignored.	2024-03-09 11:23:52 +01:00
Willy Tarreau	77d7c35243	DEV: haring: split the code between ring and buffer By splitting the initialization and the parsing of the ring, we'll ease the support for multiple ring sizes and get rid of the annoyances of the optional lock.	2024-03-09 11:23:52 +01:00
Willy Tarreau	4dddbb63a0	DEV: haring: make haring not depend on the struct ring itself haring needs to be self-sufficient about the ring format so that it continues to build when the ring API changes. Let's import the struct ring definition and call it "ring_v1".	2024-03-09 11:23:52 +01:00
Willy Tarreau	758cb450a2	OPTIM: sink: drop the sink lock used to count drops The sink lock was made to prevent event producers from passing while there were other threads trying to print a "dropped" message, in order to guarantee the absence of reordering. It has a serious impact however, which is that all threads need to take the read lock when producing a regular trace even when there's no reader. This patch takes a different approach. The drop counter is shifted left by one so that the lowest bit is used to indicate that one thread is already taking care of trying to dump the counter. Threads only read this value normally, and will only try to change it if it's non-null, in which case they'll first check if they are the first ones trying to dump it, otherwise will simply count another drop and leave. This has a large benefit. First, it will avoid the locking that causes stalls as soon as a slow reader is present. Second, it avoids any write on the fast path as long as there's no drop. And it remains very lightweight since we just need to add +2 or subtract 2*dropped in operations, while offering the guarantee that the sink_write() has succeeded before unlocking the counter. While a reader was previously limiting the traffic to 11k RPS under 4C/8T, now we reach 36k RPS vs 14k with no reader, so readers will no longer slow the traffic down and will instead even speed it up due to avoiding the contention down the chain in the ring. The locking cost dropped from ~75% to ~60% now (it's in ring_write now).	2024-03-09 11:23:52 +01:00
Willy Tarreau	eb7b2ec83a	OPTIM: sink: try to merge "dropped" messages faster When a reader doesn't read fast enough and causes drops, subsequent threads try to produce a "dropped" message. But it takes time to produce and emit this message, in part due to the use of chunk_printf() that relies on vfprintf() which has to parse the printf format, and during this time other threads may continue to increment the counter. This is the reason why this is currently performed in a loop. When reading what is received, it's common to see a large count followed by one or two single-digit counts, indicating that we could possibly have improved that by writing faster. Let's improve the situation a little bit. First we're now using a static message prefixed with enough space to write the digits, and a call to ultoa_r() fills these digits from right to left so that we don't have to process a format string nor perform a copy of the message. Second, we now re-check the counter immediately after having prepared the message so that we still get an opportunity for updating it. In order to avoid too long loops, this is limited to 10 iterations. Tests show that the number of single-digit "dropped" counters on output now dropped roughly by 15-30%. Also, it was observed that with 8 threads, there's almost never more than one retry.	2024-03-09 11:23:52 +01:00
Willy Tarreau	571232535a	DOC: design: write first notes about ring-v2 This explains the observed limitations of the current ring applied to traces and proposes a multi-step, more scalable, improvement.	2024-03-09 11:23:52 +01:00

1 2 3 4 5 ...

21840 Commits