haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2024-12-27 07:02:11 +00:00

Author	SHA1	Message	Date
Christopher Faulet	046cf44f6c	MINOR: http-rules: Make set/del-map and add/del-acl custom actions Now, these actions use their own dedicated function and are no longer handled "in place" during the HTTP rules evaluation. Thus the action names ACT_HTTP__ACL and ACT_HTTP__MAP are removed. The action type is now mapped as following: 0 = add-acl, 1 = set-map, 2 = del-acl and 3 = del-map.	2020-01-20 15:18:45 +01:00
Christopher Faulet	d1f27e3394	MINOR: http-rules: Make set-header and add-header custom actions Now, these actions use their own dedicated function and are no longer handled "in place" during the HTTP rules evaluation. Thus the action names ACT_HTTP_SET_HDR and ACT_HTTP_ADD_VAL are removed. The action type is now set to 0 to set a header (so remove existing ones if any and add a new one) or to 1 to add a header (add without remove).	2020-01-20 15:18:45 +01:00
Christopher Faulet	92d34fe38d	MINOR: http-rules: Make replace-header and replace-value custom actions Now, these actions use their own dedicated function and are no longer handled "in place" during the HTTP rules evaluation. Thus the action names ACT_HTTP_REPLACE_HDR and ACT_HTTP_REPLACE_VAL are removed. The action type is now set to 0 to evaluate the whole header or to 1 to evaluate every comma-delimited values. The function http_transform_header_str() is renamed to http_replace_hdrs() to be more explicit and the function http_transform_header() is removed. In fact, this last one is now more or less the new action function. The lua code has been updated accordingly to use http_replace_hdrs().	2020-01-20 15:18:45 +01:00
Christopher Faulet	006f6507d7	MINOR: actions: Use an integer to set the action type <action> field in the act_rule structure is now an integer. The act_name values are used for all actions without action function (but it is not a pre-requisit though) or the action will have no effect. But for all other actions, any integer value may used, only the action function will take care of it. The default for such actions is ACT_CUSTOM.	2020-01-20 15:18:45 +01:00
Christopher Faulet	245cf795c1	MINOR: actions: Add flags to configure the action behaviour Some flags can now be set on an action when it is registered. The flags are defined in the act_flag enum. For now, only ACT_FLAG_FINAL may be set on an action to specify if it stops the rules evaluation. It is set on ACT_ACTION_ALLOW, ACT_ACTION_DENY, ACT_HTTP_REQ_TARPIT, ACT_HTTP_REQ_AUTH, ACT_HTTP_REDIR and ACT_TCP_CLOSE actions. But, when required, it may also be set on custom actions. Consequently, this flag is checked instead of the action type during the configuration parsing to trigger a warning when a rule inhibits all the following ones.	2020-01-20 15:18:45 +01:00
Christopher Faulet	105ba6cc54	MINOR: actions: Rename the act_flag enum into act_opt The flags in the act_flag enum have been renamed act_opt. It means ACT_OPT prefix is used instead of ACT_FLAG. The purpose of this patch is to reserve the action flags for the actions configuration.	2020-01-20 15:18:45 +01:00
Christopher Faulet	cd26e8a2ec	MINOR: http-rules/tcp-rules: Call the defined action function first if defined When TCP and HTTP rules are evaluated, if an action function (action_ptr field in the act_rule structure) is defined for a given action, it is now always called in priority over the test on the action type. Concretly, for now, only custom actions define it. Thus there is no change. It just let us the choice to extend the action type beyond the existing ones in the enum.	2020-01-20 15:18:45 +01:00
Christopher Faulet	96bff76087	MINOR: actions: Regroup some info about HTTP rules in the same struct Info used by HTTP rules manipulating the message itself are splitted in several structures in the arg union. But it is possible to group all of them in a unique struct. Now, <arg.http> is used by most of these rules, which contains: * <arg.http.i> : an integer used as status code, nice/tos/mark/loglevel or action id. * <arg.http.str> : an IST used as header name, reason string or auth realm. * <arg.http.fmt> : a log-format compatible expression * <arg.http.re> : a regular expression used by replace rules	2020-01-20 15:18:45 +01:00
Christopher Faulet	58b3564fde	MINOR: actions: Add a function pointer to release args used by actions Arguments used by actions are never released during HAProxy deinit. Now, it is possible to specify a function to do so. ".release_ptr" field in the act_rule structure may be set during the configuration parsing to a specific deinit function depending on the action type.	2020-01-20 15:18:45 +01:00
Christopher Faulet	e00d06c99f	MINOR: http-rules: Handle all message rewrites the same way In HTTP rules, error handling during a rewrite is now handle the same way for all rules. First, allocation errors are reported as internal errors. Then, if soft rewrites are allowed, rewrite errors are ignored and only the failed_rewrites counter is incremented. Otherwise, when strict rewrites are mandatory, interanl errors are returned. For now, only soft rewrites are supported. Note also that the warning sent to notify a rewrite failure was removed. It will be useless once the strict rewrites will be possible.	2020-01-20 15:18:45 +01:00
Christopher Faulet	a00071e2e5	MINOR: http-ana: Add a txn flag to support soft/strict message rewrites the HTTP_MSGF_SOFT_RW flag must now be set on the HTTP transaction to ignore rewrite errors on a message, from HTTP rules. The mode is called the soft rewrites. If thes flag is not set, strict rewrites are performed. In this mode, if a rewrite error occurred, an internal error is reported. For now, HTTP_MSGF_SOFT_RW is always set and there is no way to switch a transaction in strict mode.	2020-01-20 15:18:45 +01:00
Christopher Faulet	a08546bb5a	MINOR: counters: Remove failed_secu counter and use denied_resp instead The failed_secu counter is only used for the servers stats. It is used to report the number of denied responses. On proxies, the same info is stored in the denied_resp counter. So, it is more consistent to use the same field for servers.	2020-01-20 15:18:45 +01:00
Christopher Faulet	0159ee4032	MINOR: stats: Report internal errors in the proxies/listeners/servers stats The stats field ST_F_EINT has been added to report internal errors encountered per proxy, per listener and per server. It appears in the CLI export and on the HTML stats page.	2020-01-20 15:18:45 +01:00
Christopher Faulet	30a2a3724b	MINOR: http-rules: Add more return codes to let custom actions act as normal ones When HTTP/TCP rules are evaluated, especially HTTP ones, some results are possible for normal actions and not for custom ones. So missing return codes (ACT_RET_) have been added to let custom actions act as normal ones. Concretely following codes have been added: * ACT_RET_DENY : deny the request/response. It must be handled by the caller * ACT_RET_ABRT : abort the request/response, handled by action itsleft. * ACT_RET_INV : invalid request/response	2020-01-20 15:18:45 +01:00
Christopher Faulet	4d90db5f4c	MINOR: http-rules: Add a rule result to report internal error Now, when HTTP rules are evaluated, HTTP_RULE_RES_ERROR must be returned when an internal error is catched. It is a way to make the difference between a bad request or a bad response and an error during its processing.	2020-01-20 15:18:45 +01:00
Christopher Faulet	d4ce6c2957	MINOR: counters: Add a counter to report internal processing errors This counter, named 'internal_errors', has been added in frontend and backend counters. It should be used when a internal error is encountered, instead for failed_req or failed_resp.	2020-01-20 15:18:45 +01:00
Christopher Faulet	cb5501327c	BUG/MINOR: http-rules: Remove buggy deinit functions for HTTP rules Functions to deinitialize the HTTP rules are buggy. These functions does not check the action name to release the right part in the arg union. Only few info are released. For auth rules, the realm is released and there is no problem here. But the regex <arg.hdr_add.re> is always unconditionally released. So it is easy to make these functions crash. For instance, with the following rule HAProxy crashes during the deinit : http-request set-map(/path/to/map) %[src] %[req.hdr(X-Value)] For now, These functions are simply removed and we rely on the deinit function used for TCP rules (renamed as deinit_act_rules()). This patch fixes the bug. But arguments used by actions are not released at all, this part will be addressed later. This patch must be backported to all stable versions.	2020-01-20 15:18:45 +01:00
Willy Tarreau	ee1a6fc943	MINOR: connection: make the last arg of subscribe() a struct wait_event* The subscriber used to be passed as a "void param" that was systematically cast to a struct wait_event. By now it appears clear that the subscribe() call at every layer is well defined and always takes a pointer to an event subscriber of type wait_event, so let's enforce this in the functions' prototypes, remove the intermediary variables used to cast it and clean up the comments to clarify what all these functions do in their context.	2020-01-17 18:30:37 +01:00
Willy Tarreau	7872d1fc15	MEDIUM: connection: merge the send_wait and recv_wait entries In practice all callers use the same wait_event notification for any I/O so instead of keeping specific code to handle them separately, let's merge them and it will allow us to create new events later.	2020-01-17 18:30:36 +01:00
Willy Tarreau	3a9312af8f	REORG: stream/backend: move backend-specific stuff to backend.c For more than a decade we've kept all the sess_update_st_*() functions in stream.c while they're only there to work in relation with what is currently being done in backend.c (srv_redispatch_connect, connect_server, etc). Let's move all this pollution over there and take this opportunity to try to find slightly less confusing names for these old functions whose role is only to handle transitions from one specific stream-int state: sess_update_st_rdy_tcp() -> back_handle_st_rdy() sess_update_st_con_tcp() -> back_handle_st_con() sess_update_st_cer() -> back_handle_st_cer() sess_update_stream_int() -> back_try_conn_req() sess_prepare_conn_req() -> back_handle_st_req() sess_establish() -> back_establish() The last one remained in stream.c because it's more or less a completion function which does all the initialization expected on a connection success or failure, can set analysers and emit logs. The other ones could possibly slightly benefit from being modified to take a stream-int instead since it's really what they're working with, but it's unimportant here.	2020-01-17 18:30:36 +01:00
Willy Tarreau	3381bf89e3	MEDIUM: connection: get rid of CO_FL_CURR_* flags These ones used to serve as a set of switches between CO_FL_SOCK_* and CO_FL_XPRT_, and now that the SOCK layer is gone, they're always a copy of the last know CO_FL_XPRT_ ones that is resynchronized before I/O events by calling conn_refresh_polling_flags(), and that are pushed back to FDs when detecting changes with conn_xprt_polling_changes(). While these functions are not particularly heavy, what they do is totally redundant by now because the fd_want_/fd_stop_() actions already perform test-and-set operations to decide to create an entry or not, so they do the exact same thing that is done by conn_xprt_polling_changes(). As such it is pointless to call that one, and given that the only reason to keep CO_FL_CURR_* is to detect changes there, we can now remove them. Even if this does only save very few cycles, this removes a significant complexity that has been responsible for many bugs in the past, including the last one affecting FreeBSD. All tests look good, and no performance regressions were observed.	2020-01-17 17:45:12 +01:00
Willy Tarreau	e2a0eeca77	MINOR: connection: move the CO_FL_WAIT_ROOM cleanup to the reader only CO_FL_WAIT_ROOM is set by the splicing function in raw_sock, and cleared by the stream-int when splicing is disabled, as well as in conn_refresh_polling_flags() so that a new call to ->rcv_pipe() could be attempted by the I/O callbacks called from conn_fd_handler(). This clearing in conn_refresh_polling_flags() makes no sense anymore and is in no way related to the polling at all. Since we don't call them from there anymore it's better to clear it before attempting to receive, and to set it again later. So let's move this operation where it should be, in raw_sock_to_pipe() so that it's now symmetric. It was also placed in raw_sock_to_buf() so that we're certain that it gets cleared if an attempt to splice is replaced with a subsequent attempt to recv(). And these were currently already achieved by the call to conn_refresh_polling_flags(). Now it could theorically be removed from the stream-int.	2020-01-17 17:19:27 +01:00
Willy Tarreau	17ccd1a356	BUG/MEDIUM: connection: add a mux flag to indicate splice usability Commit `c640ef1a7d` ("BUG/MINOR: stream-int: avoid calling rcv_buf() when splicing is still possible") fixed splicing in TCP and legacy mode but broke it badly in HTX mode. What happens in HTX mode is that the channel's to_forward value remains set to CHN_INFINITE_FORWARD during the whole transfer, and as such it is not a reliable signal anymore to indicate whether more data are expected or not. Thus, when data are spliced out of the mux using rcv_pipe(), even when the end is reached (that only the mux knows about), the call to rcv_buf() to get the final HTX blocks completing the message were skipped and there was often no new event to wake this up, resulting in transfer timeouts at the end of large objects. All this goes down to the fact that the channel has no more information about whether it can splice or not despite being the one having to take the decision to call rcv_pipe() or not. And we cannot afford to call rcv_buf() inconditionally because, as the commit above showed, this reduces the forwarding performance by 2 to 3 in TCP and legacy modes due to data lying in the buffer preventing splicing from being used later. The approach taken by this patch consists in offering the muxes the ability to report a bit more information to the upper layers via the conn_stream. This information could simply be to indicate that more data are awaited but the real need being to distinguish splicing and receiving, here instead we clearly report the mux's willingness to be called for splicing or not. Hence the flag's name, CS_FL_MAY_SPLICE. The mux sets this flag when it knows that its buffer is empty and that data waiting past what is currently known may be spliced, and clears it when it knows there's no more data or that the caller must fall back to rcv_buf() instead. The stream-int code now uses this to determine if splicing may be used or not instead of looking at the rcv_pipe() callbacks through the whole chain. And after the rcv_pipe() call, it checks the flag again to decide whether it may safely skip rcv_buf() or not. All this bitfield dance remains a bit complex and it starts to appear obvious that splicing vs reading should be a decision of the mux based on permission granted by the data layer. This would however increase the API's complexity but definitely need to be thought about, and should even significantly simplify the data processing layer. The way it was integrated in mux-h1 will also result in no more calls to rcv_pipe() on chunked encoded data, since these ones are currently disabled at the mux level. However once the issue with chunks+splice is fixed, it will be important to explicitly check for curr_len\|CHNK to set MAY_SPLICE, so that we don't call rcv_buf() after each chunk. This fix must be backported to 2.1 and 2.0.	2020-01-17 17:00:12 +01:00
Willy Tarreau	340b07e868	BUG/MAJOR: hashes: fix the signedness of the hash inputs Wietse Venema reported in the thread below that we have a signedness issue with our hashes implementations: due to the use of const char* for the input key that's often text, the crc32, sdbm, djb2, and wt6 algorithms return a platform-dependent value for binary input keys containing bytes with bit 7 set. This means that an ARM or PPC platform will hash binary inputs differently from an x86 typically. Worse, some algorithms are well defined in the industry (like CRC32) and do not provide the expected result on x86, possibly causing interoperability issues (e.g. a user-agent would fail to compare the CRC32 of a message body against the one computed by haproxy). Fortunately, and contrary to the first impression, the CRC32c variant used in the PROXY protocol processing is not affected. Thus the impact remains very limited (the vast majority of input keys are text-based, such as user-agent headers for exmaple). This patch addresses the issue by fixing all hash functions' prototypes (even those not affected, for API consistency). A reg test will follow in another patch. The vast majority of users do not use these hashes. And among those using them, very few will pass them on binary inputs. However, for the rare ones doing it, this fix MAY have an impact during the upgrade. For example if the package is upgraded on one LB then on another one, and the CRC32 of a binary input is used as a stick table key (why?) then these CRCs will not match between both nodes. Similarly, if "hash-type ... crc32" is used, LB inconsistency may appear during the transition. For this reason it is preferable to apply the patch on all nodes using such hashes at the same time. Systems upgraded via their distros will likely observe the least impact since they're expected to be upgraded within a short time frame. And it is important for distros NOT to skip this fix, in order to avoid distributing an incompatible implementation of a hash. This is the reason why this patch is tagged as MAJOR, eventhough it's extremely unlikely that anyone will ever notice a change at all. This patch must be backported to all supported branches since the hashes were introduced in 1.5-dev20 (commit `98634f0c`). Some parts may be dropped since implemented later. Link to Wietse's report: https://marc.info/?l=postfix-users&m=157879464518535&w=2	2020-01-16 08:23:42 +01:00
Willy Tarreau	f31af9367e	MEDIUM: lua: don't call the GC as often when dealing with outgoing connections In order to properly close connections established from Lua in case a Lua context dies, the context currently automatically gets a flag HLUA_MUST_GC set whenever an outgoing connection is used. This causes the GC to be enforced on the context's death as well as on yield. First, it does not appear necessary to do it when yielding, since if the connections die they are already cleaned up. Second, the problem with the flag is that even if a connection gets properly closed, the flag is not removed and the GC continues to be called on the Lua context. The impact on performance looks quite significant, as noticed and diagnosed by Sadasiva Gujjarlapudi in the following thread: https://www.mail-archive.com/haproxy@formilux.org/msg35810.html This patch changes the flag for a counter so that each created connection increments it and each cleanly closed connection decrements it. That way we know we have to call the GC on the context's death only if the count is non-null. As reported in the thread above, the Lua performance gain is now over 20% by doing this. Thanks to Sada and Thierry for the design discussion and tests that led to this solution.	2020-01-14 10:12:31 +01:00
Olivier Houchard	3c4f40acbf	BUG/MEDIUM: tasks: Use the MT macros in tasklet_free(). In tasklet_free(), to attempt to remove ourself, use MT_LIST_DEL, we can't just use LIST_DEL(), as we theorically could be in the shared tasklet list. This should be backported to 2.1.	2020-01-10 16:56:59 +01:00
Florian Tham	9205fea13a	MINOR: http: Add 404 to http-request deny This patch adds http status code 404 Not Found to http-request deny. See issue #80.	2020-01-08 16:15:23 +01:00
Florian Tham	272e29b5cc	MINOR: http: Add 410 to http-request deny This patch adds http status code 410 Gone to http-request deny. See issue #80.	2020-01-08 16:15:23 +01:00
Willy Tarreau	eaf05be0ee	OPTIM: polling: do not create update entries for FD removal In order to reduce the number of poller updates, we can benefit from the fact that modern pollers use sampling to report readiness and that under load they rarely report the same FD multiple times in a row. As such it's not always necessary to disable such FDs especially when we're almost certain they'll be re-enabled again and will require another set of syscalls. Now instead of creating an update for a (possibly temporary) removal, we only perform this removal if the FD is reported again as ready while inactive. In addition this is performed via another update so that alternating workloads like transfers have a chance to re-enable the FD without any syscall during the loop (typically after the data that filled a buffer have been sent). However we only do that for single- threaded FDs as the other ones require a more complex setup and are not on the critical path. This does cause a few spurious wakeups but almost totally eliminates the calls to epoll_ctl() on connections seeing intermitent traffic like HTTP/1 to a server or client. A typical example with 100k requests for 4 kB objects over 200 connections shows that the number of epoll_ctl() calls doesn't depend on the number of requests anymore but most exclusively on the number of established connections: Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 57.09 0.499964 0 654361 321190 recvfrom 38.33 0.335741 0 369097 1 epoll_wait 4.56 0.039898 0 44643 epoll_ctl 0.02 0.000211 1 200 200 connect ------ ----------- ----------- --------- --------- ---------------- 100.00 0.875814 1068301 321391 total After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 59.25 0.504676 0 657600 323630 recvfrom 40.68 0.346560 0 374289 1 epoll_wait 0.04 0.000370 0 620 epoll_ctl 0.03 0.000228 1 200 200 connect ------ ----------- ----------- --------- --------- ---------------- 100.00 0.851834 1032709 323831 total As expected there is also a slight increase of epoll_wait() calls since delaying de-activation of events can occasionally cause one spurious wakeup.	2019-12-27 16:38:47 +01:00
Willy Tarreau	19689882e6	MINOR: poller: do not call the IO handler if the FD is not active For now this almost never happens but with subsequent patches it will become more important not to uselessly call the I/O handlers if the FD is not active.	2019-12-27 16:38:47 +01:00
Willy Tarreau	0fbc318e24	CLEANUP: connection: merge CO_FL_NOTIFY_DATA and CO_FL_NOTIFY_DONE Both flags became equal in commit `82967bf9` ("MINOR: connection: adjust CO_FL_NOTIFY_DATA after removal of flags"), which already predicted the overlap between xprt_done_cb() and wake() after the removal of the DATA specific flags in 1.8. Let's simply remove CO_FL_NOTIFY_DATA since the "_DONE" version already covers everything and explains the intent well enough.	2019-12-27 16:38:47 +01:00
Willy Tarreau	4970e5adb7	REORG: connection: move tcp_connect_probe() to conn_fd_check() The function is not TCP-specific at all, it covers all FD-based sockets so let's move this where other similar functions are, in connection.c, and rename it conn_fd_check().	2019-12-27 16:38:43 +01:00
Willy Tarreau	11ef0837af	MINOR: pollers: add a new flag to indicate pollers reporting ERR & HUP In practice it's all pollers except select(). It turns out that we're keeping some legacy code only for select and enforcing it on all pollers, let's offer the pollers the ability to declare that they do not need that.	2019-12-27 14:04:33 +01:00
Lukas Tribus	a26d1e1324	BUILD: ssl: improve SSL_CTX_set_ecdh_auto compatibility SSL_CTX_set_ecdh_auto() is not defined when OpenSSL 1.1.1 is compiled with the no-deprecated option. Remove existing, incomplete guards and add a compatibility macro in openssl-compat.h, just as OpenSSL does: `bf4006a6f9/include/openssl/ssl.h (L1486)` This should be backported as far as 2.0 and probably even 1.9.	2019-12-21 06:46:55 +01:00
Rosen Penev	b3814c2ca8	BUG/MINOR: ssl: openssl-compat: Fix getm_ defines LIBRESSL_VERSION_NUMBER evaluates to 0 under OpenSSL, making the condition always true. Check for the define before checking it. Signed-off-by: Rosen Penev <rosenp@gmail.com> [wt: to be backported as far as 1.9]	2019-12-20 16:01:31 +01:00
Willy Tarreau	dd0e89a084	BUG/MAJOR: task: add a new TASK_SHARED_WQ flag to fix foreing requeuing Since 1.9 with commit `b20aa9eef3` ("MAJOR: tasks: create per-thread wait queues") a task bound to a single thread will not use locks when being queued or dequeued because the wait queue is assumed to be the owner thread's. But there exists a rare situation where this is not true: the health check tasks may be running on one thread waiting for a response, and may in parallel be requeued by another thread calling health_adjust() after a detecting a response error in traffic when "observe l7" is set, and "fastinter" is lower than "inter", requiring to shorten the running check's timeout. In this case, the task being requeued was present in another thread's wait queue, thus opening a race during task_unlink_wq(), and gets requeued into the calling thread's wait queue instead of the running one's, opening a second race here. This patch aims at protecting against the risk of calling task_unlink_wq() from one thread while the task is queued on another thread, hence unlocked, by introducing a new TASK_SHARED_WQ flag. This new flag indicates that a task's position in the wait queue may be adjusted by other threads than then one currently executing it. This means that such WQ manipulations must be performed under a lock. There are two types of such tasks: - the global ones, using the global wait queue (technically speaking, those whose thread_mask has at least 2 bits set). - some local ones, which for now will be placed into the global wait queue as well in order to benefit from its lock. The flag is automatically set on initialization if the task's thread mask indicates more than one thread. The caller must also set it if it intends to let other threads update the task's expiration delay (e.g. delegated I/Os), or if it intends to change the task's affinity over time as this could lead to the same situation. Right now only the situation described above seems to be affected by this issue, and it is very difficult to trigger, and even then, will often have no visible effect beyond stopping the checks for example once the race is met. On my laptop it is feasible with the following config, chained to httpterm: global maxconn 400 # provoke FD errors, calling health_adjust() defaults mode http timeout client 10s timeout server 10s timeout connect 10s listen px bind :8001 option httpchk /?t=50 server sback 127.0.0.1:8000 backup server-template s 0-999 127.0.0.1:8000 check port 8001 inter 100 fastinter 10 observe layer7 This patch will automatically address the case for the checks because check tasks are created with multiple threads bound and will get the TASK_SHARED_WQ flag set. If in the future more tasks need to rely on this (multi-threaded muxes for example) and the use of the global wait queue becomes a bottleneck again, then it should not be too difficult to place locks on the local wait queues and queue the task on its bound thread. This patch needs to be backported to 2.1, 2.0 and 1.9. It depends on previous patch "MINOR: task: only check TASK_WOKEN_ANY to decide to requeue a task". Many thanks to William Dauchy for providing detailed traces allowing to spot the problem.	2019-12-19 14:42:22 +01:00
Christopher Faulet	76014fd118	MEDIUM: h1-htx: Add HTX EOM block when the message is in H1_MSG_DONE state During H1 parsing, the HTX EOM block is added before switching the message state to H1_MSG_DONE. It is an exception in the way to convert an H1 message to HTX. Except for this block, the message is first switched to the right state before starting to add the corresponding HTX blocks. For instance, the message is switched in H1_MSG_DATA state and then the HTX DATA blocks are added. With this patch, the message is switched to the H1_MSG_DONE state when all data blocks or trailers were processed. It is the caller responsibility to call h1_parse_msg_eom() when the H1_MSG_DONE state is reached. This way, it is far easier to catch failures when the HTX buffer is full. The H1 and FCGI muxes have been updated accordingly. This patch may eventually be backported to 2.1 if it helps other backports.	2019-12-11 16:46:16 +01:00
Willy Tarreau	fec56c6a76	BUG/MINOR: listener: fix off-by-one in state name check As reported in issue #380, the state check in listener_state_str() is invalid as it allows state value 9 to report crap. We don't use such a state value so the issue should never happen unless the memory is already corrupted, but better clean this now while it's harmless. This should be backported to all maintained branches.	2019-12-11 15:51:37 +01:00
Willy Tarreau	d26c9f9465	BUG/MINOR: mworker: properly pass SIGTTOU/SIGTTIN to workers If a new process is started with -sf and it fails to bind, it may send a SIGTTOU to the master process in hope that it will temporarily unbind. Unfortunately this one doesn't catch it and stops to background instead of forwarding the signal to the workers. The same is true for SIGTTIN. This commit simply implements an extra signal handler for the master to deal with such signals that must be passed down to the workers. It must be backported as far as 1.8, though there the code differs in that it's entirely in haproxy.c and doesn't require an extra sig handler.	2019-12-11 14:26:53 +01:00
Willy Tarreau	c49ba52524	MINOR: tasks: split wake_expired_tasks() in two parts to avoid useless wakeups We used to have wake_expired_tasks() wake up tasks and return the next expiration delay. The problem this causes is that we have to call it just before poll() in order to consider latest timers, but this also means that we don't wake up all newly expired tasks upon return from poll(), which thus systematically requires a second poll() round. This is visible when running any scheduled task like a health check, as there are systematically two poll() calls, one with the interval, nothing is done after it, and another one with a zero delay, and the task is called: listen test bind *:8001 server s1 127.0.0.1:1111 check 09:37:38.200959 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8696843}) = 0 09:37:38.200967 epoll_wait(3, [], 200, 1000) = 0 09:37:39.202459 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8712467}) = 0 >> nothing run here, as the expired task was not woken up yet. 09:37:39.202497 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8715766}) = 0 09:37:39.202505 epoll_wait(3, [], 200, 0) = 0 09:37:39.202513 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8719064}) = 0 >> now the expired task was woken up 09:37:39.202522 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7 09:37:39.202537 fcntl(7, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 09:37:39.202565 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0 09:37:39.202577 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0 09:37:39.202585 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 09:37:39.202659 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0 09:37:39.202673 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8814713}) = 0 09:37:39.202683 epoll_wait(3, [{EPOLLOUT\|EPOLLERR\|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1 09:37:39.202693 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8818617}) = 0 09:37:39.202701 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0 09:37:39.202715 close(7) = 0 Let's instead split the function in two parts: - the first part, wake_expired_tasks(), called just before process_runnable_tasks(), wakes up all expired tasks; it doesn't compute any timeout. - the second part, next_timer_expiry(), called just before poll(), only computes the next timeout for the current thread. Thanks to this, all expired tasks are properly woken up when leaving poll, and each poll call's timeout remains up to date: 09:41:16.270449 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10223556}) = 0 09:41:16.270457 epoll_wait(3, [], 200, 999) = 0 09:41:17.270130 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10238572}) = 0 09:41:17.270157 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7 09:41:17.270194 fcntl(7, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 09:41:17.270204 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0 09:41:17.270216 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0 09:41:17.270224 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 09:41:17.270299 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0 09:41:17.270314 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10337841}) = 0 09:41:17.270323 epoll_wait(3, [{EPOLLOUT\|EPOLLERR\|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1 09:41:17.270332 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10341860}) = 0 09:41:17.270340 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0 09:41:17.270367 close(7) = 0 This may be backported to 2.1 and 2.0 though it's unlikely to bring any user-visible improvement except to clarify debugging.	2019-12-11 09:42:58 +01:00
Willy Tarreau	440d09b244	BUG/MINOR: tasks: only requeue a task if it was already in the queue Commit `0742c314c3` ("BUG/MEDIUM: tasks: Make sure we switch wait queues in task_set_affinity().") had a slight side effect on expired timeouts, which is that when used before a timeout is updated, it will cause an existing task to be requeued earlier than its expected timeout when done before being updated, resulting in the next poll wakup timeout too early or even instantly if the previous wake up was done on a timeout. This is visible in strace when health checks are enabled because there are two poll calls, one of which has a short or zero delay. The correct solution is to only requeue a task if it was already in the queue. This can be backported to all branches having the fix above.	2019-12-11 09:21:36 +01:00
Willy Tarreau	a1d97f88e0	REORG: listener: move the global listener queue code to listener.c The global listener queue code and declarations were still lying in haproxy.c while not needed there anymore at all. This complicates the code for no reason. As a result, the global_listener_queue_task and the global_listener_queue were made static.	2019-12-10 14:16:03 +01:00
Willy Tarreau	241797a3fc	MINOR: listener: split dequeue_all_listener() in two We use it half times for the global_listener_queue and half times for a proxy's queue and this requires the callers to take care of these. Let's split it in two versions, the current one working only on the global queue and another one dedicated to proxies for the per-proxy queues. This cleans up quite a bit of code.	2019-12-10 14:14:09 +01:00
Willy Tarreau	a45a8b5171	MEDIUM: init: set NO_NEW_PRIVS by default when supported HAProxy doesn't need to call executables at run time (except when using external checks which are strongly recommended against), and is even expected to isolate itself into an empty chroot. As such, there basically is no valid reason to allow a setuid executable to be called without the user being fully aware of the risks. In a situation where haproxy would need to call external checks and/or disable chroot, exploiting a vulnerability in a library or in haproxy itself could lead to the execution of an external program. On Linux it is possible to lock the process so that any setuid bit present on such an executable is ignored. This significantly reduces the risk of privilege escalation in such a situation. This is what haproxy does by default. In case this causes a problem to an external check (for example one which would need the "ping" command), then it is possible to disable this protection by explicitly adding this directive in the global section. If enabled, it is possible to turn it back off by prefixing it with the "no" keyword. Before the option: $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id" uid=0(root) gid=0(root) groups=0(root After the option: $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id" sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?	2019-12-06 17:20:26 +01:00
Olivier Houchard	0742c314c3	BUG/MEDIUM: tasks: Make sure we switch wait queues in task_set_affinity(). In task_set_affinity(), leave the wait_queue if any before changing the affinity, and re-enter a wait queue once it is done. If we don't do that, the task may stay in the wait queue of another thread, and we later may end up modifying that wait queue while holding no lock, which could lead to memory corruption. THis should be backported to 2.1, 2.0 and 1.9.	2019-12-05 15:11:19 +01:00
Willy Tarreau	d96f1126fe	MEDIUM: init: prevent process and thread creation at runtime Some concerns are regularly raised about the risk to inherit some Lua files which make use of a fork (e.g. via os.execute()) as well as whether or not some of bugs we fix might or not be exploitable to run some code. Given that haproxy is event-driven, any foreground activity completely stops processing and is easy to detect, but background activity is a different story. A Lua script could very well discretely fork a sub-process connecting to a remote location and taking commands, and some injected code could also try to hide its activity by creating a process or a thread without blocking the rest of the processing. While such activities should be extremely limited when run in an empty chroot without any permission, it would be better to get a higher assurance they cannot happen. This patch introduces something very simple: it limits the number of processes and threads to zero in the workers after the last thread was created. By doing so, it effectively instructs the system to fail on any fork() or clone() syscall. Thus any undesired activity has to happen in the foreground and is way easier to detect. This will obviously break external checks (whose concept is already totally insecure), and for this reason a new option "insecure-fork-wanted" was added to disable this protection, and it is suggested in the fork() error report from the checks. It is obviously recommended not to use it and to reconsider the reasons leading to it being enabled in the first place. If for any reason we fail to disable forks, we still start because it could be imaginable that some operating systems refuse to set this limit to zero, but in this case we emit a warning, that may or may not be reported since we're after the fork point. Ideally over the long term it should be conditionned by strict-limits and cause a hard fail.	2019-12-03 11:49:00 +01:00
Emmanuel Hocdet	e9a100e982	BUG/MINOR: ssl: fix X509 compatibility for openssl < 1.1.0 Commit `d4f9a60e` "MINOR: ssl: deduplicate ca-file" uses undeclared X509 functions when build with openssl < 1.1.0. Introduce this functions in openssl-compat.h . Fix issue #385.	2019-12-03 07:13:12 +01:00
Emmanuel Hocdet	d4f9a60ee2	MINOR: ssl: deduplicate ca-file Typically server line like: 'server-template srv 1-1000 *:443 ssl ca-file ca-certificates.crt' load ca-certificates.crt 1000 times and stay duplicated in memory. Same case for bind line: ca-file is loaded for each certificate. Same 'ca-file' can be load one time only and stay deduplicated in memory. As a corollary, this will prevent file access for ca-file when updating a certificate via CLI.	2019-11-28 11:11:20 +01:00
Willy Tarreau	cdb27e8295	MINOR: version: this is development again, update the status It's basically a revert of commit `9ca7f8cea`.	2019-11-25 20:38:32 +01:00
Willy Tarreau	2e077f8d53	[RELEASE] Released version 2.2-dev0 Released version 2.2-dev0 with the following main changes : - exact copy of 2.1.0	2019-11-25 20:36:16 +01:00

1 2 3 4 5 ...

3853 Commits