haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2024-12-27 23:22:09 +00:00

Author	SHA1	Message	Date
Ilya Shipitsin	6fb0f2148f	CLEANUP: assorted typo fixes in the code and comments This is sixth iteration of typo fixes	2020-04-02 16:25:45 +02:00
Tim Duesterhus	cf6e0c8a83	MEDIUM: proxy_protocol: Support sending unique IDs using PPv2 This patch adds the `unique-id` option to `proxy-v2-options`. If this option is set a unique ID will be generated based on the `unique-id-format` while sending the proxy protocol v2 header and stored as the unique id for the first stream of the connection. This feature is meant to be used in `tcp` mode. It works on HTTP mode, but might result in inconsistent unique IDs for the first request on a keep-alive connection, because the unique ID for the first stream is generated earlier than the others. Now that we can send unique IDs in `tcp` mode the `%ID` log variable is made available in TCP mode.	2020-03-13 17:26:43 +01:00
Willy Tarreau	19bc201c9f	MEDIUM: connection: remove the intermediary polling state from the connection Historically we used to require that the connections held the desired polling states for the data layer and the socket layer. Then with muxes these were more or less merged into the transport layer, and now it happens that with all transport layers having their own state, the "transport layer state" as we have it in the connection (XPRT_RD_ENA, XPRT_WR_ENA) is only an exact copy of the undelying file descriptor state, but with a delay. All of this is causing some difficulties at many places in the code because there are still some locations which use the conn_want_* API to remain clean and only rely on connection, and count on a later collection call to conn_cond_update_polling(), while others need an immediate action and directly use the FD updates. Since our updates are now much cheaper, most of them being only an atomic test-and-set operation, and since our I/O callbacks are deferred, there's no benefit anymore in trying to "cache" the transient state change in the connection flags hoping to cancel them before they become an FD event. Better make such calls transparent indirections to the FD layer instead and get rid of the deferred operations which needlessly complicate the logic inside. This removes flags CO_FL_XPRT_{RD,WR}_ENA and CO_FL_WILL_UPDATE. A number of functions related to polling updates were either greatly simplified or removed. Two places were using CO_FL_XPRT_WR_ENA as a hint to know if more data were expected to be sent after a PROXY protocol or SOCKSv4 header. These ones were simply replaced with a check on the subscription which is where we ought to get the autoritative information from. Now the __conn_xprt_want_* and their conn_xprt_want_* counterparts are the same. conn_stop_polling() and conn_xprt_stop_both() are the same as well. conn_cond_update_polling() only causes errors to stop polling. It also becomes way more obvious that muxes should not at all employ conn_xprt_{want\|stop}_{recv,send}(), and that the call to __conn_xprt_stop_recv() in case a mux failed to allocate a buffer is inappropriate, it ought to unsubscribe from reads instead. All of this definitely requires a serious cleanup.	2020-02-21 11:21:12 +01:00
Willy Tarreau	f22758d12a	MINOR: connection: remove some unneeded checks for CO_FL_SOCK_WR_SH A few places in health checks and stream-int on the send path were still checking for this flag. Now we do not and instead we rely on snd_buf() to report the error if any. It's worth noting that all 3 real muxes still use CO_FL_SOCK_WR_SH and CO_FL_ERROR interchangeably at various places to decide to abort and/or free their data. This should be clarified and fixed so that only CO_FL_ERROR is used, and this will render the error paths simpler and more accurate.	2020-01-23 19:01:37 +01:00
Willy Tarreau	49139cb914	MINOR: connection: don't check for CO_FL_SOCK_WR_SH too early in handshakes Just like with CO_FL_SOCK_RD_SH, we don't need to check for this flag too early because conn_sock_send() already does it. No error was lost so it was harmless, it was only useless code.	2020-01-23 19:01:37 +01:00
Willy Tarreau	911db9bd29	MEDIUM: connection: use CO_FL_WAIT_XPRT more consistently than L4/L6/HANDSHAKE As mentioned in commit `c192b0ab95` ("MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_*"), there is a lack of consistency on which flags are checked among L4/L6/HANDSHAKE depending on the code areas. A number of sample fetch functions only check for L4L6 to report MAY_CHANGE, some places only check for HANDSHAKE and many check both L4L6 and HANDSHAKE. This patch starts to make all of this more consistent by introducing a new mask CO_FL_WAIT_XPRT which is the union of L4/L6/HANDSHAKE and reports whether the transport layer is ready or not. All inconsistent call places were updated to rely on this one each time the goal was to check for the readiness of the transport layer.	2020-01-23 16:34:26 +01:00
Willy Tarreau	18955db43d	MINOR: stream-int: always report received shutdowns As mentioned in `c192b0ab95` ("MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_*"), si_cs_recv() currently does not propagate CS_FL_EOS to CF_READ_NULL if CO_FL_WAIT_L4L6 is set, while this situation doesn't exist anymore. Let's get rid of this confusing test.	2020-01-23 16:34:26 +01:00
Willy Tarreau	c192b0ab95	MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_* Commit `477902bd2e` ("MEDIUM: connections: Get ride of the xprt_done callback.") broke the master CLI for a very obscure reason. It happens that short requests immediately terminated by a shutdown are properly received, CS_FL_EOS is correctly set, but in si_cs_recv(), we refrain from setting CF_SHUTR on the channel because CO_FL_CONNECTED was not yet set on the connection since we've not passed again through conn_fd_handler() and it was not done in conn_complete_session(). While commit `a8a415d31a` ("BUG/MEDIUM: connections: Set CO_FL_CONNECTED in conn_complete_session()") fixed the issue, such accident may happen again as the root cause is deeper and actually comes down to the fact that CO_FL_CONNECTED is lazily set at various check points in the code but not every time we drop one wait bit. It is not the first time we face this situation. Originally this flag was used to detect the transition between WAIT_* and CONNECTED in order to call ->wake() from the FD handler. But since at least 1.8-dev1 with commit `7bf3fa3c23` ("BUG/MAJOR: connection: update CO_FL_CONNECTED before calling the data layer"), CO_FL_CONNECTED is always synchronized against the two others before being checked. Moreover, with the I/Os moved to tasklets, the decision to call the ->wake() function is performed after the I/Os in si_cs_process() and equivalent, which don't care about this transition either. So in essence, checking for CO_FL_CONNECTED has become a lazy wait to check for (CO_FL_WAIT_L4_CONN \| CO_FL_WAIT_L6_CONN), but that always relies on someone else having synchronized it. This patch addresses it once for all by killing this flag and only checking the two others (for which a composite mask CO_FL_WAIT_L4L6 was added). This revealed a number of inconsistencies that were purposely not addressed here for the sake of bisectability: - while most places do check both L4+L6 and HANDSHAKE at the same time, some places like assign_server() or back_handle_st_con() and a few sample fetches looking for proxy protocol do check for L4+L6 but don't care about HANDSHAKE ; these ones will probably fail on TCP request session rules if the handshake is not complete. - some handshake handlers do validate that a connection is established at L4 but didn't clear CO_FL_WAIT_L4_CONN - the ->ctl method of mux_fcgi, mux_pt and mux_h1 only checks for L4+L6 before declaring the mux ready while the snd_buf function also checks for the handshake's completion. Likely the former should validate the handshake as well and we should get rid of these extra tests in snd_buf. - raw_sock_from_buf() would directly set CO_FL_CONNECTED and would only later clear CO_FL_WAIT_L4_CONN. - xprt_handshake would set CO_FL_CONNECTED itself without actually clearing CO_FL_WAIT_L4_CONN, which could apparently happen only if waiting for a pure Rx handshake. - most places in ssl_sock that were checking CO_FL_CONNECTED don't need to include the L4 check as an L6 check is enough to decide whether to wait for more info or not. It also becomes obvious when reading the test in si_cs_recv() that caused the failure mentioned above that once converted it doesn't make any sense anymore: having CS_FL_EOS set while still waiting for L4 and L6 to complete cannot happen since for CS_FL_EOS to be set, the other ones must have been validated. Some of these parts will still deserve further cleanup, and some of the observations above may induce some backports of potential bug fixes once totally analyzed in their context. The risk of breaking existing stuff is too high to blindly backport everything.	2020-01-23 14:41:37 +01:00
Olivier Houchard	8af03b396a	MEDIUM: streams: Always create a conn_stream in connect_server(). In connect_server(), when creating a new connection for which we don't yet know the mux (because it'll be decided by the ALPN), instead of associating the connection to the stream_interface, always create a conn_stream. This way, we have less special-casing needed. Store the conn_stream in conn->ctx, so that we can reach the upper layers if needed.	2020-01-22 18:55:59 +01:00
Willy Tarreau	93c9f59a9c	MINOR: stream-int: remove dependency on CO_FL_WAIT_ROOM for rcv_buf() The only case where this made sense was with mux_h1 but Since we introduced CS_FL_MAY_SPLICE, we don't need to rely on this anymore, thus we don't need to clear it either when we do not splice. There is a last check on this flag used to determine if the rx channel is full and that cannot go away unless it's changed to use the CS instead but for now this wouldn't add any benefit so better not do it yet.	2020-01-17 17:24:30 +01:00
Willy Tarreau	17ccd1a356	BUG/MEDIUM: connection: add a mux flag to indicate splice usability Commit `c640ef1a7d` ("BUG/MINOR: stream-int: avoid calling rcv_buf() when splicing is still possible") fixed splicing in TCP and legacy mode but broke it badly in HTX mode. What happens in HTX mode is that the channel's to_forward value remains set to CHN_INFINITE_FORWARD during the whole transfer, and as such it is not a reliable signal anymore to indicate whether more data are expected or not. Thus, when data are spliced out of the mux using rcv_pipe(), even when the end is reached (that only the mux knows about), the call to rcv_buf() to get the final HTX blocks completing the message were skipped and there was often no new event to wake this up, resulting in transfer timeouts at the end of large objects. All this goes down to the fact that the channel has no more information about whether it can splice or not despite being the one having to take the decision to call rcv_pipe() or not. And we cannot afford to call rcv_buf() inconditionally because, as the commit above showed, this reduces the forwarding performance by 2 to 3 in TCP and legacy modes due to data lying in the buffer preventing splicing from being used later. The approach taken by this patch consists in offering the muxes the ability to report a bit more information to the upper layers via the conn_stream. This information could simply be to indicate that more data are awaited but the real need being to distinguish splicing and receiving, here instead we clearly report the mux's willingness to be called for splicing or not. Hence the flag's name, CS_FL_MAY_SPLICE. The mux sets this flag when it knows that its buffer is empty and that data waiting past what is currently known may be spliced, and clears it when it knows there's no more data or that the caller must fall back to rcv_buf() instead. The stream-int code now uses this to determine if splicing may be used or not instead of looking at the rcv_pipe() callbacks through the whole chain. And after the rcv_pipe() call, it checks the flag again to decide whether it may safely skip rcv_buf() or not. All this bitfield dance remains a bit complex and it starts to appear obvious that splicing vs reading should be a decision of the mux based on permission granted by the data layer. This would however increase the API's complexity but definitely need to be thought about, and should even significantly simplify the data processing layer. The way it was integrated in mux-h1 will also result in no more calls to rcv_pipe() on chunked encoded data, since these ones are currently disabled at the mux level. However once the issue with chunks+splice is fixed, it will be important to explicitly check for curr_len\|CHNK to set MAY_SPLICE, so that we don't call rcv_buf() after each chunk. This fix must be backported to 2.1 and 2.0.	2020-01-17 17:00:12 +01:00
Christopher Faulet	48726b78e5	BUG/MINOR: stream-int: Don't trigger L7 retry if max retries is already reached When an HTTP response is received, at the stream-interface level, if a L7 retry must be triggered because of the status code, the response is trashed and a read error is reported on the response channel. Then the stream handles this error and perform the retry. Except if the maximum connection retries is reached. In this case, an error is reported. Because the server response was already trashed by the stream-interface, a generic 502 error is returned to the client instead of the server's one. Now, the stream-interface triggers a L7 retry only if the maximum connection retries is not already reached. Thus, at the end, the last server's response is returned. This patch must be backported to 2.1 and 2.0. It should fix the issue #439.	2020-01-09 15:39:06 +01:00
Willy Tarreau	c640ef1a7d	BUG/MINOR: stream-int: avoid calling rcv_buf() when splicing is still possible In si_cs_recv(), we can end up with a partial splice() call that will be followed by an attempt to us rcv_buf(). Sometimes this works and places data into the buffer, which then prevent splicing from being used, and this causes splice() and recvfrom() calls to alternate. Better simply refrain from calling rcv_buf() when there are data in the pipe and still data to be forwarded. Usually this indicates that we've ate everything available and that we still want to use splice() on subsequent calls. This should be backported to 2.1 and 2.0.	2019-12-04 11:55:49 +01:00
Willy Tarreau	1ac5f20804	BUG/MEDIUM: stream-int: don't subscribed for recv when we're trying to flush data If we cannot splice incoming data using rcv_pipe() due to remaining data in the buffer, we must not subscribe to the mux but instead tag the stream-int as blocked on missing Rx room. Otherwise when data are flushed, calling si_chk_rcv() will have no effect because the WAIT_EP flag remains present, and we'll end in an rx timeout. This case is very hard to reproduce, and requires an inversion of the polling side in the middle of a transfer. This can only happen when the client and the server are using similar links and when splicing is enabled. It typically takes hundreds of MB to GB for the problem to happen, and tends to be magnified by the use of option contstats which causes process_stream() to be called every 5s and to try again to recv. This fix must be backported to 2.1, 2.0, and possibly 1.9.	2019-12-04 11:55:49 +01:00
Christopher Faulet	e6d8cb1e91	BUG/MINOR: stream-int: Fix si_cs_recv() return value The previous patch on this function (`36b536d6c` "BUG/MEDIUM: stream-int: Don't loose events on the CS when an EOS is reported") contains a bug. The return value is based on the conn-stream's flags. But it may be reset if the CS is closed. Ironically it was exactly the purpose of this patch... This patch must be backported to 2.0 and 1.9.	2019-11-20 16:48:01 +01:00
Christopher Faulet	36b536d6c8	BUG/MEDIUM: stream-int: Don't loose events on the CS when an EOS is reported In si_cs_recv(), when a shutdown for reads is handled, the conn-stream may be closed. It happens when the ouput channel is closed for writes or if SI_FL_NOHALF is set on the stream-interface. In this case, conn-stream's flags are reset. Thus, if an error (CS_FL_ERROR) or an end of input (CS_FL_EOI) is reported by the mux, the event is lost. si_cs_recv() does not report these events by itself. It relies on si_cs_process() to report them to the stream-interface and/or the channel. For instance, if CS_FL_EOS and CS_FL_EOI are set by the H1 multiplexer during a call to si_cs_recv() on the server side, if the conn-stream is closed (read0 + SI_FL_NOHALF), the CS_FL_EOI flag is lost. Thus, this may lead the stream to interpret it as a server abort. Now, conn-stream's flags are processed at the end of si_cs_recv(). The function is responsible to set the right flags on the stream-interface and/or the channel. Due to this patch, the function is now almost linear. Except some early checks at the beginning, there is only one return statement. It also fixes a potential bug because of an inconsistency between the splicing and the buffered receipt. On the first case, CS_FL_EOS if handled before errors on the connection or the conn-stream. On the second one, it is the opposite. This patch must be backported to 2.0 and 1.9.	2019-11-20 14:11:47 +01:00
Christopher Faulet	04400bc787	BUG/MAJOR: stream-int: Don't receive data from mux until SI_ST_EST is reached This bug is pretty pernicious and have serious consequences : In 2.1, an infinite loop in process_stream() because the backend stream-interface remains in the ready state (SI_ST_RDY). In 2.0, a call in loop to process_stream() because the stream-interface remains blocked in the connect state (SI_ST_CON). In both cases, it happens after a connection retry attempt. In 1.9, it seems to not happen. But it may be just by chance or just because it is harder to get right conditions to trigger the bug. However, reading the code, the bug seems to exist too. Here is how the bug happens in 2.1. When we try to establish a new connection to a server, the corresponding stream-interface is first set to the connect state (SI_ST_CON). When the underlying connection is known to be connected (the flag CO_FL_CONNECTED set), the stream-interface is switched to the ready state (SI_ST_RDY). It is a transient state between the connect state (SI_ST_CON) and the established state (SI_ST_EST). It must be handled on the next call to process_stream(), which is responsible to operate the transition. During all this time, errors can occur. A connection error or a client abort. The transient state SI_ST_RDY was introduced to let a chance to process_stream() to catch these errors before considering the connection as fully established. Unfortunatly, if a read0 is catched in states SI_ST_CON or SI_ST_RDY, it is possible to have a shutdown without transition to SI_ST_DIS (in fact, here, SI_ST_CON is swichted to SI_ST_RDY). This happens if the request was fully received and analyzed. In this case, the flag SI_FL_NOHALF is set on the backend stream-interface. If an error is also reported during the connect, the behavior is undefined because an error is returned to the client and a connection retry is performed. So on the next connection attempt to the server, if another error is reported, a client abort is detected. But the shutdown for writes was already done. So the transition to the state SI_ST_DIS is impossible. We stay in the state SI_ST_RDY. Because it is a transient state, we loop in process_stream() to perform the transition. It is hard to understand how the bug happens reading the code and even harder to explain. But there is a trivial way to hit the bug by sending h2 requests to a server only speaking h1. For instance, with the following config : listen tst bind *:80 server www 127.0.0.1:8000 proto h2 # in reality, it is a HTTP/1.1 server It is a configuration error, but it is an easy way to observe the bug. Note it may happen with a valid configuration. So, after a careful analyzis, it appears that si_cs_recv() should never be called for a not fully established stream-interface. This way the connection retries will be performed before reporting an error to the client. Thus, if a shutdown is performed because a read0 is handled, the stream-interface is inconditionnaly set to the transient state SI_ST_DIS. This patch must be backported to 2.0 and 1.9. However on these versions, this patch reveals a design flaw about connections and a bad way to perform the connection retries. We are working on it.	2019-10-26 08:24:45 +02:00
Christopher Faulet	e55a5a4171	BUG/MEDIUM: stream-int: Process connection/CS errors during synchronous sends If an error occurred on the connection or the conn-stream, no syncrhonous send is performed. If the error was not already processed and there is no more I/O, it will never be processed and the stream will never be notified of this error. This may block the stream until a timeout is reached or infinitly if there is no timeout. Concretly, this bug can be triggered time to time with h2spec, running the test "http2/5.1.1/2". This patch depends on the commit `328ed220a` "BUG/MINOR: stream-int: Process connection/CS errors first in si_cs_send()". Both must be backported to 2.0 and probably to 1.9. In 1.9, the code is totally different, so this patch would have to be adapted.	2019-09-24 10:04:19 +02:00
Christopher Faulet	328ed220a8	BUG/MINOR: stream-int: Process connection/CS errors first in si_cs_send() Errors on the connections or the conn-stream must always be processed in si_cs_send(), even if the stream-interface is already subscribed on sending. This patch does not fix any concrete bug per-se. But it is required by the following one to handle those errors during synchronous sends. This patch must be backported with the following one to 2.0 and probably to 1.9 too, but with caution because the code is really different.	2019-09-24 10:04:05 +02:00
Willy Tarreau	45bcb37f0f	BUG/MINOR: stream-int: also update analysers timeouts on activity Between 1.6 and 1.7, some parts of the stream forwarding process were moved into lower layers and the stream-interface had to keep the stream's task up to date regarding the timeouts. The analyser timeouts were not updated there as it was believed this was not needed during forwarding, but actually there is a case for this which is "option contstats" which periodically triggers the analyser timeout, and this change broke the option in case of sustained traffic (if there is some I/O activity during the same millisecond as the timeout expires, then the update will be missed). This patch simply brings back the analyser expiration updates from process_stream() to stream_int_notify(). It may be backported as far as 1.7, taking care to adjust the fields names if needed.	2019-08-01 18:58:21 +02:00
Willy Tarreau	a64c703374	BUG/MINOR: stream-int: make sure to always release empty buffers after sending There are some situations, after sending a request or response, upon I/O completion, or applet execution, where we end up with an empty buffer that was not released. This results in excessive memory usage (back to 1.5) and a lower CPU cache efficiency since buffers are not recycled as fast. This has changed since the places where we send have changed with the new layering, but not all cases susceptible of leaving an empty buffer were properly spotted. Doing so reduces the memory pressure on buffers by about 2/3 in high traffic tests. This should be backported to 2.0 and maybe 1.9.	2019-08-01 14:34:01 +02:00
Willy Tarreau	7bb447c3dd	MINOR: stream-int: use conn_get_{src,dst} in conn_si_send_proxy() These ones replace the previous conn_get_{from,to}_addr() used to wait for the connection establishment before sending a LOCAL line. The error handling was preserved.	2019-07-19 13:50:09 +02:00
Christopher Faulet	037b3ebd35	BUG/MEDIUM: stream-int: Don't rely on CF_WRITE_PARTIAL to unblock opposite si In the function stream_int_notify(), when the opposite stream-interface is blocked because there is no more room into the input buffer, if the flag CF_WRITE_PARTIAL is set on this buffer, it is unblocked. It is a way to unblock the reads on the other side because some data was sent. But it is a problem during the fast-forwarding because only the stream is able to remove the flag CF_WRITE_PARTIAL. So it is possible to have this flag because of a previous send while the input buffer of the opposite stream-interface is now full. In such case, the opposite stream-interface will be woken up for nothing because its input buffer is full. If the same happens on the opposite side, we will have a loop consumming all the CPU. To fix the bug, the opposite side is now only notify if there is some available room in its input buffer in the function si_cs_send(), so only if some data was sent. This patch must be backported to 2.0 and 1.9.	2019-07-05 14:26:15 +02:00
Christopher Faulet	86162db15c	MINOR: stream-int: Factorize processing done after sending data in si_cs_send() In the function si_cs_send(), what is done when an error occurred on the connection or the conn_stream or when some successfully data was send via a pipe or the channel's buffer may be factorized at the function. It slightly simplify the function. This patch must be backported to 2.0 and 1.9 because a bugfix depends on it.	2019-07-05 14:26:15 +02:00
Olivier Houchard	c31e2cbd28	BUG/MEDIUM: stream_interface: Don't add SI_FL_ERR the state is < SI_ST_CON. Only add SI_FL_ERR if the stream_interface is connected, or is attempting a connection. We may get there because the stream_interface's tasklet was woken up, but before it actually runs, process_stream() may be called, detect that there were an error, and change the state of the stream_interface to SI_ST_TAR. When the stream_interface's tasklet then run, the connection may still have CO_FL_ERROR, but that error was already accounted for, so just ignore it. This should be backported to 2.0.	2019-06-24 19:00:16 +02:00
Willy Tarreau	3c39a7d889	CLEANUP: connection: rename the wait_event.task field to .tasklet It's really confusing to call it a task because it's a tasklet and used in places where tasks and tasklets are used together. Let's rename it to tasklet to remove this confusion.	2019-06-14 14:42:29 +02:00
Olivier Houchard	19a2e2d91e	BUG/MEDIUM: stream_interface: Make sure we call si_cs_process() if CS_FL_EOI. In si_cs_recv(), if we got the CS_FL_EOI flag on the conn_stream, make sure we return 1, so that si_cs_process() will be called, and wake process_stream() up, otherwise if we're unlucky the flag will never be noticed, and the stream won't be woken up.	2019-06-07 19:37:21 +02:00
Willy Tarreau	829bd4710f	MEDIUM: stream: rearrange the events to remove the loop The "goto redo" at the end of process_stream() to make the states converge is still a big source of problems and mostly stems from the very late call to the send() functions, whose results need to be considered, while it's being done in si_update_both() when leaving. This patch extracts the si_sync_send() calls from si_update_both(), and places them at the relevant places in process_stream(), which are just after the amount of data to forward is updated and before the shutw() calls (which were also moved). The stream-interface resynchronization needs to go slightly upper to take into account the transition from CON to RDY that will happen consecutive to some successful send(), and that's all. By doing so we can now get rid of this loop and have si_update_both() called only to update the stream interface and channel when leaving the function, as it was initially designed to work. It is worth noting that a number of the remaining conditions to perform a goto resync_XXX still seem suboptimal and would benefit from being refined to perform les resynchronization. But what matters at this stage is that the code remains valid and efficient.	2019-06-06 16:36:19 +02:00
Willy Tarreau	3b285d7fbd	MINOR: stream-int: make si_sync_send() from the send code of si_update_both() Just like we have a synchronous recv() function for the stream interface, let's have a synchronous send function that we'll be able to call from different places. For now this only moves the code, nothing more.	2019-06-06 16:36:19 +02:00
Willy Tarreau	236c4298b3	MINOR: stream-int: split si_update() into si_update_rx() and si_update_tx() We should not update the two directions at once, in fact we should update the Rx path after recv() and the Tx path after send(). Let's start by splitting the update function in two for this.	2019-06-06 16:36:19 +02:00
Willy Tarreau	b27f54a88c	MAJOR: stream-int: switch from SI_ST_CON to SI_ST_RDY on I/O Now whenever an I/O event succeeds during a connection attempt, we switch the stream-int's state to SI_ST_RDY. This allows si_update() to update R/W timeouts on the channel and end points to start to consume outgoing data and to subscribe to lower layers in case of failure. It also allows chk_rcv() to be performed on the other side to enable data forwarding and make sure we don't fall into a situation where no more events happen and nothing moves anymore.	2019-06-06 16:36:19 +02:00
Willy Tarreau	4f283fa604	MEDIUM: stream-int: introduce a new state SI_ST_RDY The main reason for all the trouble we're facing with stream interface error or timeout reports during the connection phase is that we currently can't make the difference between a connection attempt and a validated connection attempt. It is problematic because we tend to switch early to SI_ST_EST but can't always do what we want in this state since it's supposed to be set when we don't need to visit sess_establish() again. This patch introduces a new state betwen SI_ST_CON and SI_ST_EST, which is SI_ST_RDY. It indicates that we've verified that the connection is ready. It's a transient state, like SI_ST_DIS, that cannot persist when leaving process_stream(). For now it is not set, only verified in various tests where SI_ST_CON was used or SI_ST_EST depending on the cases. The stream-int state diagram was minimally updated to reflect the new state, though it is largely obsolete and would need to be seriously updated.	2019-06-06 16:36:19 +02:00
Willy Tarreau	7ab22adbf7	MEDIUM: stream-int: remove dangerous interval checks for stream-int states The stream interface state checks involving ranges were replaced with checks on a set of states, already revealing some issues. No issue was fixed, all was replaced in a one-to-one mapping for easier control. Some checks involving a strict difference were also replaced with fields to be clearer. At this stage, the result must be strictly equivalent. A few tests were also turned to their bit-field equivalent for better readability or in preparation for upcoming changes. The test performed in the SPOE filter was swapped so that the closed and error states are evicted first and that the established vs conn state is tested second.	2019-06-06 16:36:19 +02:00
Olivier Houchard	03abf2d31e	MEDIUM: connections: Remove CONN_FL_SOCK* Now that the various handshakes come with their own XPRT, there's no need for the CONN_FL_SOCK* flags, and the conn_sock_want\|stop functions, so garbage-collect them.	2019-06-05 18:03:38 +02:00
Willy Tarreau	6499b9d996	BUG/MEDIUM: connection: fix multiple handshake polling issues Connection handshakes were rarely stacked on top of each other, but the recent experiments consisting in sending PROXY over SOCKS4 revealed a number of issues in these lower layers. First, each handler waiting for data MUST subscribe to recv events with __conn_sock_want_recv() and MUST unsubscribe from send events using __conn_sock_stop_send() to avoid any wake-up loop in case a previous sender has set this. Second, each handler waiting for sending MUST subscribe to send events with __conn_sock_want_send() and MUST unsubscribe from recv events using __conn_sock_stop_recv() to avoid any wake-up loop in case some data are available on the connection. Till now this was done at various random places, and in particular the cases where the FD was not ready for recv forgot to re-enable reading. Second, while senders can happily use conn_sock_send() which automatically handles EINTR, loops, and marks the FD as not ready with fd_cant_send(), there is no equivalent for recv so receivers facing EAGAIN MUST call fd_cant_send() to enable polling. It could be argued that implementing an equivalent conn_sock_recv() function could be useful and more long-term proof than the current situation. Third, both types of handlers MUST unsubscribe from their respective events once they managed to do their job, and none may even play with __conn_xprt_*(). Here again this was lacking, and one surprizing call to __conn_xprt_stop_recv() was present in the proxy protocol parser for TCP6 messages! Thanks to Alexander Liu for his help on this issue. This patch must be backported to 1.9 and possibly some older versions, though the SOCKS parts should be dropped.	2019-06-03 08:31:22 +02:00
Olivier Houchard	661167d136	BUG/MEDIUM: connection: Use the session to get the origin address if needed. In conn_si_send_proxy(), if we don't have a conn_stream yet, because the mux won't be created until the SSL handshake is done, retrieve the opposite's connection from the session. At this point, we know the session associated with the connection is the one that initiated it, and we can thus just use the session's origin. This should be backported to 1.9.	2019-05-29 17:56:59 +02:00
Christopher Faulet	9cdd5036f3	MINOR: stream-int: Don't use the flag CO_RFL_KEEP_RSV anymore in si_cs_recv() Because the channel_recv_max() always return the right value, for HTX and legacy streams, we don't need to set this flag. The multiplexer don't use it anymore.	2019-05-28 07:42:12 +02:00
Christopher Faulet	297fbb45fe	MINOR: htx: Replace the function http_find_stline() by http_get_stline() Now, we only return the start-line. If not found, NULL is returned. No lookup is performed and the HTX message is no more updated. It is now the caller responsibility to update the position of the start-line to the right value. So when it is not found, i.e sl_pos is set to -1, it means the last start-line has been already processed and the next one has not been inserted yet. It is mandatory to rely on this kind of warranty to store 1xx informational responses and final reponse in the same HTX message.	2019-05-28 07:42:12 +02:00
Christopher Faulet	8e9e3ef15c	BUG/MINOR: mux-h1: Report EOI instead EOS on parsing error or H2 upgrade When a parsing error occurrs in the H1 multiplexer, we stop to copy HTX blocks. So the error may be reported with an emtpy HTX message. For instance, if the headers parsing failed. When it happens, the flag CS_FL_EOS is also set on the conn_stream. But it is an error. Most of time, it is set on established connections, so it is not really an issue. But if it happens when the server connection is not fully established, the connection is shut down immediatly and the stream-interface is switched from SI_ST_CON to SI_ST_DIS/CLO. So HTX analyzers have no chance to catch the error. Instead of setting CS_FL_EOS, it is fairly better to set CS_FL_EOI, which is the right flag to use. The same is also done on H2 upgrade. As a side effet of this fix, in the stream-interface code, we must now set the flag CF_READ_PARTIAL on the channel when the flag CF_EOI is set. It is a warranty to wakeup the stream when EOI is reported to the channel while no data are received. This patch must be backported to 1.9.	2019-05-24 09:11:01 +02:00
Olivier Houchard	aacc405c1f	BUG/MEDIUM: streams: Don't switch from SI_ST_CON to SI_ST_DIS on read0. When we receive a read0, and we're still in SI_ST_CON state (so on an outgoing conneciton), don't immediately switch to SI_ST_DIS, or, we would never call sess_establish(), and so the analysers will never run. Instead, let sess_establish() handle that case, and switch to SI_ST_DIS if we already have CF_SHUTR on the channel. This should be backported to 1.9.	2019-05-21 19:05:09 +02:00
Olivier Houchard	ce1a0292bf	BUG/MEDIUM: streams: Don't use CF_EOI to decide if the request is complete. In si_cs_send(), don't check CF_EOI on the request channel to decide if the request is complete and if we should save the buffer to eventually attempt L7 retries. The flag may not be set yet, and it may too be set to early, before we're done modifying the buffer. Instead, get the msg, and make sure its state is HTTP_MSG_DONE. That way we will store the request buffer when sending it even in H2.	2019-05-17 15:49:21 +02:00
Olivier Houchard	a254a37ad7	MEDIUM: streams: Add the ability to retry a request on L7 failure. When running in HTX mode, if we sent the request, but failed to get the answer, either because the server just closed its socket, we hit a server timeout, or we get a 404, 408, 425, 500, 501, 502, 503 or 504 error, attempt to retry the request, exactly as if we just failed to connect to the server. To do so, add a new backend keyword, "retry-on". It accepts a list of keywords, which can be "none" (never retry), "conn-failure" (we failed to connect, or to do the SSL handshake), "empty-response" (the server closed the connection without answering), "response-timeout" (we timed out while waiting for the server response), or "404", "408", "425", "500", "501", "502", "503" and "504". The default is "conn-failure".	2019-05-04 10:19:56 +02:00
Olivier Houchard	51205a1958	BUG/MEDIUM: applets: Don't use task_in_rq(). When deciding if we want to wake the task of an applet up, don't give up if task_in_rq returns 1, as there's a race condition and another thread may run it. Instead, always attempt to task_wakeup(), at worst the task is already in the run queue, and nothing will happen.	2019-04-17 19:30:23 +02:00
Olivier Houchard	b2fc04ebef	BUG/MEDIUM: stream_interface: Don't bother doing chk_rcv/snd if not connected. If the interface is not in state SI_ST_CON or SI_ST_EST, don't bother trying to send/recv data, we can't do it anyway, and if we're in SI_ST_TAR, that may lead to adding the SI_FL_ERR flag back on the stream_interface, while we don't want it. This should be backported to 1.9.	2019-04-12 13:14:55 +02:00
Olivier Houchard	86dcad6c62	BUG/MEDIUM: stream: Don't clear the stream_interface flags in si_update_both. In commit `d7704b534`, we introduced and expiration flag on the stream interface, which is used for the connect, the queue and the turn around. Because the turn around state isn't an error, the flag was reset in process_stream(), and later in commit `cff6411f9` when introducing the SI_FL_ERR flag, the cleanup of the flag at this place was erroneously generalized. To fix this, the SI_FL_EXP flag is only cleared at the end of the turn around state, and nobody should clear the stream interface flags anymore. This should be backported to 1.9, it has no known impact on older versions.	2019-04-09 19:31:22 +02:00
Olivier Houchard	39cc020af1	BUG/MEDIUM: streams: Don't remove the SI_FL_ERR flag in si_update_both(). Don't inconditionally remove the SI_FL_ERR code in si_update_both(), which is called at the end of process_stream(). Doing so was a bug that was there since the flag was introduced, because we were always setting si->flags to SI_FL_NONE, however we don't want to lose that one, except if we will retry connecting, so only remove it in sess_update_st_cer(). This should be backported to 1.9.	2019-04-09 19:31:22 +02:00
Willy Tarreau	65e04eb2bb	MINOR: channel: don't unset CF_SHUTR_NOW after shutting down. This flag is set by the stream layer to request an abort, and results in CF_SHUTR being set once the abort is performed. However by analogy with the send side, the flag was removed once the CF_SHUTR flag was set, thus we lose the information about the cause of the shutr. This is what creates the confusion that sometimes arises between client and server aborts. This patch makes sure we don't remove this flag anymore in this case. All call places only use it to perform the shutr and already check it against CF_SHUTR. So no condition needs to be updated to take this into account. Some later, more careful changes may consist in refining the conditions where we report a client reset or a server reset to ignore SHUTR when SHUTR_NOW is set so that we don't report such misleading information anymore.	2019-03-25 18:35:05 +01:00
Christopher Faulet	87a8f353f1	CLEANUP: muxes/stream-int: Remove flags CS_FL_READ_NULL and SI_FL_READ_NULL Since the flag CF_SHUTR is no more set to mark the end of the message, these flags become useless. This patch should be backported to 1.9.	2019-03-25 06:55:23 +01:00
Christopher Faulet	297d3e2e0f	MINOR: channel: Report EOI on the input channel if it was reached in the mux The flag CF_EOI is now set on the input channel when the flag CS_FL_EOI is set on the corresponding conn_stream. In addition, if a read activity is reported when this flag is set, the stream is woken up. This patch should be backported to 1.9.	2019-03-25 06:24:43 +01:00
Christopher Faulet	203b2b0a5a	MINOR: muxes: Report the Last read with a dedicated flag For conveniance, in HTTP muxes (h1 and h2), the end of the stream and the end of the message are reported the same way to the stream, by setting the flag CS_FL_EOS. In the stream-interface, when CS_FL_EOS is detected, a shutdown for read is reported on the channel side. This is historical. With the legacy HTTP layer, because the parsing is done by the stream in HTTP analyzers, the EOS really means a shutdown for read. Most of time, for muxes h1 and h2, it works pretty well, especially because the keep-alive is handled by the muxes. The stream is only used for one transaction. So mixing EOS and EOM is good enough. But not everytime. For now, client aborts are only reported if it happens before the end of the request. It is an error and it is properly handled. But because the EOS was already reported, client aborts after the end of the request are silently ignored. Eventually an error can be reported when the response is sent to the client, if the sending fails. Otherwise, if the server does not reply fast enough, an error is reported when the server timeout is reached. It is the expected behaviour, excpect when the option abortonclose is set. In this case, we must report an error when the client aborts. But as said before, this event can be ignored. So to be short, for now, the abortonclose is broken. In fact, it is a design problem and we have to rethink all channel's flags and probably the conn-stream ones too. It is important to split EOS and EOM to not loose information anymore. But it is not a small job and the refactoring will be far from straightforward. So for now, temporary flags are introduced. When the last read is received, the flag CS_FL_READ_NULL is set on the conn-stream. This way, we can set the flag SI_FL_READ_NULL on the stream interface. Both flags are persistant. And to be sure to wake the stream, the event CF_READ_NULL is reported. So the stream will always have the chance to handle the last read. This patch must be backported to 1.9 because it will be used by another patch to fix the option abortonclose.	2019-03-18 15:50:23 +01:00

1 2 3 4 5 ...

384 Commits