haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2024-12-14 15:34:35 +00:00

Author	SHA1	Message	Date
Willy Tarreau	a1bd1faeeb	BUILD: use inttypes.h instead of stdint.h I found on an (old) AIX 5.1 machine that stdint.h didn't exist while inttypes.h which is expected to include it does exist and provides the desired functionalities. As explained here, stdint being just a subset of inttypes for use in freestanding environments, it's probably always OK to switch to inttypes instead: https://pubs.opengroup.org/onlinepubs/009696799/basedefs/stdint.h.html Also it's even clearer here in the autoconf doc : https://www.gnu.org/software/autoconf/manual/autoconf-2.61/html_node/Header-Portability.html "The C99 standard says that inttypes.h includes stdint.h, so there's no need to include stdint.h separately in a standard environment. Some implementations have inttypes.h but not stdint.h (e.g., Solaris 7), but we don't know of any implementation that has stdint.h but not inttypes.h"	2019-04-01 07:44:56 +02:00
Christopher Faulet	c6827d52c1	MINOR: channel/htx: Add function to skips output bytes from an HTX channel It is the HTX version of co_skip(). Internally, It uses the function htx_drain(). It will be used by other commits to fix bugs, so it must be backported to 1.9.	2019-02-26 14:04:23 +01:00
Christopher Faulet	729b5b308c	BUG/MINOR: channel: Set CF_WROTE_DATA when outgoing data are skipped in co_skip(), the flag CF_WRITE_PARTIAL is set on the channel. The flag CF_WROTE_DATA must also be set to notify the channel some data were sent. This patch must be backported to 1.9.	2019-02-26 14:04:23 +01:00
Christopher Faulet	f7ed195ac8	MINOR: channel/htx: Add the HTX version of channel_truncate/erase The function channel_htx_truncate() can now be used on HTX buffer to truncate all incoming data, keeping outgoing one intact. This function relies on the function channel_htx_erase() and htx_truncate(). This patch may be backported to 1.9. If so, the patch "MINOR: channel/htx: Add the HTX version of channel_truncate()" must also be backported.	2019-01-08 12:06:55 +01:00
Christopher Faulet	5811db0043	MINOR: channel/htx: Add HTX version for some helper functions HTX versions for functions to test the free space in input against the reserve have been added. Now, on HTX streams, following functions can be used: * channel_htx_may_recv * channel_htx_recv_limit * channel_htx_recv_max * channel_htx_full This patch must be backported in 1.9 because it will be used by a futher patch to fix a bug.	2019-01-07 16:32:05 +01:00
Christopher Faulet	e64582929f	MINOR: channel: Add the function channel_add_input This function must be called when new incoming data are pushed in the channel's buffer. It updates the channel state and take care of the fast forwarding by consuming right amount of data and decrementing "->to_forward" accordingly when necessary. In fact, this patch just moves a part of ci_putblk in a dedicated function. This patch must be backported to 1.9.	2019-01-02 20:12:44 +01:00
Willy Tarreau	b96b77ed6e	REORG: htx: merge types+proto into common/htx.h All the HTX definition is self-contained and doesn't really depend on anything external since it's a mostly protocol. In addition, some external similar files (like h2) also placed in common used to rely on it, making it a bit awkward. This patch moves the two htx.h files into a single self-contained one. The historical dependency on sample.h could be also removed since it used to be there only for http_meth_t which is now in http.h.	2018-12-11 17:15:04 +01:00
Christopher Faulet	b2aedea142	MEDIUM: channel/htx: Add functions for forward HTX data To ease the fast forwarding and the infinte forwarding on HTX proxies, 2 functions have been added to let the channel be almost aware of the way data are stored in its buffer. By calling these functions instead of legacy ones, we are sure to forward the right amount of data.	2018-12-05 17:29:30 +01:00
Willy Tarreau	ede3d884fc	MEDIUM: channel: merge back flags CF_WRITE_PARTIAL and CF_WRITE_EVENT The behaviour of the flag CF_WRITE_PARTIAL was modified by commit `95fad5ba4` ("BUG/MAJOR: stream-int: don't re-arm recv if send fails") due to a situation where it could trigger an immediate wake up of the other side, both acting in loops via the FD cache. This loss has caused the need to introduce CF_WRITE_EVENT as commit `c5a9d5bf`, to replace it, but both flags express more or less the same thing and this distinction creates a lot of confusion and complexity in the code. Since the FD cache now acts via tasklets, the issue worked around in the first patch no longer exists, so it's more than time to kill this hack and to restore CF_WRITE_PARTIAL's semantics (i.e.: there has been some write activity since we last left process_stream). This patch mostly reverts the two commits above. Only the part making use of CF_WROTE_DATA instead of CF_WRITE_PARTIAL to detect the loss of data upon connection setup was kept because it's more accurate and better suited.	2018-10-26 08:32:57 +02:00
Christopher Faulet	d44a9b3627	MEDIUM: mux: Remove const on the buffer in mux->snd_buf() This is a partial revert of the commit `deccd1116` ("MEDIUM: mux: make mux->snd_buf() take the byte count in argument"). It is a requirement to do zero-copy transfers. This will be mandatory when the TX buffer of the conn_stream will be used. So, now, data are consumed by mux->snd_buf() and not only sent. So it needs to update the buffer state. On its side, the caller must be aware the buffer can be replaced y an empty or unallocated one. As a side effet of this change, the function co_set_data() is now only responsible to update the channel set, by update ->output field.	2018-08-07 14:36:52 +02:00
Willy Tarreau	83061a820e	MAJOR: chunks: replace struct chunk with struct buffer Now all the code used to manipulate chunks uses a struct buffer instead. The functions are still called "chunk*", and some of them will progressively move to the generic buffer handling code as they are cleaned up.	2018-07-19 16:23:43 +02:00
Willy Tarreau	843b7cbe9d	MEDIUM: chunks: make the chunk struct's fields match the buffer struct Chunks are only a subset of a buffer (a non-wrapping version with no head offset). Despite this we still carry a lot of duplicated code between buffers and chunks. Replacing chunks with buffers would significantly reduce the maintenance efforts. This first patch renames the chunk's fields to match the name and types used by struct buffers, with the goal of isolating the code changes from the declaration changes. Most of the changes were made with spatch using this coccinelle script : @rule_d1@ typedef chunk; struct chunk chunk; @@ - chunk.str + chunk.area @rule_d2@ typedef chunk; struct chunk chunk; @@ - chunk.len + chunk.data @rule_i1@ typedef chunk; struct chunk chunk; @@ - chunk->str + chunk->area @rule_i2@ typedef chunk; struct chunk chunk; @@ - chunk->len + chunk->data Some minor updates to 3 http functions had to be performed to take size_t ints instead of ints in order to match the unsigned length here.	2018-07-19 16:23:43 +02:00
Willy Tarreau	c9fa0480af	MAJOR: buffer: finalize buffer detachment Now the buffers only contain the header and a pointer to the storage area which can be anywhere. This will significantly simplify buffer swapping and will make it possible to map chunks on buffers as well. The buf_empty variable was removed, as now it's enough to have size==0 and area==NULL to designate the empty buffer (thus a non-allocated head is the empty buffer by default). buf_wanted for now is indicated by size==0 and area==(void *)1. The channels and the checks now embed the buffer's head, and the only pointer is to the storage area. This slightly increases the unallocated buffer size (3 extra ints for the empty buffer) but considerably simplifies dynamic buffer management. It will also later permit to detach unused checks. The way the struct buffer is arranged has proven quite efficient on a number of tests, which makes sense given that size is always accessed and often first, followed by the othe ones.	2018-07-19 16:23:43 +02:00
Willy Tarreau	bd1dba8a89	MINOR: buffer: rename the data length member to '->data' It used to be called 'len' during the reorganisation but strictly speaking it's not a length since it wraps. Also we already use '_data' as the suffix to count available data, and data is also what we use to indicate the amount of data in a pipe so let's improve consistency here. It was important to do this in two operations because data used to be the name of the pointer to the storage area.	2018-07-19 16:23:43 +02:00
Willy Tarreau	4d893d440c	MINOR: buffers/channel: replace buffer_insert_line2() with ci_insert_line2() There was no point keeping that function in the buffer part since it's exclusively used by HTTP at the channel level, since it also automatically appends the CRLF. This further cleans up the buffer code.	2018-07-19 16:23:43 +02:00
Olivier Houchard	08afac0fd7	MEDIUM: buffers: move "output" from struct buffer to struct channel Since we never access this field directly anymore, but only through the channel's wrappers, it can now move to the channel. The buffers are now completely free from the distinction between input and output data.	2018-07-19 16:23:43 +02:00
Willy Tarreau	abed1e7f34	MINOR: buffer: remove the check for output on b_del() b_del() is used in : - mux_h2 with the demux buffer : always processes input data - checks with output data though output is not considered at all there - b_eat() which is not used anywhere - co_skip() where the len is always <= output Thus the distinction for output data is not needed anymore and the decrement can be made inconditionally in co_skip().	2018-07-19 16:23:43 +02:00
Willy Tarreau	d54a8ceb97	MAJOR: start to change buffer API This is intentionally the minimal and safest set of changes, some cleanups area still required. These changes are quite tricky and cannot be independantly tested, so it's important to keep this patch as bisectable as possible. buf_empty and buf_wanted were changed and are now exactly similar since there's no <p> member in the structure anymore. Given that no test is ever made in the code to check that buf == &buf_wanted, it may be possible that we don't need to have two anymore, unless some buf_empty tests have precedence. This will have to be investigated. A significant part of this commit affects the HTTP compression code, which used to deeply manipulate the input and output buffers without any reasonable solution for a better abstraction. For this reason, if any regression is met and designates this patch as the culprit, it is important to run tests which specifically involve compression or which definitely don't use it in order to spot the issue. Cc: Olivier Houchard <ohouchard@haproxy.com>	2018-07-19 16:23:42 +02:00
Willy Tarreau	cd9e60db00	MEDIUM: channel: adapt to the new buffer API Also, ci_swpbuf() was removed (unused).	2018-07-19 16:23:42 +02:00
Olivier Houchard	d4251a7e98	MINOR: channel: Add co_set_data(). Add a new function that lets one set the channel's output amount.	2018-07-19 16:23:42 +02:00
Willy Tarreau	3ee8344b7b	MINOR: channel: remove almost all references to buf->i and buf->o We use ci_data() and co_data() instead now everywhere we read these values.	2018-07-19 16:23:42 +02:00
Willy Tarreau	50227f9b88	MINOR: buffer: use c_head() instead of buffer_wrap_sub(c->buf, p-o) This way we don't need o anymore.	2018-07-19 16:23:42 +02:00
Willy Tarreau	3f6799975f	MINOR: buffer: replace bi_space_for_replace() with ci_space_for_replace() This one computes the size that can be overwritten over the input part of the buffer, so it's channel-specific.	2018-07-19 16:23:41 +02:00
Willy Tarreau	2375233ef0	MINOR: buffer: replace buffer_full() with channel_full() It's only used by channels since we need to know the amount of output data.	2018-07-19 16:23:41 +02:00
Willy Tarreau	0c7ed5d264	MINOR: buffer: replace buffer_empty() with b_empty() or c_empty() For the same consistency reasons, let's use b_empty() at the few places where an empty buffer is expected, or c_empty() if it's done on a channel. Some of these places were there to realign the buffer so {b,c}_realign_if_empty() was used instead.	2018-07-19 16:23:41 +02:00
Willy Tarreau	55f3ce1c91	MINOR: buffer: make b_getblk_nc() take size_t for the block sizes Till now we used to reimplement it using ints to limit external changes but we must adjust it and the various users to switch to size_t.	2018-07-19 16:23:41 +02:00
Willy Tarreau	206ba834ef	MINOR: buffer: make b_getblk_nc() take const pointers Now that there are no more users requiring to modify the buffer anymore, switch these ones to const char and const buffer. This will make it more obvious next time send functions are tempted to modify the buffer's output count. Minor adaptations were necessary at a few call places which were using char due to the function's previous prototype.	2018-07-19 16:23:41 +02:00
Willy Tarreau	e5f12ce7f2	MINOR: buffer: replace bi_del() and bo_del() with b_del() Till now the callers had to know which one to call for specific use cases. Let's fuse them now since a single one will remain after the API migration. Given that bi_del() may only be used where o==0, just combine the two tests by first removing output data then only input.	2018-07-19 16:23:40 +02:00
Willy Tarreau	7194d3cc3b	MINOR: buffer: split bi_contig_data() into ci_contig_data and b_config_data() This function was sometimes used from a channel and sometimes from a buffer. In both cases it requires knowledge of the size of the output data (to skip them). Here the split ensures the channel can deal with this point, and that other places not having output data can continue to work.	2018-07-19 16:23:40 +02:00
Willy Tarreau	bcbd39370f	MINOR: channel/buffer: replace b_{adv,rew} with c_{adv,rew} These ones manipulate the output data count which will be specific to the channel soon, so prepare the call points to use the channel only. The b_* functions are now unused and were removed.	2018-07-19 16:23:40 +02:00
Willy Tarreau	fd8d42f496	MEDIUM: channel: make channel_slow_realign() take a swap buffer The few call places where it's used can use the trash as a swap buffer, which is made for this exact purpose. This way we can rely on the generic b_slow_realign() call.	2018-07-19 16:23:40 +02:00
Willy Tarreau	4cf1300e6a	MINOR: channel/buffer: replace buffer_slow_realign() with channel_slow_realign() and b_slow_realign() Where relevant, the channel version is used instead. The buffer version was ported to be more generic and now takes a swap buffer and the output byte count to know where to set the alignment point. The H2 mux still uses buffer_slow_realign() with buf->o but it will change later.	2018-07-19 16:23:40 +02:00
Willy Tarreau	08d5ac8f27	MINOR: channel: add a few basic functions for the new buffer API This adds : - c_orig() : channel buffer's origin - c_size() : channel buffer's size - c_wrap() : channel buffer's wrapping location - c_data() : channel buffer's total data count - c_room() : room left in channel buffer's - c_empty() : true if channel buffer is empty - c_full() : true if channel buffer is full - c_ptr() : pointer to an offset relative to input data in the buffer - c_adv() : advances the channel's buffer (bytes become part of output) - c_rew() : rewinds the channel's buffer (output bytes not output anymore) - c_realign_if_empty() : realigns the buffer if it's empty - co_data() : # of output data - co_head() : beginning of output data - co_tail() : end of output data - ci_data() : # of input data - ci_head() : beginning of input data - ci_tail() : end of input data - ci_stop() : location after ci_tail() - ci_next() : pointer to next input byte And for the ci_* / co_* functions above, the "__*" variants which disable wrapping checks, and the "_ofs" variants which return an offset relative to the buffer's origin instead.	2018-07-19 16:23:39 +02:00
Olivier Houchard	673867c357	MAJOR: applets: Use tasks, instead of rolling our own scheduler. There's no real reason to have a specific scheduler for applets anymore, so nuke it and just use tasks. This comes with some benefits, the first one being that applets cannot induce high latencies anymore since they share nice values with other tasks. Later it will be possible to configure the applets' nice value. The second benefit is that the applet scheduler was not very thread-friendly, having a big lock around it in prevision of this change. Thus applet-intensive workloads should now scale much better with threads. Some more improvement is possible now : some applets also use a task to handle timers and timeouts. These ones could now be simplified to use only one task.	2018-05-26 20:03:30 +02:00
Christopher Faulet	c5a9d5bf23	BUG/MEDIUM: stream-int: Don't loss write's notifs when a stream is woken up When a write activity is reported on a channel, it is important to keep this information for the stream because it take part on the analyzers' triggering. When some data are written, the flag CF_WRITE_PARTIAL is set. It participates to the task's timeout updates and to the stream's waking. It is also used in CF_MASK_ANALYSER mask to trigger channels anaylzers. In the past, it was cleared by process_stream. Because of a bug (fixed in commit `95fad5ba4` ["BUG/MAJOR: stream-int: don't re-arm recv if send fails"]), It is now cleared before each send and in stream_int_notify. So it is possible to loss this information when process_stream is called, preventing analyzers to be called, and possibly leading to a stalled stream. Today, this happens in HTTP2 when you call the stat page or when you use the cache filter. In fact, this happens when the response is sent by an applet. In HTTP1, everything seems to work as expected. To fix the problem, we need to make the difference between the write activity reported to lower layers and the one reported to the stream. So the flag CF_WRITE_EVENT has been added to notify the stream of the write activity on a channel. It is set when a send succedded and reset by process_stream. It is also used in CF_MASK_ANALYSER. finally, it is checked in stream_int_notify to wake up a stream and in channel_check_timeouts. This bug is probably present in 1.7 but it seems to have no effect. So for now, no needs to backport it.	2017-11-09 15:16:05 +01:00
Christopher Faulet	2a944ee16b	BUILD: threads: Rename SPIN/RWLOCK macros using HA_ prefix This remove any name conflicts, especially on Solaris.	2017-11-07 11:10:24 +01:00
Emeric Brun	a1dd243adb	MAJOR: threads/buffer: Make buffer wait queue thread safe Adds a global lock to protect the buffer wait queue.	2017-10-31 13:58:31 +01:00
Willy Tarreau	41ab86898e	MINOR: channel: make the channel be a const in all {ci,co}_get* functions There's no point having the channel marked writable as these functions only extract data from the channel. The code was retrieved from their ci/co ancestors.	2017-10-19 15:01:08 +02:00
Willy Tarreau	06d80a9a9c	REORG: channel: finally rename the last bi_* / bo_* functions For HTTP/2 we'll need some buffer-only equivalent functions to some of the ones applying to channels and still squatting the bi_* / bo_* namespace. Since these names have kept being misleading for quite some time now and are really getting annoying, it's time to rename them. This commit will use "ci/co" as the prefix (for "channel in", "channel out") instead of "bi/bo". The following ones were renamed : bi_getblk_nc, bi_getline_nc, bi_putblk, bi_putchr, bo_getblk, bo_getblk_nc, bo_getline, bo_getline_nc, bo_inject, bi_putchk, bi_putstr, bo_getchr, bo_skip, bi_swpbuf	2017-10-19 15:01:08 +02:00
Christopher Faulet	533182f1c8	CLEANUP: http: Remove channel_congested function Not used anymore since last commit.	2017-03-31 14:38:08 +02:00
Christopher Faulet	a73e59b690	BUG/MAJOR: Fix how the list of entities waiting for a buffer is handled When an entity tries to get a buffer, if it cannot be allocted, for example because the number of buffers which may be allocated per process is limited, this entity is added in a list (called <buffer_wq>) and wait for an available buffer. Historically, the <buffer_wq> list was logically attached to streams because it were the only entities likely to be added in it. Now, applets can also be waiting for a free buffer. And with filters, we could imagine to have more other entities waiting for a buffer. So it make sense to have a generic list. Anyway, with the current design there is a bug. When an applet failed to get a buffer, it will wait. But we add the stream attached to the applet in <buffer_wq>, instead of the applet itself. So when a buffer is available, we wake up the stream and not the waiting applet. So, it is possible to have waiting applets and never awakened. So, now, <buffer_wq> is independant from streams. And we really add the waiting entity in <buffer_wq>. To be generic, the entity is responsible to define the callback used to awaken it. In addition, applets will still request an input buffer when they become active. But they will not be sleeped anymore if no buffer are available. So this is the responsibility to the applet I/O handler to check if this buffer is allocated or not. This way, an applet can decide if this buffer is required or not and can do additional processing if not. [wt: backport to 1.7 and 1.6]	2016-12-12 19:11:04 +01:00
Willy Tarreau	8bf242b764	BUG/MEDIUM: channel: fix inconsistent handling of 4GB-1 transfers In 1.4-dev3, commit `31971e5` ("[MEDIUM] add support for infinite forwarding") made it possible to configure the lower layer to forward data indefinitely by setting the forward size to CHN_INFINITE_FORWARD (4GB-1). By then larger chunk sizes were not supported so there was no confusion in the usage of the function. Since 1.5 we support 64-bit content-lengths and chunk sizes and the function has grown to support 64-bit arguments, though it still limits a single pass to 32-bit quantities (what fit in the channel's to_forward field). The issue now becomes that a 4GB-1 content-length can be confused with infinite forwarding (in fact it's 4GB-1+what was already in the buffer). It causes a visible effect when transferring this exact size because the transfer rate is lower than with other sizes due in part to the disabling of the Nagle algorithm on the sendto() call. In theory with keep-alive it should prevent a second request from being processed after such a transfer, but since the analysers are still present, the forwarding analyser properly counts down the remaining size to transfer and ultimately the transaction gets correctly reset so there is no visible effect. Since the root cause of the issue is an API problem (lack of distinction between a real valid length and a magic value), this patch modifies the API to have a new dedicated function called channel_forward_forever() to program a permanent forwarding. The existing function __channel_forward() was modified to properly take care of the requested sizes and ensure it 1) never overflows and 2) never reaches CHN_INFINITE_FORWARD by accident. It is worth noting that the function used to have a bug causing a 2GB forward to be scheduled if it was called with less data than what is present in buf->i. Fortunately this bug couldn't be triggered with existing code. This fix should be backported to 1.6 and 1.5. While it also theorically affects 1.4, it's better not to backport it there, as the risk of breaking large object transfers due to significant API differences is high, compared to the fact that the largest supported objects (4GB-1) are just slower to transfer.	2016-05-04 15:26:37 +02:00
Willy Tarreau	ef907fee12	BUG/MAJOR: channel: fix miscalculation of available buffer space (4th try) Unfortunately, commit `169c470` ("BUG/MEDIUM: channel: fix miscalculation of available buffer space (3rd try)") was still not enough to completely address the issue. It fell into an integer comparison trap. Contrary to expectations, chn->to_forward may also have the sign bit set when forwarding regular data having a large content-length, resulting in an incomplete check of the result and of the reserve because the with to_forward very large, to_forward+o could become very small and also the reserve could become positive again and make channel_recv_limit() return a negative value. One way to reproduce this situation is to transfer a large file (> 2GB) with http-keep-alive or http-server-close, without splicing, and ensure that the server uses content-length instead of chunks. The transfer should stall very early after the first buffer has been transferred to the client. This fix now properly checks 1) for an overflow caused by summing o and to_forward, and 2) for o+to_forward being smaller or larger than maxrw before performing the subtract, so that all sensitive operations are properly performed on 33-bit arithmetics. The code was subjected again to a series of tests using inject+httpterm scanning a wide range of object sizes (+10MB after each new request) : $ printf "new page 1\nget 127.0.0.1:8002 / s=%%s0m\n" \| \ inject64 -o 1 -u 1 -f /dev/stdin With previous fix, the transfer would suddenly stop when reaching 2GB : hits ^hits hits/s ^h/s bytes kB/s last errs tout htime sdht ptime 203 1 2 1 216816173354 2710202 3144892 0 0 685.0 0.0 685.0 205 2 2 2 219257283186 2706880 2441109 0 0 679.5 6.5 679.5 205 0 2 0 219257283186 2673836 0 0 0 0.0 0.0 0.0 205 0 2 0 219257283186 2641622 0 0 0 0.0 0.0 0.0 205 0 2 0 219257283186 2610174 0 0 0 0.0 0.0 0.0 Now it's fine even past 4 GB. Many thanks to Vedran Furac for reporting this issue early with a common access pattern helping to troubleshoot this. This fix must be backported to 1.6 and 1.5 where the commit above was already backported.	2016-05-03 17:58:03 +02:00
Willy Tarreau	55e58f2334	MINOR: channel: add new function channel_congested() This function returns non-zero if the channel is congested with data in transit waiting for leaving, indicating to the caller that it should wait for the reserve to be released before starting to process new data in case it needs the ability to append data. This is meant to be used while waiting for a clean response buffer before processing a request.	2016-05-02 16:39:22 +02:00
Willy Tarreau	169c47028a	BUG/MEDIUM: channel: fix miscalculation of available buffer space (3rd try) Latest fix `8a32106` ("BUG/MEDIUM: channel: fix miscalculation of available buffer space (2nd try)") did happen to fix some observable issues but not all of them in fact, some corner cases still remained and at least one user reported a busy loop that appeared possible, though not easily reproducible under experimental conditions. The remaining issue is that we still consider min(i, to_fwd) as the number of bytes in transit, but in fact <i> is not relevant here. Indeed, what matters is that we can read everything we want at once provided that at the end, <i> cannot be larger than <size-maxrw> (if it was not already). This is visible in two cases : - let's have i=o=max/2 and to_fwd=0. Then i+o >= max indicates that the buffer is already full, while it is not since once <o> is forwarded, some space remains. - when to_fwd is much larger than i, it's obvious that we can fill the buffer. The only relevant part in fact is o + to_fwd. to_fwd will ensure that at least this many bytes will be moved from <i> to <o> hence will leave the buffer, whatever the number of rounds it takes. Interestingly, the fix applied here ensures that channel_recv_max() will now equal (size - maxrw - i + to_fwd), which is indeed what remains available below maxrw after to_fwd bytes are forwarded from i to o and leave the buffer. Additionally, the latest fix made it possible to meet an integer overflow that was not caught by the range test when forwarding in TCP or tunnel mode due to to_forward being added to an existing value, causing the buffer size to be limited when it should not have been, resulting in 2 to 3 recv() calls when a single one was enough. The first one was limited to the unreserved buffer size, the second one to the size of the reserve minus 1, and the last one to the last byte. Eg with a 2kB buffer : recvfrom(22, "HTTP/1.1 200\r\nConnection: close\r"..., 1024, 0, NULL, NULL) = 1024 recvfrom(22, "23456789.123456789.123456789.123"..., 1023, 0, NULL, NULL) = 1023 recvfrom(22, "5", 1, 0, NULL, NULL) = 1 This bug is still present in 1.6 and 1.5 so the fix should be backported there.	2016-04-21 18:06:08 +02:00
Willy Tarreau	93dc478a04	BUG/MEDIUM: channel: incorrect polling condition may delay event delivery The condition to poll for receive as implemented in channel_may_recv() is still incorrect. If buf->o is null and buf->i is slightly larger than chn->to_forward and at least as large as buf->size - maxrewrite, then reading will be disabled. It may slightly delay some data delivery by having first to forward pending bytes, but may also cause some random issues with analysers that wait for some data before starting to forward what they correctly parsed. For instance, a body analyser may be prevented from seeing the data that only fits in the reserve. This bug may also prevent an applet's chk_rcv() function from being called when part of a buffer is released. It is possible (though not verified) that this participated to some peers frozen session issues some people have been facing. This fix should be backported to 1.6 and 1.5 to ensure better coherency with channel_recv_limit().	2016-04-21 17:03:46 +02:00
Willy Tarreau	4b46a3e8cc	BUG/MEDIUM: channel: don't allow to overwrite the reserve until connected Commit `9c06ee4` ("BUG/MEDIUM: channel: don't schedule data in transit for leaving until connected") took care of an issue involving POST in conjunction with http-send-name-header, where we absolutely never want to touch the reserve until we're sure not to touch the buffer contents anymore, which is indicated by the output stream-interface being connected. But channel_may_recv() was not equipped with such a test, so in some situations it might decide that it is possible to poll for reads, and later channel_recv_limit() will decide it's not possible to read, causing a loop. So we must add a similar test there. Since the fix above was backported to 1.6 and 1.5, this fix must as well.	2016-04-21 15:31:22 +02:00
Willy Tarreau	8a32106fff	BUG/MEDIUM: channel: fix miscalculation of available buffer space (2nd try) Commit `999f643` ("BUG/MEDIUM: channel: fix miscalculation of available buffer space.") introduced a bug which made output data to be ignored when computing the remaining room in a buffer. The problem is that channel_may_recv() properly considers them and may declare that the FD may be polled for read events, but once the even strikes, channel_recv_limit() called before recv() says the opposite. In 1.6 and later this case is automatically caught by polling loop detection at the connection level and is harmless. But the backport in 1.5 ends up with a busy polling loop as soon as it becomes possible to have a buffer with this conflict. In order to reproduce it, it is necessary to have less than [maxrewrite] bytes available in a buffer, no forwarding enabled (end of transfer) and [buf->o >= maxrewrite - free space]. Since this heavily depends on socket buffers, it will randomly strike users. On 1.5 with 8kB buffers it was possible to reproduce it with httpterm using the following command line : $ (printf "GET /?s=675000 HTTP/1.0\r\n\r\n"; sleep 60) \| \ nc6 --rcvbuf-size 1 --send-only 127.0.0.1 8002 This bug is only medium in 1.6 and later but is major in the 1.5 backport, so it must be backported there. Thanks to Nenad Merdanovic and Janusz Dziemidowicz for reporting this issue with enough elements to help understand it.	2016-04-11 17:13:35 +02:00
Willy Tarreau	999f643ed2	BUG/MEDIUM: channel: fix miscalculation of available buffer space. The function channel_recv_limit() relies on channel_reserved() which itself relies on channel_in_transit(). Individually they're OK but combined they're doing the wrong thing. The problem is that we refrain from filling buffers while to_forward is even much larger than the buffer because of a semantic issue along the call chain. This is particularly visible when offloading SSL on moderately large files (1 MB), though it is also visible on clear text. Twice the number of recv() calls are made compared to what is needed, and the typical performance drops by 15-20% in SSL in 1.6 and later, and no directly measurable drop in 1.5 except when using strace. There's no need for all these intermediate functions, so let's get rid of them and reimplement channel_recv_limit() from scratch in a safer way. This fix needs to be backported to 1.6 and 1.5 (at least). Note that in 1.5 the function is called buffer_recv_limit() and it may differ a bit.	2016-01-25 02:31:18 +01:00
Thierry FOURNIER	27929fbfd7	MINOR: channel: rename function chn_sess to chn_strm The name of the function chn_sess is no longer appropriate. This patch renames it to chn_strm.	2015-09-25 23:27:33 +02:00

1 2

91 Commits