haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-04-25 20:38:03 +00:00

Author	SHA1	Message	Date
Willy Tarreau	e6300be8f8	BUG/MEDIUM: stream-interface: don't wake the task up before end of transfer Recent commit `d7ad9f5` ("MAJOR: channel: add a new flag CF_WAKE_WRITE to notify the task of writes") was not correct. It used to wake up the task as soon as there was some write activity and the flag was set, even if there were still some data to be forwarded. This resulted in process_session() being called a lot when transfering chunk-encoded HTTP responses made of very large chunks. The purpose of the flag is to wake up only a task waiting for some room and not the other ones, so it's totally counter-productive to wake it up as long as there are data to forward because the task will not be allowed to write anyway. Also, the commit above was taking some risks by not considering certain events anymore (eg: state != SI_ST_EST). While such events are not used at the moment, if some new features were developped in the future relying on these, it would be better that they could be notified when subscribing to the WAKE_WRITE event, so let's restore the condition.	2014-01-25 22:28:22 +01:00
Willy Tarreau	46be2e5039	MEDIUM: connection: update callers of ctrl->drain() to use conn_drain() Now we can more safely rely on the connection state to decide how to drain and what to do when data are drained. Callers don't need to manipulate the file descriptor's state anymore. Note that it also removes the need for the fix `ea90063` ("BUG/MEDIUM: stream-int: fix the keep-alive idle connection handler") since conn_drain() correctly sets the polling flags.	2014-01-20 22:27:17 +01:00
Willy Tarreau	7f4bcc312d	MINOR: protocol: improve the proto->drain() API It was not possible to know if the drain() function had hit an EAGAIN, so now we change the API of this function to return : < 0 if EAGAIN was met = 0 if some data remain > 0 if a shutdown was received	2014-01-20 22:27:16 +01:00
Willy Tarreau	d7ad9f5b0d	MAJOR: channel: add a new flag CF_WAKE_WRITE to notify the task of writes Since commit `6b66f3e` ([MAJOR] implement autonomous inter-socket forwarding) introduced in 1.3.16-rc1, we've been relying on a stupid mechanism to wake up the task after a write, which was an exact copy-paste of the reader side. The principle was that if we empty a buffer and there's no forwarding scheduled or if the producer is not in a connected state, then we wake the task up. That does not make any sense. It happens to wake up too late sometimes (eg, when the request analyser waits for some room in the buffer to start to work), and leads to unneeded wakeups in client-side keep-alive, because the task is woken up when the response is sent, while the analysers are simply waiting for a new request. In order to fix this, we introduce a new channel flag : CF_WAKE_WRITE. It is designed so that an analyser can explicitly request being notified when some data were written. It is used only when the HTTP request or response analysers need to wait for more room in the buffers. It is automatically cleared upon wake up. The flag is also automatically set by the functions which try to write into a buffer from an applet when they fail (bi_putblk() etc...). That allows us to remove the stupid condition above and avoid some wakeups. In http-server-close and in http-keep-alive modes, this reduces from 4 to 3 the average number of wakeups per request, and increases the overall performance by about 1.5%.	2013-12-31 18:37:36 +01:00
Willy Tarreau	61f7f0a959	BUG/MINOR: stream-int: do not clear the owner upon unregister Since the applet rework and the removal of the inter-task applets, we must not clear the stream-interface's owner task anymore otherwise we risk a crash when maintaining keep-alive with an applet. This is not possible right now so there is no impact yet, but this bug is not easy to track down. No backport is needed.	2013-12-28 21:33:37 +01:00
Willy Tarreau	ea90063cbc	BUG/MEDIUM: stream-int: fix the keep-alive idle connection handler Commit `2737562` (MEDIUM: stream-int: implement a very simplistic idle connection manager) implemented an idle connection handler. In the case where all data is drained from the server, it fails to disable polling, resulting in a busy spinning loop. Thanks to Sander Klein and Guillaume Castagnino for reporting this bug. No backport is needed.	2013-12-17 14:21:48 +01:00
Willy Tarreau	2737562e43	MEDIUM: stream-int: implement a very simplistic idle connection manager Idle connections are not monitored right now. So if a server closes after a response without advertising it, it won't be detected until a next request wants to use the connection. This is a bit problematic because it unnecessarily maintains file descriptors and sockets in an idle state. This patch implements a very simple idle connection manager for the stream interface. It presents itself as an I/O callback. The HTTP engine enables it when it recycles a connection. If a close or an error is detected on the underlying socket, it tries to drain as much data as possible from the socket, detect the close and responds with a close as well, then detaches from the stream interface.	2013-12-17 00:00:28 +01:00
Willy Tarreau	ad38acedaa	MEDIUM: connection: centralize handling of nolinger in fd management Right now we see many places doing their own setsockopt(SO_LINGER). Better only do it just before the close() in fd_delete(). For this we add a new flag on the file descriptor, indicating if it's safe or not to linger. If not (eg: after a connect()), then the setsockopt() call is automatically performed before a close(). The flag automatically turns to safe when receiving a read0.	2013-12-16 02:23:52 +01:00
Willy Tarreau	d02cdd23be	MINOR: connection: add simple functions to report connection readiness conn_xprt_ready() reports if the transport layer is ready. conn_ctrl_ready() reports if the control layer is ready. The stream interface uses si_conn_ready() to report that the underlying connection is ready. This will be used for connection reuse in keep-alive mode.	2013-12-16 02:23:52 +01:00
Willy Tarreau	0a23bcb8be	MAJOR: stream-interface: dynamically allocate the applet context From now on, a call to stream_int_register_handler() causes a call to si_alloc_appctx() and returns an initialized appctx for the current stream interface. If one was previously allocated, it is released. If the stream interface was attached to a connection, it is released as well. The appctx are allocated from the same pools as the connections, because they're substantially smaller in size, and we can't have both a connection and an appctx on an interface at any moment. In case of memory shortage, the call may return NULL, which is already handled by all consumers of stream_int_register_handler(). The field appctx was removed from the stream interface since we only rely on the endpoint now. On 32-bit, the stream_interface size went down from 108 to 44 bytes. On 64-bit, it went down from 144 to 64 bytes. This represents a memory saving of 160 bytes per session. It seems that a later improvement could be to move the call to stream_int_register_handler() to session.c for most cases.	2013-12-09 15:40:23 +01:00
Willy Tarreau	1fbe1c9ec8	MEDIUM: stream-int: return the allocated appctx in stream_int_register_handler() The task returned by stream_int_register_handler() is never used, however we always need to access the appctx afterwards. So make it return the appctx instead. We already plan for it to fail, which is the reason for the addition of a few tests and the possibility for the HTTP analyser to return a status code 500.	2013-12-09 15:40:23 +01:00
Willy Tarreau	57cd3e46b9	MEDIUM: connection: merge the send_proxy and local_send_proxy calls We used to have two very similar functions for sending a PROXY protocol line header. The reason is that the default one relies on the stream interface to retrieve the other end's address, while the "local" one performs a local address lookup and sends that instead (used by health checks). Now that the send_proxy_ofs is stored in the connection and not the stream interface, we can make the local_send_proxy rely on it and support partial sends. This also simplifies the code by removing the local_send_proxy function, making health checks use send_proxy_ofs, resulting in the removal of the CO_FL_LOCAL_SPROXY flag, and the associated test in the connection handler. The other flag, CO_FL_SI_SEND_PROXY was renamed without the "SI" part so that it is clear that it is not dedicated anymore to a usage with a stream interface.	2013-12-09 15:40:23 +01:00
Willy Tarreau	b8020cefed	MEDIUM: connection: move the send_proxy offset to the connection Till now the send_proxy_ofs field remained in the stream interface, but since the dynamic allocation of the connection, it makes a lot of sense to move that into the connection instead of the stream interface, since it will not be statically allocated for each session. Also, it turns out that moving it to the connection fils an alignment hole on 64 bit architectures so it does not consume more memory, and removing it from the stream interface was an opportunity to correctly reorder fields and reduce the stream interface's size from 160 to 144 bytes (-10%). This is 32 bytes saved per session.	2013-12-09 15:40:23 +01:00
Willy Tarreau	32e3c6a607	MAJOR: stream interface: dynamically allocate the outgoing connection The outgoing connection is now allocated dynamically upon the first attempt to touch the connection's source or destination address. If this allocation fails, we fail on SN_ERR_RESOURCE. As we didn't use si->conn anymore, it was removed. The endpoints are released upon session_free(), on the error path, and upon a new transaction. That way we are able to carry the existing server's address across retries. The stream interfaces are not initialized anymore before session_complete(), so we could even think about allocating them dynamically as well, though that would not provide much savings. The session initialization now makes use of conn_new()/conn_free(). This slightly simplifies the code and makes it more logical. The connection initialization code is now shorter by about 120 bytes because it's done at once, allowing the compiler to remove all redundant initializations. The si_attach_applet() function now takes care of first detaching the existing endpoint, and it is called from stream_int_register_handler(), so we can safely remove the calls to si_release_endpoint() in the application code around this call. A call to si_detach() was made upon stream_int_unregister_handler() to ensure we always free the allocated connection if one was allocated in parallel to setting an applet (eg: detect HTTP proxy while proceeding with stats maybe).	2013-12-09 15:40:23 +01:00
Willy Tarreau	2a6e8802c0	MEDIUM: stream-interface: introduce si_attach_conn to replace si_prepare_conn si_prepare_conn() is not appropriate in our case as it both initializes and attaches the connection to the stream interface. Due to the asymmetry between accept() and connect(), it causes some fields such as the control and transport layers to be reinitialized. Now that we can separately initialize these fields using conn_prepare(), let's break this function to only attach the connection to the stream interface. Also, by analogy, si_prepare_none() was renamed si_detach(), and si_prepare_applet() was renamed si_attach_applet().	2013-12-09 15:40:23 +01:00
Willy Tarreau	f79c8171b2	MAJOR: connection: add two new flags to indicate readiness of control/transport Currently the control and transport layers of a connection are supposed to be initialized when their respective pointers are not NULL. This will not work anymore when we plan to reuse connections, because there is an asymmetry between the accept() side and the connect() side : - on accept() side, the fd is set first, then the ctrl layer then the transport layer ; upon error, they must be undone in the reverse order, then the FD must be closed. The FD must not be deleted if the control layer was not yet initialized ; - on the connect() side, the fd is set last and there is no reliable way to know if it has been initialized or not. In practice it's initialized to -1 first but this is hackish and supposes that local FDs only will be used forever. Also, there are even less solutions for keeping trace of the transport layer's state. Also it is possible to support delayed close() when something (eg: logs) tracks some information requiring the transport and/or control layers, making it even more difficult to clean them. So the proposed solution is to add two flags to the connection : - CO_FL_CTRL_READY is set when the control layer is initialized (fd_insert) and cleared after it's released (fd_delete). - CO_FL_XPRT_READY is set when the control layer is initialized (xprt->init) and cleared after it's released (xprt->close). The functions have been adapted to rely on this and not on the pointers anymore. conn_xprt_close() was unused and dangerous : it did not close the control layer (eg: the socket itself) but still marks the transport layer as closed, preventing any future call to conn_full_close() from finishing the job. The problem comes from conn_full_close() in fact. It needs to close the xprt and ctrl layers independantly. After that we're still having an issue : we don't know based on ->ctrl alone whether the fd was registered or not. For this we use the two new flags CO_FL_XPRT_READY and CO_FL_CTRL_READY. We now rely on this and not on conn->xprt nor conn->ctrl anymore to decide what remains to be done on the connection. In order not to miss some flag assignments, we introduce conn_ctrl_init() to initialize the control layer, register the fd using fd_insert() and set the flag, and conn_ctrl_close() which unregisters the fd and removes the flag, but only if the transport layer was closed. Similarly, at the transport layer, conn_xprt_init() calls ->init and sets the flag, while conn_xprt_close() checks the flag, calls ->close and clears the flag, regardless xprt_ctx or xprt_st. This also ensures that the ->init and the ->close functions are called only once each and in the correct order. Note that conn_xprt_close() does nothing if the transport layer is still tracked. conn_full_close() now simply calls conn_xprt_close() then conn_full_close() in turn, which do nothing if CO_FL_XPRT_TRACKED is set. In order to handle the error path, we also provide conn_force_close() which ignores CO_FL_XPRT_TRACKED and closes the transport and the control layers in turns. All relevant instances of fd_delete() have been replaced with conn_force_close(). Now we always know what state the connection is in and we can expect to split its initialization.	2013-12-09 15:40:23 +01:00
Willy Tarreau	b363a1f469	MAJOR: stream-int: stop using si->conn and use si->end instead The connection will only remain there as a pre-allocated entity whose goal is to be placed in ->end when establishing an outgoing connection. All connection initialization can be made on this connection, but all information retrieved should be applied to the end point only. This change is huge because there were many users of si->conn. Now the only users are those who initialize the new connection. The difficulty appears in a few places such as backend.c, proto_http.c, peers.c where si->conn is used to hold the connection's target address before assigning the connection to the stream interface. This is why we have to keep si->conn for now. A future improvement might consist in dynamically allocating the connection when it is needed.	2013-12-09 15:40:22 +01:00
Willy Tarreau	cf644ed37a	MEDIUM: stream-int: make ->end point to the connection or the appctx The long-term goal is to have a context for applets as an alternative to the connection and not as a complement. At the moment, the context is still stored into the stream interface, and we only put a pointer to the applet's context in si->end, initialize the context with object type OBJ_TYPE_APPCTX, and this allows us not to allocate an entry when deciding to switch to an applet. A special care is taken to never dereference si->conn anymore when dealing with an applet. That's why it's important that si->end is always set to the proper type : si->end == NULL => not connected to anything si->end == OBJ_TYPE_APPCTX => connected to an applet si->end == OBJ_TYPE_CONN => real connection (server, proxy, ...) The session management code used to check the applet from the connection's target. Now it uses the stream interface's end point and does not touch the connection at all. Similarly, we stop checking the connection's addresses and file descriptors when reporting the applet's status in the stats dump.	2013-12-09 15:40:22 +01:00
Willy Tarreau	4a59f2f954	MAJOR: stream interface: remove the ->release function pointer Since last commit, we now have a pointer to the applet in the applet context. So we don't need the si->release function pointer anymore, it can be extracted from applet->applet.release. At many places, the ->release function was still tested for real connections while it is only limited to applets, so most of them were simply removed. For the remaining valid uses, a new inline function si_applet_release() was added to simplify the check and the call.	2013-12-09 15:40:22 +01:00
Willy Tarreau	7d67d7b9e5	MINOR: stream-int: add a new pointer to the end point The end point will correspond to either an applet context or a connection, depending on the object type. For now the pointer remains null.	2013-12-09 15:40:22 +01:00
Willy Tarreau	372d6708fb	MINOR: stream-int: split si_prepare_embedded into si_prepare_none and si_prepare_applet si_prepare_embedded() was used both to attach an applet and to detach anything from a stream interface. Split it into si_prepare_none() to detach and si_prepare_applet() to attach an applet. si->conn->target is now assigned from within these two functions instead of their respective callers.	2013-12-09 15:40:22 +01:00
Willy Tarreau	6fe1541285	MINOR: stream-int: make the shutr/shutw functions void This is to be more consistent with the other functions. The only reason why these functions used to return a value was to let the caller adjust polling by itself, but now their only callers were the si_shutr()/si_shutw() inline functions. Now these functions do not depend anymore on the connection. These connection variant of these functions now call conn_data_stop_recv()/conn_data_stop_send() before returning order not to require a return code anymore. The applet version does not need this at all.	2013-12-09 15:40:22 +01:00
Willy Tarreau	8b3d7dfd7c	MEDIUM: stream-int: split the shutr/shutw functions between applet and conn These functions induce a lot of ifs everywhere because they consider two different cases, one which is where the connection exists and has a file descriptor, and the other one which is the default case where at most an applet has to be notified. Let's have them in si_ops and automatically decide which one to use. The connection shutdown sequence has been slightly simplified, and we now clear the flags at the end. Also we remove SHUTR_NOW after a shutw with nolinger, as it's cleaner not to keep it.	2013-12-09 15:40:22 +01:00
Willy Tarreau	26f4a04744	MEDIUM: connection: set the socket shutdown flags on socket errors When we get a hard error from a syscall indicating the socket is dead, it makes sense to set the CO_FL_SOCK_WR_SH and CO_FL_SOCK_RD_SH flags to indicate that the socket may not be used anymore. It will ease the error processing in health checks where the state of socket is very important. We'll also be able to avoid some setsockopt(nolinger) after an error. For now, the rest of the code is not impacted because CO_FL_ERROR is always tested prior to these flags.	2013-12-04 23:50:36 +01:00
Willy Tarreau	7fe45698f5	BUG/MINOR: connection: check EINTR when sending a PROXY header PROXY protocol header was not tolerant to signals, so it might cause a connection to report an error if a signal comes in at the exact same moment the send is done. This is 1.5-specific and does not need any backport.	2013-12-04 23:50:26 +01:00
Godbach	4f48990c1a	OPTIM: stream_interface: return directly if the connection flag CO_FL_ERROR has been set The connection flag CO_FL_ERROR will be tested in the functions both si_conn_recv_cb() and si_conn_send_cb(). If CO_FL_ERROR has been set, out_error branch will be executed. But the only job of out_error branch is to set CO_FL_ERROR on connection flag. So it's better return directly than goto out_error branch under such conditions. As a result, out_error branch becomes needless and can be removed. In addition, the return type of si_conn_send_loop() is also changed to void. The caller should check conn->flags for errors just like stream_int_chk_snd_conn() does as below: static void stream_int_chk_snd_conn(struct stream_interface *si) { ... conn_refresh_polling_flags(si->conn); - if (si_conn_send(si->conn) < 0) { + si_conn_send(si->conn); + if (si->conn->flags & CO_FL_ERROR) { ... } Signed-off-by: Godbach <nylzhaowei@gmail.com>	2013-12-04 10:46:09 +01:00
Godbach	e68e02dc1d	CLEANUP: stream_interface: cleanup loop information in si_conn_send_loop() Though si_conn_send_loop() does not loop over ->snd_buf() after commit `ed7f836`, there is still some codes left which use `while` but only execute once. This commit does the cleanup job and rename si_conn_send_loop() to si_conn_send(). Signed-off-by: Godbach <nylzhaowei@gmail.com>	2013-10-12 07:53:33 +02:00
Willy Tarreau	95742a43aa	BUG/MEDIUM: fix broken send_proxy on FreeBSD David Berard reported that send-proxy was broken on FreeBSD and tracked the issue to be an error returned by send(). We already had the same issue in the past in another area which was addressed by the following commit : `0ea0cf6` BUG: raw_sock: also consider ENOTCONN in addition to EAGAIN In fact, on Linux send() returns EAGAIN when the connection is not yet established while other OSes return ENOTCONN. Let's consider ENOTCONN for send-proxy there as the same as EAGAIN. David confirmed that this change properly fixed the issue. Another place was affected as well (health checks with send-proxy), and was fixed. This fix does not need any backport since it only affects 1.5.	2013-09-03 09:08:31 +02:00
Willy Tarreau	fa8e2bc68c	OPTIM: splicing: use splice() for the last block when relevant Splicing is avoided for small transfers because it's generally cheaper to perform a couple of recv+send calls than pipe+splice+splice. This has the consequence that the last chunk of a large transfer may be transferred using recv+send if it's less than 4 kB. But when the pipe is already set up, it's better to use splice() to read the pending data, since they will get merged with the pending ones. This is what now happens everytime the reader is slower than the writer. Note that this change alone could have fixed most of the CPU hog bug, except at the end when only the close was pending.	2013-07-22 09:31:56 +02:00
Willy Tarreau	5007d2aa33	BUG/MINOR: stream_interface: don't call chk_snd() on polled events As explained in previous patch, we incorrectly call chk_snd() when performing a read even if the write event is already subscribed to poll(). This is counter-productive because we're almost sure to get an EAGAIN. A quick test shows that this fix halves the number of failed splice() calls without adding any extra work on other syscalls. This could have been tagged as an improvement, but since this behaviour made the analysis of previous bug more complex, it still qualifies as a fix.	2013-07-22 09:31:55 +02:00
Willy Tarreau	61d39a0e2a	BUG/MEDIUM: splicing: fix abnormal CPU usage with splicing Mark Janssen reported an issue in 1.5-dev19 which was introduced in 1.5-dev12 by commit `96199b10`. From time to time, randomly, the CPU usage spikes to 100% for seconds to minutes. A deep analysis of the traces provided shows that it happens when waiting for the response to a second pipelined HTTP request, or when trying to handle the received shutdown advertised by epoll() after the last block of data. Each time, splice() was involved with data pending in the pipe. The cause of this was that such events could not be taken into account by splice nor by recv and were left pending : - the transfer of the last block of data, optionally with a shutdown was not handled by splice() because of the validation that to_forward is higher than MIN_SPLICE_FORWARD ; - the next recv() call was inhibited because of the test on presence of data in the pipe. This is also what prevented the recv() call from handling a response to a pipelined request until the client had ACKed the previous response. No less than 4 different methods were experimented to fix this, and the current one was finally chosen. The principle is that if an event is not caught by splice(), then it MUST be caught by recv(). So we remove the condition on the pipe's emptiness to perform an recv(), and in order to prevent recv() from being used in the middle of a transfer, we mark supposedly full pipes with CO_FL_WAIT_ROOM, which makes sense because the reason for stopping a splice()-based receive is that the pipe is supposed to be full. The net effect is that we don't wake up and sleep in loops during these transient states. This happened much more often than expected, sometimes for a few cycles at end of transfers, but rarely long enough to be noticed, unless a client timed out with data pending in the pipe. The effect on CPU usage is visible even when transfering 1MB objects in pipeline, where the CPU usage drops from 10 to 6% on a small machine at medium bandwidth. Some further improvements are needed : - the last chunk of a splice() transfer is never done using splice due to the test on to_forward. This is wrong and should be performed with splice if the pipe has not yet been emptied ; - si_chk_snd() should not be called when the write event is already being polled, otherwise we're almost certain to get EAGAIN. Many thanks to Mark for all the traces he cared to provide, they were essential for understanding this issue which was not reproducible without. Only 1.5-dev is affected, no backport is needed.	2013-07-22 09:31:55 +02:00
Willy Tarreau	9568d7108f	BUG/MEDIUM: stream_interface: don't close outgoing connections on shutw() Commit `7bb68abb` introduced the SI_FL_NOHALF flag in dev10. It is used to automatically close the write side of a connection whose read side is closed. But the patch also caused the opposite to happen, which is that a simple shutw() call would immediately close the connection. This is not desired because when using option abortonclose, we want to pass the client's shutdown to the server which will decide what to do with it. So let's avoid the close when SHUTR is not set.	2012-12-30 01:39:37 +01:00
Willy Tarreau	34ac5665d4	BUG/MEDIUM: stream_interface: fix another case where the reader might not be woken up The code review during the chase for the POST freeze uncovered another possible issue which might appear when we perform an incomplete read and want to stop because of READ_DONTWAIT or because we reached the maximum read_poll limit. Reading is disabled but SI_FL_WAIT_ROOM was not set, possibly causing some cases where a send() on the other side would not wake the reader up until another activity on the same side calls the update function which fixes its status.	2012-12-19 19:28:57 +01:00
Willy Tarreau	6657276871	BUG/MAJOR: stream_interface: fix occasional data transfer freezes Since the changes in connection management, it became necessary to re-enable polling after a fast-forward transfer would complete. One such issue was addressed after dev12 by commit `9f7c6a18` (BUG/MAJOR: stream_interface: certain workloads could cause get stuck) but unfortunately, it was incomplete as very subtle cases would occasionally remain unaddressed when a buffer was marked with the NOEXP flag, which is used during POST uploads. The wake up must be performed even when the flag is there, the flag is used only to refresh the timeout. Too many conditions need to be hit together for the situation to be reproducible, but it systematically appears for some users. It is particularly important to credit Sander Klein and John Rood from Picturae ICT ( http://picturae.com/ ) for reporting this bug on the mailing list, providing configs and countless traces showing the bug in action, and for their patience testing litteraly tens of snapshots and versions of supposed fixes during a full week to narrow the commit range until the bug was really knocked down! As a side effect of their numerous tests, several other bugs were fixed.	2012-12-19 19:20:24 +01:00
Willy Tarreau	7d28149e92	BUG/MEDIUM: connection: always update connection flags prior to computing polling stream_int_chk_rcv_conn() did not clear connection flags before updating them. It is unsure whether this could have caused the stalled transfers that have been reported since dev15. In order to avoid such further issues, we now use a simple inline function to do all the job.	2012-12-17 01:14:25 +01:00
Willy Tarreau	b016587068	BUG/MINOR: stream_interface: don't return when the fd is already set Back in the days where polling was made with select() where all FDs were checked at once, stream_int_chk_snd_conn() used to check whether the file descriptor it was passed was ready or not, so that it did not perform the work for nothing. Right now FDs are checked just before calling the I/O handler so this test never matches at best, or may return false information at worst. Since conn_fd_handler() always clears the flags upon exit, it looks like a missed event cannot happen right now. Still, better remove this outdated check than wait for it to cause issues.	2012-12-15 10:12:39 +01:00
Willy Tarreau	ca00fbcb91	BUG/MEDIUM: stream-interface: fix possible stalls during transfers Sander Klein reported a rare case of POST transfers being stalled after a few megabytes since dev15. One possible culprit is the fix for the CPU spinning issues which is not totally correct, because stream_int_chk_snd_conn() would inconditionally enable the CO_FL_CURR_WR_ENA flag. What could theorically happen is the following sequence : 1) send buffer is empty, server-side polling is disabled 2) client sends some data 3) such data are forwarded to the server using stream_int_chk_snd_conn() 4) conn->flags \|= CO_FL_CURR_WR_ENA 5) si_conn_send_loop() is called 6) raw_sock_from_buf() does a partial write due to full kernel buffers 7) stream_int_chk_snd_conn() detects this and requests to be called to send the remaining data using __conn_data_want_send(), and clears the SI_FL_WAIT_DATA flag on the stream interface, indicating that it is already congestionned. 8) conn_cond_update_polling() calls conn_data_update_polling() which sees that both CO_FL_DATA_WR_ENA and CO_FL_CURR_WR_ENA are set, so it does not enable polling on the output fd. 9) the next chunk from the client fills the buffer 10) stream_int_chk_snd_conn() is called again 11) SI_FL_WAIT_DATA is already cleared, so the function immediately returns without doing anything. 12) the buffer is now full with the FD write polling disabled and everything deadlocks. Not that there is no reason for such an issue not to happen the other way around, from server to client, except maybe that due to the speed difference between the client and the server, client-side polling is always enabled and the buffer is never empty. All this shows that the new polling still looks fragile, in part due to the double information on the FD status, being both in fdtab[] and in the connection, which looks unavoidable. We should probably have some functions to tighten the relation between such flags and avoid manipulating them by hand. Also, the effects of chk_snd() on the polling are still under-estimated, while the relation between the stream_int and the FD is still too much present. Maybe the function should be rethought to only call the connection's fd handler. The connection model probably needs two calling conventions for bottom half and upper half.	2012-12-15 09:18:05 +01:00
Willy Tarreau	d486ef5045	BUG/MINOR: connection: remove a few synchronous calls to polling updates There were a few synchronous calls to polling updates in some functions called from the connection handler. These ones are not needed and should be replaced by more efficient and more debugable asynchronous calls.	2012-12-10 17:03:52 +01:00
Willy Tarreau	d29a06689f	BUG/MAJOR: connection: always recompute polling status upon I/O Bryan Berry and Baptiste Assmann both reported some occasional CPU spinning loops where haproxy was still processing I/O but burning CPU for apparently uncaught events. What happens is the following sequence : - proxy is in TCP mode - a connection from a client initiates a connection to a server - the connection to the server does not immediately happen and is polled for - in the mean time, the client speaks and the stream interface calls ->chk_snd() on the peer connection to send the new data - chk_snd() calls send_loop() to send the data. This last one makes the connection succeed and empties the buffer, so it disables polling on the connection and on the FD by creating an update entry. - before the update is processed, poll() succeeds and reports a write event for this fd. The poller does fd_ev_set() on the FD to switch it to speculative mode - the IO handler is called with a connection which has no write flag but an FD which is enabled in speculative mode. - the connection does nothing useful. - conn_update_polling() at the end of conn_fd_handler() cannot disable the FD because there were no changes on this FD. - the handler is left with speculative polling still enabled on the FD, and will be called over and over until a poll event is needed to transfer data. There is no perfectly elegant solution to this. At least we should update the flags indicating the current polling status to reflect what is being done at the FD level. This will allow to detect that the FD needs to be disabled upon exit. chk_snd() also needs minor changes to correctly switch to speculative polling before calling send_loop(), and to reflect this in the connection flags. This is needed so that no event remains stuck there without any polling. In fact, chk_snd() and chk_rcv() should perform the same number of preparations and cleanups as conn_fd_handler().	2012-12-10 16:52:10 +01:00
Willy Tarreau	d1b3f0498d	MINOR: connection: don't remove failed handshake flags It's annoying that handshake handlers remove themselves from the connection flags when they fail because there is no way to tell which one fails. So now we only remove them when they succeed.	2012-12-03 14:22:12 +01:00
Willy Tarreau	2b199c9ac3	MEDIUM: connection: provide a common conn_full_close() function Several places got the connection close sequence wrong because it was not obvious. In practice we always need the same sequence when aborting, so let's have a common function for this.	2012-11-23 17:32:21 +01:00
Willy Tarreau	f9fbfe8229	BUG/MAJOR: stream_interface: read0 not always handled since dev12 The connection handling changed introduced in 1.5-dev12 introduced a regression with commit `9bf9c14c`. The issue is that the stream_sock_read0() callback must update the channel flags to indicate that the side is closed so that when process_session() is called, it can propagate the close to the other side and terminate the session. The issue only appears in HTTP tunnel mode. It's a bit tricky to trigger the issue, it requires that the request channel is full with data flowing from the client to the server and that both the response and the read0() are received at once so that the flags are not updated, and that the HTTP analyser switches to tunnel mode without being informed that the request write side is closed. After that, process_session() does not know that the connection has to be aborted either, and no more event appears on this side where the connection stays here forever. Many thanks to Igor at owind for testing several snapshots and for providing valuable traces to reproduce and diagnose the issue!	2012-11-21 21:59:51 +01:00
Willy Tarreau	9f7c6a183b	BUG/MAJOR: stream_interface: certain workloads could cause get stuck Some very specifically scheduled workloads could sometimes get stuck when data receive was disabled due to buffer full then re-enabled due to a full send(). A conn_data_want_recv() had to be set again in this specific case. This bug was introduced with connection rework and polling changes in dev12.	2012-11-19 17:11:00 +01:00
Willy Tarreau	3fdb366885	MAJOR: connection: replace struct target with a pointer to an enum Instead of storing a couple of (int, ptr) in the struct connection and the struct session, we use a different method : we only store a pointer to an integer which is stored inside the target object and which contains a unique type identifier. That way, the pointer allows us to retrieve the object type (by dereferencing it) and the object's address (by computing the displacement in the target structure). The NULL pointer always corresponds to OBJ_TYPE_NONE. This reduces the size of the connection and session structs. It also simplifies target assignment and compare. In order to improve the generated code, we try to put the obj_type element at the beginning of all the structs (listener, server, proxy, si_applet), so that the original and target pointers are always equal. A lot of code was touched by massive replaces, but the changes are not that important.	2012-11-12 00:42:33 +01:00
Willy Tarreau	128b03c9ab	CLEANUP: stream_interface: remove the external task type target Before connections were introduced, it was possible to connect an external task to a stream interface. However it was left as an exercise for the brave implementer to find how that ought to be done. The feature was broken since the introduction of connections and was never fixed since due to lack of users. Better remove this dead code now.	2012-11-11 23:14:16 +01:00
Willy Tarreau	b31c971bef	CLEANUP: channel: remove any reference of the hijackers Hijackers were functions designed to inject data into channels in the distant past. They became unused around 1.3.16, and since there has not been any user of this mechanism to date, it's uncertain whether the mechanism still works (and it's not really useful anymore). So better remove it as well as the pointer it uses in the channel struct.	2012-11-11 23:05:39 +01:00
Willy Tarreau	7f7ad91056	BUILD: stream_interface: remove si_fd() and its references si_fd() is not used a lot, and breaks builds on OpenBSD 5.2 which defines this name for its own purpose. It's easy enough to remove this one-liner function, so let's do it.	2012-11-11 20:53:29 +01:00
Willy Tarreau	5fddab0a56	OPTIM: stream_interface: disable reading when CF_READ_DONTWAIT is set CF_READ_DONTWAIT was designed to avoid getting an EAGAIN upon recv() when very few data are expected. It prevents the reader from looping over recv(). Unfortunately with speculative I/O, it is very common that the same event has the time to be called twice before the task handles the data and disables the recv(). This is because not all tasks are always processed at once. Instead of leaving the buffer free-wheeling and doing an EAGAIN, we disable reading as soon as the first recv() succeeds. This way we're sure that only the next wakeup of the task will re-enable it if needed. Doing so has totally removed the EAGAIN we were seeing till now (30% of recv).	2012-11-10 00:23:38 +01:00
Willy Tarreau	ed7f836f07	BUG/MINOR: stream_interface: don't loop over ->snd_buf() It is stupid to loop over ->snd_buf() because the snd_buf() itself already loops and stops when system buffers are full. But looping again onto it, we lose the information of the full buffers and perform one useless syscall. Furthermore, this causes issues when dealing with large uploads while waiting for a connection to establish, as it can report a server reject of some data as a connection abort, which is wrong. 1.4 does not have this issue as it loops maximum twice (once for each buffer half) and exists as soon as system buffers are full. So no backport is needed.	2012-10-29 23:30:33 +01:00
Willy Tarreau	19d14ef104	MEDIUM: make the trash be a chunk instead of a char * The trash is used everywhere to store the results of temporary strings built out of s(n)printf, or as a storage for a chunk when chunks are needed. Using global.tune.bufsize is not the most convenient thing either. So let's replace trash with a chunk and directly use it as such. We can then use trash.size as the natural way to get its size, and get rid of many intermediary chunks that were previously used. The patch is huge because it touches many areas but it makes the code a lot more clear and even outlines places where trash was used without being that obvious.	2012-10-29 16:57:30 +01:00

1 2 3

144 Commits