haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-02-17 11:06:54 +00:00

Author	SHA1	Message	Date
Willy Tarreau	25002d206b	MINOR: polling: create function fd_compute_new_polled_status() This function is used to compute the new polling state based on the previous state. All pollers have to do this in their update loop, so better centralize the logic for it.	2014-01-26 00:42:32 +01:00
Willy Tarreau	e852545594	MEDIUM: polling: centralize polled events processing Currently, each poll loop handles the polled events the same way, resulting in a lot of duplicated, complex code. Additionally, epoll was the only one to handle newly created FDs immediately. So instead, let's move that code to fd.c in a new function dedicated to this task : fd_process_polled_events(). All pollers now use this function.	2014-01-26 00:42:32 +01:00
Willy Tarreau	6c11bd2f89	OPTIM: raw-sock: don't speculate after a short read if polling is enabled This is the reimplementation of the "done" action : when we experience a short read, we're almost certain that we've exhausted the system's buffers and that we'll meet an EAGAIN if we attempt to read again. If the FD is not yet polled, the stream interface already takes care of stopping the speculative read. When the FD is already being polled, we have two options : - either we're running from a level-triggered poller, in which case we'd rather report that we've reached the end so that we don't speculate over the poller and let it report next time data are available ; - or we're running from an edge-triggered poller in which case we have no choice and have to see the EAGAIN to re-enable events. At the moment we don't have any edge-triggered poller, so it's desirable to avoid speculative I/O that we know will fail. Note that this must not be ported to SSL since SSL hides the real readiness of the file descriptor. Thanks to this change, we observe no EAGAIN anymore during keep-alive transfers, and failed recvfrom() are reduced by half in http-server-close mode (the client-facing side is always being polled and the second recv can be avoided). Doing so results in about 5% performance increase in keep-alive mode. Similarly, we used to have up to about 1.6% of EAGAIN on accept() (1/maxaccept), and these have completely disappeared under high loads.	2014-01-26 00:42:32 +01:00
Willy Tarreau	baf5b9b445	CLEANUP: connection: fix comments in connection.h to reflect new behaviour. The polling has substantially changed, better fix the comments.	2014-01-26 00:42:31 +01:00
Willy Tarreau	aad69387ac	CLEANUP: connection: use conn_xprt_ready() instead of checking the flag It's easier and safer to rely on conn_xprt_ready() everywhere than to check the flag itself. It will also simplify adding extra checks later if needed. Some useless controls for !xprt have been removed, as the XPRT_READY flag itself guarantees xprt is set.	2014-01-26 00:42:31 +01:00
Willy Tarreau	3c72872da1	CLEANUP: connection: use conn_ctrl_ready() instead of checking the flag It's easier and safer to rely on conn_ctrl_ready() everywhere than to check the flag itself. It will also simplify adding extra checks later if needed. Some useless controls for !ctrl have been removed, as the CTRL_READY flag itself guarantees ctrl is set.	2014-01-26 00:42:31 +01:00
Willy Tarreau	d8375891fc	MEDIUM: connection: no need to recheck FD state We already have everything in the connection flags using the CO_FL_DATA_*_ENA bits combined with the fd's ready state, so we do not need to check fdtab[fd].ev anymore. This considerably simplifies the connection handling logic since it doesn't have to mix connection flags with past polling states.	2014-01-26 00:42:31 +01:00
Willy Tarreau	798c3c9c41	MINOR: stream-interface: no need to call fd_stop_both() on error We don't need to call fd_stop_both() since we already call conn_cond_update_polling() which will do it. This call was introduced by commit `d29a066` ("BUG/MAJOR: connection: always recompute polling status upon I/O").	2014-01-26 00:42:31 +01:00
Willy Tarreau	708e717251	MEDIUM: stream-interface: the polling flags must always be updated in chk_snd_conn We used to only update the polling flags in data phase, but after that we could update other flags. It does not seem possible to trigger a bug here but it's not very safe either. Better always keep them up to date.	2014-01-26 00:42:30 +01:00
Willy Tarreau	fd803bb4d7	MEDIUM: connection: add check for readiness in I/O handlers The recv/send callbacks must check for readiness themselves instead of having their callers do it. This will strengthen the test and will also ensure we never refrain from calling a handshake handler because a direction is being polled while the other one is ready.	2014-01-26 00:42:30 +01:00
Willy Tarreau	e1f50c4b02	MEDIUM: connection: remove conn_{data,sock}_poll_{recv,send} We simply remove these functions and replace their calls with the appropriate ones : - if we're in the data phase, we can simply report wait on the FD - if we're in the socket phase, we may also have to signal the desire to read/write on the socket because it might not be active yet.	2014-01-26 00:42:30 +01:00
Willy Tarreau	310987a038	MAJOR: connection: remove the CO_FL_WAIT_{RD,WR} flags These flags were used to report the readiness of the file descriptor. Now this readiness is directly checked at the file descriptor itself. This removes the need for constantly synchronizing updates between the file descriptor and the connection and ensures that all layers share the same level of information. For now, the readiness is updated in conn_{sock,data}_poll_* by directly touching the file descriptor. This must move to the lower layers instead so that these functions can disappear as well. In this state, the change works but is incomplete. It's sensible enough to avoid making it more complex. Now the sock/data updates become much simpler because they just have to enable/disable access to a file descriptor and not to care anymore about its readiness.	2014-01-26 00:42:30 +01:00
Willy Tarreau	f817e9f473	MAJOR: polling: rework the whole polling system This commit heavily changes the polling system in order to definitely fix the frequent breakage of SSL which needs to remember the last EAGAIN before deciding whether to poll or not. Now we have a state per direction for each FD, as opposed to a previous and current state previously. An FD can have up to 8 different states for each direction, each of which being the result of a 3-bit combination. These 3 bits indicate a wish to access the FD, the readiness of the FD and the subscription of the FD to the polling system. This means that it will now be possible to remember the state of a file descriptor across disable/enable sequences that generally happen during forwarding, where enabling reading on a previously disabled FD would result in forgetting the EAGAIN flag it met last time. Several new state manipulation functions have been introduced or adapted : - fd_want_{recv,send} : enable receiving/sending on the FD regardless of its state (sets the ACTIVE flag) ; - fd_stop_{recv,send} : stop receiving/sending on the FD regardless of its state (clears the ACTIVE flag) ; - fd_cant_{recv,send} : report a failure to receive/send on the FD corresponding to EAGAIN (clears the READY flag) ; - fd_may_{recv,send} : report the ability to receive/send on the FD as reported by poll() (sets the READY flag) ; Some functions are used to report the current FD status : - fd_{recv,send}_active - fd_{recv,send}_ready - fd_{recv,send}_polled Some functions were removed : - fd_ev_clr(), fd_ev_set(), fd_ev_rem(), fd_ev_wai() The POLLHUP/POLLERR flags are now reported as ready so that the I/O layers knows it can try to access the file descriptor to get this information. In order to simplify the conditions to add/remove cache entries, a new function fd_alloc_or_release_cache_entry() was created to be used from pollers while scanning for updates. The following pollers have been updated : ev_select() : done, built, tested on Linux 3.10 ev_poll() : done, built, tested on Linux 3.10 ev_epoll() : done, built, tested on Linux 3.10 & 3.13 ev_kqueue() : done, built, tested on OpenBSD 5.2	2014-01-26 00:42:30 +01:00
Willy Tarreau	033cd9d78c	REORG: polling: rename "fd_process_spec_events()" to "fd_process_cached_events()" This is in order to be coherent with the rest.	2014-01-26 00:42:29 +01:00
Willy Tarreau	899d95757e	REORG: polling: rename the cache allocation functions - alloc_spec_entry() becomes fd_alloc_cache_entry() - release_spec_entry() becomes fd_release_cache_entry()	2014-01-26 00:42:29 +01:00
Willy Tarreau	16f649c82c	REORG: polling: rename "fd_spec" to "fd_cache" So fd_spec was renamed "fd_cache" as it's becoming an event cache, and fd_nbspec becomes fd_cache_num.	2014-01-26 00:42:29 +01:00
Willy Tarreau	15a4dec87e	REORG: polling: rename "spec_e" to "state" and "spec_p" to "cache" We're completely changing the way FDs will be polled. There will be no more speculative I/O since we'll know the exact FD state, so these will only be cached events. First, let's fix a few field names which become confusing. "spec_e" was used to store a speculative I/O event state. Now we'll store the whole R/W states for the FD there. "spec_p" was used to store a speculative I/O cache position. Now let's clearly call it "cache".	2014-01-26 00:42:29 +01:00
Willy Tarreau	8e84c637d1	DOC: add a diagram showing polling state transitions This is internal stuff.	2014-01-26 00:42:29 +01:00
Willy Tarreau	69a41fa8a3	CLEANUP: polling: rename "spec_e" to "state" We're completely changing the way FDs will be polled. First, let's fix a few field names which become confusing. "spec_e" was used to store a speculative I/O event state. Now we'll store the whole R/W states for the FD there.	2014-01-26 00:42:28 +01:00
Willy Tarreau	f1ed327a7a	BUILD: fix VERDATE exclusion regex A backslash was missing. It used to work well with GNU grep anyway but better fix it.	2014-01-26 00:39:22 +01:00
Willy Tarreau	e6300be8f8	BUG/MEDIUM: stream-interface: don't wake the task up before end of transfer Recent commit `d7ad9f5` ("MAJOR: channel: add a new flag CF_WAKE_WRITE to notify the task of writes") was not correct. It used to wake up the task as soon as there was some write activity and the flag was set, even if there were still some data to be forwarded. This resulted in process_session() being called a lot when transfering chunk-encoded HTTP responses made of very large chunks. The purpose of the flag is to wake up only a task waiting for some room and not the other ones, so it's totally counter-productive to wake it up as long as there are data to forward because the task will not be allowed to write anyway. Also, the commit above was taking some risks by not considering certain events anymore (eg: state != SI_ST_EST). While such events are not used at the moment, if some new features were developped in the future relying on these, it would be better that they could be notified when subscribing to the WAKE_WRITE event, so let's restore the condition.	2014-01-25 22:28:22 +01:00
Willy Tarreau	4afd70aeab	BUG/MAJOR: fix freezes during compression Recent commit `d7ad9f5` ("MAJOR: channel: add a new flag CF_WAKE_WRITE to notify the task of writes") introduced this new CF_WAKE_WRITE flag that an analyser which requires some free space to write must set if it wants to be notified. Unfortunately, some places were missing. More specifically, the compression engine can rarely be stuck by a lack of output space, especially when dealing with non-compressible data. It then has to stop until some pending data are flushed and for this it must set the CF_WAKE_WRITE flag. But these cases were missed by the commit above. Fortunately, this change was introduced very recently and never released, so the impact was limited. Huge thanks to Sander Klein who first reported this issue and who kindly and patiently provided lots of traces and test data that made it possible to reproduce, analyze, then fix this issue.	2014-01-25 22:28:22 +01:00
Willy Tarreau	d153b3b096	DOC: fix misleading information about SIGQUIT SIGQUIT dumps the pools state to stderr, not to the logs. Thanks to Jim Freeman for reporting this.	2014-01-25 18:19:32 +01:00
Willy Tarreau	1f0da2485e	BUG/MEDIUM: unique_id: HTTP request counter is not stable Patrick Hemmer reported that using unique_id_format and logs did not report the same unique ID counter since commit `9f09521` ("BUG/MEDIUM: unique_id: HTTP request counter must be unique!"). This is because the increment was done while producing the log message, so it was performed twice. A better solution consists in fetching a new value once per request and saving it in the request or session context for all of this request's life. It happens that sessions already have a unique ID field which is used for debugging and reporting errors, and which differs from the one sent in logs and unique_id header. So let's change this to reuse this field to have coherent IDs everywhere. As of now, a session gets a new unique ID once it is instanciated. This means that TCP sessions will also benefit from a unique ID that can be logged. And this ID is renewed for each extra HTTP request received on an existing session. Thus, all TCP sessions and HTTP requests will have distinct IDs that will be stable along all their life, and coherent between all places where they're used (logs, unique_id header, "show sess", "show errors"). This feature is 1.5-specific, no backport to 1.4 is needed.	2014-01-25 11:07:06 +01:00
Thierry FOURNIER	e7054c7177	BUG/MINOR: payload: the patterns of the acl "req.ssl_ver" are no parsed with the good function. The fetch "req.ssl_ver" is not declared as explicit acl. If it is used as implicit ACL, the acl engine detect SMP_T_UINT output type and choose to use the default interger parser: pat_parse_int(). This fetch needs the parser pat_parse_dotted_ver(). This patch declare explicit ACL named "req.ssl_ver" that use the good parser function pat_parse_dotted_ver().	2014-01-25 02:50:58 +01:00
Willy Tarreau	4bd07de087	MEDIUM: checks: make use of chk_report_conn_err() for connection errors Checks used not to precisely report the errors that were detected at the connection layer (eg: too many SSL connections). Using chk_report_conn_err() makes this possible.	2014-01-24 16:15:04 +01:00
Willy Tarreau	9ce7013429	MEDIUM: tcp: report connection error at the connection level Now when a connection error happens, it is reported in the connection so that upper layers know exactly what happened. This is particularly useful with health checks and resources exhaustion.	2014-01-24 16:15:04 +01:00
Willy Tarreau	45b34e8abc	MINOR: connection: add more error codes to report connection errors It is quite often that an connection error only reports "socket error" with no more information. This is especially problematic with health checks where many causes are possible, including resource exhaustion which do not lead to a valid errno code. So let's add explicit codes to cover these cases.	2014-01-24 16:15:04 +01:00
Thierry FOURNIER	ee330afba0	MINOR: standard: The parse_binary() returns the length consumed and his documentation is updated Actually the values returned by this function is never used. All the callers just check if the resultat is non-zero. Before this patch, the function returns the length of the produced content. This value is not useful because is returned twice: the first time in the return value and the second time in the <binstrlen> argument. Now the function returns the number of bytes consumed from <source>.	2014-01-21 22:14:44 +01:00
Thierry FOURNIER	e7ba23633b	MINOR: pattern: move functions for grouping pat_match_* and pat_parse_* and add documentation.	2014-01-21 22:14:21 +01:00
Thierry FOURNIER	46ceb01c24	BUG/MEDIUM: pattern: Segfault in binary parser The functions pat_parse_* must return 0 if fail and the number of elements eated from **text if not fail. The function pat_parse_bin() returns 0 or the length parsed. This causes a segfault. I just apply the double operator "!" on the result of the function pat_parse_bin() and the return value value match the expected value.	2014-01-21 22:14:21 +01:00
Willy Tarreau	46be2e5039	MEDIUM: connection: update callers of ctrl->drain() to use conn_drain() Now we can more safely rely on the connection state to decide how to drain and what to do when data are drained. Callers don't need to manipulate the file descriptor's state anymore. Note that it also removes the need for the fix `ea90063` ("BUG/MEDIUM: stream-int: fix the keep-alive idle connection handler") since conn_drain() correctly sets the polling flags.	2014-01-20 22:27:17 +01:00
Willy Tarreau	3bd3e57a9b	MEDIUM: tcp: report in tcp_drain() that lingering is already disabled on close When an incoming shutdown or error is detected, we know that we can safely close without disabling lingering. Do it in tcp_drain() so that we don't have to do it from each and every caller.	2014-01-20 22:27:17 +01:00
Willy Tarreau	2aefad5df7	MINOR: connection: add a new conn_drain() function Till now there was no way to know from a connection if a previous call to drain() had done any change. This function is used to drain incoming data and to update the connection's flags at the same time. It also correctly sets the polling flags on the connection if the drain function indicates inability to receive. This function will be used preferably over ctrl->drain() when a connection is used.	2014-01-20 22:27:16 +01:00
Willy Tarreau	7f4bcc312d	MINOR: protocol: improve the proto->drain() API It was not possible to know if the drain() function had hit an EAGAIN, so now we change the API of this function to return : < 0 if EAGAIN was met = 0 if some data remain > 0 if a shutdown was received	2014-01-20 22:27:16 +01:00
Willy Tarreau	a593ec5bf4	MEDIUM: listener: fix polling management in the accept loop The accept loop used to force fd_poll_recv() even in places where it was not completely appropriate (eg: unexpected errors). It does not yet cause trouble but will do with the upcoming polling changes. Let's use it only where relevant now. EINTR/ECONNABORTED do not result in poll() anymore but the failed connection is simply skipped (this code dates from 1.1.32 when error codes were first considered).	2014-01-20 22:27:16 +01:00
Willy Tarreau	fa7fc95e16	BUG/MEDIUM: polling: ensure we update FD status when there's no more activity Some rare unexplained busy loops were observed on versions up to 1.5-dev19. It happens that if a file descriptor happens to be disabled for both read and write while it was speculatively enabled for both and this without creating a new update entry, there will be no way to remove it from the speculative I/O list until some other changes occur. It is suspected that a double sequence such as enable_both/disable_both could have led to this situation where an update cancels itself and does not clear the spec list in the poll loop. While it is unclear what I/O sequence may cause this situation to arise, it is safer to always add the FD to the update list if nothing could be done on it so that the next poll round will automatically take care of it. This is 1.5-specific, no backport is needed.	2014-01-20 20:57:02 +01:00
Willy Tarreau	00b0fb9349	BUG/MAJOR: ssl: fix breakage caused by recent fix `abf08d9` Recent commit `abf08d9` ("BUG/MAJOR: connection: fix mismatch between rcv_buf's API and usage") accidentely broke SSL by relying on an uninitialized value to enter the read loop. Many thanks to Cyril Bont� and Steve Ruiz for reporting this issue.	2014-01-17 11:09:40 +01:00
Thierry FOURNIER	410f8101ae	BUG/MEDIUM: map: segmentation fault with the stats's socket command "set map ..." The value of the variable "appctx->ctx.map.ent" is used after the loop, but its value has changed. The variable "value" is initialized and contains the good value. This is a recent bug, no backport is needed.	2014-01-15 18:39:38 +01:00
Willy Tarreau	2317976daa	BUILD: listener: fix recent accept4() again Recent commit `4448925` ("BUILD/MINOR: listener: remove a glibc warning on accept4()") broke accept4() on some systems because the glibc's version may now conflict with the local one.	2014-01-15 16:45:17 +01:00
Willy Tarreau	abf08d9365	BUG/MAJOR: connection: fix mismatch between rcv_buf's API and usage Steve Ruiz reported some reproducible crashes with HTTP health checks on a certain page returning a huge length. The traces he provided clearly showed that the recv() call was performed twice for a total size exceeding the buffer's length. Cyril Bont� tracked down the problem to be caused by the full buffer size being passed to rcv_buf() in event_srv_chk_r() instead of passing just the remaining amount of space. Indeed, this change happened during the connection rework in 1.5-dev13 with the following commit : `f150317` MAJOR: checks: completely use the connection transport layer But one of the problems is also that the comments at the top of the rcv_buf() functions suggest that the caller only has to ensure the requested size doesn't overflow the buffer's size. Also, these functions already have to care about the buffer's size to handle wrapping free space when there are pending data in the buffer. So let's change the API instead to more closely match what could be expected from these functions : - the caller asks for the maximum amount of bytes it wants to read ; This means that only the caller is responsible for enforcing the reserve if it wants to (eg: checks don't). - the rcv_buf() functions fix their computations to always consider this size as a max, and always perform validity checks based on the buffer's free space. As a result, the code is simplified and reduced, and made more robust for callers which now just have to care about whether they want the buffer to be filled or not. Since the bug was introduced in 1.5-dev13, no backport to stable versions is needed.	2014-01-15 01:09:48 +01:00
Willy Tarreau	4448925930	BUILD/MINOR: listener: remove a glibc warning on accept4() The accept4() Linux syscall requires _GNU_SOURCE on ix86, otherwise it emits a warning. On other archs including x86_64, this problem doesn't happen. Thanks to Charles Carter from Sigma Software for reporting this.	2014-01-14 17:54:12 +01:00
Thierry FOURNIER	35249cb045	BUG/MINOR: pattern: pattern comparison executed twice If the pattern is set as case insensitive, the string comparison is executed twice. The first time is insensitive comparison, the second is sensitive. This is a recent bug, no backport is needed.	2014-01-14 15:42:59 +01:00
Willy Tarreau	8663105095	BUG: Revert "OPTIM: poll: restore polling after a poll/stop/want sequence" This reverts commit `1208266356`. It randomly breaks SSL. What happens is that if the SSL response is read at once by the SSL stack and is partially delivered to the buffer, then there's no way to read the next parts because we wait for some polling first. So we'll fix this after the polling rework.	2014-01-13 11:34:42 +01:00
Willy Tarreau	17edc81e7e	MEDIUM: config: report a warning when multiple servers have the same name A config where multiple servers have the same name in the same backend is prone to a number of issues : logs are not really exploitable, stats get really tricky and even harder to change, etc... In fact, it can be safe to have the same name between multiple servers only when their respective IDs are known and used. So now we detect this situation and emit a warning for the first conflict detected per server if any of the servers uses an automatic ID.	2014-01-03 12:20:22 +01:00
Willy Tarreau	2b028dd828	OPTIM: session: put unlikely() around the freewheeling code The code which enables tunnel mode or TCP transfers is rarely used and at most once per session. Putting it in an unlikely() clause reduces the length of the hot path of process_session() which is already quite long, and also slightly reduces its overall size. Some measurements show a steady gain of about 0.2% thanks to this.	2013-12-31 23:56:46 +01:00
Willy Tarreau	9fe7aae6eb	MINOR: checks: use an inline function for health_adjust() This function is called twice per request, and does almost always nothing. Better use an inline version to avoid entering it when we can. About 0.5% additional performance was gained this way.	2013-12-31 23:47:37 +01:00
Willy Tarreau	9e5a3aacf4	MEDIUM: stream-int: make si_connect() return an established state when possible si_connect() used to only return SI_ST_CON. But it already detect the connection reuse and is the function which avoids calling connect(). So it already knows the connection is valid and reuse. Thus we make it return SI_ST_EST when a connection is reused. This means that connect_server() can return this state and sess_update_stream_int() as well. Thanks to this change, we don't need to leave process_session() in SI_ST_CON state to immediately enter it again to switch to SI_ST_EST. Implementing this removes one call to process_session() per request in keep-alive mode. We're now at 2 calls per request, which is the minimum (one for the request and another one for the response). The number of calls to http_wait_for_response() has also dropped from 2 to one. Tests indicate a performance gain of about 2.6% in request rate in keep-alive mode. There should be no gain in http-server-close() since we don't use this faster path.	2013-12-31 23:32:12 +01:00
Willy Tarreau	b44c873d61	MEDIUM: session: prepare to support earlier transitions to the established state At the moment it is possible in sess_prepare_conn_req() to switch to the established state when the target is an applet. But sess_update_stream_int() will soon also have the ability to set the established state via connect_server() when a connection is reused, leading to a synchronous connect. So prepare the code to handle this SI_ST_ASS -> SI_ST_EST transition, which really matches what's done in the lower layers.	2013-12-31 23:16:50 +01:00
Willy Tarreau	0e37f1c40e	MINOR: session: factor out the connect time measurement Currently there are 3 places in the code where t_connect is set after switching to state SI_ST_EST, and a fourth one will soon come. Since all these places lead to an immediate call to sess_establish() to complete the session establishment, better move that measurement there.	2013-12-31 23:06:46 +01:00

... 2 3 4 5 6 ...

3803 Commits