haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-05-07 10:18:01 +00:00

Author	SHA1	Message	Date
Vincent Bernat	3c2f2f207f	CLEANUP: remove unneeded casts In C89, "void " is automatically promoted to any pointer type. Casting the result of malloc/calloc to the type of the LHS variable is therefore unneeded. Most of this patch was built using this Coccinelle patch: @@ type T; @@ - (T ) (\(lua_touserdata\\|malloc\\|calloc\\|SSL_get_app_data\\|hlua_checkudata\\|lua_newuserdata\)(...)) @@ type T; T x; void data; @@ x = - (T ) data @@ type T; T x; T data; @@ x = - (T ) data Unfortunately, either Coccinelle or I is too limited to detect situation where a complex RHS expression is of type "void *" and therefore casting is not needed. Those cases were manually examined and corrected.	2016-04-03 14:17:42 +02:00
Willy Tarreau	10146c9c51	CLEANUP: poll: move the conditions for waiting out of the poll functions The poll() functions have become a bit dirty because they now check the size of the signal queue, the FD cache and the number of tasks. It's not their job, this must be moved to the caller. In the end it simplifies the code because the expiration date is now set to now_ms if we must not wait, and this achieves in exactly the same result and is cleaner. The change looks large due to the change of indent for blocks which were inside an "if" block.	2015-04-13 20:47:51 +02:00
Willy Tarreau	5be2f35231	MAJOR: polling: centralize calls to I/O callbacks In order for HTTP/2 not to eat too much memory, we'll have to support on-the-fly buffer allocation, since most streams will have an empty request buffer at some point. Supporting allocation on the fly means being able to sleep inside I/O callbacks if a buffer is not available. Till now, the I/O callbacks were called from two locations : - when processing the cached events - when processing the polled events from the poller This change cleans up the design a bit further than what was started in 1.5. It now ensures that we never call any iocb from the poller itself and that instead, events learned by the poller are put into the cache. The benefit is important in terms of stability : we don't have to care anymore about the risk that new events are added into the poller while processing its events, and we're certain that updates are processed at a single location. To achieve this, we now modify all the fd_* functions so that instead of creating updates, they add/remove the fd to/from the cache depending on its state, and only create an update when the polling status reaches a state where it will have to change. Since the pollers make use of these functions to notify readiness (using fd_may_recv/fd_may_send), the cache is always up to date with the poller. Creating updates only when the polling status needs to change saves a significant amount of work for the pollers : a benchmark showed that on a typical TCP proxy test, the amount of updates per connection dropped from 11 to 1 on average. This also means that the update list is smaller and has more chances of not thrashing too many CPU cache lines. The first observed benefit is a net 2% performance gain on the connection rate. A second benefit is that when a connection is accepted, it's only when we're processing the cache, and the recv event is automatically added into the cache after the current one, resulting in this event to be processed immediately during the same loop. Previously we used to have a second run over the updates to detect if new events were added to catch them before waking up tasks. The next gain will be offered by the next steps on this subject consisting in implementing an I/O queue containing all cached events ordered by priority just like the run queue, and to be able to leave some events pending there as long as needed. That will allow us not to perform some FD processing if it's not the proper time for this (typically keep waiting for a buffer to be allocated if none is available for an recv()). And by only processing a small bunch of them, we'll allow priorities to take place even at the I/O level. As a result of this change, functions fd_alloc_or_release_cache_entry() and fd_process_polled_events() have disappeared, and the code dedicated to checking for new fd events after the callback during the poll() loop was removed as well. Despite the patch looking large, it's mostly a change of what function is falled upon fd_*() and almost nothing was added.	2014-11-21 20:37:32 +01:00
Willy Tarreau	25002d206b	MINOR: polling: create function fd_compute_new_polled_status() This function is used to compute the new polling state based on the previous state. All pollers have to do this in their update loop, so better centralize the logic for it.	2014-01-26 00:42:32 +01:00
Willy Tarreau	e852545594	MEDIUM: polling: centralize polled events processing Currently, each poll loop handles the polled events the same way, resulting in a lot of duplicated, complex code. Additionally, epoll was the only one to handle newly created FDs immediately. So instead, let's move that code to fd.c in a new function dedicated to this task : fd_process_polled_events(). All pollers now use this function.	2014-01-26 00:42:32 +01:00
Willy Tarreau	f817e9f473	MAJOR: polling: rework the whole polling system This commit heavily changes the polling system in order to definitely fix the frequent breakage of SSL which needs to remember the last EAGAIN before deciding whether to poll or not. Now we have a state per direction for each FD, as opposed to a previous and current state previously. An FD can have up to 8 different states for each direction, each of which being the result of a 3-bit combination. These 3 bits indicate a wish to access the FD, the readiness of the FD and the subscription of the FD to the polling system. This means that it will now be possible to remember the state of a file descriptor across disable/enable sequences that generally happen during forwarding, where enabling reading on a previously disabled FD would result in forgetting the EAGAIN flag it met last time. Several new state manipulation functions have been introduced or adapted : - fd_want_{recv,send} : enable receiving/sending on the FD regardless of its state (sets the ACTIVE flag) ; - fd_stop_{recv,send} : stop receiving/sending on the FD regardless of its state (clears the ACTIVE flag) ; - fd_cant_{recv,send} : report a failure to receive/send on the FD corresponding to EAGAIN (clears the READY flag) ; - fd_may_{recv,send} : report the ability to receive/send on the FD as reported by poll() (sets the READY flag) ; Some functions are used to report the current FD status : - fd_{recv,send}_active - fd_{recv,send}_ready - fd_{recv,send}_polled Some functions were removed : - fd_ev_clr(), fd_ev_set(), fd_ev_rem(), fd_ev_wai() The POLLHUP/POLLERR flags are now reported as ready so that the I/O layers knows it can try to access the file descriptor to get this information. In order to simplify the conditions to add/remove cache entries, a new function fd_alloc_or_release_cache_entry() was created to be used from pollers while scanning for updates. The following pollers have been updated : ev_select() : done, built, tested on Linux 3.10 ev_poll() : done, built, tested on Linux 3.10 ev_epoll() : done, built, tested on Linux 3.10 & 3.13 ev_kqueue() : done, built, tested on OpenBSD 5.2	2014-01-26 00:42:30 +01:00
Willy Tarreau	899d95757e	REORG: polling: rename the cache allocation functions - alloc_spec_entry() becomes fd_alloc_cache_entry() - release_spec_entry() becomes fd_release_cache_entry()	2014-01-26 00:42:29 +01:00
Willy Tarreau	16f649c82c	REORG: polling: rename "fd_spec" to "fd_cache" So fd_spec was renamed "fd_cache" as it's becoming an event cache, and fd_nbspec becomes fd_cache_num.	2014-01-26 00:42:29 +01:00
Willy Tarreau	15a4dec87e	REORG: polling: rename "spec_e" to "state" and "spec_p" to "cache" We're completely changing the way FDs will be polled. There will be no more speculative I/O since we'll know the exact FD state, so these will only be cached events. First, let's fix a few field names which become confusing. "spec_e" was used to store a speculative I/O event state. Now we'll store the whole R/W states for the FD there. "spec_p" was used to store a speculative I/O cache position. Now let's clearly call it "cache".	2014-01-26 00:42:29 +01:00
Willy Tarreau	69a41fa8a3	CLEANUP: polling: rename "spec_e" to "state" We're completely changing the way FDs will be polled. First, let's fix a few field names which become confusing. "spec_e" was used to store a speculative I/O event state. Now we'll store the whole R/W states for the FD there.	2014-01-26 00:42:28 +01:00
Willy Tarreau	39ebef82aa	BUG/MINOR: poll: the I/O handler was called twice for polled I/Os When a polled I/O event is detected, the event is added to the updates list and the I/O handler is called. Upon return, if the event handler did not experience an EAGAIN, the event remains in the updates list so that it will be processed later. But if the event was already in the spec list, its state is updated and it will be called again immediately upon exit, by fd_process_spec_events(), so this creates unfairness between speculative events and polled events. So don't call the I/O handler upon I/O detection when the FD already is in the spec list. The fd events are still updated so that the spec list is up to date with the possible I/O change.	2012-12-14 00:17:03 +01:00
Willy Tarreau	26d7cfce32	BUG/MAJOR: polling: do not set speculative events on ERR nor HUP Errors and Hangups are sticky events, which means that once they're detected, we never clear them, allowing them to be handled later if needed. Till now when an error was reported, it used to register a speculative I/O event for both recv and send. Since the connection had not requested such events, it was not able to detect a change and did not clear them, so the events were called in loops until a timeout caused their owner task to die. So this patch does two things : - stop registering spec events when no I/O activity was requested, so that we don't end up with non-disablable polling state ; - keep the sticky polling flags (ERR and HUP) when leaving the connection handler so that an error notification doesn't magically become a normal recv() or send() report once the event is converted to a spec event. It is normally not needed to make the connection handler emit an error when it detects POLL_ERR because either a registered data handler will have done it, or the event will be disabled by the wake() callback.	2012-12-07 00:09:43 +01:00
Willy Tarreau	70c6fd82c3	MAJOR: polling: remove unused callbacks from the poller struct Since no poller uses poller->{set,clr,wai,is_set,rem} anymore, let's remove them and remove the associated pointer tests in proto/fd.h.	2012-11-11 21:02:34 +01:00
Willy Tarreau	4a22627672	MAJOR: ev_kqueue: make the poller support speculative events The poller was updated to support speculative events. We'll need this to fully support SSL. As an a side effect, the code has become much simpler and much more efficient, by taking advantage of the nice kqueue API which supports batched updates. All references to fd_sets have disappeared, and only the fdtab[].spec_e fields are used to decide about file descriptor state.	2012-11-11 20:53:29 +01:00
Willy Tarreau	babd05a6c6	MEDIUM: fd: add fd_poll_{recv,send} for use when explicit polling is required The old EV_FD_SET() macro was confusing, as it would enable receipt but there was no way to indicate that EAGAIN was received, hence the recently added FD_WAIT_* flags. They're not enough as we're still facing a conflict between EV_FD_* and FD_WAIT_*. So let's offer I/O functions what they need to explicitly request polling.	2012-09-02 21:53:11 +02:00
Willy Tarreau	3788e4c874	MEDIUM: fd: remove the EV_FD_COND_* primitives These primitives were initially introduced so that callers were able to conditionally set/disable polling on a file descriptor and check in return what the state was. It's been long since we last had an "if" on this, and all pollers' functions were the same for cond_* and their systematic counter parts, except that this required a check and a specific return value that are not always necessary. So let's simplify the FD API by removing this now unused distinction and by making all specific functions return void.	2012-09-02 21:53:10 +02:00
Willy Tarreau	076be25ab8	CLEANUP: remove the now unused fdtab direct I/O callbacks They were all left to NULL since last commit so we can safely remove them all now and remove the temporary dual polling logic in pollers.	2012-09-02 21:51:29 +02:00
Willy Tarreau	9845e75d23	MEDIUM: polling: prepare to call the iocb() function when defined. We will need this to centralize I/O callbacks. Nobody sets it right now so the code should have no impact.	2012-09-02 21:51:27 +02:00
Willy Tarreau	db3b32610f	REORG/MEDIUM: fd: remove FD_STCLOSE from struct fdtab In an attempt to get rid of fdtab[].state, and to move the relevant parts to the connection struct, we remove the FD_STCLOSE state which can easily be deduced from the <owner> pointer as there is a 1:1 match.	2012-09-02 21:51:25 +02:00
Willy Tarreau	491c498d97	BUG/MINOR: polling: some events were not set in various pollers fdtab[].ev was only set in ev_sepoll. Unfortunately, some I/O handling functions now rely on this, so depending on the polling mechanism, some useless operations might have been performed, such as performing a useless recv() when a HUP was reported. This is a very old issue, the flags were only added to the fdtab and not propagated into any poller. Then they were used in ev_sepoll which needed them for the cache. It is unsure whether a backport to 1.4 is appropriate or not.	2012-07-31 07:55:31 +02:00
Willy Tarreau	45a1251515	[MEDIUM] poll: add a measurement of idle vs work time We now measure the work and idle times in order to report the idle time in the stats. It's expected that we'll be able to use it at other places later.	2011-09-10 18:01:41 +02:00
Willy Tarreau	d79e79b436	[BUG] O(1) pollers should check their FD before closing it epoll, sepoll and kqueue pollers should check that their fd is not closed before attempting to close it, otherwise we can end up with multiple closes of fd #0 upon exit, which is harmless but dirty.	2009-05-10 10:18:54 +02:00
Willy Tarreau	332740dab2	[MEDIUM] pollers: don't wait if a signal is pending If an asynchronous signal is received outside of the poller, we don't want the poller to wait for a timeout to occur before processing it, so we set its timeout to zero, just like we do with pending tasks in the run queue.	2009-05-10 09:57:21 +02:00
Willy Tarreau	a534fea478	[CLEANUP] remove 65 useless NULL checks before free C specification clearly states that free(NULL) is a no-op. So remove useless checks before calling free.	2008-08-03 20:48:50 +02:00
Willy Tarreau	ec6c5df018	[CLEANUP] remove many #include <types/xxx> from C files It should be stated as a rule that a C file should never include types/xxx.h when proto/xxx.h exists, as it gives less exposure to declaration conflicts (one of which was caught and fixed here) and it complicates the file headers for nothing. Only types/global.h, types/capture.h and types/polling.h have been found to be valid includes from C files.	2008-07-16 10:30:42 +02:00
Willy Tarreau	0c303eec87	[MAJOR] convert all expiration timers from timeval to ticks This is the first attempt at moving all internal parts from using struct timeval to integer ticks. Those provides simpler and faster code due to simplified operations, and this change also saved about 64 bytes per session. A new header file has been added : include/common/ticks.h. It is possible that some functions should finally not be inlined because they're used quite a lot (eg: tick_first, tick_add_ifset and tick_is_expired). More measurements are required in order to decide whether this is interesting or not. Some function and variable names are still subject to change for a better overall logics.	2008-07-07 00:09:58 +02:00
Willy Tarreau	b0b37bcd65	[MEDIUM] further improve monotonic clock by check forward jumps The first implementation of the monotonic clock did not verify forward jumps. The consequence is that a fast changing time may expire a lot of tasks. While it does seem minor, in fact it is problematic because most machines which boot with a wrong date are in the past and suddenly see their time jump by several years in the future. The solution is to check if we spent more apparent time in a poller than allowed (with a margin applied). The margin is currently set to 1000 ms. It should be large enough for any poll() to complete. Tests with randomly jumping clock show that the result is quite accurate (error less than 1 second at every change of more than one second).	2008-06-23 14:00:57 +02:00
Willy Tarreau	b7f694f20e	[MEDIUM] implement a monotonic internal clock If the system date is set backwards while haproxy is running, some scheduled events are delayed by the amount of time the clock went backwards. This is particularly problematic on systems where the date is set at boot, because it seldom happens that health-checks do not get sent for a few hours. Before switching to use clock_gettime() on systems which provide it, we can at least ensure that the clock is not going backwards and maintain two clocks : the "date" which represents what the user wants to see (mostly for logs), and an internal date stored in "now", used for scheduled events.	2008-06-22 17:18:02 +02:00
Willy Tarreau	3a6281199a	[BUG] event pollers must not wait if a task exists in the run queue Under some circumstances, a task may already lie in the run queue (eg: inter-task wakeup). It is disastrous to wait for an event in this case because some processing gets delayed.	2008-06-20 15:05:56 +02:00
Willy Tarreau	1db37710dc	[MEDIUM] limit the number of events returned by poll By default, epoll/kqueue used to return as many events as possible. This could sometimes cause huge latencies (latencies of up to 400 ms have been observed with many thousands of fds at once). Limiting the number of events returned also reduces the latency by avoiding too many blind processing. The value is set to 200 by default and can be changed in the global section using the tune.maxpollevents parameter.	2007-06-03 17:16:49 +02:00
Willy Tarreau	79b8a62ff6	[BUG] ev_kqueue was forgotten during the switch to timeval	2007-05-14 03:15:46 +02:00
Willy Tarreau	ef1d1f859b	[MAJOR] auto-registering of pollers at load time Gcc provides __attribute__((constructor)) which is very convenient to execute functions at startup right before main(). All the pollers have been converted to have their register() function declared like this, so that it is not necessary anymore to call them from a centralized file.	2007-04-16 00:25:25 +02:00
Willy Tarreau	258696f5d8	[MAJOR] missing tv_now in kqueue_poll() blocking timeouts a missing call to tv_now(&now) just after kevent() prevented the timeouts from expiring.	2007-04-10 02:31:54 +02:00
Willy Tarreau	40562cb00c	[MINOR] kqueue: use fd_clo() to close the fd fd_clo() does not call kevent() which is not needed during a close(). This one will be faster.	2007-04-09 20:38:57 +02:00
Willy Tarreau	2ff7622c0c	[MAJOR] delay registering of listener sockets at startup Some pollers such as kqueue lose their FD across fork(), meaning that the registered file descriptors are lost too. Now when the proxies are started by start_proxies(), the file descriptors are not registered yet, leaving enough time for the fork() to take place and to get a new pollfd. It will be the first call to maintain_proxies that will register them.	2007-04-09 19:29:56 +02:00
Willy Tarreau	8755285486	[MEDIUM] kqueue: do not manually remove fds FDs attached to a kevent are automatically removed after close(). Also, do not mark the FDs as EV_CLEAR. We want to stay informed about readiness.	2007-04-09 17:16:07 +02:00
Willy Tarreau	cd5ce2a514	[MAJOR] kqueue bug in handling infinite timeouts Calls to kevent() need to pass NULL when there is no timeout.	2007-04-09 16:25:46 +02:00
Willy Tarreau	63455a9be5	[MINOR] use 'is_set' instead of 'isset' in struct poller 'isset' was defined as a macro in /usr/include/sys/param.h, and it breaks build on at least OpenBSD.	2007-04-09 15:34:49 +02:00
Willy Tarreau	69801b8e77	[MINOR] removed proto/polling.h which was not used anymore	2007-04-09 15:28:51 +02:00
Willy Tarreau	1e63130a37	[MAJOR] implemented support for FreeBSD's kqueue() polling mechanism It has not been tested yet, but at least it builds.	2007-04-09 12:03:06 +02:00

40 Commits