haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2024-12-27 07:02:11 +00:00

Author	SHA1	Message	Date
Willy Tarreau	3e12de2cc6	CLEANUP: tcp: make use of sock_accept_conn() where relevant This allows to get rid of two getsockopt(SO_ACCEPTCONN).	2020-10-13 18:15:33 +02:00
Willy Tarreau	cc8b653483	MINOR: sockpair: implement the .rx_listening function For socket pairs we don't rely on a real listening socket but we need to have a properly connected UNIX stream socket. This is what the new sockpair_accept_conn() tries to report. Some corner cases like half shutdown will still not be detected but that should be sufficient for most cases we really care about.	2020-10-13 18:15:33 +02:00
Willy Tarreau	29185140db	MINOR: protocol: make proto_tcp & proto_uxst report listening sockets Now we introdce a new .rx_listening() function to report if a receiver is actually a listening socket. The reason for this is to help detect shared sockets that might have been broken by sibling processes.	2020-10-13 18:15:33 +02:00
Willy Tarreau	5ced3e8879	MINOR: sock: add sock_accept_conn() to test a listening socket At several places we need to check if a socket is still valid and still willing to accept connections. Instead of open-coding this, each time, let's add a new function for this.	2020-10-13 18:15:33 +02:00
Willy Tarreau	8b6fc3d10e	MINOR: proto-tcp: make use of connect(AF_UNSPEC) for the pause Currently the suspend/resume mechanism for listeners only works on Linux and we resort to a number of tricks involving shutdown+listen+shutdown to try to detect failures on other operating systems that do not support it. But on Linux connect(AF_UNSPEC) also works pretty well and is much cleaner. It still doesn't work on other operating systems but the error is easier to detect and appears safer. So let's switch to this.	2020-10-13 18:15:33 +02:00
Willy Tarreau	7c9f756dcc	MINOR: fd: report an error message when failing initial allocations When starting with a huge maxconn (say 1 billion), the only error seen is "No polling mechanism available". This doesn't help at all to resolve the problem. Let's add specific alerts for the failed mallocs. Now we can get this instead: [ALERT] 286/154439 (23408) : Not enough memory to allocate 2000000033 entries for fdtab! This may be backported as far as 2.0 as it helps debugging bad configurations.	2020-10-13 18:15:33 +02:00
Willy Tarreau	b1e600c9c5	BUG/MINOR: mux-h2: do not stop outgoing connections on stopping There are reports of a few "SC" in logs during reloads when H2 is used on the backend side. Christopher analysed this as being caused by the proxy disabled test in h2_process(). As the comment says, this was done for frontends only, and must absolutely not send a GOAWAY to the backend, as all it will result in is to make newly queued streams fail. The fix consists in simply testing the connection side before deciding to send the GOAWAY. This may be backported as far as 2.0, though for whatever reason it seems to manifest itself only since 2.2 (probably due to changes in the outgoing connection setup sequence).	2020-10-13 18:15:33 +02:00
Willy Tarreau	2bd0f8147b	BUG/MINOR: init: only keep rlim_fd_cur if max is unlimited On some operating systems, RLIM_INFINITY is set to -1 so that when the hard limit on the number of FDs is set to unlimited, taking the MAX of both values keeps rlim_fd_cur and everything works. But on other systems this values is defined as the highest positive integer. This is what was observed on a 32-bit AIX 5.1. The effect is that maxsock becomes 2^31-1 and that fdtab allocation fails. Note that a simple workaround consists in manually setting maxconn in the global section. Let's ignore unlimited as soon as we retrieve rlim_fd_max so that all systems behave consistently. This may be backported as far as 2.0, though it doesn't seem like it has annoyed anyone.	2020-10-13 15:36:08 +02:00
Ilya Shipitsin	6b736b4476	CI: travis-ci: replace not defined SSL_LIB, SSL_INC for BotringSSL builds after `73b520b958` variables SSL_LIB, SSL_INC are not set, but still used by BoringSSL builds. That leads to error (I wish we could stop on such errors) and using stock openssl instead of boringssl	2020-10-11 21:12:33 +02:00
Willy Tarreau	b7ffe1975a	[RELEASE] Released version 2.3-dev6 Released version 2.3-dev6 with the following main changes : - REGTESTS: use "command" instead of "which" for better POSIX compatibility - BUILD: makefile: Update feature flags for OpenBSD - DOC: agent-check: fix typo in "fail" word expected reply - DOC: crt: advise to move away from cert bundle - BUG/MINOR: ssl/crt-list: exit on warning out of crtlist_parse_line() - REGTEST: fix host part in balance-uri-path-only.vtc - REGTEST: make ssl_client_samples and ssl_server_samples requiret to 2.3 - REGTEST: the iif converter test requires 2.3 - REGTEST: make agent-check.vtc require 1.8 - REGTEST: make abns_socket.vtc require 1.8 - REGTEST: make map_regm_with_backref require 1.7 - BUILD: makefile: Update feature flags for FreeBSD - OPTIM: backend/random: never queue on the server, always on the backend - OPTIM: backend: skip LB when we know the backend is full - BUILD: makefile: Fix building with closefrom() support enabled - BUILD: makefile: add an EXTRAVERSION variable to ease local naming - MINOR: tools: support for word expansion of environment in parse_line - BUILD: tools: fix minor build issue on isspace() - BUILD: makefile: Enable closefrom() support on Solaris - CLEANUP: ssl: Use structured format for error line report during crt-list parsing - MINOR: ssl: Add error if a crt-list might be truncated - MINOR: ssl: remove uneeded check in crtlist_parse_file - BUG/MINOR: Fix several leaks of 'log_tag' in init(). - DOC: tcp-rules: Refresh details about L7 matching for tcp-request content rules - MEDIUM: tcp-rules: Warn if a track-sc* content rule doesn't depend on content - BUG/MINOR: tcpcheck: Set socks4 and send-proxy flags before the connect call - DOC: ssl: new "cert bundle" behavior - BUG/MEDIUM: queue: make pendconn_cond_unlink() really thread-safe - CLEANUP: ssl: "bundle" is not an OpenSSL wording - MINOR: counters: fix a typo in comment - BUG/MINOR: stats: fix validity of the json schema - REORG: stats: export some functions - MINOR: stats: add stats size as a parameter for csv/json dump - MINOR: stats: hide px/sv/li fields in applet struct - REORG: stats: extract proxy json dump - REORG: stats: extract proxies dump loop in a function - MINOR: hlua: Display debug messages on stderr only in debug mode - MINOR: stats: define the concept of domain for statistics - MINOR: stats: define additional flag px cap on domain - MEDIUM: stats: add delimiter for static proxy stats on csv - MEDIUM: stats: define an API to register stat modules - MEDIUM: stats: add abstract type to store counters - MEDIUM: stats: integrate static proxies stats in new stats - MINOR: stats: support clear counters for dynamic stats - MINOR: stats: display extra proxy stats on the html page - MINOR: stats: add config "stats show modules" - MINOR: dns/stats: integrate dns counters in stats - MINOR: stats: remove for loop declaration - DOC: ssl: fix typo about ocsp files - BUG/MINOR: peers: Inconsistency when dumping peer status codes. - DOC: update INSTALL with supported OpenBSD / FreeBSD versions - BUG/MINOR: proto_tcp: Report warning messages when listeners are bound - CLEANUP: cache: Fix leak of cconf->c.name during config check - CLEANUP: ssl: Release cached SSL sessions on deinit - BUG/MINOR: mux-h1: Be sure to only set CO_RFL_READ_ONCE for the first read - BUG/MINOR: mux-h1: Always set the session on frontend h1 stream - MINOR: mux-h1: Don't wakeup the H1C when output buffer become available - CLEANUP: sock-unix: Remove an unreachable goto clause - BUG/MINOR: proxy: inc req counter on new syslog messages. - BUG/MEDIUM: log: old processes with log foward section don't die on soft stop. - MINOR: stats: inc req counter on listeners. - MINOR: channel: new getword and getchar functions on channel. - MEDIUM: log: syslog TCP support on log forward section. - BUG/MINOR: proxy/log: frontend/backend and log forward names must differ - DOC: re-work log forward bind statement documentation. - DOC: fix a confusing typo on a regsub example - BUILD: Add a DragonFlyBSD target - BUG/MINOR: makefile: fix a tiny typo in the target list - BUILD: makefile: Update feature flags for NetBSD - CI: travis-ci: help Coverity to detect BUG_ON() as a real stop - DOC: Add missing stats fields in the management doc - BUG/MEDIUM: mux-fcgi: Don't handle pending read0 too early on streams - BUG/MEDIUM: mux-h2: Don't handle pending read0 too early on streams - DOC: Fix typos in configuration.txt - BUG/MINOR: http: Fix content-length of the default 500 error - BUG/MINOR: http-htx: Expect no body for 204/304 internal HTTP responses - REGTESTS: mark abns_socket as broken - MEDIUM: fd: always wake up one thread when enabling a foreing FD - MEDIUM: listeners: don't bounce listeners management between queues - MEDIUM: init: stop disabled proxies after initializing fdtab - MEDIUM: listeners: make unbind_listener() converge if needed - MEDIUM: deinit: close all receivers/listeners before scanning proxies - MEDIUM: listeners: remove the now unused ZOMBIE state - MINOR: listeners: do not uselessly try to close zombie listeners in soft_stop() - CLEANUP: proxy: remove the first_to_listen hack in zombify_proxy() - MINOR: listeners: introduce listener_set_state() - MINOR: proxy: maintain per-state counters of listeners - MEDIUM: proxy: remove the unused PR_STFULL state - MEDIUM: proxy: remove the PR_STERROR state - MEDIUM: proxy: remove state PR_STPAUSED - MINOR: startup: don't rely on PR_STNEW to check for listeners - CLEANUP: peers: don't use the PR_ST* states to mark enabled/disabled - MEDIUM: proxy: replace proxy->state with proxy->disabled - MEDIUM: proxy: remove start_proxies() - MEDIUM: proxy: merge zombify_proxy() with stop_proxy() - MINOR: listeners: check the current listener state in pause_listener() - MINOR: listeners: check the current listener earlier state in resume_listener() - MEDIUM: listener/proxy: make the listeners notify about proxy pause/resume - MINOR: protocol: introduce protocol_{pause,resume}_all() - MAJOR: signals: use protocol_pause_all() and protocol_resume_all() - CLEANUP: proxy: remove the now unused pause_proxies() and resume_proxies() - MEDIUM: proto_tcp: make the pause() more robust in multi-process - BUG/MEDIUM: listeners: correctly report pause() errors - MINOR: listeners: move fd_stop_recv() to the receiver's socket code - CLEANUP: protocol: remove the ->disable_all method - CLEANUP: listeners: remove unused disable_listener and disable_all_listeners - MINOR: listeners: export enable_listener() - MINOR: protocol: directly call enable_listener() from protocol_enable_all() - CLEANUP: protocol: remove the ->enable_all method - CLEANUP: listeners: remove the now unused enable_all_listeners() - MINOR: protocol: rename the ->listeners field to ->receivers - MINOR: protocol: replace ->pause(listener) with ->rx_suspend(receiver) - MINOR: protocol: implement an ->rx_resume() method - MINOR: listener: use the protocol's ->rx_resume() method when available - MINOR: sock: provide a set of generic enable/disable functions - MINOR: protocol: add a new pair of rx_enable/rx_disable methods - MINOR: protocol: add a new pair of enable/disable methods for listeners - MEDIUM: listeners: now use the listener's ->enable/disable - MINOR: listeners: split delete_listener() in two versions - MINOR: listeners: count unstoppable jobs on creation, not deletion - MINOR: listeners: add a new stop_listener() function - MEDIUM: proxy: make stop_proxy() now use stop_listener() - MEDIUM: proxy: add mode PR_MODE_PEERS to flag peers frontends - MEDIUM: proxy: centralize proxy status update and reporting - MINOR: protocol: add protocol_stop_now() to instant-stop listeners - MEDIUM: proxy: make soft_stop() stop most listeners using protocol_stop_now() - MEDIUM: udp: implement udp_suspend() and udp_resume() - MINOR: listener: add a few BUG_ON() statements to detect inconsistencies - MEDIUM: listeners: always close master vs worker listeners - BROKEN/MEDIUM: listeners: rework the unbind logic to make it idempotent - MEDIUM: listener: let do_unbind_listener() decide whether to close or not - CLEANUP: listeners: remove the do_close argument to unbind_listener() - MINOR: listeners: move the LI_O_MWORKER flag to the receiver - MEDIUM: receivers: add an rx_unbind() method in the protocols - MINOR: listeners: split do_unbind_listener() in two - MEDIUM: listeners: implement protocol level ->suspend/resume() calls - MEDIUM: config: mark "grace" as deprecated - MEDIUM: config: remove the deprecated and dangerous global "debug" directive - BUG/MINOR: proxy: respect the proper format string in sig_pause/sig_listen - MINOR: peers: heartbeat, collisions and handshake information for "show peers" command. - BUILD: makefile: Enable getaddrinfo() on OS/X	2020-10-10 10:45:13 +02:00
Brad Smith	ad5afbafea	BUILD: makefile: Enable getaddrinfo() on OS/X Enable getaddrinfo() on OS/X.	2020-10-10 10:09:29 +02:00
Fr�d�ric L�caille	3fc0fe05fd	MINOR: peers: heartbeat, collisions and handshake information for "show peers" command. This patch adds "coll" new counter and the heartbeat timer values to "show peers" command. It also adds the elapsed time since the last handshake to new "last_hdshk" new peer dump field.	2020-10-09 20:59:58 +02:00
Willy Tarreau	0a002df2c2	BUG/MINOR: proxy: respect the proper format string in sig_pause/sig_listen When factoring out the pause/resume error messages in commit `775e00158` ("MAJOR: signals: use protocol_pause_all() and protocol_resume_all()") I forgot that ha_warning() and send_log() take a format string and not just a const string. No backport is needed, this is 2.3-dev.	2020-10-09 19:26:27 +02:00
Willy Tarreau	ccf429960b	MEDIUM: config: remove the deprecated and dangerous global "debug" directive This one was scheduled for removal in 2.3 since 2.2-dev3 by commit `1b85785bc` ("MINOR: config: mark global.debug as deprecated"). Let's remove it now. It remains totally possible to use -d on the command line though.	2020-10-09 19:18:45 +02:00
Willy Tarreau	ab0a5192a8	MEDIUM: config: mark "grace" as deprecated This was introduced 15 years ago or so to delay the stopping of some services so that a monitoring device could detect its port being down before services were stopped. Since then, clean reloads were implemented and this doesn't cope well with reload at all, preventing the new process from seamlessly binding, and forcing processes to coexist with half-baked configurations. Now it has become a real problem because there's a significant code portion in the proxies that is solely dedicated to this obsolete feature, and dealing with its special cases eases the introduction of bugs in other places so it's about time that it goes. We could tentatively schedule its removal for 2.4 with a hard deadline for 2.5 in any case.	2020-10-09 19:07:01 +02:00
Willy Tarreau	e03204c8e1	MEDIUM: listeners: implement protocol level ->suspend/resume() calls Now we have ->suspend() and ->resume() for listeners at the protocol level. This means that it now becomes possible for a protocol to redefine its own way to suspend and resume. The default functions are provided for TCP, UDP and unix, and they are pass-through to the receiver equivalent as it used to be till now. Nothing was defined for sockpair since it does not need to suspend/resume during reloads, hence it will succeed.	2020-10-09 18:44:37 +02:00
Willy Tarreau	7b2febde1d	MINOR: listeners: split do_unbind_listener() in two The inner part now goes into the protocol and is used to decide how to unbind a given protocol's listener. The existing code which is able to also unbind the receiver was provided as a default function that we currently use everywhere. Some complex listeners like QUIC will use this to decide how to unbind without impacting existing connections, possibly by setting up other incoming paths for the traffic.	2020-10-09 18:44:37 +02:00
Willy Tarreau	f58b8db47b	MEDIUM: receivers: add an rx_unbind() method in the protocols This is used as a generic way to unbind a receiver at the end of do_unbind_listener(). This allows to considerably simplify that function since we can now let the protocol perform the cleanup. The generic code was moved to sock.c, along with the conditional rx_disable() call. Now the code also supports that the ->disable() function of the protocol which acts on the listener performs the close itself and adjusts the RX_F_BUOND flag accordingly.	2020-10-09 18:44:36 +02:00
Willy Tarreau	18c20d28d7	MINOR: listeners: move the LI_O_MWORKER flag to the receiver This listener flag indicates whether the receiver part of the listener is specific to the master or to the workers. In practice it's only used by the master's CLI right now. It's used to know whether or not the FD must be closed before forking the workers. For this reason it's way more of a receiver's property than a listener's property, so let's move it there under the name RX_F_MWORKER. The rest of the code remains unchanged.	2020-10-09 18:43:05 +02:00
Willy Tarreau	75c98d166e	CLEANUP: listeners: remove the do_close argument to unbind_listener() And also remove it from its callers. This subtle distinction was added as sort of a hack for the seamless reload feature but is not needed anymore since the do_close turned unused since commit previous commit ("MEDIUM: listener: let do_unbind_listener() decide whether to close or not"). This also removes the unbind_listener_no_close() function.	2020-10-09 18:41:56 +02:00
Willy Tarreau	374e9af358	MEDIUM: listener: let do_unbind_listener() decide whether to close or not The listener contains all the information needed to decide to close on unbind or not. The rule is the following (when we're not stopping): - worker process unbinding from a worker's FD with socket transfer enabled => keep - master process unbinding from a master's inherited FD => keep - master process unbinding from a master's FD => close - master process unbinding from a worker's FD => close - worker process unbinding from a master's FD => close - worker process unbinding from a worker's FD => close Let's translate that into the function and stop using the do_close argument that is a bit obscure for callers. It was not yet removed to ease code testing.	2020-10-09 18:41:48 +02:00
Willy Tarreau	87acd4e848	BROKEN/MEDIUM: listeners: rework the unbind logic to make it idempotent BROKEN: the failure rate on reg-tests/seamless-reload/abns_socket.vtc has significantly increased for no obvious reason. It fails 99% of the time vs 10% before. do_unbind_listener() is not logical and is not even idempotent. It must not touch the fd if already -1, which also means not touch the receiver. In addition, when performing a partial stop on a socket (not closing), we know the socket remains in the listening state yet it's marked as LI_ASSIGNED, which is confusing as it doesn't translate its real state. With this change, we make sure that FDs marked for close end up in ASSIGNED state and that those which are really bound and on which a listen() was made (i.e. not pause) remain in LISTEN state. This is what is closest to reality. Ideally this function should become a default proto->unbind() one but it may still keep a bit too much state logic to become generalized to other protocols (e.g. QUIC).	2020-10-09 18:29:04 +02:00
Willy Tarreau	d6afb53bdc	MEDIUM: listeners: always close master vs worker listeners Right now in enable_listener(), we used to start all enabled listeners then kill from the workers those that were for the master. But this is incomplete. We must also close from the master the listeners that are solely for workers, and do it before we even start them. Otherwise we end up with a master responding to the worker CLI connections if the listener remains in listen mode to translate the socket's real state. It doesn't seem like it could have caused bugs in the past because we used to aggressively mark disabled listeners as LI_ASSIGNED despite the fact that they were still bound and listening. If this patch were ever seen as a candidate solution for any obscure bug, be careful in that it subtly relies on the fact that fd_delete() doesn't close inherited FDs anymore, otherwise that could break the master's ability to pass inherited FDs on reloads.	2020-10-09 18:29:04 +02:00
Willy Tarreau	95a3460739	MINOR: listener: add a few BUG_ON() statements to detect inconsistencies We must not have an fd==-1 when switching to certain states. This will later disappear but for now it helps detecting inconsistencies.	2020-10-09 18:29:04 +02:00
Willy Tarreau	e122dc5316	MEDIUM: udp: implement udp_suspend() and udp_resume() In Linux kernel's net/ipv4/udp.c there's a udp_disconnect() function which is called when connecting to AF_UNSPEC, and which unhashes a "connection". This property, which is also documented in connect(2) both in Linux and Open Group's man pages for datagrams, is interesting because it allows to reverse a connect() which is in fact a filter on the source. As such we can suspend a receiver by making it connect to itself, which will cause it not to receive any traffic anymore, letting a new one receive it all, then resume it by breaking this connection. This was tested to work well on Linux, other operating systems should also be tested. Before this, sending a SIGTTOU to a process having a UDP syslog forwarder would cause this error: [WARNING] 280/194249 (3268) : Paused frontend GLOBAL. [WARNING] 280/194249 (3268) : Some proxies refused to pause, performing soft stop now. [WARNING] 280/194249 (3268) : Proxy GLOBAL stopped (cumulated conns: FE: 0, BE: 0). [WARNING] 280/194249 (3268) : Proxy sylog-loadb stopped (cumulated conns: FE: 0, BE: 0). With this change, it now proceeds just like with TCP listeners: [WARNING] 280/195503 (3885) : Paused frontend GLOBAL. [WARNING] 280/195503 (3885) : Paused frontend sylog-loadb. And SIGTTIN also works: [WARNING] 280/195507 (3885) : Resumed frontend GLOBAL. [WARNING] 280/195507 (3885) : Resumed frontend sylog-loadb. On Linux this also works with TCP listeners (which can then be resumed using listen()) and established TCP sockets (which we currently kill using setsockopt(so_linger)), both not being portable on other OSes. UNIX sockets and ABNS sockets do not support it however (connect always fails). This needs to be further explored to see if other OSes might benefit from this to perform portable and reliable resets particularly on the backend side.	2020-10-09 18:29:04 +02:00
Willy Tarreau	626f3a7beb	MEDIUM: proxy: make soft_stop() stop most listeners using protocol_stop_now() One difficulty in soft-stopping is to make sure not to forget unlisted listeners. By first doing a pass using protocol_stop_now() we catch the vast majority of them. The few remaining ones are the ones belonging to a proxy having a grace period. For these ones, the proxy will arm its stop_time timer and emit a log message. Since neither UDP listeners nor peers use the grace period, we can already get rid of the special cases there since we know they will have been stopped by the protocols.	2020-10-09 18:29:04 +02:00
Willy Tarreau	02e8557e88	MINOR: protocol: add protocol_stop_now() to instant-stop listeners This will instantly stop all listeners except those which belong to a proxy configured with a grace time. This means that UDP listeners, and peers will also be stopped when called this way.	2020-10-09 18:29:04 +02:00
Willy Tarreau	acde152175	MEDIUM: proxy: centralize proxy status update and reporting There are multiple ways a proxy may switch to the disabled state, but now it's essentially once it loses its last listener. Instead of keeping duplicate code around and reporting the state change before actually seeing it, we now report it at the moment it's performed (from the last listener leaving) which allows to remove the message from all other places.	2020-10-09 18:29:04 +02:00
Willy Tarreau	a389c9e1e3	MEDIUM: proxy: add mode PR_MODE_PEERS to flag peers frontends For now we cannot easily distinguish a peers frontend from another one, which will be problematic to avoid reporting them when stopping their listeners. Let's add PR_MODE_PEERS for this. It's not supposed to cause any issue since all non-HTTP proxies are handled similarly now.	2020-10-09 18:28:21 +02:00
Willy Tarreau	322b9b94e9	MEDIUM: proxy: make stop_proxy() now use stop_listener() The function will stop the listeners using this method, which in turn will ping back once it finishes disabling the proxy.	2020-10-09 18:28:18 +02:00
Willy Tarreau	caa7df1296	MINOR: listeners: add a new stop_listener() function This function will be used to definitely stop a listener (e.g. during a soft_stop). This is actually tricky because it may be called for a proxy or for a protocol, both of which require locks and already hold some. The function takes booleans indicating which ones are already held, hoping this will be enough. It's not well defined wether proto->disable() and proto->rx_disable() are supposed to be called with any lock held, and they are used from do_unbind_listener() with all these locks. Some back annotations ought to be added on this point. The proxy's listeners count is updated, and the proxy is marked as disabled and woken up after the last one is gone. Note that a listener in listen state is already not attached anymore since it was disabled.	2020-10-09 18:27:48 +02:00
Willy Tarreau	455585e3cd	MINOR: listeners: count unstoppable jobs on creation, not deletion We have to count unstoppable jobs which correspond to worker sockpairs, in order to know when to count. However the way it's currently done is quite awkward because these are counted when stopping making the stop mechanism non-idempotent. This is definitely something we want to fix before stopping by protocol or our listeners count will quickly go wrong. Now they are counted when the listeners are created.	2020-10-09 18:25:14 +02:00
Willy Tarreau	b4c083f5bf	MINOR: listeners: split delete_listener() in two versions We'll need an already locked variant of this function so let's make __delete_listener() which will be called with the protocol lock held and the listener's lock held.	2020-10-09 11:27:30 +02:00
Willy Tarreau	4b51f42899	MEDIUM: listeners: now use the listener's ->enable/disable At each place we used to manipulate the FDs directly we can now call the listener protocol's enable/disable/rx_enable/rx_disable depending on whether the state changes on the listener or the receiver. One exception currently remains in listener_accept() which is a bit special and which should be split into 2 or 3 parts in the various protocol layers. The test of fd_updt in do_unbind_listener() that was added by commit `a51885621` ("BUG/MEDIUM: listeners: Don't call fd_stop_recv() if fd_updt is NULL.") could finally be removed since that part is correctly handled in the low-level disable() function. One disable() was added in resume_listener() before switching to LI_FULL because rx_resume() enables polling on the FD for the receiver while we want to disable it if the listener is full. There are different ways to clean this up in the future. One of them could be to consider that TCP receivers only act at the listener level. But in fact it does not translate reality. The reality is that only the receiver is paused and that the listener's state ought not be affected here. Ultimately the resume_listener() function should be split so that the part controlled by the protocols only acts on the receiver, and that the receiver itself notifies the upper listener about the change so that the listener protocol may decide to disable or enable polling. Conversely the listener should automatically update its receiver when they share the same state. Since there is no harm proceeding like this, let's keep this for now.	2020-10-09 11:27:30 +02:00
Willy Tarreau	5ddf1ce9c4	MINOR: protocol: add a new pair of enable/disable methods for listeners These methods will be used to enable/disable accepting new connections so that listeners do not play with FD directly anymore. Since all the currently supported protocols work on socket for now, these are identical to the rx_enable/rx_disable functions. However they were not defined in sock.c since it's likely that some will quickly start to differ. At the moment they're not used. We have to take care of fd_updt before calling fd_{want,stop}_recv() because it's allocated fairly late in the boot process and some such functions may be called very early (e.g. to stop a disabled frontend's listeners).	2020-10-09 11:27:30 +02:00
Willy Tarreau	686fa3db50	MINOR: protocol: add a new pair of rx_enable/rx_disable methods These methods will be used to enable/disable rx at the receiver level so that callers don't play with FDs directly anymore. All our protocols use the generic ones from sock.c at the moment. For now they're not used.	2020-10-09 11:27:30 +02:00
Willy Tarreau	e70c7977f2	MINOR: sock: provide a set of generic enable/disable functions These will be used on receivers, to enable or disable receiving on a listener, which most of the time just consists in enabling/disabling the file descriptor. We have to take care of the existence of fd_updt to know if we may or not call fd_{want,stop}_recv() since it's not permitted in very early boot.	2020-10-09 11:27:30 +02:00
Willy Tarreau	010fe151ce	MINOR: listener: use the protocol's ->rx_resume() method when available Instead of calling listen() for IPPROTO_TCP in resume_listener(), let's call the protocol's ->rx_resume() method when defined, which does the same. This removes another hard-dependency on the fd and underlying protocol from the generic functions.	2020-10-09 11:27:30 +02:00
Willy Tarreau	58e6b71bb0	MINOR: protocol: implement an ->rx_resume() method This one undoes ->rx_suspend(), it tries to restore an operational socket. It was only implemented for TCP since it's the only one we support right now.	2020-10-09 11:27:30 +02:00
Willy Tarreau	cb66ea60cf	MINOR: protocol: replace ->pause(listener) with ->rx_suspend(receiver) The ->pause method is inappropriate since it doesn't exactly "pause" a listener but rather temporarily disables it so that it's not visible at all to let another process take its place. The term "suspend" is more suitable, since the "pause" is actually what we'll need to apply to the FULL and LIMITED states which really need to make a pause in the accept process. And it goes well with the use of the "resume" function that will also need to be made per-protocol. Let's rename the function and make it act on the receiver since it's already what it essentially does, hence the prefix "_rx" to make it more explicit. The protocol struct was a bit reordered because it was becoming a real mess between the parts related to the listeners and those for the receivers.	2020-10-09 11:27:30 +02:00
Willy Tarreau	d7f331c8b8	MINOR: protocol: rename the ->listeners field to ->receivers Since the listeners were split into receiver+listener, this field ought to have been renamed because it's confusing. It really links receivers and not listeners, as most of the time it's used via rx.proto_list! The nb_listeners field was updated accordingly.	2020-10-09 11:27:30 +02:00
Willy Tarreau	dae0692717	CLEANUP: listeners: remove the now unused enable_all_listeners() It's not used anymore since previous commit. The good thing is that no more listener function now directly acts on a protocol.	2020-10-09 11:27:30 +02:00
Willy Tarreau	078e1c7102	CLEANUP: protocol: remove the ->enable_all method It's not used anymore, now the listeners are enabled from protocol_enable_all().	2020-10-09 11:27:30 +02:00
Willy Tarreau	5b95ae6b32	MINOR: protocol: directly call enable_listener() from protocol_enable_all() protocol_enable_all() calls proto->enable_all() for all protocols, which is always equal to enable_all_listeners() which in turn simply is a generic loop calling enable_listener() always returning ERR_NONE. Let's clean this madness by first calling enable_listener() directly from protocol_enable_all().	2020-10-09 11:27:30 +02:00
Willy Tarreau	7834a3f70f	MINOR: listeners: export enable_listener() we'll soon call it from outside.	2020-10-09 11:27:30 +02:00
Willy Tarreau	d008009958	CLEANUP: listeners: remove unused disable_listener and disable_all_listeners These ones have never been called, they were referenced by the protocol's disable_all for some protocols but there are no traces of their use, so in addition to not being sure the code works, it has never been tested. Let's remove a bit of complexity starting from there.	2020-10-09 11:27:30 +02:00
Willy Tarreau	fb4ead8e8a	CLEANUP: protocol: remove the ->disable_all method This one has never been used, is only referenced by proto_uxst and proto_sockpair, and it's not even certain it works at all. Let's get rid of it.	2020-10-09 11:27:30 +02:00
Willy Tarreau	e53608b2cd	MINOR: listeners: move fd_stop_recv() to the receiver's socket code fd_stop_recv() has nothing to do in the generic listener code, it's per protocol as some don't need it. For instance with abns@ it could even lead to fd_stop_recv(-1). And later with QUIC we don't want to touch the fd at all! It used to be that since commit `f2cb169487` delegating fd manipulation to their respective threads it wasn't possible to call it down there but it's not the case anymore, so let's perform the action in the protocol-specific code.	2020-10-09 11:27:30 +02:00
Willy Tarreau	fb76bd5ca6	BUG/MEDIUM: listeners: correctly report pause() errors By using the same "ret" variable in the "if" block to test the return value of pause(), the second one shadows the first one and when forcing the result to zero in case of an error, it doesn't do anything. The problem is that some listeners used to fail to pause in multi-process mode and this was not reported, but their failure was automatically resolved by the last process to pause. By properly checking for errors we might now possibly report a race once in a while so we may have to roll this back later if some users meet it. The test on ==0 is wrong too since technically speaking a total stop validates the need for a pause, but stops the listener so it's just the resume that won't work anymore. We could switch to stopped but it's an involuntary switch and the user will not know. Better then mark it as paused and let the resume continue to fail so that only the resume will eventually report an error (e.g. abns@). This must not be backported as there is a risk of side effect by fixing this bug, given that it hides other bugs itself.	2020-10-09 11:27:30 +02:00
Willy Tarreau	91c614dd0e	MEDIUM: proto_tcp: make the pause() more robust in multi-process In multi-process, the TCP pause is very brittle and we never noticed it because the error was lost in the upper layers. The problem is that shutdown() may fail if another process already did it, and will cause a process to fail to pause. What we do here in case of error is that we double-check the socket's state to verify if it's still accepting connections, and if not, we can conclude that another process already did the job in parallel. The difficulty here is that we're trying to eliminate false positives where some OSes will silently report a success on shutdown() while they don't shut the socket down, hence this dance of shutw/listen/shutr that only keeps the compatible ones. Probably that a new approach relying on connect(AF_UNSPEC) would provide better results.	2020-10-09 11:27:30 +02:00

1 2 3 4 5 ...

13029 Commits