haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2024-12-12 22:44:32 +00:00

Author	SHA1	Message	Date
Christopher Faulet	85db3212b8	MINOR: spoe: Use the sample context to pass frag_ctx info during encoding This simplifies the API and hide the details in the sample. This way, only string and binary are aware of these info, because other types cannot be partially encoded. This patch may be backported to 1.9 and 1.8.	2019-04-29 16:02:05 +02:00
Kevin Zhu	f7f54280c8	BUG/MEDIUM: spoe: arg len encoded in previous frag frame but len changed Fragmented arg will do fetch at every encode time, each fetch may get different result if SMP_F_MAY_CHANGE, for example res.payload, but the length already encoded in first fragment of the frame, that will cause SPOA decode failed and waste resources. This patch must be backported to 1.9 and 1.8.	2019-04-29 16:02:05 +02:00
Christopher Faulet	1907ccc2f7	BUG/MINOR: http: Call stream_inc_be_http_req_ctr() only one time per request The function stream_inc_be_http_req_ctr() is called at the beginning of the analysers AN_REQ_HTTP_PROCESS_FE/BE. It as an effect only on the backend. But we must be careful to call it only once. If the processing of HTTP rules is interrupted in the middle, when the analyser is resumed, we must not call it again. Otherwise, the tracked counters of the backend are incremented several times. This bug was reported in github. See issue #74. This fix should be backported as far as 1.6.	2019-04-29 16:01:47 +02:00
Willy Tarreau	97215ca284	BUG/MEDIUM: mux-h2: properly deal with too large headers frames In h2c_decode_headers(), now that we support CONTINUATION frames, we try to defragment all pending frames at once before processing them. However if the first is exactly full and the second cannot be parsed, we don't detect the problem and we wait for the next part forever due to an incorrect check on exit; we must abort the processing as soon as the current frame remains full after defragmentation as in this case there is no way to make forward progress. Thanks to Yves Lafon for providing traces exhibiting the problem. This must be backported to 1.9.	2019-04-29 10:20:21 +02:00
David CARLIER	4de0eba848	MEDIUM: da: HTX mode support. The DeviceAtlas module now can support both the legacy mode and the new HTX's with the known set of support headers for the latter.	2019-04-26 17:06:32 +02:00
David Carlier	0470d704a7	BUILD/MEDIUM: contrib: Dummy DeviceAtlas API. Creating a "mocked" version mainly for testing purposes.	2019-04-26 17:06:32 +02:00
Willy Tarreau	4ad574fbe2	MEDIUM: streams: measure processing time and abort when detecting bugs On some occasions we've had loops happening when processing actions (e.g. a yield not being well understood) resulting in analysers being called in loops until the analysis timeout without incrementing the stream's call count, thus this type of bug cannot be caught by the current protection system. What this patch proposes is to start to measure the time spent in analysers when profiling is enabled on the thread, in order to detect if a stream is really misbehaving. In this case we measured the consumed CPU time, not the wall clock time, so as not to be affected by possible noisy neighbours sharing the same CPU. When more than 100ms are spent in an analyser, we trigger the stream_dump_and_crash() function to report the anomaly. The choice of 100ms comes from the fact that regular calls only take around 1 microsecond and it seems reasonable to accept a degradation factor of 100000, which covers very slow machines such as home gateways running on sub-ghz processors, with extremely heavy configurations. Some complete tests show that even this common bogus map_regm() entry supposedly designed to extract a port from an IP:port entry does not trigger the timeout (25 ms evaluation time for a 4kB header, exercise left to the reader to spot the mistake) : ([0-9]{0,3}).([0-9]{0,3}).([0-9]{0,3}).([0-9]{0,3}):([0-9]{0,5}) \5 However this one purposely designed to kill haproxy definitely dies as it manages to completely freeze the whole process for more than one second on a 4 GHz CPU for only 120 bytes in : (.{0,20})(.{0,20})(.{0,20})(.{0,20})(.{0,20})b \1 This protection will definitely help during the code stabilization period and may possibly be left enabled later depending on reported issues or not. If you've noticed that your workload is affected by this patch, please report it as you have very likely found a bug. And in the mean time you can turn profiling off to disable it.	2019-04-26 14:30:59 +02:00
Willy Tarreau	3d07a16f14	MEDIUM: stream/debug: force a crash if a stream spins over itself forever If a stream is caught spinning over itself at more than 100000 loops per second and for more than one second, the process will be aborted and the offender reported on the console and logs. Typical figures usually are just a few tens to hundreds per second over a very short time so there is a huge margin here. Using even higher values could also work but there is the risk of not being able to catch offenders if multiple ones start to bug at the same time and share the load. This code should ideally be disabled for stable releases, though in theory nothing should ever trigger it.	2019-04-26 13:16:14 +02:00
Willy Tarreau	dcb0e1d37d	MEDIUM: appctx/debug: force a crash if an appctx spins over itself forever If an appctx is caught spinning over itself at more than 100000 loops per second and for more than one second, the process will be aborted and the offender reported on the console and logs. Typical figures usually are just a few tens to hundreds per second over a very short time so there is a huge margin here. Using even higher values could also work but there is the risk of not being able to catch offenders if multiple ones start to bug at the same time and share the load. This code should ideally be disabled for stable releases, though in theory nothing should ever trigger it.	2019-04-26 13:15:56 +02:00
Willy Tarreau	71c07ac65a	MINOR: stream/debug: make a stream dump and crash function During 1.9 development (and even a bit after) we've started to face a significant number of situations where streams were abusively spinning due to an uncaught error flag or complex conditions that couldn't be correctly identified. Sometimes streams wake appctx up and conversely as well. More importantly when this happens the only fix is to restart. This patch adds a new function to report a serious error, some relevant info and to crash the process using abort() so that a core dump is available. The purpose will be for this function to be called in various situations where the process is unfixable. It will help detect these issues much earlier during development and may even help fixing test platforms which are able to automatically restart when such a condition happens, though this is not the primary purpose. This patch only provides the function and doesn't use it yet.	2019-04-26 13:15:56 +02:00
Willy Tarreau	5e6a5b3a6e	MINOR: connection: make the debugging helper functions safer We have various functions like conn_get_ctrl_name() to retrieve some information reported in "show sess" for debugging, which assume that the connection is valid. This is really not convenient in code aimed at debugging and is error-prone. Let's add a validity test first.	2019-04-25 18:35:49 +02:00
Willy Tarreau	5e370daa52	BUG/MINOR: proto_http: properly reset the stream's call rate on keep-alive The stream's call rate measurement was added by commit `2e9c1d296` ("MINOR: stream: measure and report a stream's call rate in "show sess"") but it forgot to reset it in case of HTTP keep-alive (legacy mode), resulting in incorrect measurements. No backport is needed, unless the patch above is backported.	2019-04-25 18:33:37 +02:00
Willy Tarreau	d5ec4bfe85	CLEANUP: standard: use proper const to addr_to_str() and port_to_str() The input parameter was not marked const, making it painful for some calls.	2019-04-25 17:48:16 +02:00
Willy Tarreau	d2d3348acb	MINOR: activity: enable automatic profiling turn on/off Instead of having to manually turn task profiling on/off in the configuration, by default it will work in "auto" mode, which automatically turns on on any thread experiencing sustained loop latencies over one millisecond averaged over the last 1024 samples. This may happen with configs using lots of regex (thing map_reg for example, which is the lazy way to convert Apache's rewrite rules but must not be abused), and such high latencies affect all the process and the problem is most often intermittent (e.g. hitting a map which is only used for certain host names). Thus now by default, with profiling set to "auto", it remains off all the time until something bad happens. This also helps better focus on the issues when looking at the logs as well as in "show sess" output. It automatically turns off when the average loop latency over the last 1024 calls goes below 990 microseconds (which typically takes a while when in idle). This patch could be backported to stable versions after a bit more exposure, as it definitely improves observability and the ability to quickly spot the culprit. In this case, previous patch ("MINOR: activity: make the profiling status per thread and not global") must also be taken.	2019-04-25 17:26:46 +02:00
Willy Tarreau	d9add3acc8	MINOR: activity: make the profiling status per thread and not global In order to later support automatic profiling turn on/off, we need to have it per-thread. We're keeping the global option to know whether to turn it or on off, but the profiling status is now set per thread. We're updating the status in activity_count_runtime() which is called before entering poll(). The reason is that we'll extend this with run time measurement when deciding to automatically turn it on or off.	2019-04-25 17:26:19 +02:00
Willy Tarreau	d636675137	BUG/MINOR: activity: always initialize the profiling variable It happens it was only set if present in the configuration. It's harmless anyway but can still cause doubts when comparing logs and configurations so better correctly initialize it. This should be backported to 1.9.	2019-04-25 17:26:19 +02:00
Willy Tarreau	a0abc8f2be	BUILD: travis: remove the "allow_failures" entry Now that OSX passes all regtests as well, remove this temporary entry.	2019-04-25 08:58:02 +02:00
Willy Tarreau	084354f0be	REGTEST: exclude OSX and generic targets from abns_socket.vtc This one relies on Linux's abstract namespace sockets which are not available there. FreeBSD used to already be excluded.	2019-04-25 08:50:25 +02:00
Willy Tarreau	4fd376d51d	REGTEST: relax the IPv6 address format checks in converters_ipmask_concat_strcmp_field_word In Travis build https://travis-ci.com/haproxy/haproxy/jobs/195477767 we can see that OSX tends to pad zeroes at a different position than Linux in compact IPv6 addresses, resulting in a failure in the checks which were developped on Linux. This patch uses [0:]* in holes and [0:]+ at the end of addresses to allow the different variants. It will unfortunately also accept impossible addresses but there is no reason that we have to care about for such crap to be emitted.	2019-04-25 08:47:15 +02:00
Willy Tarreau	03c6ab0cbb	REGTEST: exclude osx and generic targets for 40be_2srv_odd_health_checks As explained in the commit below, this test relies on TCP_DEFER_ACCEPT which is not available everywhere, and as such fails on OSX as well : `15685c791` ("REGTEST: Exclude freebsd target for some reg tests.")	2019-04-25 08:39:48 +02:00
Tim Duesterhus	88c63a6e55	BUILD: extend Travis CI config to support more platforms This commit extends the Travis CI configuration to build HAProxy with gcc on Linux, clang on Mac and cleans up the build flag configuration to be easier extendable. Note: At the moment HAProxy fails on Travis for configurations on OS X	2019-04-25 08:24:29 +02:00
Willy Tarreau	22d63a24d9	MINOR: applet: measure and report an appctx's call rate in "show sess" Very similarly to previous commit doing the same for streams, we now measure and report an appctx's call rate. This will help catch applets which do not consume all their data and/or which do not properly report that they're waiting for something else. Some of them like peers might theorically be able to exhibit some occasional peeks when teaching a full table to a nearby peer (e.g. the new replacement process), but nothing close to what a bogus service can do so there is no risk of confusion.	2019-04-24 16:04:23 +02:00
Willy Tarreau	2e9c1d2960	MINOR: stream: measure and report a stream's call rate in "show sess" Quite a few times some bugs have made a stream task incorrectly handle a complex combination of events, which was often reported as "100% CPU", and was usually caused by the event not being properly identified and flushed, and the stream's handler called in loops. This patch adds a call rate counter to the stream struct. It's not huge, it's really inexpensive (especially compared to the rest of the processing function) and will easily help spot such tasks in "show sess" output, possibly even allowing to kill them. A future patch should probably consist in alerting when they're above a certain threshold, possibly sending a dump and killing them. Some options could also consist in aborting in order to get an analyzable core dump and let a service manager restart a fresh new process.	2019-04-24 16:04:23 +02:00
Willy Tarreau	0212fadd65	MINOR: tasks/activity: report the context switch and task wakeup rates It's particularly useful to spot runaway tasks to see this. The context switch rate covers all tasklet calls (tasks and I/O handlers) while the task wakeups only covers tasks picked from the run queue to be executed. High values there will indicate either an intense traffic or a bug that mades a task go wild.	2019-04-24 16:04:23 +02:00
Willy Tarreau	69b5a7f1a3	CLEANUP: task: report calls as unsigned in show sess The "show sess" output used signed ints to report the number of calls, which is confusing for runaway tasks where the call count can turn negative.	2019-04-24 16:04:23 +02:00
Christopher Faulet	4904058661	BUG/MINOR: htx: Exclude TCP proxies when the HTX mode is handled during startup When tests are performed on the HTX mode during HAProxy startup, only HTTP proxies are considered. It is important because, since the commit `1d2b586cd` ("MAJOR: htx: Enable the HTX mode by default for all proxies"), the HTX is enabled on all proxies by default. But for TCP proxies, it is "deactivated". This patch must be backported to 1.9.	2019-04-24 15:40:02 +02:00
Christopher Faulet	c1918d1a8f	BUG/MAJOR: muxes: Use the HTX mode to find the best mux for HTTP proxies only Since the commit `1d2b586cd` ("MAJOR: htx: Enable the HTX mode by default for all proxies"), the HTX is enabled by default for all proxies, HTTP and TCP, but also CLI and HEALTH proxies. But when the best mux is retrieved, only HTTP and TCP modes are checked. If the TCP mode is not explicitly set, it is considered as an HTTP proxy. It is an hidden bug introduced when the option "http-use-htx" was added. It has no effect until the commit `1d2b586cd`. But now, when a stats socket is created for the master process, the mux h1 is installed on all incoming connections to the CLI proxy, leading to segfaults because HTX operations are performed on raw buffers. So to fix the buf, when a mux is installed, all proxies are considered as TCP proxies, except HTTP ones. This way, CLI and HEALTH proxies will be handled as TCP proxies. This patch must be backported to 1.9 although it has no effect. It is safer to not keep hidden bugs.	2019-04-24 15:40:02 +02:00
Willy Tarreau	274ba67862	BUG/MAJOR: lb/threads: fix AB/BA locking issue in round-robin LB An occasional divide by zero in the round-robin scheduler was addressed in commit `9df86f997` ("BUG/MAJOR: lb/threads: fix insufficient locking on round-robin LB") by grabing the server's lock in fwrr_get_server_from_group(). But it happens that this is not the correct approach as it introduces a case of AB/BA deadlock reported by Maksim Kupriianov. This happens when a server weight changes from/to zero while another thread extracts this server from the tree. The reason is that the functions used to manipulate the state work under the server's lock and grab the LB lock while the ones used in LB work under the LB lock and grab the server's lock when needed. This commit mostly reverts the changes above and instead further completes the locking analysis performed on this code to identify areas that really need to be protected by the server's lock, since this is the only algorithm which happens to have this requirement. This audit showed that in fact all locations which require the server's lock are already protected by the LB lock. This was not noticed the first time due to the server's lock being taken instead and due to some functions misleadingly using atomic ops to modify server fields which are under the LB lock protection (these ones were now removed). The change consists in not taking the server's lock anymore here, and instead making sure that the aforementioned function which used to suffer from the server's weight becoming zero only uses a copy of the weight which was preliminary verified to be non-null (when the weight is null, the server will be removed from the tree anyway so there is no need to recalculate its position). With this change, the code survived an injection at 200k req/s split on two servers with weights changing 50 times a second. This commit must be backported to 1.9 only.	2019-04-24 14:23:40 +02:00
Olivier Houchard	a28454ee21	BUG/MEDIUM: ssl: Return -1 on recv/send if we got EAGAIN. In ha_ssl_read()/ha_ssl_write(), if we couldn't send/receive data because we got EAGAIN, return -1 and not 0, as older SSL versions expect that. This should fix the problems with OpenSSL < 1.1.0.	2019-04-24 12:06:08 +02:00
Christopher Faulet	371723b0c2	BUG/MINOR: spoe: Don't systematically wakeup SPOE stream in the applet handler This can lead to wakeups in loop between the SPOE stream and the SPOE applets waiting to receive agent messages (mainly AGENT-HELLO and AGENT-DISCONNECT). This patch must be backported to 1.9 and 1.8.	2019-04-23 21:20:47 +02:00
Christopher Faulet	5e1a9d715e	BUG/MEDIUM: stream: Fix the way early aborts on the client side are handled A regression was introduced with the commit c9aecc8ff ("BUG/MEDIUM: stream: Don't request a server connection if a shutw was scheduled"). Among other this, it breaks the CLI when the shutr on the client side is handled with the client data. To depend on the flag CF_SHUTW_NOW to not establish the server connection when an error on the client side is detected is the right way to fix the bug, because this flag may be set without any error on the client side. So instead, we abort the request where the error is handled and only when the backend stream-interface is in the state SI_ST_INI. This way, there is no ambiguity on the reason why the abort accurred. The stream-interface is also switched to the state SI_ST_CLO. This patch must be backported to 1.9. If the commit c9aecc8ff is backported to previous versions, this one MUST also be backported. Otherwise, it MAY be backported to older versions that 1.9 with caution.	2019-04-23 21:20:47 +02:00
Fr�d�ric L�caille	bed883abe8	BUG/MAJOR: stream: Missing DNS context initializations. Fix some missing initializations wich came with `333939c` commit (MINOR: action: new '(http-request\|tcp-request content) do-resolve' action). The DNS contexts of streams which were allocated were not initialized by stream_new(). This leaded to accesses to non-allocated memory when freeing these contexts with stream_free().	2019-04-23 20:24:11 +02:00
Willy Tarreau	ca8df4c074	REGTEST: make the "run-regtests" script search for tests in reg-tests by default It happens almost daily to me that make regtests fails because the script found a temporary, old, or broken VTC file that was lying in my work dir, leaving me no place to hide it. This is a real pain as some tests take ages to fail, so let's make this script only look up for tests where they are expected to be stored, under reg-tests only. It remains possible to force the location on the command line though.	2019-04-23 16:09:50 +02:00
Fr�d�ric L�caille	b894f9230c	REGTEST: adapt some reg tests after renaming. Some reg tests and their dependencies have been renamed. They may be referenced by the .vtc files. So, this patch modifies also the references to these dependencies.	2019-04-23 15:37:11 +02:00
Fr�d�ric L�caille	d7a8f14145	REGTEST: rename the reg test files. We rename all the VTC files to avoid name collisions when importing/backporting.	2019-04-23 15:37:03 +02:00
Fr�d�ric L�caille	dc1a3bd999	REGTEST: replace LEVEL option by a more human readable one. This patch replaces LEVEL variable by REGTESTS_TYPES variable which is more mnemonic and human readable. It is uses as a filter to run the reg tests scripts where a commented #REGTEST_TYPE may be defined to designate their types. Running the following command: $ REGTESTS_TYPES=slow,default will start all the reg tests where REGTEST_TYPE is defines as 'slow' or 'default'. Note that 'default' is also the default value of REGTEST_TYPE when not specified dedicated to run all the current h.vtc files. When REGTESTS_TYPES is not specified there is no filter at all. All the tests are run. This patches also defines REGTEST_TYPE with 'slow' value for all the s.vtc files, 'bug' value for al the b.vtc files, 'broken' value for all the k.vtc files.	2019-04-23 15:14:52 +02:00
Fr�d�ric L�caille	0bad840b4d	MINOR: log: Extract some code to send syslog messages. This patch extracts the code of __send_log() responsible of sending a syslog message to a syslog destination represented as a logsrv struct to define __do_send_log() function. __send_log() calls __do_send_log() for each syslog destination of a proxy after having prepared some of its parameters.	2019-04-23 14:16:51 +02:00
Baptiste Assmann	333939c2ee	MINOR: action: new '(http-request\|tcp-request content) do-resolve' action The 'do-resolve' action is an http-request or tcp-request content action which allows to run DNS resolution at run time in HAProxy. The name to be resolved can be picked up in the request sent by the client and the result of the resolution is stored in a variable. The time the resolution is being performed, the request is on pause. If the resolution can't provide a suitable result, then the variable will be empty. It's up to the admin to take decisions based on this statement (return 503 to prevent loops). Read carefully the documentation concerning this feature, to ensure your setup is secure and safe to be used in production. This patch creates a global counter to track various errors reported by the action 'do-resolve'.	2019-04-23 11:41:52 +02:00
Baptiste Assmann	0b9ce82dfa	MINOR: obj_type: new object type for struct stream This patch creates a new obj_type for the struct stream in HAProxy.	2019-04-23 11:35:56 +02:00
Baptiste Assmann	db4c8521ca	MINOR: dns: move callback affection in dns_link_resolution() In dns.c, dns_link_resolution(), each type of dns requester is managed separately, that said, the callback function is affected globaly (and points to server type callbacks only). This design prevents the addition of new dns requester type and this patch aims at fixing this limitation: now, the callback setting is done directly into the portion of code dedicated to each requester type.	2019-04-23 11:34:11 +02:00
Baptiste Assmann	dfd35fd71a	MINOR: dns: dns_requester structures are now in a memory pool dns_requester structure can be allocated at run time when servers get associated to DNS resolution (this happens when SRV records are used in conjunction with service discovery). Well, this memory allocation is safer if managed in an HAProxy pool, furthermore with upcoming HTTP action which can perform DNS resolution at runtime. This patch moves the memory management of the dns_requester structure into its own pool.	2019-04-23 11:33:48 +02:00
paulborile	cd9b9bd3e4	MINOR: contrib: dummy wurfl library This is dummy version of the Scientiamobile WURFL C API that can be used to successfully build/run haproxy compiled with USE_WURFL=1. It is marked as version 1.11.2.100 to distinguish it from any real version of the lib. It has no external dependencies so it should work out of the box by building it like this : $ make -C contrib/wurfl In order to use it, simply reference this directory as the WURFL include and library paths : $ make TARGET=<target> USE_WURFL=1 WURFL_INC=$PWD/contrib/wurfl WURFL_LIB=$PWD/contrib/wurfl	2019-04-23 11:00:23 +02:00
paulborile	7714b12604	MINOR: wurfl: enabled multithreading mode Initially excluded multithreaded mode is completely supported (libwurfl is fully MT safe). Internal tests now are run also with multithreading enabled.	2019-04-23 11:00:23 +02:00
paulborile	c81b9bf7b4	DOC: wurfl: added point of contact in MAINTAINERS file	2019-04-23 11:00:23 +02:00
paulborile	bad132c384	CLEANUP: wurfl: removed deprecated methods last 2 major releases of libwurfl included a complete review of engine options with the result of deprecating many features. The patch removes unecessary code and fixes the documentation. Can be backported on any version of haproxy. [wt: must not be backported since it removes config keywords and would thus break existing configurations] Signed-off-by: Willy Tarreau <w@1wt.eu>	2019-04-23 11:00:23 +02:00
paulborile	59d50145dc	BUILD: wurfl: build fix for 1.9/2.0 code base This applies the required changes for the new buffer API that came in 1.9. This patch must be backported to 1.9.	2019-04-23 11:00:23 +02:00
Willy Tarreau	b518823f1b	MINOR: wurfl: indicate in haproxy -vv the wurfl version in use It also explicitly mentions that the library is the dummy one when it is detected. We have this output now : $ ./haproxy -vv \|grep -i wurfl Built with WURFL support (dummy library version 1.11.2.100)	2019-04-23 11:00:23 +02:00
Willy Tarreau	1426198efb	BUILD: add USE_WURFL to the list of known build options Since the removal of WURFL we've improved the build system to report known build options, let's reference this option there as well.	2019-04-23 11:00:23 +02:00
Willy Tarreau	b3cc9f2887	Revert "CLEANUP: wurfl: remove dead, broken and unmaintained code" This reverts commit `8e5e1e7bf0`. The following patches will fix this code and may be backported.	2019-04-23 10:34:43 +02:00
Emeric Brun	d0e095c2aa	MINOR: ssl/cli: async fd io-handlers printable on show fd This patch exports the async fd iohandlers and make them printable doing a 'show fd' on cli.	2019-04-19 17:27:01 +02:00

... 2 3 4 5 6 ...

9783 Commits