haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-01-20 12:40:46 +00:00

Author	SHA1	Message	Date
Christopher Faulet	799f51801a	BUG/MINOR: spoe: Fix parsing of dontlog-normal option A missing goto led to a parsing error when line "option dontlog-normal" was parsed.	2018-04-26 11:50:30 +02:00
Christopher Faulet	ebe1399efe	BUG/MINOR: spoe: Fix counters update when processing is interrupted When the processing is interrupted, because of a typo, <nb_sending> was incremented instead of decremented.	2018-04-26 11:50:18 +02:00
Willy Tarreau	eba10f24b7	BUG/MEDIUM: h2: implement missing support for chunked encoded uploads Upload requests not carrying a content-length nor tunnelling data must be sent chunked-encoded over HTTP/1. The code was planned but for some reason forgotten during the implementation, leading to such payloads to be sent as tunnelled data. Browsers always emit a content length in uploads so this problem doesn't happen for most sites. However some applications may send data frames after a request without indicating it earlier. The only way to detect that a client will need to send data is that the HEADERS frame doesn't hold the ES bit. In this case it's wise to look for the content-length header. If it's not there, either we're in tunnel (CONNECT method) or chunked-encoding (other methods). This patch implements this. The following request is sent using content-length : curl --http2 -sk https://127.0.0.1:4443/s2 -XPOST -T /large/file and these ones using chunked-encoding : curl --http2 -sk https://127.0.0.1:4443/s2 -XPUT -T /large/file curl --http2 -sk https://127.0.0.1:4443/s2 -XPUT -T - < /dev/urandom Thanks to Robert Samuel Newson for raising this issue with details. This fix must be backported to 1.8.	2018-04-26 10:20:44 +02:00
Willy Tarreau	174b06a572	MINOR: h2: detect presence of CONNECT and/or content-length We'll need this in order to support uploading chunks. The h2 to h1 converter checks for the presence of the content-length header field as well as the CONNECT method and returns these information to the caller. The caller indicates whether or not a body is detected for the message (presence of END_STREAM or not). No transfer-encoding header is emitted yet.	2018-04-26 10:15:14 +02:00
Tim Duesterhus	cd235c6042	BUG/MEDIUM: lua: Fix segmentation fault if a Lua task exits PiBa-NL reported that haproxy crashes with a segmentation fault if a function registered using `core.register_task` returns. An example Lua script that reproduces the bug is: mytask = function() core.Info("Stopping task") end core.register_task(mytask) The Valgrind output is as follows: ==6759== Process terminating with default action of signal 11 (SIGSEGV) ==6759== Access not within mapped region at address 0x20 ==6759== at 0x5B60AA9: lua_sethook (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==6759== by 0x430264: hlua_ctx_resume (hlua.c:1009) ==6759== by 0x43BB68: hlua_process_task (hlua.c:5525) ==6759== by 0x4FED0A: process_runnable_tasks (task.c:231) ==6759== by 0x4B2256: run_poll_loop (haproxy.c:2397) ==6759== by 0x4B2256: run_thread_poll_loop (haproxy.c:2459) ==6759== by 0x41A7E4: main (haproxy.c:3049) Add the missing `task = NULL` for the `HLUA_E_OK` case. The error cases have been fixed as of `253e53e661` which first was included in haproxy v1.8-dev3. This bugfix should be backported to haproxy 1.8.	2018-04-25 11:30:56 +02:00
Rian McGuire	89fcb7d929	BUG/MINOR: log: t_idle (%Ti) is not set for some requests If TCP content inspection is used, msg_state can be >= HTTP_MSG_ERROR the first time http_wait_for_request is called. t_idle was being left unset in that case. In the example below : stick-table type string len 64 size 100k expire 60s tcp-request inspect-delay 1s tcp-request content track-sc1 hdr(X-Session) %Ti will always be -1, because the msg_state is already at HTTP_MSG_BODY when http_wait_for_request is called for the first time. This patch should backported to 1.8 and 1.7.	2018-04-25 08:59:23 +02:00
Tim Duesterhus	45be38c9c7	BUG/MAJOR: channel: Fix crash when trying to read from a closed socket When haproxy is compiled using GCC <= 3.x or >= 5.x the `unlikely` macro performs a comparison with zero: `(x) != 0`, thus returning either 0 or 1. In `int co_getline_nc()` this macro was accidentally applied to the variable `retcode` itself, instead of the result of the comparison `retcode <= 0`. As a result any negative `retcode` is converted to `1` for purposes of the comparison. Thus never taking the branch (and exiting the function) for negative values. This in turn leads to reads of uninitialized memory in the for-loop below: ==12141== Conditional jump or move depends on uninitialised value(s) ==12141== at 0x4EB6B4: co_getline_nc (channel.c:346) ==12141== by 0x421CA4: hlua_socket_receive_yield (hlua.c:1713) ==12141== by 0x421F6F: hlua_socket_receive (hlua.c:1896) ==12141== by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x529B497: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x529711A: lua_pcallk (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x52ABDF0: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x529A9F1: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x529B523: lua_resume (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== ==12141== Use of uninitialised value of size 8 ==12141== at 0x4EB6B9: co_getline_nc (channel.c:346) ==12141== by 0x421CA4: hlua_socket_receive_yield (hlua.c:1713) ==12141== by 0x421F6F: hlua_socket_receive (hlua.c:1896) ==12141== by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x529B497: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x529711A: lua_pcallk (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x52ABDF0: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x529A9F1: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x529B523: lua_resume (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== ==12141== Invalid read of size 1 ==12141== at 0x4EB6B9: co_getline_nc (channel.c:346) ==12141== by 0x421CA4: hlua_socket_receive_yield (hlua.c:1713) ==12141== by 0x421F6F: hlua_socket_receive (hlua.c:1896) ==12141== by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x529B497: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x529711A: lua_pcallk (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x52ABDF0: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x529A9F1: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== by 0x529B523: lua_resume (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0) ==12141== Address 0x8637171e928bb500 is not stack'd, malloc'd or (recently) free'd Fix this bug by correctly applying the `unlikely` macro to the result of the comparison. This bug exists as of commit `ca16b03813` which is the first commit adding this function. v1.6-dev1 is the first tag containing this commit, the fix should be backported to haproxy 1.6 and newer.	2018-04-25 05:39:49 +02:00
Aur�lien Nephtali	564d15a71e	BUG/MINOR: pattern: Add a missing HA_SPIN_INIT() in pat_ref_newid() pat_ref_newid() is lacking a spinlock init. It was probably forgotten in `b5997f740b` ("MAJOR: threads/map: Make acls/maps thread safe"). Signed-off-by: Aur�lien Nephtali <aurelien.nephtali@corp.ovh.com>	2018-04-19 17:49:48 +02:00
Willy Tarreau	daac1e4c79	DOC: lua: update the links to the config and Lua API The links were still stuck to version 1.6. Let's update them. The patch needs to be carefully backported to 1.8 and 1.7 after editing the respective version (replace 1.9dev with 1.8 or 1.7).	2018-04-19 15:12:26 +02:00
Willy Tarreau	3f0e1ec701	BUG/CRITICAL: h2: fix incorrect frame length check The incoming H2 frame length was checked against the max_frame_size setting instead of being checked against the bufsize. The max_frame_size only applies to outgoing traffic and not to incoming one, so if a large enough frame size is advertised in the SETTINGS frame, a wrapped frame will be defragmented into a temporary allocated buffer where the second fragment my overflow the heap by up to 16 kB. It is very unlikely that this can be exploited for code execution given that buffers are very short lived and their address not realistically predictable in production, but the likeliness of an immediate crash is absolutely certain. This fix must be backported to 1.8. Many thanks to Jordan Zebor from F5 Networks for reporting this issue in a responsible way.	2018-04-19 10:35:30 +02:00
Willy Tarreau	9eb2a4addf	BUILD: sample: avoid build warning in sample.c Recent commit `9631a28` ("MEDIUM: sample: Extend functionality for field/word converters") introduced this minor build warning that this patch addresses : src/sample.c: In function 'sample_conv_word': src/sample.c:2108:8: warning: suggest explicit braces to avoid ambiguous 'else' [-Wparentheses] src/sample.c:2137:8: warning: suggest explicit braces to avoid ambiguous 'else' [-Wparentheses] No backport is needed.	2018-04-19 10:33:28 +02:00
Olivier Houchard	ebaba75429	BUG/MEDIUM: kqueue: When adding new events, provide an output to get errors. When adding new events using kevent(), if there's an error, because we're trying to delete an event that wasn't there, or because the fd has already been closed, kevent() will either add an event in the eventlist array if there's enough room for it, and keep on handling other events, or stop and return -1. We want it to process all the events, so give it a large-enough array to store any error. Special thanks to PiBa-NL for diagnosing the root cause of this bug. This should be backported to 1.8.	2018-04-17 17:46:56 +02:00
William Lallemand	daf4cd209a	MINOR: export localpeer as an environment variable Export localpeer as the environment variable $HAPROXY_LOCALPEER, allowing to use this variable in the configuration file. It's useful to use this variable in the case of synchronized configuration between peers.	2018-04-17 17:17:58 +02:00
Marcin Deranek	9631a28275	MEDIUM: sample: Extend functionality for field/word converters Extend functionality of field/word converters, so it's possible to extract field(s)/word(s) counting from the beginning/end and/or extract multiple fields/words (including separators) eg. str(f1_f2_f3__f5),field(2,_,2) # f2_f3 str(f1_f2_f3__f5),field(2,_,0) # f2_f3__f5 str(f1_f2_f3__f5),field(-2,_,3) # f2_f3_ str(f1_f2_f3__f5),field(-3,_,0) # f1_f2_f3 str(w1_w2_w3___w4),word(3,_,2) # w3___w4 str(w1_w2_w3___w4),word(2,_,0) # w2_w3___w4 str(w1_w2_w3___w4),word(-2,_,3) # w1_w2_w3 str(w1_w2_w3___w4),word(-3,_,0) # w1_w2 Change is backward compatible.	2018-04-17 11:27:48 +02:00
Aur�lien Nephtali	9a4da683a6	MINOR: cli: Ensure the CLI always outputs an error when it should When using the CLI_ST_PRINT_FREE state, always output something back if the faulty function did not fill the 'err' variable. The map/acl code could lead to a crash whereas the SSL code was silently failing. Signed-off-by: Aur�lien Nephtali <aurelien.nephtali@corp.ovh.com>	2018-04-16 19:23:16 +02:00
Aur�lien Nephtali	c511b7cc97	BUG/MINOR: cli: Guard against NULL messages when using CLI_ST_PRINT_FREE Some error paths (especially those followed when running out of memory) can set the error message to NULL. In order to avoid a crash, use a generic message ("Out of memory") when this case arises. It should be backported to 1.8. Signed-off-by: Aur�lien Nephtali <aurelien.nephtali@corp.ovh.com>	2018-04-16 19:22:42 +02:00
Ben Draut	054fbee67a	MINOR: config: Warn if resolvers has no nameservers Today, a `resolvers` section may be configured without any `nameserver` directives, which is useless. This implements a warning when such sections are detected. [List thread][1]. [1]: https://www.mail-archive.com/haproxy@formilux.org/msg29600.html	2018-04-16 15:58:23 +02:00
Marcin Deranek	9a66dfbd6c	MINOR: proxy: Add fe_defbe fetcher Patch adds ability to fetch frontend's default backend name in your logic, so it can be used later to derive other backend names to make routing decisions.	2018-04-16 15:51:57 +02:00
Christopher Faulet	11ebb2080e	BUG/MINOR: http: Return an error in proxy mode when url2sa fails In proxy mode, the result of url2sa is never checked. So when the function fails to resolve the destination server from the URL, we continue. Depending on the internal state of the connection, we get different behaviours. With a newly allocated connection, the field <addr.to> is not set. So we will get a HTTP error. The status code is 503 instead of 400, but it's not really critical. But, if it's a recycled connection, we will reuse the previous value of <addr.to>, opening a connection on an unexpected server. To fix the bug, we return an error when url2sa fails. This patch should be backported in all version from 1.5.	2018-04-16 15:31:18 +02:00
Olivier Houchard	302f9ef055	BUG/MEDIUM: connection: Make sure we have a mux before calling detach(). In some cases, we call cs_destroy() very early, so early the connection doesn't yet have a mux, so we can't call mux->detach(). In this case, just destroy the associated connection. This should be backported to 1.8.	2018-04-13 16:02:21 +02:00
Christopher Faulet	48aa13f286	BUG/MEDIUM: threads: Fix the max/min calculation because of name clashes With gcc < 4.7, when HAProxy is built with threads, the macros HA_ATOMIC_CAS/XCHG/STORE relies on the legacy __sync builtins. These macros are slightly complicated than the versions relying on the '_atomic' builtins. Internally, some local variables are defined, prefixed with '__' to avoid name clashes with the caller. On the other hand, the macros HA_ATOMIC_UPDATE_MIN/MAX call HA_ATOMIC_CAS. Some local variables are also definied in these macros, following the same naming rule as below. The problem is that '__new' variable is used in HA_ATOMIC_MIN/_MAX and in HA_ATOMIC_CAS. Obviously, the behaviour is undefined because '__new' in HA_ATOMIC_CAS is left uninitialized. Unfortunatly gcc fails to detect this error. To fix the problem, all internal variables to macros are now suffixed with name of the macros to avoid clashes (for instance, '__new_cas' in HA_ATOMIC_CAS). This patch must be backported in 1.8.	2018-04-10 11:07:56 +02:00
Thierry Fournier	f7b7c3e2f2	MINOR: servers: Support alphanumeric characters for the server templates names 'server-template' directive doesn't support the same name alphabet as the 'server' directive. This patch allows the usage of chars [0-9]. [wt: let's backport this to 1.8 to apply the principle of least surprize to people migrating to server templates]	2018-04-06 19:16:18 +02:00
Willy Tarreau	1093a4586c	BUG/MAJOR: cache: always initialize newly created objects Recent commit `5bd37fa` ("BUG/MAJOR: cache: fix random crashes caused by incorrect delete() on non-first blocks") addressed an issue where dangling objects could be deleted in the cache, but even after this fix some similar segfaults were reported at the same place (cache_free_blocks()). The tree was always corrupted as well. Placing some traces revealed that this time it's caused by a missing initialization in http_action_store_cache() : while object->eb.key is used to note that the object is not in the tree, the first retrieved block may contain random data and is not initialized. Further, this entry can be updated later without the object being inserted into the tree. Thus, if at the end the object is not stored and the blocks are put back to the avail list, the next attempt to use them will find eb.key != 0 and will try to delete the uninitialized block, will see that eb.node.leaf_p is not NULL (random data), and will dereference it as well as a few other uninitialized pointers. It was harder to trigger than the previous one, despite being very closely related. This time the following config was used : listen l1 mode http bind :8888 http-request cache-use c1 http-response cache-store c1 server s1 127.0.0.1:8000 cache c1 total-max-size 4 max-age 10 Httpterm was running on port 8000. And it was stressed this way : $ inject -o 1 -u 500 -P 1 -G '127.0.0.1:8888/?s=4097&p=1&x=%s' ... wait 5 seconds then Ctrl-C ... # wait 3 seconds doing nothing $ inject -o 1 -u 500 -P 1 -G '127.0.0.1:8888/?s=4097&p=1&x=%s' => segfault Other values don't work well. The size and the small pieces in the responses (p=1) are critical to make it work. Here the fix consists in pre-zeroing object->eb.key AND object->eb.leaf_p just after the object is allocated so as to stay consistent with other locations. Ideally this could be simplified later by only relying on eb->node.leaf_p everywhere since in the end the key alone is not a reliable indicator, so that we use only one indicator of being part of the tree or not. This fix needs to be backported to 1.8.	2018-04-06 19:02:25 +02:00
Christopher Faulet	caf2feca62	MINOR: spoe: Add counters to log info about SPOE agents In addition to metrics about time spent in the SPOE, following counters have been added: * applets : number of SPOE applets. * idles : number of idle applets. * nb_sending : number of streams waiting to send data. * nb_waiting : number of streams waiting for a ack. * nb_processed : number of events/groups processed by the SPOE (from the stream point of view). * nb_errors : number of errors during the processing (from the stream point of view). Log messages has been updated to report these counters. Following pattern has been added at the end of the log message: ... <idles>/<applets> <nb_sending>/<nb_waiting> <nb_error>/<nb_processed>	2018-04-05 15:13:54 +02:00
Christopher Faulet	3b8e34902b	MINOR: spoe: use agent's logger to log SPOE messages Instead of using the logger of the stream, we now use dedicated logger of the SPOE. This means a logger should be defined.	2018-04-05 15:13:54 +02:00
Christopher Faulet	0e0f085a73	MINOR: spoe: Add support for option dontlog-normal in the SPOE agent section It does the same than for proxies.	2018-04-05 15:13:54 +02:00
Christopher Faulet	7250b8fb5c	MINOR: spoe: Add loggers dedicated to the SPOE agent Now it is possible to configure a logger in a spoe-agent section using a "log" line, as for a proxy. "no log", "log global" and "log <address> ..." syntaxes are supported.	2018-04-05 15:13:54 +02:00
Christopher Faulet	28ac099907	MINOR: log: Keep the ref when a log server is copied to avoid duplicate entries With "log global" line, the global list of loggers are copied into the proxy's struct. The list coming from the default section is also copied when a frontend or a backend section is parsed. So it is possible to have duplicate entries in the proxy's list. For instance, with this following config, all messages will be logged twice: global log 127.0.0.1 local0 debug daemon defaults mode http log global option httplog frontend front-http log global bind *:8888 default_backend back-http backend back-http server www 127.0.0.1:8000	2018-04-05 15:13:54 +02:00
Christopher Faulet	4b0b79dd56	MINOR: log: move 'log' keyword parsing in dedicated function Now, the function parse_logsrv should be used to parse a "log" line. This function will update the list of loggers passed in argument. It can release all log servers when "no log" line was parsed (by the caller) or it can parse "log global" or "log <address> ... " lines. It takes care of checking the caller context (global or not) to prohibit "log global" usage in the global section.	2018-04-05 15:13:54 +02:00
Christopher Faulet	36bda1cd4a	MINOR: spoe: Add options to store processing times in variables "set-process-time" and "set-total-time" options have been added to store processing times in the transaction scope, at each event and group processing, the current one and the total one. So it is possible to get them. TODO: documentation	2018-04-05 15:13:54 +02:00
Christopher Faulet	b2dd1e034c	MINOR: spoe: Add metrics in to know time spent in the SPOE Following metrics are added for each event or group of messages processed in the SPOE: * processing time: the delay to process the event or the group. From the stream point of view, it is the latency added by the SPOE processing. * request time : It is the encoding time. It includes ACLs processing, if any. For fragmented frames, it is the sum of all fragments. * queue time : the delay before the request gets out the sending queue. For fragmented frames, it is the sum of all fragments. * waiting time: the delay before the reponse is received. No fragmentation supported here. * response time: the delay to process the response. No fragmentation supported here. * total time: (unused for now). It is the sum of all events or groups processed by the SPOE for a specific threads. Log messages has been updated. Before, only errors was logged (status_code != 0). Now every processing is logged, following this format: SPOE: [AGENT] <TYPE:NAME> sid=STREAM-ID st=STATUC-CODE reqT/qT/wT/resT/pT where: AGENT is the agent name TYPE is EVENT of GROUP NAME is the event or the group name STREAM-ID is an integer, the unique id of the stream STATUS_CODE is the processing's status code reqT/qT/wT/resT/pT are delays descrive above For all these delays, -1 means the processing was interrupted before the end. So -1 for the queue time means the request was never dequeued. For fragmented frames it is harder to know when the interruption happened. For now, messages are logged using the same logger than the backend of the stream which initiated the request.	2018-04-05 15:13:53 +02:00
Christopher Faulet	879dca9a76	BUG/MINOR: spoe: Don't forget to decrement fpa when a processing is interrupted In async or pipelining mode, we count the number of NOTIFY frames sent waiting for their corresponding ACK frames. This is a way to evaluate the "load" of a SPOE applet. For pipelining mode, it is easy to make the link between a NOTIFY frame and its ACK one, because exchanges are done using the same TCP connection. For async mode, it is harder because a ACK frame can be received on another connection than the one sending the NOTIFY frame. So to decrement the fpa of the right applet, we need to keep it in the SPOE context. Most of time, it works expect when the processing is interrupted by the stream, because of a timeout. This patch fixes this issue. If a SPOE applet is still link to a SPOE context when the processing is interrupted by the stream, the applet's fpa is decremented. This is only done for unfragmented frames.	2018-04-05 15:13:53 +02:00
Christopher Faulet	b7426d1562	BUG/MINOR: spoe: Register the variable to set when an error occurred Variables referenced in HAProxy's configuration file are registered during the configuration parsing (during parsing of "var", "set-var" or "unset-var" keywords). For the SPOE, you can use "register-var-names" directive to explicitly register variable names. All unknown variables will be rejected (unless you set "force-set-var" option). But, the variable set when an error occurred (when "set-on-error" option is defined) should also be regiestered by default. This is done with this patch.	2018-04-05 15:13:53 +02:00
Christopher Faulet	ac580608d7	BUG/MINOR: spoe: Don't release the context buffer in .check_timeouts callbaclk It is better to let spoe_stop_processing release this buffer because, in .check_timeouts callback, we lack information to know if it should be release or not. For instance, if the processing timeout is reached while the SPOE applet receives the reply, it is preferable to ignore the timeout and process the result. This patch should be backported in 1.8.	2018-04-05 15:13:53 +02:00
Christopher Faulet	84c844eb12	BUG/MINOR: spoe: Initialize variables used during conf parsing before any check Some initializations must be done at the beginning of parse_spoe_flt to avoid segmentaion fault when first errors are catched, when the "filter spoe" line is parsed. This patch must be backported in 1.8. [cf: the variable "curvars" doesn't exist in 1.8. So the patch must be adapted.]	2018-04-05 15:13:53 +02:00
Willy Tarreau	5bd37fa625	BUG/MAJOR: cache: fix random crashes caused by incorrect delete() on non-first blocks Several segfaults were reported in the cache, each time in eb_delete() called from cache_free_blocks() itself called from shctx_row_reserve_hot(). Each time the tree node was corrupted with random cached data (often JS or HTML contents). The problem comes from an incompatibility between the cache's expectations and the recycling algorithm used in the shctx. The shctx allocates and releases a chain of blocks at once. And when it needs to allocate N blocks from the avail list while a chain of M>N is found, it picks the first N from the list, moves them to the hot list, and marks all remaining M-N blocks as isolated blocks (chains of 1). For each such released block, the shctx->free_block() callback is used and passed a pointer to the first and current block of the chain. For the cache, it's cache_free_blocks(). What this function does is check that the current block is the first one, and in this case delete the object from the tree and mark it as not in tree by setting key to zero. The problem this causes is that the tail blocks when M>N become first blocks for the next call to shctx_row_reserve_hot(), these ones will be passed to cache_free_blocks() as list heads, and will be sent to eb_delete() despite containing only cached data. The simplest solution for now is to mark each block as holding no cache object by setting key to zero all the time. It keeps the principle used elsewhere in the code. The SSL code is not subject to this problem because it relies on the block's len not being null, which happens immediately after a block was released. It was uncertain however whether this method is suitable for the cache. It is not critical though since this code is going to change soon in 1.9 to dynamically allocate only the number of required blocks. This fix must be backported to 1.8. Thanks to Thierry for providing exploitable cores.	2018-04-04 20:17:03 +02:00
Willy Tarreau	afe1de5d98	BUG/MINOR: cache: fix "show cache" output The "show cache" command used to dump the header for each entry into into the handler loop, making it repeated every ~16kB of output data. Additionally chunk_appendf() was used instead of chunk_printf(), causing the output to repeat already emitted lines, and the output size to grow in O(n^2). It used to take several minutes to report tens of millions of objects from a small cache containing only a few thousands. There was no more impact though. This fix must be backported to 1.8.	2018-04-04 11:56:43 +02:00
Christopher Faulet	b797ae1f15	BUG/MINOR: email-alert: Set the mailer port during alert initialization Since the commit `2f3a56b4f` ("BUG/MINOR: tcp-check: use the server's service port as a fallback"), email alerts stopped working because the mailer's port was overriden by the server's port. Remember, email alerts are defined as checks with specific tcp-check rules and triggered on demand to send alerts. So to send an email, a check is executed. Because no specific port's was defined, the server's one was used. To fix the bug, the ports used for checks attached an email alert are explicitly set using the mailer's port. So this port will be used instead of the server's one. In this patch, the assignement to a default port (587) when an email alert is defined has been removed. Indeed, when a mailer is defined, the port must be defined. So the default port was never used. This patch must be backported in 1.8.	2018-04-04 10:36:50 +02:00
Olivier Houchard	8ef1a6b0d8	BUG/MINOR: fd: Don't clear the update_mask in fd_insert. Clearing the update_mask bit in fd_insert may lead to duplicate insertion of fd in fd_updt, that could lead to a write past the end of the array. Instead, make sure the update_mask bit is cleared by the pollers no matter what. This should be backported to 1.8. [wt: warning: 1.8 doesn't have the lockless fdcache changes and will require some careful changes in the pollers]	2018-04-03 19:38:15 +02:00
Willy Tarreau	2500fc2c34	BUG/MINOR: checks: check the conn_stream's readiness and not the connection Since commit `9aaf778` ("MAJOR: connection : Split struct connection into struct connection and struct conn_stream."), the checks use a conn_stream and not directly the connection anymore. However wake_srv_chk() still used to verify the connection's readiness instead of the conn_stream's. Due to the existence of a mux, the connection is always waiting for receiving something, and doesn't reflect the changes made in event_srv_chk_{r,w}(), causing the connection appear as not ready yet, and the check to be validated only after its timeout. The difference is only visible when sending pure TCP checks, and simply adding a "tcp-check connect" line is enough to work around it. This fix must be backported to 1.8.	2018-04-03 19:31:38 +02:00
Willy Tarreau	b2e290acb6	BUG/MEDIUM: h2: always add a stream to the send or fctl list when blocked When a stream blocks on a mux buffer full/unallocated or on connection flow control, a flag among H2_SF_MUX_M* is set, but the stream is not always added to the connection's list. It's properly done when the operations are performed from the connection handler but not always when done from the stream handler. For instance, a simple shutr or shutw may fail by lack of room. If it's immediately followed by a call to h2_detach(), the stream remains lying around in no list at all, and prevents the connection from ending. This problem is actually quite difficult to trigger and seems to require some large objects and low server-side timeouts. This patch covers all identified paths. Some are redundant but since the code will change and will be simplified in 1.9, it's better to stay on the safe side here for now. It must be backported to 1.8.	2018-03-30 17:43:49 +02:00
Willy Tarreau	1a1dd6066f	BUG/MINOR: h2: remove accidental debug code introduced with show_fd function Commit `e3f36cd` ("MINOR: h2: implement a basic "show_fd" function") accidently brought one surrounding debugging part that was in the same context. No backport needed.	2018-03-30 17:41:19 +02:00
Willy Tarreau	c754b343a2	MINOR: cli: report cache indexes in "show fd" Instead of just indicating "cache={0,1}" we now report cache.next and cache.prev since they are the ones used with the lockless fd cache.	2018-03-30 15:00:15 +02:00
Willy Tarreau	e3f36cd479	MINOR: h2: implement a basic "show_fd" function The purpose here is to dump some information regarding an H2 connection, and a few statistics about its streams. The output looks like this : 35 : st=0x55(R:PrA W:PrA) ev=0x00(heopi) [lc] cache=0 owner=0x7ff49ee15e80 iocb=0x588a61(conn_fd_handler) tmask=0x1 umask=0x0 cflg=0x00201366 fe=decrypt mux=H2 mux_ctx=0x7ff49ee16f30 st0=2 flg=0x00000002 fctl_cnt=0 send_cnt=33 tree_cnt=33 orph_cnt=0 - st0 is the connection's state (FRAME_H here) - flg is the connection's flags (MUX_MFULL here) - fctl_cnt is the number of streams in the fctl_list - send_cnt is the number of streams in the send_list - tree_cnt is the number of streams in the streams_by_id tree - orph_cnt is the number of orphaned streams (cs==0) in the tree	2018-03-30 14:43:13 +02:00
Willy Tarreau	b011d8f4c4	MINOR: mux: add a "show_fd" function to dump debugging information for "show fd" This function will be called from the CLI's "show fd" command to append some extra mux-specific information that only the mux handler can decode. This is supposed to help collect various hints about what is happening when facing certain anomalies.	2018-03-30 14:41:19 +02:00
Willy Tarreau	e96e61cadc	BUILD/MINOR: threads: always export thread_sync_io_handler() Otherwise it doesn't build again without threads.	2018-03-29 18:54:33 +02:00
Willy Tarreau	3041fcc2fd	BUG/MEDIUM: h2: don't consider pending data on detach if connection is in error Interrupting an h2load test shows that some connections remain active till the client timeout. This is due to the fact that h2_detach() immediately returns if the h2s flags indicate that the h2s is still waiting for some buffer room in the output mux (possibly to emit a response or to send some window updates). If the connection is broken, these data will never leave and must not prevent the stream from being terminated nor the connection from being released. This fix must be backported to 1.8.	2018-03-29 15:41:32 +02:00
Willy Tarreau	0975f11d55	BUG/MEDIUM: h2/threads: never release the task outside of the task handler Currently, h2_release() will release all resources assigned to the h2 connection, including the timeout task if any. But since the multi-threaded scheduler, the timeout task could very well be queued in the thread-local list of running tasks without any way to remove it, so task_delete() will have no effect and task_free() will cause this undefined object to be dereferenced. In order to prevent this from happening, we never release the task in h2_release(), instead we wake it up after marking its context NULL so that the task handler can release the task. Future improvements could consist in modifying the scheduler so that a task_wakeup() has to be done on any task having to be killed, letting the scheduler take care of it. This fix must be backported to 1.8. This bug was apparently not reported so far.	2018-03-29 15:22:59 +02:00
Willy Tarreau	71049cce3f	MINOR: h2: fuse h2s_detach() and h2s_free() into h2s_destroy() Since these two functions are always used together, let's simplify the code by having a single one for both operations. It also ensures we don't leave wandering elements that risk to leak later.	2018-03-29 13:22:15 +02:00
Willy Tarreau	e323f3458c	MINOR: h2: always call h2s_detach() in h2_detach() The code is safer and more robust this way, it avoids multiple paths. This is possible due to the idempotence of LIST_DEL() and eb32_delete() that are called in h2s_detach().	2018-03-29 13:22:15 +02:00

1 2 3 4 5 ...

7514 Commits