haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2024-12-16 08:24:42 +00:00

Author	SHA1	Message	Date
Willy Tarreau	2da742933d	REGTESTS: unbreak http-check-send.vtc As noticed by Christopher, I messed up the version fix in commit `cb4ed02ef` ("REGTESTS: mark http-check-send.vtc as 2.4-only"), as while looking up the commit introducing the change I accidently reverted it. Let's reinsert the contents of the file prior to that fix, except the version, of course.	2021-02-05 10:13:15 +01:00
Willy Tarreau	a84986ae4f	BUG/MINOR: ssl: do not try to use early data if not configured The CO_FL_EARLY_SSL_HS flag was inconditionally set on the connection, resulting in SSL_read_early_data() always being used first in handshake calculations. While this seems to work well (probably that there are fallback paths inside openssl), it's particularly confusing and makes the debugging quite complicated. It possibly is not optimal by the way. This flag ought to be set only when early_data is configured on the bind line. Apparently there used to be a good reason for doing it this way in 1.8 times, but it really does not make sense anymore. It may be OK to backport this to 2.3 if this helps with troubleshooting, but better not go too far as it's unlikely to fix any real issue while it could introduce some in old versions.	2021-02-05 08:04:02 +01:00
Willy Tarreau	23296f92f4	REGTESTS: mark sample_fetches/hashes.vtc as 2.4-only Commit `9eea56009` ("REGTESTS: add tests for the xxh3 converter") introduced the xxh3 to the tests thus made it incompatible with 2.3 and older, let's upgrade the version requirement.	2021-02-04 18:07:59 +01:00
Willy Tarreau	cb4ed02ef0	REGTESTS: mark http-check-send.vtc as 2.4-only Since commit `39ff8c519` ("REGTESTS: complete http-check test"), it breaks on pre-2.4, let's update the required version.	2021-02-04 18:06:13 +01:00
Willy Tarreau	4acb99f867	BUG/MINOR: xxhash: make sure armv6 uses memcpy() There was a special case made to allow ARMv6 to use unaligned accesses via a cast in xxHash when __ARM_FEATURE_UNALIGNED is defined. But while ARMv6 (and v7) does support unaligned accesses, it's only for 32-bit pointers, not 64-bit ones, leading to bus errors when the compiler emits an ldrd instruction and the input (e.g. a pattern) is not aligned, as in issue #1035. Note that v7 was properly using the packed approach here and was safe, however haproxy versions 2.3 and older use the old r39 xxhash code which has the same issue for armv7. A slightly different fix is required there, by using a different definition of packed for 32 and 64 bits. The problem is really visible when running v7 code on a v8 kernel because such kernels do not implement alignment trap emulation, and the process dies when this happens. This is why in the issue above it was only detected under lxc. The emulation could have been disabled on v7 as well by writing zero to /proc/cpu/alignment though. This commit is a backport of xxhash commit a470f2ef ("update default memory access for armv6"). Thanks to @srkunze for the report and tests, @stgraber for his help on setting up an easy reproducer outside of lxc, and @Cyan4973 for the discussion around the best way to fix this. Details and alternate patches available on https://github.com/Cyan4973/xxHash/issues/490.	2021-02-04 17:14:58 +01:00
Christopher Faulet	a8979a9b59	DOC: server: Add missing params in comment of the server state line parsing srv_use_ssl and srv_check_port parameters were not mentionned in the comment of the function parsing a server state line.	2021-02-04 14:00:43 +01:00
William Dauchy	4858fb2e18	MEDIUM: check: align agentaddr and agentport behaviour in the same manner of agentaddr, we now: - permit to set agentport through `port` keyword, like it is the case for agentaddr through `addr` - set the priority on `agent-port` keyword when used - add a flag to be able to test when the value is set like for agentaddr it makes the behaviour between `addr` and `port` more consistent. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 14:00:38 +01:00
William Dauchy	1c921cd748	BUG/MINOR: check: consitent way to set agentaddr small consistency problem with `addr` and `agent-addr` options: for the both options, the last one parsed is always used to set the agent-check addr. Thus these two lines don't have the same behavior: server ... addr <addr1> agent-addr <addr2> server ... agent-addr <addr2> addr <addr1> After this patch `agent-addr` will always be the priority option over `addr`. It means we test the flag before setting agentaddr. We also fix all the places where we did not set the flag to be coherent everywhere. I was not really able to determine where this issue is coming from. So it is probable we may backport it to all stable version where the agent is supported. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 13:55:04 +01:00
William Dauchy	fe03e7d045	MEDIUM: server: adding support for check_port in server state We can currently change the check-port using the cli command `set server check-port` but there is a consistency issue when using server state. This patch aims to fix this problem but will be also a good preparation work to get rid of checkport flag, so we are able to know when checkport was set by config. I am fully aware this is not making github #953 moving forward, I however think this might be acceptable while waiting for a proper solution and resolve consistency problem faced with port settings. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 10:46:52 +01:00
William Dauchy	69f118d7b6	MEDIUM: check: remove checkport checkaddr flag While trying to fix some consistency problem with the config file/cli (e.g. check-port cli command does not set the flag), we realised checkport flag was not necessarily needed. Indeed tcpcheck uses service port as the last choice if check.port is zero. So we can assume if check.port is zero, it means it was never set by the user, regardless if it is by the cli or config file. In the longterm this will avoid to introduce a new consistency issue if we forget to set the flag. in the same manner of checkport flag, we don't really need checkaddr flag. We can assume if checkaddr is not set, it means it was never set by the user or config. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 10:43:00 +01:00
Christopher Faulet	21ca3dfc3a	MINOR: dns: Don't set the check port during a server dns resolution When a server dns resolution is performed, there is no reason to set an unconfigured check port with the server port. Because by default, if the check port is not set, the server's one is used. Thus we can remove this useless assignment. It is mandatory for next improvements.	2021-02-04 10:42:52 +01:00
Christopher Faulet	99497d7dba	MINOR: server: Don't set the check port during the update from a state file When the server state is loaded from a server-state file, there is no reason to set an unconfigured check port with the server port. Because by default, if the check port is not set, the server's one is used. Thus we can remove this useless assignment. It is mandatory for next improvements.	2021-02-04 10:42:45 +01:00
William Dauchy	446db718cb	BUG/MINOR: cli: fix set server addr/port coherency with health checks while reading `update_server_addr_port` I found out some things which can be seen as incoherency. I hope I did not overlooked anything: - one comment is stating check's address should be updated if it uses the server one; however the condition checks if `SRV_F_CHECKADDR` is set; this flag is set when a check address is set; result is that we override the check address where I was not expecting it. In fact we don't need to update anything here as server addr is used when check addr is not set. - same goes for check agent addr - for port, it is a bit different, we update the check port if it is unset. This is harmless because we also use server port if check port is unset. However it creates some incoherency before/after using this command, as check port should stay unset througout the life of the process unless it is is set by `set server check-port` command. quite hard to locate the origin of this this issue but the function was introduced in commit `d458adcc52` ("MINOR: new update_server_addr_port() function to change both server's ADDR and service PORT"). I was however not able to determine whether this is due to a change of behavior along the years. So this patch can potentially be backported up to v1.8 but we must be careful while doing so, as the code has changed a lot. That being said, the bug being not very impacting I would be fine keeping it for 2.4 only. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 09:06:04 +01:00
William Lallemand	e0de0a6b32	MINOR: ssl/cli: flush the server session cache upon 'commit ssl cert' Flush the SSL session cache when updating a certificate which is used on a server line. This prevent connections to be established with a cached session which was using the previous SSL_CTX. This patch also replace the ha_barrier with a thread_isolate() since there are more operations to do. The reg-test was also updated to remove the 'no-ssl-reuse' keyword which is now uneeded.	2021-02-03 18:51:01 +01:00
Amaury Denoyelle	377d8786a7	BUG/MINOR: mux_h2: fix incorrect stat titles Duplicate titles for the stats H2_ST_{OPEN,TOTAL}_{CONN,STREAM}. These entries are used on csv for the heading. This must be backported up to 2.3. This fixes the github issue #1102.	2021-02-03 17:50:45 +01:00
Willy Tarreau	0630038e77	BUG/MEDIUM: ssl: check a connection's status before computing a handshake As spotted in issue #822, we're having a problem with error detection in the SSL layer. The problem is that on an overwhelmed machine, accepted connections can start to pile up, each of them requiring a slow handshake, and during all this time if the client aborts, the handshake will still be calculated. The error controls are properly placed, it's just that the SSL layer reads records exactly of the advertised size, without having the ability to encounter a pending connection error. As such if injecting many TLS connections to a listener with a huge backlog, it's fairly possible to meet this situation: 12:50:48.236056 accept4(8, {sa_family=AF_INET, sin_port=htons(62794), sin_addr=inet_addr("127.0.0.1")}, [128->16], SOCK_NONBLOCK) = 1109 12:50:48.236071 setsockopt(1109, SOL_TCP, TCP_NODELAY, [1], 4) = 0 (process other connections' handshakes) 12:50:48.257270 getsockopt(1109, SOL_SOCKET, SO_ERROR, [ECONNRESET], [4]) = 0 (proof that error was detectable there but this code was added for the PoC) 12:50:48.257297 recvfrom(1109, "\26\3\1\2\0", 5, 0, NULL, NULL) = 5 12:50:48.257310 recvfrom(1109, "\1\0\1\3"..., 512, 0, NULL, NULL) = 512 (handshake calculation taking 700us) 12:50:48.258004 sendto(1109, "\26\3\3\0z"..., 1421, MSG_DONTWAIT\|MSG_NOSIGNAL, NULL, 0) = -1 EPIPE (Broken pipe) 12:50:48.258036 close(1109) = 0 The situation was amplified by the multi-queue accept code, as it resulted in many incoming connections to be accepted long before they could be handled. Prior to this they would have been accepted and the handshake immediately started, which would have resulted in most of the connections waiting in the the system's accept queue, and dying there when the client aborted, thus the error would have been detected before even trying to pass them to the handshake code. As a result, with a listener running on a very large backlog, it's possible to quickly accept tens of thousands of connections and waste time slowly running their handshakes while they get replaced by other ones. This patch adds an SO_ERROR check on the connection's FD before starting the handshake. This is not pretty as it requires to access the FD, but it does the job. Some improvements should be made over the long term so that the transport layers can report extra information with their ->rcv_buf() call, or at the very least, implement a ->get_conn_status() function to report various flags such as shutr, shutw, error at various stages, allowing an upper layer to inquire for the relevance of engaging into a long operation if it's known the connection is not usable anymore. An even simpler step could probably consist in implementing this in the control layer. This patch is simple enough to be backported as far as 2.0. Many thanks to @ngaugler for his numerous tests with detailed feedback.	2021-02-02 15:55:53 +01:00
William Lallemand	8695ce0bae	BUG/MEDIUM: ssl/cli: abort ssl cert is freeing the old store The "abort ssl cert" command is buggy and removes the current ckch store, and instances, leading to SNI removal. It must only removes the new one. This patch also adds a check in set_ssl_cert.vtc and set_ssl_server_cert.vtc. Must be backported as far as 2.2.	2021-02-01 17:58:21 +01:00
Christopher Faulet	040b1195f7	BUG/MINOR: contrib/prometheus-exporter: Restart labels dump at the right pos For some metrics, several lines are produced per entity, one per label value. For instance, the health-check status (ST_F_CHECK_STATUS) or the entity status (ST_F_STATUS). The dump may be stopped in the middle of the labels processing if the output buffer is full. This means the next time, we must take care to restart on the right label value. For now, this part is buggy and we always restart to dump all the label values again from the beginning. To be sure to restart at the right position, the field <ctx.stats.st_code> in the applet context is used to save the last position. Of course, we take care to reset this value when necessary. This fix is specific for 2.4. No backport needed.	2021-02-01 15:21:55 +01:00
Christopher Faulet	32ef48e984	BUG/MINOR: contrib/prometheus-exporter: Add missing label for ST_F_HRSP_1XX Since the labels are dynamically created for each metric, the "code" label of the ST_F_HRSP_1XX field is missing. To fix the bug, this metric is handled in the same way the other ST_F_HRSP_* field are. We only take care to dump the metric header only once. This bug was introduced by the commit `5a2f93873` ("MEDIUM: contrib/prometheus-exporter: Use dynamic labels instead of static ones"). No backport needed.	2021-02-01 15:16:33 +01:00
Christopher Faulet	1a68cd0689	DOC: contrib/prometheus-exporter: Add missing metrics in README Some metrics were missing (haproxy_process_uptime_seconds and haproxy_process_build_info). To ease the review against the service output, the same order is used in the README.	2021-02-01 15:16:33 +01:00
William Dauchy	4b7bf7eccd	CLEANUP: contrib/prometheus-exporter: remove description in README Now that we got ride of description in prometheus code, let's assume we no longer need to maintain it in README, and diret user to the output of prometheus to get more info. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-01 15:16:33 +01:00
William Dauchy	df9a05db6a	CLEANUP: contrib/prometheus-exporter: align and reorder fields - align safe_idle_connections_current field fix minor typo added in commit `37286a5ac5` ("MEDIUM: contrib/prometheus-exporter: Rework matrices defining Promex metrics") - reorder info fields to be able to compare them easily - add missing ignored info fields as comment Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-01 15:16:33 +01:00
William Dauchy	99066dd47f	CLEANUP: contrib/prometheus-exporter: remove unused includes unless I'm wrong, those includes are no longer needed. The only recent one I remember is ssl-sock include since commit `5d9b8f3c93` ("MINOR: contrib/prometheus-exporter: use fill_info for process dump") where we make use of the code from stats.c Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-01 15:16:33 +01:00
William Dauchy	7741c33779	MINOR: contrib/prometheus-exporter: add recv logs_logs_total field this field was added by commit `45c457a629` ("MINOR: log: adds counters on received syslog messages.") Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-01 15:16:33 +01:00
William Dauchy	e5a26a250d	MINOR: contrib/prometheus-exporter: add uweight field this field was added in commit `bd71510024` ("MINOR: stats: report server's user-configured weight next to effective weight") Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-01 15:16:33 +01:00
William Dauchy	82b2ce2f96	MINOR: contrib/prometheus-exporter: use stats desc when possible It is a followup work of commit `a191b77e54` ("MINOR: contrib/prometheus-exporter: merge info description from stats") but for all other stats fields; we however keep a way to override them when needed (e.g. units, specific cases) this is another step which will avoid duplicating work between stats.c and prometheus. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-01 15:16:33 +01:00
William Dauchy	19f7cfc8c3	MINOR: stats: improve max stats descriptions In order to unify prometheus and stats description, we need to remove some field reference which are specific to stats implementation: - `scur` in max current sessions (also reword current session) - `rate` in max sessions - `req_rate` in max requests - `conn_rate` in max connections Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-01 15:16:33 +01:00
William Dauchy	eedb9b13f4	MINOR: stats: improve pending connections description In order to unify prometheus and stats description, we need to clarify the description for pending connections. - remove the BE reference in counters struct, as it is also used in servers - remove reference of `qcur` field in description as it is specific to stats implemention - try to reword cur and max pending connections description Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-01 15:16:33 +01:00
William Dauchy	a1da7bab1a	MINOR: contrib/prometheus-exporter: improve service status description field Since we changed the behaviour of this metric, improve the description to better explain what is the meaning of the new gauge value; it also reflects the description we did for health check status. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-01 15:16:33 +01:00
William Dauchy	de3c326389	MAJOR: contrib/prometheus-exporter: move health check status to labels this patch is a breaking change between v2.3 and v2.4: we move from using gauge value for health check states to labels values. The diff is quite small thanks to the preparation work from Christopher to allow more flexibility in labels, see commit `5a2f938732` ("MEDIUM: contrib/prometheus-exporter: Use dynamic labels instead of static ones") this is a follow up of commit `c6464591a3` ("MAJOR: contrib/prometheus-exporter: move ftd/bkd/srv states to labels"). The main goal being to be better aligned with prometheus use cases in terms of queries. More specifically to health checks, Pierre C. mentioned the possible quirks he had to put in place in order to make use of those metrics through prometheus: <aggregator_function> by(proxy, check_status) (count_values by(proxy, instance) ("check_status", haproxy_server_check_status)) I am perfectly aware this introduces a lot more metrics but I don't see how we can improve the usability without it. The main issue remains in the cardinality of the states which are > 20. Prometheus recommends to stay below a cardinality of 10 for a given metric but I consider our case very specific, because highly linked to the level of precision haproxy exposes. Even before this patch I saw several large production setup (a few hundreds of MB in output) which are making use of the scope parameter to simply ignore the server metrics, so that the scrapping can be faster, and memory consumed on client side not too high. So I believe we should eventually continue in that direction and offer more granularity of filtering of the output. That being said it is already possible to filter out the data on prometheus client side. this is related to github issue #1029 Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-01 15:16:33 +01:00
Christopher Faulet	7aa3271439	MINOR: checks: Add function to get the result code corresponding to a status The function get_check_status_result() can now be used to get the result code (CHK_RES_) corresponding to a check status (HCHK_STATUS_). It will be used by the Prometheus exporter when reporting the check status of a server.	2021-02-01 15:16:33 +01:00
William Lallemand	ff97edac3e	REGTESTS: set_ssl_server_cert: cleanup the SSL caching option Replace the tune.ssl.cachesize 0 and the no-tls-tickets by a no-ssl-reuse option on the server line.	2021-02-01 14:57:31 +01:00
William Lallemand	a870a9cfdb	REGTESTS: set_ssl_server_cert.vtc: remove SSL caching and set as working In a previous commit this test was disabled because I though the feature was broken, but in fact this is the test which is broken. Indeed the connection between the server and the client was not renegociated and was using the SSL cache or a ticket. To be work correctly these 2 features must be disabled or a new connection must be established after the ticket timeout, which is too long for a regtest. Also a "nbthread 1" was added as it was easier to reproduce the problem with it.	2021-02-01 14:50:17 +01:00
Willy Tarreau	75f72338df	BUG/MINOR: activity: take care of late wakeups in "show tasks" During the call to thread_isolate(), some other threads might have performed some task_wakeup() which will have a call date past the one we retrieved. It could be avoided by taking the current date once we're alone but this would significantly affect the latency measurements by adding the isolation time. Instead we're now only accounting positive times, so that late wakeups normally appear with a zero latency. No backport is needed, this is 2.4.	2021-01-29 15:07:07 +01:00
Willy Tarreau	d597ec2718	MINOR: listener: export manage_global_listener_queue() This one pops up in tasks lists when running against a saturated listener.	2021-01-29 14:29:57 +01:00
Christopher Faulet	5a2f938732	MEDIUM: contrib/prometheus-exporter: Use dynamic labels instead of static ones Instead of using static labels for metrics, we now use an array of labels, filled for each metrics if necessary and passed to the dump function. This way, it is easier to extend the promex service. For now, there are at most 8 labels per metrics. This limit may be raised by changing PROMEX_MAX_LABELS value. And to ease labels addition, a label is defined as a key/value pair. The formatting is handled by the dump function. For the proxies and servers, the first entry of the array is always the proxy name. In addition, for the servers, the second entry is always the server name.	2021-01-29 13:42:43 +01:00
William Dauchy	c6464591a3	MAJOR: contrib/prometheus-exporter: move ftd/bkd/srv states to labels this patch is a breaking change between v2.3 and v2.4: we move from using gauge value for frontend/backend/servers states to labels values. the main motivation being I realised it is very difficult to make use of it without hard coded quirks on prometheus client side; especially because the main use is often to group by state, which is harder when the state is the value of the metric. in order to achieve that we iterate on the status metric to generate labels, and so as many metrics. this is the first step to resolve github issue #1029 A second step should address health check states. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-29 13:42:43 +01:00
William Dauchy	5493821fe6	MINOR: contrib/prometheus-exporter: declare states for objects in preparation to change state gauge values as labels, declare them as enum associated with the string definition Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-29 13:42:39 +01:00
Christopher Faulet	c29b4bf946	MINOR: mux-h2: Slightly improve request HEADERS frames sending In h2s_bck_make_req_headers() function, in the loop on the HTX blocks, the most common blocks, the headers, are now handled in first, before the start-line. The same change was already performed on the response HEADERS frames. Thus the code is more consistent now.	2021-01-29 13:28:43 +01:00
Christopher Faulet	564981369b	MINOR: mux-h2: Don't tests the start-line when sending HEADERS frame When a HEADERS frame is sent, it is always when an HTX start-line block is found. Thus, in h2s_bck_make_req_headers() and h2s_frt_make_resp_headers() functions, it is useless to tests the start-line. Instead of being too defensive, we use BUG_ON() now because it must not happen and must be handled as a bug. This patch should fix the issue #1086.	2021-01-29 13:27:57 +01:00
Christopher Faulet	3702f78cf9	MINOR: ssl-sample: Don't check if argument list is set in sample fetches The list is always defined by definition. Thus there is no reason to test it.	2021-01-29 13:26:24 +01:00
Christopher Faulet	e6e7a585e9	MINOR: sample: Don't check if argument list is set in sample fetches The list is always defined by definition. Thus there is no reason to test it.	2021-01-29 13:26:13 +01:00
Christopher Faulet	72dbcfe66d	MINOR: http-conv: Don't check if argument list is set in sample converters The list is always defined by definition. Thus there is no reason to test it.	2021-01-29 13:26:02 +01:00
Christopher Faulet	623af93722	MINOR: http-fetch: Don't check if argument list is set in sample fetches The list is always defined by definition. Thus there is no reason to test it. There is also plenty of checks on arguments types while it is already validated during the configuration parsing. But one thing at a time. This patch should fix the issue #1087.	2021-01-29 13:25:34 +01:00
Christopher Faulet	bdbd5db2a5	BUG/MINOR: stick-table: Always call smp_fetch_src() with a valid arg list The sample fetch functions must always be called with a valid argument list. When called by hand, if there is no argument to pass, empty_arg_list must be used. In the stick-table code, there are some calls to smp_fetch_src() with NULL as argument list. It is changed to use empty_arg_list instead. It is not really a bug because smp_fetch_src() does not use the argument list. But it is an API bug. This patch may be backported to all stable branches as a cleanup.	2021-01-29 13:24:16 +01:00
Christopher Faulet	1faeb4c710	MINOR: mux-h1: Remove first useless test on count in h1_process_output() h1_process_output() function is never called with no data to send (count == 0). Thus, the first test on count, at the beginning of the function is useless and may be removed. This way, by reading the code, it is obvious the <chn_htx> variable is always defined. This patch should fix the issue #1085.	2021-01-29 13:16:32 +01:00
Willy Tarreau	5c25daa170	MINOR: stick-tables: export process_table_expire() This handler can take quite some time as it deletes a large number of entries under a lock, let's export it so that it's immediately visible in "show profiling".	2021-01-29 12:39:32 +01:00
Willy Tarreau	f6c88421b7	MINOR: peers: export process_peer_sync() to improve traces This one will probably pop up from time to time in "show profiling", better have it resolve.	2021-01-29 12:38:42 +01:00
Willy Tarreau	025fc71b47	MINOR: checks: export a few functions that appear often in trace dumps The check I/O handler, process_chk_conn and server_warmup are often present in complex backtraces as they're impacted by locking or I/O issues. Let's export them so that they resolve cleanly.	2021-01-29 12:35:24 +01:00
Willy Tarreau	ac6322dd36	MINOR: muxes: export the timeout and shutr task handlers These ones appear often in "show tasks" so it's handy to make them resolve.	2021-01-29 12:33:46 +01:00

... 3 4 5 6 7 ...

13952 Commits