haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2024-12-18 01:14:38 +00:00

Author	SHA1	Message	Date
Willy Tarreau	e1f1ba13c2	WIP/MEDIUM: thread: order CPUs before picking the first ones WIP: still not true because the number of threads is not necessarily equal to the number of CPUs we're bound to. For now the node id remains the sole criterion used on this test. Also, we should only bind CPUs that are either: - not excluded at boot when !cpu_map_configured() - mapped when cpu_map_configured() and not offline of course. Now when starting a default number of threads, instead of limiting ourselves to the first 32 or 64 CPUs that appear in the list, we'll take the same number after having sorted them by capacity and vicinity. This means that setups which have a single declared NUMA node and CPUs spread between sockets before threads will at least have a chance to bind only to threads of related cores and to avoid using the second package by default. The goal will now be to improve on this selection.	2023-07-21 17:53:05 +02:00
Willy Tarreau	d7e21999c8	MINOR: cpuset: implement a sorting mechanism for CPU perf/topo The new function cpu_optimize_topology() sorts a topology array by advertised capacity (cpu_capacity, SMT count) and vicinity (cluster ID, L3 cache ID etc). The purpose is to place the big cores first on heterogenous architectures, and then sort them so that groups can be lineraly composed from adjacent CPUs.	2023-07-21 17:53:05 +02:00
Willy Tarreau	ec8eb37361	WIP/MEDIUM: cfgparse: remote numa & thread-count detection Not needed anymore since already done before landing here. NOTE that the cmp_hw_cpus() function is here!	2023-07-21 17:53:05 +02:00
Willy Tarreau	404ff1c9d0	WIP: haproxy: call thread_detect_count() even without cpu affinity	2023-07-21 17:53:05 +02:00
Willy Tarreau	2fcfbf7873	WIP: thread: make thread_detect_count() also work without CPU affinity	2023-07-21 17:53:05 +02:00
Willy Tarreau	a2aee8c495	WIP: start to move numa detection we reimplement automatic binding to the first NUMA node.	2023-07-21 17:53:05 +02:00
Willy Tarreau	a31da151e3	WIP: reimplement automatic thread setting We continue to default to the max number of available threads and 1 tgroup by default, with the limit. This normally allows to get rid of that test in check_config_validity().	2023-07-21 17:53:05 +02:00
Willy Tarreau	263b2228e2	EXP: thrd: start to detect thread groups and threads min/max By mutually refining the thread count and group count, we can try to detect the most suitable setup for the current machine. Taskset is implicitly handled correctly. tgroups automatically adapt to the configured number of threads. cpu-map manages to limit tgroups to the smallest supported value. In fact this could even work without the rest of the changes for now, just ignoring CPU locations. Also it could be useful to calculate a "distance" to next CPU for each of them so that when trying to create multiple groups we could cut around the largest ones and evaluate for certain distances how many groups that would provide. Last, the per-core capacity and number of siblings would be useful to figure which groups to consider first.	2023-07-21 17:53:05 +02:00
Willy Tarreau	b253fd556b	WIP: cfgparse: comment out detection	2023-07-21 17:53:05 +02:00
Willy Tarreau	83b0887e10	WIP: cfgparse: fill the NUMA node IDs The ID is the one retrieved from the directory name. Now the current situation makes no sense anymore. Let's rework it: - call cpu_detect_topology() from haproxy.c:2251, just before calling check_config_validity(). At this point we know tgroups, threads, and cpumap if they were explicitly set. - the function must use cpumap[] if set to indicate bound CPUs, otherwise fall back to getaffinity(). NOTE: better set UNBOUND rather than BOUND so that when we don't know they're all there ? - the function can now check the online map only for bound CPUs. Should probably also set OFFLINE rather than online. - can now check the CPU+NUMA topology for all bound+online Then config_check_validity() should call thread_adjust_to_cpu() which will consider the number of threads, of groups etc.	2023-07-21 17:53:05 +02:00
Willy Tarreau	3280195d0c	WIP: cfgparse: detect CPU topology and order all CPUs by vicinity This detects package, NUMA node, l3, core cluster, L2, thread set, L1, and orders them accordingly. For now groups are not made yet, but it should be possible to compose them using a linear scan now. Note there's still a DEBUG dump of what was discovered at boot! TODO: integrate NUMA node numbering (ignored for now) wip: cfgparse: use ha_cpu_topo now wip: cfgparse: no need to clear HA_CPU_F_BOUND anymore already done in cpuset at boot. wip: cfgparse: drop useless err from parse_cpu_set WIP: cfgparse: temporarily hide online detect code and list all cpus	2023-07-21 17:53:05 +02:00
Willy Tarreau	6a62a480b4	WIP: DEBUG: use ./sys/devices temporarily	2023-07-21 17:53:05 +02:00
Willy Tarreau	9c316a528c	MINOR: cfgparse: use already known offline CPU information No need to reparse cpu/online, let's just rely on the info we learned previously about offline CPUs.	2023-07-21 17:53:05 +02:00
Willy Tarreau	0aba375027	MINOR: cfgparse: move the binding detection into numa_detect_topology() For now the function refrains from detecting the CPU topology when a restrictive taskset or cpu-map was already performed on the process, and it's documented as such, the reason being that until we're able to automatically create groups, better not change user settings. But we'll need to be able to detect bound CPUs and to process them as desired by the user, so we now need to move that detection into the function itself. It changes nothing to the logic, just gives more freedom to the function.	2023-07-21 17:53:05 +02:00
Willy Tarreau	f064e524d4	MINOR: cpuset: add NUMA node identification to CPUs on FreeBSD With this patch we're also NUMA node IDs to each CPU when the info is found. The code is highly inspired from the one in commit `f5d48f8b3` ("MEDIUM: cfgparse: numa detect topology on FreeBSD."), the difference being that we're just setting the value in ha_cpu_topo[].	2023-07-21 17:53:05 +02:00
Willy Tarreau	6a839c3dfc	MINOR: cpuset: add NUMA node identification to CPUs on Linux With this patch we're also assigning NUMA node IDs to each CPU when one is found. The code is highly inspired from the one in commit `b56a7c89a` ("MEDIUM: cfgparse: detect numa and set affinity if needed") that already did the job, except that it could be simplified since we're just collecting info to fill the ha_cpu_topo[] array.	2023-07-21 17:53:05 +02:00
Willy Tarreau	0c3a4c9733	MINOR: cpuset: add CPU topology detection for linux This uses the publicly available information from /sys to figure the cache and package arrangements between logical CPUs and fill ha_cpu_topo[], as well as their SMT capabilities and relative capacity for those which expose this. The functions clearly have to be OS-specific.	2023-07-21 17:52:36 +02:00
Willy Tarreau	ef5d4ab8da	MINOR: cpuset: try to detect offline cpus at boot When possible, the offline CPUs are detected at boot and their OFFLINE flag is set in the ha_cpu_topo[] array. When the detection is not possible (e.g. not linux, /sys not mounted etc), we just mark none of them as being offline, as we don't want to infer wrong info that could hinder automatic CPU placement detection.	2023-07-21 16:25:07 +02:00
Willy Tarreau	fb2311518f	MINOR: cpuset: add detection of online CPUs on FreeBSD On FreeBSD we can detect online CPUs at least by doing the bitwise-OR of the CPUs of all domains, so we're using this and adding this detection to ha_cpuset_detect_online(). If we find simpler later, we can always rework it, but it's reasonably inexpensive since we only check existing domains.	2023-07-21 16:25:07 +02:00
Willy Tarreau	6bef0d7298	MINOR: cpuset: add detection of online CPUs on Linux This adds a generic function ha_cpuset_detect_online() which for now only supports linux via /sys. It fills a cpuset with the list of online CPUs that were detected (or returns a failure).	2023-07-21 16:25:07 +02:00
Willy Tarreau	3a0e03bd4c	MINOR: thread: rely on the cpuset functions to count bound CPUs let's just clean up the thread_cpus_enabled() code a little bit by removing the OS-specific code and rely on ha_cpuset_detect_bound() instead. On macos we continue to use sysconf() for now.	2023-07-21 16:25:07 +02:00
Willy Tarreau	3f83b78961	MINOR: cpuset: update CPU topology from excluded CPUs at boot Now before trying to resolve the thread assignment to groups, we detect which CPUs are not bound at boot so that we can mark them with HA_CPU_F_EXCLUDED. This will be useful to better know on which CPUs we can count later. Note that we purposely ignore cpu-map here as we don't know how threads and groups will map to cpu-map entries, hence which CPUs will really be used. It's important to proceed this way so that when we have no info we assume they're all available.	2023-07-21 16:25:07 +02:00
Willy Tarreau	e9fd787b96	MINOR: cpuset: allocate and initialize the ha_cpu_topo array. This does the bare minimum to allocate and initialize a global ha_cpu_topo array for the number of supported CPUs and release it at deinit time.	2023-07-21 16:25:06 +02:00
Willy Tarreau	b6556995bf	MINOR: cpuset: add ha_cpu_topo definition This structure will be used to store information about each CPU's topology (package ID, L3 cache ID, NUMA node ID etc). This will be used in conjunction with CPU affinity setting to try to perform a mostly optimal binding between threads and CPU numbers by default.	2023-07-21 16:25:03 +02:00
Willy Tarreau	f14975c74a	MINOR: cpuset: centralize bound cpu detection Till now the CPUs that were bound were only retrieved in thread_cpus_enabled() in order to count the number of CPUs allowed, and it relied on arch-specific code. Let's slightly arrange this into ha_cpuset_detect_bound() that reuses the ha_cpuset struct and the accompanying code. This makes the code much clearer without having to carry along some arch-specific stuff out of this area. Note that the macos-specific code used in thread.c to only count online CPUs but not retrieve a mask, so for now we can't infer anything from it and can't implement it.	2023-07-20 15:36:11 +02:00
Willy Tarreau	b5809a4a52	REORG: cpuset: move parse_cpu_set() and parse_cpumap() to cpuset.c These ones were still in cfgparse.c but they're not specific to the config at all and may actually be used even when parsing cpu list entries in /sys. Better move them where they can be reused.	2023-07-20 15:36:11 +02:00
Willy Tarreau	69d2ff7078	MINOR: cpuset: add ha_cpuset_or() to bitwise-OR two CPU sets This operation was not implemented and will be needed later.	2023-07-20 15:36:11 +02:00
Willy Tarreau	f2af7a5d20	MINOR: cpuset: add ha_cpuset_isset() to check for the presence of a CPU in a set This function will be convenient to test for the presence of a given CPU in a set.	2023-07-20 15:36:11 +02:00
Willy Tarreau	c982e4ca4f	MINOR: cpuset: dynamically allocate cpu_map cpu_map is 8.2kB/entry and there's one such entry per group, that's ~520kB total. In addition, the init code is still in haproxy.c enclosed in ifdefs. Let's make this a dynamically allocated array in the cpuset code and remove that init code. Later we may even consider reallocating it once the number of threads and groups is known, in order to shrink it a little bit, as the typical setup with a single group will only need 8.2kB, thus saving half a MB of RAM. This would require that the upper bound is placed in a variable though.	2023-07-20 15:36:11 +02:00
Willy Tarreau	c11d09ac81	MINOR: cfgparse: use read_line_from_trash() to read from /sys It's easier to use this function now to natively support variable fields in the file's path. This also removes read_file_from_trash() that was only used here and was static.	2023-07-20 15:36:11 +02:00
Willy Tarreau	3c3261c676	MINOR: tools: add function read_line_to_trash() to read a line of a file This function takes on input a printf format for the file name, making it particularly suitable for /proc or /sys entries which take a lot of numbers. It also automatically trims the trailing CR and/or LF chars.	2023-07-20 15:36:11 +02:00
Willy Tarreau	c5c1ea6930	MEDIUM: init: initialize the trash earlier More and more utility function rely on the trash while most of the init code doesn't have access to it because it's initialized very late (in PRE_CHECK for the initial one). It's a pool, and it purposely supports being reallocated, so let's initialize it in STG_POOL so that early STG_INIT code can at least use it.	2023-07-20 15:36:11 +02:00
Willy Tarreau	8a4466bd15	MEDIUM: cfgparse: assign NUMA affinity to cpu-maps Do not force affinity on the process, instead let's just apply it to cpu-map, it will automatically be used later in the init process. We can do this because we know that cpu-map was not set when we're using this detection code. This is much saner, as we don't need to manipulate the process' affinity at this point in time, and just update the info that the user omitted to set by themselves, which guarantees a better long-term consistency with the documented feature.	2023-07-20 15:33:18 +02:00
Willy Tarreau	6ecabb3f35	CLEANUP: config: make parse_cpu_set() return documented values parse_cpu_set() stopped returning the undocumented -1 which was a leftover from an earlier attempt, changed from ulong to int since it only returns a success/failure and no more a mask. Thus it must not return -1 and its callers must only test for != 0, as is documented.	2023-07-20 11:01:09 +02:00
Willy Tarreau	f54d8c6457	CLEANUP: cpuset: remove the unused proc_t1 field in cpu_map This field used to store the cpumap of the first thread in a group, and was used till 2.4 to hold some default settings, after which it was no longer used. Let's just drop it.	2023-07-20 11:01:09 +02:00
Willy Tarreau	c955659906	BUG/MINOR: init: set process' affinity even in foreground The per-process CPU affinity settings are only applied during forking, which means that cpu-map are ignored when running in foreground (e.g. haproxy started with -db). This is historic due to the original semantics of a process array, but isn't documented and causes surprises when trying to debug affinity settings. Let's make sure the setting is applied to the workers themselves even in foreground. This may be backported to 2.6 though it is really not important. If backported, it also depends on previous commit: BUG/MINOR: cpuset: remove the bogus "proc" from the cpu_map struct	2023-07-20 11:01:09 +02:00
Willy Tarreau	151f9a2808	BUG/MINOR: cpuset: remove the bogus "proc" from the cpu_map struct We're currently having a problem with the porting from cpu_map from processes to thread-groups as it happened in 2.7 with commit `5b09341c0` ("MEDIUM: cpu-map: replace the process number with the thread group number"), though it seems that it has deeper roots even in 2.0 and that it was progressively made worng over time. The issue stems in the way the per-process and per-thread cpu-sets were employed over time. Originally only processes were supported. Then threads were added after an optional "/" and it was documented that "cpu-map 1" is exactly equivalent to "cpu-map 1/all" (this was clarified in 2.5 by commit `317804d28` ("DOC: update references to process numbers in cpu-map and bind-process"). The reality is different: when processes were still supported, setting "cpu-map 1" would apply the mask to the process itself (and only when run in the background, which is not documented either and is also a bug for another fix), and would be combined with any possible per-thread mask when calculating the threads' affinity, possibly resulting in empty sets. However, "cpu-map 1/all" would only set the mask for the threads and not the process. As such the following: cpu-map 1 odd cpu-map 1/1-8 even would leave no CPU while doing: cpu-map 1/all odd cpu-map 1/1-8 even would allow all CPUs. While such configs are very unlikely to ever be met (which is why this bug is tagged minor), this is becoming quite more visible while testing automatic CPU binding during 2.9 development because due to this bug it's much more common to end up with incorrect bindings. This patch fixes it by simply removing the .proc entry from cpu_map and always setting all threads' maps. The process is no longer arbitrarily bound to the group 1's mask, but in case threads are disabled, we'll use thread 1's mask since it contains the configured CPUs. This fix should be backported at least to 2.6, but no need to insist if it resists as it's easier to break cpu-map than to fix an unlikely issue.	2023-07-20 11:01:09 +02:00
Willy Tarreau	67c99db0a7	BUG/MINOR: config: do not detect NUMA topology when cpu-map is configured As documented, the NUMA auto-detection is not supposed to be used when the CPU affinity was set either by taskset (already checked) or by a cpu-map directive. However this check was missing, so that configs having cpu-map entries would still first bind to a single node. In practice it has no impact on correct configs since bindings will be replaced. However for those where the cpu-map directive are not exhaustive it will have the impact of binding those threads to one node, which disagrees with the doc (and makes future evolutions significantly more complicated). This could be backported to 2.4 where numa-cpu-mapping was added, though if nobody encountered this by then maybe we should only focus on recent versions that are more NUMA-friendly (e.g. 2.8 only). This patch depends on this previous commit that brings the function we rely on: MINOR: cpuset: add cpu_map_configured() to know if a cpu-map was found	2023-07-20 11:01:09 +02:00
Willy Tarreau	7134417613	MINOR: cpuset: add cpu_map_configured() to know if a cpu-map was found Since we'll soon want to adjust the "thread-groups" degree of freedom based on the presence of cpu-map, we first need to be able to detect if cpu-map was used. This function scans all cpu-map sets to detect if any is present, and returns true accordingly.	2023-07-20 11:01:09 +02:00
Daan van Gorkum	f034139bc0	MINOR: lua: Allow reading "proc." scoped vars from LUA core. This adds the "core.get_var()" method allow the reading of "proc." scoped variables outside of TXN or HTTP/TCPApplet. Fixes: #2212 Signed-off-by: Daan van Gorkum <djvg@djvg.net>	2023-07-20 10:55:28 +02:00
Christopher Faulet	083f917fe2	BUG/MINOR: h1-htx: Return the right reason for 302 FCGI responses A FCGI response may contain a "Location" header with no status code. In this case a 302-Found HTTP response must be returned to the client. However, while the status code is indeed 302, the reason is wrong. "Found" must be set instead of "Moved Temporarily". This patch must be backported as far as 2.2. With the commit `e3e4e0006` ("BUG/MINOR: http: Return the right reason for 302"), this should fix the issue #2208.	2023-07-20 09:51:00 +02:00
firexinghe	bfff46f411	BUG/MINOR: hlua: add check for lua_newstate Calling lual_newstate(Init main lua stack) in the hlua_init_state() function, the return value of lua_newstate() may be NULL (for example in case of OOM). In this case, L will be NULL, and then crash happens in lua_getextraspace(). So, we add a check for lua_newstate. This should be backported at least to 2.4, maybe further.	2023-07-19 10:16:14 +02:00
Emeric Brun	c0456f45c8	BUILD: quic: fix warning during compilation using gcc-6.5 Building with gcc-6.5: src/quic_conn.c: In function 'send_retry': src/quic_conn.c:6554:2: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing] ((uint32_t )((unsigned char *)&buf[i])) = htonl(qv->num); This patch use write_n32 to set the value. This could be backported until v2.6	2023-07-19 08:58:55 +02:00
Frédéric Lécaille	e5a17b0bc0	BUG/MINOR: quic: Unckecked encryption levels availability This bug arrived with this commit: MEDIUM: quic: Dynamic allocations of QUIC TLS encryption levels It is possible that haproxy receives a late Initial packet after it has released its Initial or Handshake encryption levels. In this case it must not try to retransmit packets from such encryption levels to speed up the handshake completion. No need to backport.	2023-07-18 11:50:31 +02:00
Ilya Shipitsin	f7dcceccc9	CI: explicitely highlight VTest result section if there's something it turned out that people miss VTest result section because it is not highlighted, let us fix that	2023-07-17 15:56:53 +02:00
Ilya Shipitsin	ddedefcaaa	CI: add naming convention documentation branches "haproxy-" stand for stable branches, otherwise development	2023-07-17 15:56:52 +02:00
Mariam John	00b7b49a46	MEDIUM: ssl: new sample fetch method to get curve name Adds a new sample fetch method to get the curve name used in the key agreement to enable better observability. In OpenSSLv3, the function `SSL_get_negotiated_group` returns the NID of the curve and from the NID, we get the curve name by passing the NID to OBJ_nid2sn. This was not available in v1.1.1. SSL_get_curve_name(), which returns the curve name directly was merged into OpenSSL master branch last week but will be available only in its next release.	2023-07-17 15:45:41 +02:00
Christopher Faulet	e3e4e00063	BUG/MINOR: http: Return the right reason for 302 Because of a cut/paste error, the wrong reason was returned for 302 code. The 301 reason was returned instead. Thus now, "Found" is returned for 302, instead of "Moved Permanently". This pathc should fix the issue 2208. It must be backported to all stable versions.	2023-07-17 11:14:10 +02:00
Christopher Faulet	b982fc2177	BUG/MINOR: sample: Fix wrong overflow detection in add/sub conveters When "add" or "sub" conveters are used, an overflow detection is performed. When 2 negative integers are added (or a positive integer is substracted to a positive one), we take care to not exceed the low limit (LLONG_MIN) and when 2 positive integers are added, we take care to not exceed the high limit (LLONG_MAX). However, because of a missing 'else' statement, if there is no overflow in the first case, we fall back on the second check (the one for positive adds) and LLONG_MAX is returned. It means that most of time, when 2 negative integers are added (or a positive integer is substracted to a negative one), LLONG_MAX is returned. This patch should solve the issue #2216. It must be backported to all stable versions.	2023-07-17 11:14:10 +02:00
Christopher Faulet	46e5876035	DOC: config: Fix fc_src description to state the source address is returned A typo in the "fc_src" description was fixed. This sample returns the original source IP address and not the destination one. This patch should be backported as far as 2.6.	2023-07-17 11:11:39 +02:00

1 2 3 4 5 ...

20438 Commits