haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-04-24 12:06:57 +00:00

Author	SHA1	Message	Date
David Carlier	7adf8f35df	OPTIM: regex: PCRE2 use JIT match when JIT optimisation occured. When a regex had been succesfully compiled by the JIT pass, it is better to use the related match, thanksfully having same signature, for better performance. Signed-off-by: David Carlier <devnexen@gmail.com>	2020-08-14 07:53:40 +02:00
Ilya Shipitsin	46a030cdda	CLEANUP: assorted typo fixes in the code and comments This is 11th iteration of typo fixes	2020-07-06 14:34:32 +02:00
Willy Tarreau	f278eec37a	BUILD: tree-wide: cast arguments to tolower/toupper to unsigned char NetBSD apparently uses macros for tolower/toupper and complains about the use of char for array subscripts. Let's properly cast all of them to unsigned char where they are used. This is needed to fix issue #729.	2020-07-05 21:50:02 +02:00
Willy Tarreau	36979d9ad5	REORG: include: move the error reporting functions to from log.h to errors.h Most of the files dealing with error reports have to include log.h in order to access ha_alert(), ha_warning() etc. But while these functions don't depend on anything, log.h depends on a lot of stuff because it deals with log-formats and samples. As a result it's impossible not to embark long dependencies when using ha_warning() or qfprintf(). This patch moves these low-level functions to errors.h, which already defines the error codes used at the same places. About half of the users of log.h could be adjusted, sometimes revealing other issues such as missing tools.h. Interestingly the total preprocessed size shrunk by 4%.	2020-06-11 10:18:59 +02:00
Willy Tarreau	dfd3de8826	REORG: include: move stream.h to haproxy/stream{,-t}.h This one was not easy because it was embarking many includes with it, which other files would automatically find. At least global.h, arg.h and tools.h were identified. 93 total locations were identified, 8 additional includes had to be added. In the rare files where it was possible to finalize the sorting of includes by adjusting only one or two extra lines, it was done. But all files would need to be rechecked and cleaned up now. It was the last set of files in types/ and proto/ and these directories must not be reused anymore.	2020-06-11 10:18:58 +02:00
Willy Tarreau	aeed4a85d6	REORG: include: move log.h to haproxy/log{,-t}.h The current state of the logging is a real mess. The main problem is that almost all files include log.h just in order to have access to the alert/warning functions like ha_alert() etc, and don't care about logs. But log.h also deals with real logging as well as log-format and depends on stream.h and various other things. As such it forces a few heavy files like stream.h to be loaded early and to hide missing dependencies depending where it's loaded. Among the missing ones is syslog.h which was often automatically included resulting in no less than 3 users missing it. Among 76 users, only 5 could be removed, and probably 70 don't need the full set of dependencies. A good approach would consist in splitting that file in 3 parts: - one for error output ("errors" ?). - one for log_format processing - and one for actual logging.	2020-06-11 10:18:58 +02:00
Willy Tarreau	f268ee8795	REORG: include: split global.h into haproxy/global{,-t}.h global.h was one of the messiest files, it has accumulated tons of implicit dependencies and declares many globals that make almost all other file include it. It managed to silence a dependency loop between server.h and proxy.h by being well placed to pre-define the required structs, forcing struct proxy and struct server to be forward-declared in a significant number of files. It was split in to, one which is the global struct definition and the few macros and flags, and the rest containing the functions prototypes. The UNIX_MAX_PATH definition was moved to compat.h.	2020-06-11 10:18:58 +02:00
Willy Tarreau	48fbcae07c	REORG: tools: split common/standard.h into haproxy/tools{,-t}.h And also rename standard.c to tools.c. The original split between tools.h and standard.h dates from version 1.3-dev and was mostly an accident. This patch moves the files back to what they were expected to be, and takes care of not changing anything else. However this time tools.h was split between functions and types, because it contains a small number of commonly used macros and structures (e.g. name_desc) which in turn cause the massive list of includes of tools.h to conflict with the callers. They remain the ugliest files of the whole project and definitely need to be cleaned and split apart. A few types are defined there only for functions provided there, and some parts are even OS-specific and should move somewhere else, such as the symbol resolution code.	2020-06-11 10:18:57 +02:00
Willy Tarreau	7cd8b6e3a4	REORG: include: split common/regex.h into haproxy/regex{,-t}.h Regex are essentially included for myregex_t but it turns out that several of the C files didn't include it directly, relying on the one included by their own .h. This has been cleanly addressed so that only the type is included by H files which need it, and adding the missing includes for the other ones.	2020-06-11 10:18:57 +02:00
Willy Tarreau	4c7e4b7738	REORG: include: update all files to use haproxy/api.h or api-t.h if needed All files that were including one of the following include files have been updated to only include haproxy/api.h or haproxy/api-t.h once instead: - common/config.h - common/compat.h - common/compiler.h - common/defaults.h - common/initcall.h - common/tools.h The choice is simple: if the file only requires type definitions, it includes api-t.h, otherwise it includes the full api.h. In addition, in these files, explicit includes for inttypes.h and limits.h were dropped since these are now covered by api.h and api-t.h. No other change was performed, given that this patch is large and affects 201 files. At least one (tools.h) was already freestanding and didn't get the new one added.	2020-06-11 10:18:42 +02:00
Willy Tarreau	39bd740d00	CLEANUP: regex: remove outdated support for regex actions The support for reqrep and friends was removed in 2.1 but the chain_regex() function and the "action" field in the regex struct was still there. This patch removes them. One point worth mentioning though. There is a check_replace_string() function whose purpose was to validate the replacement strings passed to reqrep. It should also be used for other replacement regex, but is never called. Callers of exp_replace() should be checked and a call to this function should be added to detect the error early.	2020-06-02 17:17:13 +02:00
Dragan Dosen	2674303912	MEDIUM: regex: modify regex_comp() to atomically allocate/free the my_regex struct Now we atomically allocate the my_regex struct within function regex_comp() and compile the regex or free both in case of failure. The pointer to the allocated my_regex struct is returned directly. The my_regex* argument to regex_comp() is removed. Function regex_free() was modified so that it systematically frees the my_regex entry. The function does nothing when called with a NULL as argument (like free()). It will avoid existing risk of not properly freeing the initialized area. Other structures are also updated in order to be compatible (the ones related to Lua and action rules).	2019-05-07 06:58:15 +02:00
Willy Tarreau	8071338c78	MINOR: initcall: apply initcall to all register_build_opts() calls Most register_build_opts() calls use static strings. These ones were replaced with a trivial REGISTER_BUILD_OPTS() statement adding the string and its call to the STG_REGISTER section. A dedicated section could be made for this if needed, but there are very few such calls for this to be worth it. The calls made with computed strings however, like those which retrieve OpenSSL's version or zlib's version, were moved to a dedicated function to guarantee they are called late in the process. For example, the SSL call probably requires that SSL_library_init() has been called first.	2018-11-26 19:50:32 +01:00
Joseph Herlant	eda75484a8	CLEANUP: Fix typos in the regex subsystem Fix typos in the code comment of the regex subsystem.	2018-11-18 22:26:42 +01:00
Christopher Faulet	767a84bcc0	CLEANUP: log: Rename Alert/Warning in ha_alert/ha_warning	2017-11-24 17:19:12 +01:00
Emeric Brun	272e252e61	MINOR: threads/regex: Change Regex trash buffer into a thread local variable	2017-10-31 13:58:31 +01:00
David Carlier	f2592b29f1	MEDIUM: regex: pcre2 support this adds a support of the newest pcre2 library, more secure than its older sibling in a cost of a more complex API. It works pretty similarly to pcre's part to keep the overall change smooth, except : - we define the string class supported at compile time. - after matching the ovec data is properly sized, althought we do not take advantage of it here. - the lack of jit support is treated less 'dramatically' as pcre2_jit_compile in this case is 'no-op'.	2016-12-28 12:51:51 +01:00
Willy Tarreau	7a9ac6dac6	CLEANUP: regex: use the build options list to report the regex type This removes 3 #ifdef from haproxy.c.	2016-12-21 21:30:54 +01:00
Vincent Bernat	02779b6263	CLEANUP: uniformize last argument of malloc/calloc Instead of repeating the type of the LHS argument (sizeof(struct ...)) in calls to malloc/calloc, we directly use the pointer name (sizeof(...)). The following Coccinelle patch was used: @@ type T; T x; @@ x = malloc( - sizeof(T) + sizeof(x) ) @@ type T; T x; @@ x = calloc(1, - sizeof(T) + sizeof(*x) ) When the LHS is not just a variable name, no change is made. Moreover, the following patch was used to ensure that "1" is consistently used as a first argument of calloc, not the last one: @@ @@ calloc( + 1, ... - ,1 )	2016-04-03 14:17:42 +02:00
Willy Tarreau	15a53a4384	MEDIUM: regex: add support for passing regex flags to regex_exec_match() This function (and its sister regex_exec_match2()) abstract the regex execution but make it impossible to pass flags to the regex engine. Currently we don't use them but we'll need to support REG_NOTBOL soon (to indicate that we're not at the beginning of a line). So let's add support for this flag and update the API accordingly.	2015-01-22 14:24:53 +01:00
Christian Ruppert	de898712a0	MEDIUM: regex: Use pcre_study always when PCRE is used, regardless of JIT pcre_study() has been around long before JIT has been added. It also seems to affect the performance in some cases (positive). Below I've attached some test restults. The test is based on http://sljit.sourceforge.net/regex_perf.html (see bottom). It has been modified to just test pcre_study vs. no pcre_study. Note: This test does not try to match specific header it's instead run over a larger text with more and less complex patterns to make the differences more clear. % ./runtest 'mark.txt' loaded. (Length: 19665221 bytes) ----------------- Regex: 'Twain' [pcre-nostudy] time: 14 ms (2388 matches) [pcre-study] time: 21 ms (2388 matches) ----------------- Regex: '^Twain' [pcre-nostudy] time: 109 ms (100 matches) [pcre-study] time: 109 ms (100 matches) ----------------- Regex: 'Twain$' [pcre-nostudy] time: 14 ms (127 matches) [pcre-study] time: 16 ms (127 matches) ----------------- Regex: 'Huck[a-zA-Z]+\|Finn[a-zA-Z]+' [pcre-nostudy] time: 695 ms (83 matches) [pcre-study] time: 26 ms (83 matches) ----------------- Regex: 'a[^x]{20}b' [pcre-nostudy] time: 90 ms (12495 matches) [pcre-study] time: 91 ms (12495 matches) ----------------- Regex: 'Tom\|Sawyer\|Huckleberry\|Finn' [pcre-nostudy] time: 1236 ms (3015 matches) [pcre-study] time: 34 ms (3015 matches) ----------------- Regex: '.{0,3}(Tom\|Sawyer\|Huckleberry\|Finn)' [pcre-nostudy] time: 5696 ms (3015 matches) [pcre-study] time: 5655 ms (3015 matches) ----------------- Regex: '[a-zA-Z]+ing' [pcre-nostudy] time: 1290 ms (95863 matches) [pcre-study] time: 1167 ms (95863 matches) ----------------- Regex: '^[a-zA-Z]{0,4}ing[^a-zA-Z]' [pcre-nostudy] time: 136 ms (4507 matches) [pcre-study] time: 134 ms (4507 matches) ----------------- Regex: '[a-zA-Z]+ing$' [pcre-nostudy] time: 1334 ms (5360 matches) [pcre-study] time: 1214 ms (5360 matches) ----------------- Regex: '^[a-zA-Z ]{5,}$' [pcre-nostudy] time: 198 ms (26236 matches) [pcre-study] time: 197 ms (26236 matches) ----------------- Regex: '^.{16,20}$' [pcre-nostudy] time: 173 ms (4902 matches) [pcre-study] time: 175 ms (4902 matches) ----------------- Regex: '([a-f](.[d-m].){0,2}[h-n]){2}' [pcre-nostudy] time: 1242 ms (68621 matches) [pcre-study] time: 690 ms (68621 matches) ----------------- Regex: '([A-Za-z]awyer\|[A-Za-z]inn)[^a-zA-Z]' [pcre-nostudy] time: 1215 ms (675 matches) [pcre-study] time: 952 ms (675 matches) ----------------- Regex: '"[^"]{0,30}[?!\.]"' [pcre-nostudy] time: 27 ms (5972 matches) [pcre-study] time: 28 ms (5972 matches) ----------------- Regex: 'Tom.{10,25}river\|river.{10,25}Tom' [pcre-nostudy] time: 705 ms (2 matches) [pcre-study] time: 68 ms (2 matches) In some cases it's more or less the same but when it's faster than by a huge margin. It always depends on the pattern, the string(s) to match against etc. Signed-off-by: Christian Ruppert <c.ruppert@babiel.com>	2014-11-18 13:26:18 +01:00
Christian Ruppert	955f4613cb	BUG/MEDIUM: regex: fix pcre_study error handling pcre_study() may return NULL even though it succeeded. In this case error is NULL otherwise error is not NULL. Also see man 3 pcre_study. Previously a ACL pattern of e.g. ".*" would cause error because pcre_study did not found anything to speed up matching and returned regex->extra = NULL and error = NULL which in this case was a false-positive. That happend only when PCRE_JIT was enabled for HAProxy but libpcre has been built without JIT. Signed-off-by: Christian Ruppert <c.ruppert@babiel.com> [wt: this needs to be backported to 1.5 as well]	2014-10-29 17:44:31 +01:00
Thierry FOURNIER	26202760a4	MINOR: regex: Use native PCRE API. The pcreposix layer (in the pcre projetc) execute strlen to find thlength of the string. When we are using the function "regex_exex*2", the length is used to add a final \0, when pcreposix is executed a strlen is executed to compute the length. If we are using a native PCRE api, the length is provided as an argument, and these operations disappear. This is useful because PCRE regex are more used than POSIC regex.	2014-06-18 15:14:00 +02:00
Thierry FOURNIER	09af0d6d43	MEDIUM: regex: replace all standard regex function by own functions This patch remove all references of standard regex in haproxy. The last remaining references are only in the regex.[ch] files. In the file src/checks.c, the original function uses a "pmatch" array. In fact this array is unused. This patch remove it.	2014-06-18 15:07:57 +02:00
Thierry FOURNIER	b8f980cc19	MINOR: regex: Create JIT compatible function that return match strings This patchs rename the "regex_exec" to "regex_exec2". It add a new "regex_exec", "regex_exec_match" and "regex_exec_match2" function. This function can match regex and return array containing matching parts. Otherwise, this function use the compiled method (JIT or PCRE or POSIX). JIT require a subject with length. PCREPOSIX and native POSIX regex require a null terminted subject. The regex_exec* function are splited in two version. The first version take a null terminated string, but it execute strlen() on the subject if it is compiled with JIT. The second version (terminated by "2") take the subject and the length. This version adds a null character in the subject if it is compiled with PCREPOSIX or native POSIX functions. The documentation of posix regex and pcreposix says that the function returns 0 if the string matche otherwise it returns REG_NOMATCH. The REG_NOMATCH macro take the value 1 with posix regex and the value 17 with the pcreposix. The documentaion of the native pcre API (used with JIT) returns a negative number if no match, otherwise, it returns 0 or a positive number. This patch fix also the return codes of the regex_exec* functions. Now, these function returns true if the string match, otherwise it returns false.	2014-06-18 15:07:50 +02:00
Willy Tarreau	c874653bb4	BUILD: don't use type "uint" which is not portable Dmitry Sivachenko reported that "uint" doesn't build on FreeBSD 10. On Linux it's defined in sys/types.h and indicated as "old". Just get rid of the very few occurrences.	2014-05-28 23:05:07 +02:00
Sasha Pachev	c600204ddf	BUG/MEDIUM: regex: fix risk of buffer overrun in exp_replace() Currently exp_replace() (which is used in reqrep/reqirep) is vulnerable to a buffer overrun. I have been able to reproduce it using the attached configuration file and issuing the following command: wget -O - -S -q http://localhost:8000/`perl -e 'print "a"x4000'`/cookie.php Str was being checked only in in while (str) and it was possible to read past that when more than one character was being accessed in the loop. WT: Note that this bug is only marked MEDIUM because configurations capable of triggering this bug are very unlikely to exist at all due to the fact that most rewrites consist in static string additions that largely fit into the reserved area (8kB by default). This fix should also be backported to 1.4 and possibly even 1.3 since it seems to have been present since 1.1 or so. Config: ------- global maxconn 500 stats socket /tmp/haproxy.sock mode 600 defaults timeout client 1000 timeout connect 5000 timeout server 5000 retries 1 option redispatch listen stats bind :8080 mode http stats enable stats uri /stats stats show-legends listen tcp_1 bind :8000 mode http maxconn 400 balance roundrobin reqrep ^([^\ :])\ /(.)/(.)\.php(.) \1\ /\3.php?arg=\2\2\2\2\2\2\2\2\2\2\2\2\2\4 server srv1 127.0.0.1:9000 check port 9000 inter 1000 fall 1 server srv2 127.0.0.1:9001 check port 9001 inter 1000 fall 1	2014-05-27 14:36:06 +02:00
Thierry FOURNIER	0b6d15fdc8	MINOR: regex: The pointer regstr in the struc regex is no longer used. The pointer <regstr> is only used to compare and identify the original regex string with the patterns. Now the patterns have a reference map containing this original string. It is useless to store this value two times.	2014-03-17 18:06:08 +01:00
Thierry FOURNIER	39e258fcee	MINOR: regex: Copy the original regex expression into string. This is useful for the debug or for search regex in maps.	2013-12-12 15:43:34 +01:00
Thierry FOURNIER	799c042daa	MINOR: regex: Change the struct containing regex This change permits to remove the typedef. The original regex structs are set in haproxy's struct.	2013-12-12 15:42:58 +01:00
Thierry FOURNIER	ed5a4aefae	CLEANUP: regex: Create regex_comp function that compiles regex using compilation options The current file "regex.h" define an abstraction for the regex. It provides the same struct name and the same "regexec" function for the 3 regex types supported: standard libc, basic pcre and jit pcre. The regex compilation function is not provided by this file. If the developper wants to use regex, he must write regex compilation code containing "#define JIT". This patch provides a unique regex compilation function according to the compilation options. In addition, the "regex.h" file checks the presence of the "#define PCRE_CONFIG_JIT" when "USE_PCRE_JIT" is enabled. If this flag is not present, the pcre lib doesn't support JIT and "#error" is emitted.	2013-10-14 14:42:50 +02:00
Willy Tarreau	f4f04125d4	[MINOR] prepare req_/rsp_ to receive a condition It will be very handy to be able to pass conditions to req_* and rsp_*. For now, we just add the pointer to the condition in the affected structs.	2010-01-28 18:10:50 +01:00
Willy Tarreau	8f8e645066	[CLEANUP] shut warnings 'is' macros from ctype.h on solaris Solaris visibly uses an array for is, which returns warnings about the use of signed chars as indexes. Good opportunity to put casts everywhere.	2007-06-17 21:51:38 +02:00
Willy Tarreau	b17916e89b	[CLEANUP] add a few "const char " where appropriate As suggested by Markus Elfring, a few "const char " have replaced some "char *" declarations where a function is not expected to modify a value. It does not change the code but it helps detecting coding errors.	2006-10-15 15:17:57 +02:00
Willy Tarreau	e3ba5f0aaa	[CLEANUP] included common/version.h everywhere	2006-06-29 18:54:54 +02:00
Willy Tarreau	2dd0d4799e	[CLEANUP] renamed include/haproxy to include/common	2006-06-29 17:53:05 +02:00
Willy Tarreau	baaee00406	[BIGMOVE] exploded the monolithic haproxy.c file into multiple files. The files are now stored under : - include/haproxy for the generic includes - include/types.h for the structures needed within prototypes - include/proto.h for function prototypes and inline functions - src/*.c for the C files Most include files are now covered by LGPL. A last move still needs to be done to put inline functions under GPL and not LGPL. Version has been set to 1.3.0 in the code but some control still needs to be done before releasing.	2006-06-26 02:48:02 +02:00

37 Commits