Commit Graph

19 Commits

Author SHA1 Message Date
Thierry FOURNIER 26202760a4 MINOR: regex: Use native PCRE API.
The pcreposix layer (in the pcre projetc) execute strlen to find
thlength of the string. When we are using the function "regex_exex*2",
the length is used to add a final \0, when pcreposix is executed a
strlen is executed to compute the length.

If we are using a native PCRE api, the length is provided as an
argument, and these operations disappear.

This is useful because PCRE regex are more used than POSIC regex.
2014-06-18 15:14:00 +02:00
Thierry FOURNIER 09af0d6d43 MEDIUM: regex: replace all standard regex function by own functions
This patch remove all references of standard regex in haproxy. The last
remaining references are only in the regex.[ch] files.

In the file src/checks.c, the original function uses a "pmatch" array.
In fact this array is unused. This patch remove it.
2014-06-18 15:07:57 +02:00
Thierry FOURNIER b8f980cc19 MINOR: regex: Create JIT compatible function that return match strings
This patchs rename the "regex_exec" to "regex_exec2". It add a new
"regex_exec", "regex_exec_match" and "regex_exec_match2" function. This
function can match regex and return array containing matching parts.
Otherwise, this function use the compiled method (JIT or PCRE or POSIX).

JIT require a subject with length. PCREPOSIX and native POSIX regex
require a null terminted subject. The regex_exec* function are splited
in two version. The first version take a null terminated string, but it
execute strlen() on the subject if it is compiled with JIT. The second
version (terminated by "2") take the subject and the length. This
version adds a null character in the subject if it is compiled with
PCREPOSIX or native POSIX functions.

The documentation of posix regex and pcreposix says that the function
returns 0 if the string matche otherwise it returns REG_NOMATCH. The
REG_NOMATCH macro take the value 1 with posix regex and the value 17
with the pcreposix. The documentaion of the native pcre API (used with
JIT) returns a negative number if no match, otherwise, it returns 0 or a
positive number.

This patch fix also the return codes of the regex_exec* functions. Now,
these function returns true if the string match, otherwise it returns
false.
2014-06-18 15:07:50 +02:00
Willy Tarreau c874653bb4 BUILD: don't use type "uint" which is not portable
Dmitry Sivachenko reported that "uint" doesn't build on FreeBSD 10.
On Linux it's defined in sys/types.h and indicated as "old". Just
get rid of the very few occurrences.
2014-05-28 23:05:07 +02:00
Sasha Pachev c600204ddf BUG/MEDIUM: regex: fix risk of buffer overrun in exp_replace()
Currently exp_replace() (which is used in reqrep/reqirep) is
vulnerable to a buffer overrun. I have been able to reproduce it using
the attached configuration file and issuing the following command:

  wget -O - -S -q http://localhost:8000/`perl -e 'print "a"x4000'`/cookie.php

Str was being checked only in in while (str) and it was possible to
read past that when more than one character was being accessed in the
loop.

WT:
   Note that this bug is only marked MEDIUM because configurations
   capable of triggering this bug are very unlikely to exist at all due
   to the fact that most rewrites consist in static string additions
   that largely fit into the reserved area (8kB by default).

   This fix should also be backported to 1.4 and possibly even 1.3 since
   it seems to have been present since 1.1 or so.

Config:
-------

  global
        maxconn         500
        stats socket /tmp/haproxy.sock mode 600

  defaults
        timeout client      1000
        timeout connect      5000
        timeout server      5000
        retries         1
        option redispatch

  listen stats
        bind :8080
        mode http
        stats enable
        stats uri /stats
        stats show-legends

  listen  tcp_1
        bind :8000
        mode            http
        maxconn 400
        balance roundrobin
        reqrep ^([^\ :]*)\ /(.*)/(.*)\.php(.*) \1\ /\3.php?arg=\2\2\2\2\2\2\2\2\2\2\2\2\2\4
        server  srv1 127.0.0.1:9000 check port 9000 inter 1000 fall 1
        server  srv2 127.0.0.1:9001 check port 9001 inter 1000 fall 1
2014-05-27 14:36:06 +02:00
Thierry FOURNIER 0b6d15fdc8 MINOR: regex: The pointer regstr in the struc regex is no longer used.
The pointer <regstr> is only used to compare and identify the original
regex string with the patterns. Now the patterns have a reference map
containing this original string. It is useless to store this value two
times.
2014-03-17 18:06:08 +01:00
Thierry FOURNIER 39e258fcee MINOR: regex: Copy the original regex expression into string.
This is useful for the debug or for search regex in maps.
2013-12-12 15:43:34 +01:00
Thierry FOURNIER 799c042daa MINOR: regex: Change the struct containing regex
This change permits to remove the typedef. The original regex structs
are set in haproxy's struct.
2013-12-12 15:42:58 +01:00
Thierry FOURNIER ef37a66628 CLEANUP: The function "regex_exec" needs the string length but in many case they expect null terminated char.
If haproxy is compiled with the USE_PCRE_JIT option, the length of the
string is used. If it is compiled without this option the function doesn't
use the length and expects a null terminated string.

The prototype of the function is ambiguous, and depends on the
compilation option. The developer can think that the length is always
used, and many bugs can be created.

This patch makes sure that the length is used. The regex_exec function
adds the final '\0' if it is needed.
2013-10-23 12:19:51 +02:00
Thierry FOURNIER ed5a4aefae CLEANUP: regex: Create regex_comp function that compiles regex using compilation options
The current file "regex.h" define an abstraction for the regex. It
provides the same struct name and the same "regexec" function for the
3 regex types supported: standard libc, basic pcre and jit pcre.

The regex compilation function is not provided by this file. If the
developper wants to use regex, he must write regex compilation code
containing "#define *JIT*".

This patch provides a unique regex compilation function according to
the compilation options.

In addition, the "regex.h" file checks the presence of the "#define
PCRE_CONFIG_JIT" when "USE_PCRE_JIT" is enabled. If this flag is not
present, the pcre lib doesn't support JIT and "#error" is emitted.
2013-10-14 14:42:50 +02:00
Thierry FOURNIER e28f1ecf2b BUILD/MINOR: missing header file
In the header file "common/regex.h", the C keyword NULL is used. This
keyword is referenced into the header file "stdlib.h", but this is not
included.
2013-10-10 11:38:35 +02:00
Willy Tarreau dd11293e84 BUG/MINOR: acl: fix a double free during exit when using PCRE_JIT
When freeing ACL regex, we don't want to perform the free() in regex_free()
as it's already performed in free_pattern(). The double free only happens
when using PCRE_JIT when freeing everything during exit so it's harmless
but exhibits libc errors during a reload/restart.

Bug reported by Seri.
2013-05-08 18:14:18 +02:00
Hiroaki Nakamura 7035132349 MEDIUM: regex: Use PCRE JIT in acl
This is a patch for using PCRE JIT in acl.

I notice regex are used in other places, but they are more complicated
to modify to use PCRE APIs. So I focused to acl in the first try.

BTW, I made a simple benchmark program for PCRE JIT beforehand.
https://github.com/hnakamur/pcre-jit-benchmark

I read the manual for PCRE JIT
http://www.manpagez.com/man/3/pcrejit/

and wrote my benchmark program.
https://github.com/hnakamur/pcre-jit-benchmark/blob/master/test-pcre.c
2013-04-02 00:02:54 +02:00
Willy Tarreau f4f04125d4 [MINOR] prepare req_*/rsp_* to receive a condition
It will be very handy to be able to pass conditions to req_* and rsp_*.
For now, we just add the pointer to the condition in the affected
structs.
2010-01-28 18:10:50 +01:00
Willy Tarreau a496b6042b [MAJOR] merged the 'setbe' actions to switch the backend on a regex
Sin Yu's patch to permit to change the proxy from a regex was merged
with little changes :
  - req_cap/rsp_cap are not reassigned to the new proxy, they stay
    attached to the frontend

  - the actions have been renamed "reqsetbe" and "reqisetbe" for
    "set BackEnd".

  - the buffer is not reset after the switch, instead, the headers are
    parsed again by the backend

  - in Sin's patch, it was theorically possible to switch multiple times,
    but the switching track was lost, making it impossible to apply
    server responsesin the reverse order. Now switching is limited to
    1 action (separation between frontend and backend) but the filters
    remain.

Now it will be extremely easy to add other switching conditions, such
as host matching, URI matching, etc...

There's still a hard work to be done on the logs and stats.
2006-12-17 23:15:24 +01:00
Willy Tarreau b17916e89b [CLEANUP] add a few "const char *" where appropriate
As suggested by Markus Elfring, a few "const char *" have replaced
some "char *" declarations where a function is not expected to
modify a value. It does not change the code but it helps detecting
coding errors.
2006-10-15 15:17:57 +02:00
Willy Tarreau b8750a82a2 [MEDIUM] added the "reqtarpit" and "reqitarpit" features
It is now possible to tarpit connections based on regex matches.
The tarpit timeout is equal to the contimeout. A 500 server error
response is faked, and the logs show the status flags as "PT" which
indicate the connection has been tarpitted.
2006-09-03 09:56:00 +02:00
Willy Tarreau e3ba5f0aaa [CLEANUP] included common/version.h everywhere 2006-06-29 18:54:54 +02:00
Willy Tarreau 2dd0d4799e [CLEANUP] renamed include/haproxy to include/common 2006-06-29 17:53:05 +02:00