haproxy/include/common
Christian Ruppert de898712a0 MEDIUM: regex: Use pcre_study always when PCRE is used, regardless of JIT
pcre_study() has been around long before JIT has been added. It also seems to
affect the performance in some cases (positive).

Below I've attached some test restults. The test is based on
http://sljit.sourceforge.net/regex_perf.html (see bottom). It has been modified
to just test pcre_study vs. no pcre_study. Note: This test does not try to
match specific header it's instead run over a larger text with more and less
complex patterns to make the differences more clear.

% ./runtest
'mark.txt' loaded. (Length: 19665221 bytes)
-----------------
Regex: 'Twain'
[pcre-nostudy] time:    14 ms (2388 matches)
[pcre-study] time:    21 ms (2388 matches)
-----------------
Regex: '^Twain'
[pcre-nostudy] time:   109 ms (100 matches)
[pcre-study] time:   109 ms (100 matches)
-----------------
Regex: 'Twain$'
[pcre-nostudy] time:    14 ms (127 matches)
[pcre-study] time:    16 ms (127 matches)
-----------------
Regex: 'Huck[a-zA-Z]+|Finn[a-zA-Z]+'
[pcre-nostudy] time:   695 ms (83 matches)
[pcre-study] time:    26 ms (83 matches)
-----------------
Regex: 'a[^x]{20}b'
[pcre-nostudy] time:    90 ms (12495 matches)
[pcre-study] time:    91 ms (12495 matches)
-----------------
Regex: 'Tom|Sawyer|Huckleberry|Finn'
[pcre-nostudy] time:  1236 ms (3015 matches)
[pcre-study] time:    34 ms (3015 matches)
-----------------
Regex: '.{0,3}(Tom|Sawyer|Huckleberry|Finn)'
[pcre-nostudy] time:  5696 ms (3015 matches)
[pcre-study] time:  5655 ms (3015 matches)
-----------------
Regex: '[a-zA-Z]+ing'
[pcre-nostudy] time:  1290 ms (95863 matches)
[pcre-study] time:  1167 ms (95863 matches)
-----------------
Regex: '^[a-zA-Z]{0,4}ing[^a-zA-Z]'
[pcre-nostudy] time:   136 ms (4507 matches)
[pcre-study] time:   134 ms (4507 matches)
-----------------
Regex: '[a-zA-Z]+ing$'
[pcre-nostudy] time:  1334 ms (5360 matches)
[pcre-study] time:  1214 ms (5360 matches)
-----------------
Regex: '^[a-zA-Z ]{5,}$'
[pcre-nostudy] time:   198 ms (26236 matches)
[pcre-study] time:   197 ms (26236 matches)
-----------------
Regex: '^.{16,20}$'
[pcre-nostudy] time:   173 ms (4902 matches)
[pcre-study] time:   175 ms (4902 matches)
-----------------
Regex: '([a-f](.[d-m].){0,2}[h-n]){2}'
[pcre-nostudy] time:  1242 ms (68621 matches)
[pcre-study] time:   690 ms (68621 matches)
-----------------
Regex: '([A-Za-z]awyer|[A-Za-z]inn)[^a-zA-Z]'
[pcre-nostudy] time:  1215 ms (675 matches)
[pcre-study] time:   952 ms (675 matches)
-----------------
Regex: '"[^"]{0,30}[?!\.]"'
[pcre-nostudy] time:    27 ms (5972 matches)
[pcre-study] time:    28 ms (5972 matches)
-----------------
Regex: 'Tom.{10,25}river|river.{10,25}Tom'
[pcre-nostudy] time:   705 ms (2 matches)
[pcre-study] time:    68 ms (2 matches)

In some cases it's more or less the same but when it's faster than by a huge margin.
It always depends on the pattern, the string(s) to match against etc.

Signed-off-by: Christian Ruppert <c.ruppert@babiel.com>
2014-11-18 13:26:18 +01:00
..
accept4.h BUILD: syscalls: remove improper inline statement in front of syscalls 2014-05-08 22:38:02 +02:00
appsession.h [MINOR] Make appsess{,ion}_refresh static 2011-06-25 21:07:01 +02:00
base64.h [MINOR] add encode/decode function for 30-bit integers from/to base64 2010-10-30 19:04:33 +02:00
buffer.h CLEANUP: buffers: remove unused function buffer_contig_space_with_res() 2014-04-24 17:19:22 +02:00
cfgparse.h MEDIUM: config: report it when tcp-request rules are misplaced 2014-09-16 15:43:24 +02:00
chunk.h MINOR: chunks: centralize the trash chunk allocation 2012-12-23 21:46:07 +01:00
compat.h BUILD: fix dependencies between config and compat.h 2014-07-15 19:09:36 +02:00
compiler.h CLEANUP: ebtree: clarify licence and update to 6.0.6 2011-12-02 17:09:49 +01:00
config.h BUILD: fix dependencies between config and compat.h 2014-07-15 19:09:36 +02:00
debug.h [MINOR] term_trace: add better instrumentations to trace the code 2008-08-16 14:55:08 +02:00
defaults.h MEDIUM: stick-table: make it easier to register extra data types 2014-07-15 19:14:52 +02:00
epoll.h MAJOR: polling: replace epoll with sepoll and remove sepoll 2012-11-11 20:53:30 +01:00
errors.h [MINOR] errors: provide new status codes for config parsing functions 2010-08-10 14:01:15 +02:00
hash.h BUG/MEDIUM: backend: Update hash to use unsigned int throughout 2014-07-08 22:00:21 +02:00
memory.h MINOR: cli: add the new "show pools" command 2014-01-28 16:50:35 +01:00
mini-clist.h BUG/MEDIUM: prevent gcc from moving empty keywords lists into BSS 2013-06-21 23:29:02 +02:00
rbtree.h
regex.h MEDIUM: regex: Use pcre_study always when PCRE is used, regardless of JIT 2014-11-18 13:26:18 +01:00
sessionhash.h
splice.h BUILD: syscalls: remove improper inline statement in front of syscalls 2014-05-08 22:38:02 +02:00
standard.h MINOR: sample: add "json" converter 2014-10-26 06:41:12 +01:00
syscall.h BUILD: syscalls: remove improper inline statement in front of syscalls 2014-05-08 22:38:02 +02:00
template.h
ticks.h [MEDIUM] scheduler: get rid of the 4 trees thanks and use ebtree v4.1 2009-03-21 10:25:14 +01:00
time.h BUILD: time: adapt the type of TV_ETERNITY to the local system 2013-12-13 09:22:23 +01:00
tools.h [MINOR] tools: add two macros MID_RANGE and MAX_RANGE 2011-03-28 15:55:43 +02:00
uri_auth.h [REORG] http: move the http-request rules to proto_http 2011-03-13 22:00:24 +01:00
version.h DOC: stop referencing the slow git repository in the README 2014-05-10 11:04:39 +02:00