Commit Graph

4577 Commits

Author SHA1 Message Date
Willy Tarreau
5ecb069c58 MEDIUM: stream: move all the session-specific stuff of stream_accept() earlier
Since the tcp-request connection rules don't need the stream anymore, we
can safely move the session-specific stuff earlier and prepare for a split
of session and stream initialization. Some work remains to be done.
2015-04-06 11:37:31 +02:00
Willy Tarreau
1df0cc6466 MEDIUM: stream: don't call stream_store_counters() in kill_mini_session() nor session_accept()
This one is not needed anymore since we cannot track the stream counters
prior to reaching these locations. Only session counters may be tracked
and they're properly committed during session_free().
2015-04-06 11:37:31 +02:00
Willy Tarreau
e73ef85a63 MAJOR: tcp: make tcp_exec_req_rules() only rely on the session
It passes a NULL wherever a stream was needed (acl_exec_cond() and
action_ptr mainly). It can still track the connection rate correctly
and block based on ACLs.
2015-04-06 11:37:31 +02:00
Willy Tarreau
70f454e8fa MEDIUM: proto_tcp: track the session's counters in the connection ruleset
The tcp-request connection ruleset now only tracks session counters and
not stream counters. Thus it does not need access to the stream anymore.
2015-04-06 11:37:31 +02:00
Willy Tarreau
bb2ef12a60 MEDIUM: session: update the session's stick counters upon session_free()
Whenever session_free() is called, any possible stick counter stored in
the session will be synchronized.
2015-04-06 11:37:31 +02:00
Willy Tarreau
8b7f8688ee MEDIUM: streams: support looking up stkctr in the session
In order to support sessions tracking counters, we first ensure that there
is no overlap between streams' stkctr and sessions', and we allow an
automatic lookup into the session's counters when the stream doesn't
have a counter or when the stream doesn't exist during an access via
a sample fetch. The functions used to update the stream counters only
update them and not the session counters however.
2015-04-06 11:37:31 +02:00
Willy Tarreau
7698c9080a REORG: stktable: move the stkctr_* functions from stream to sticktable
These ones are not stream-specific at all and will be needed outside of
stream, so let's move them to stick_tables where struct stkctr is defined.
2015-04-06 11:37:30 +02:00
Willy Tarreau
b2bf8331fb MINOR: session: add stick counters to the struct session
The stick counters in the session will be used for everything not related
to contents, hence the connections / concurrent sessions / etc. They will
be usable by "tcp-request connection" rules even without a stream. For now
they're just allocated and initialized.
2015-04-06 11:37:30 +02:00
Willy Tarreau
11c3624c32 MINOR: session: implement session_free() and use it everywhere
We want to call this one everywhere we have to kill a session so
that future parts we move to the session can be released from there.
2015-04-06 11:37:30 +02:00
Willy Tarreau
8d2eca73eb MINOR: session: don't rely on s->logs.logwait in embryonic sessions
The embryonic session now only stores the task (to expire the session)
and the stick counters.
2015-04-06 11:37:30 +02:00
Willy Tarreau
7ea671b914 MINOR: session: store the session's accept date
Doing so ensures we don't need to use the stream anymore to prepare the
log information to report a failed handshake on an embryonic session.
Thus, prepare_mini_sess_log_prefix() now takes a session in argument.
2015-04-06 11:37:30 +02:00
Willy Tarreau
15b5e14faa MINOR: stream: move session initialization before the stream's
In an effort to completely separate stream from session, this patch
further separates the two by initializing the session before the stream.
2015-04-06 11:37:29 +02:00
Willy Tarreau
1f52bb27ec CLEANUP: stream: don't set ->target to the incoming connection anymore
Now that we have sess->origin to carry that information along, we don't
need to put that into strm->target anymore, so we remove one dependence
on the stream in embryonic connections.
2015-04-06 11:37:29 +02:00
Willy Tarreau
d0d8da989b MINOR: stream: provide a few helpers to retrieve frontend, listener and origin
Expressions are quite long when using strm_sess(strm)->whatever, so let's
provide a few helpers : strm_fe(), strm_li(), strm_orig().
2015-04-06 11:37:29 +02:00
Willy Tarreau
192252e2d8 MAJOR: sample: pass a pointer to the session to each sample fetch function
Many such function need a session, and till now they used to dereference
the stream. Once we remove the stream from the embryonic session, this
will not be possible anymore.

So as of now, sample fetch functions will be called with this :

   - sess = NULL,  strm = NULL                     : never
   - sess = valid, strm = NULL                     : tcp-req connection
   - sess = valid, strm = valid, strm->txn = NULL  : tcp-req content
   - sess = valid, strm = valid, strm->txn = valid : http-req / http-res
2015-04-06 11:37:25 +02:00
Willy Tarreau
fdcd2ae1f8 CLEANUP: lua: don't pass http_txn anymore to hlua_request_act_wrapper()
It doesn't use it anymore at all.
2015-04-06 11:35:53 +02:00
Willy Tarreau
987e3fb868 MEDIUM: http: remove the now useless http_txn from {req/res} rules
The registerable http_req_rules / http_res_rules used to require a
struct http_txn at the end. It's redundant with struct stream and
propagates very deep into some parts (ie: it was the reason for lua
requiring l7). Let's remove it now.
2015-04-06 11:35:53 +02:00
Willy Tarreau
f3bf3050a1 CLEANUP: lua: remove unused hlua_smp->l7 and hlua_txn->l7
Since last commit, we don't retrieve the HTTP transaction from there
anymore, so these entries can go.
2015-04-06 11:35:53 +02:00
Willy Tarreau
15e91e1b36 MAJOR: sample: don't pass l7 anymore to sample fetch functions
All of them can now retrieve the HTTP transaction *if it exists* from
the stream and be sure to get NULL there when called with an embryonic
session.

The patch is a bit large because many locations were touched (all fetch
functions had to have their prototype adjusted). The opportunity was
taken to also uniformize the call names (the stream is now always "strm"
instead of "l4") and to fix indent where it was broken. This way when
we later introduce the session here there will be less confusion.
2015-04-06 11:35:53 +02:00
Willy Tarreau
eee5b51248 MAJOR: http: move http_txn out of struct stream
Now this one is dynamically allocated. It means that 280 bytes of memory
are saved per TCP stream, but more importantly that it will become
possible to remove the l7 pointer from fetches and converters since
it will be deduced from the stream and will support being null.

A lot of care was taken because it's easy to forget a test somewhere,
and the previous code used to always trust s->txn for being valid, but
all places seem to have been visited.

All HTTP fetch functions check the txn first so we shouldn't have any
issue there even when called from TCP. When branching from a TCP frontend
to an HTTP backend, the txn is properly allocated at the same time as the
hdr_idx.
2015-04-06 11:35:52 +02:00
Willy Tarreau
63986c72c8 MINOR: http: create a dedicated pool for http_txn
This one will not necessarily be allocated for each stream, and we want
to use the fact that it equals null to know it's not present so that we
can always deduce its presence from the stream pointer.

This commit only creates the new pool.
2015-04-06 11:35:52 +02:00
Willy Tarreau
cb7dd015be MEDIUM: http: move header captures from http_txn to struct stream
The header captures are now general purpose captures since tcp rules
can use them to capture various contents. That removes a dependency
on http_txn that appeared in some sample fetch functions and in the
order by which captures and http_txn were allocated.

Interestingly the reset of the header captures were done at too many
places as http_init_txn() used to do it while it was done previously
in every call place.
2015-04-06 11:35:52 +02:00
Willy Tarreau
53c9b4db41 CLEANUP: sample: remove useless tests in fetch functions for l4 != NULL
The stream may never be null given that all these functions are called
from sample_process(). Let's remove this now confusing test which
sometimes happens after a dereference was already done.
2015-04-06 11:35:52 +02:00
Willy Tarreau
9ad7bd48d2 MEDIUM: session: use the pointer to the origin instead of s->si[0].end
When s->si[0].end was dereferenced as a connection or anything in
order to retrieve information about the originating session, we'll
now use sess->origin instead so that when we have to chain multiple
streams in HTTP/2, we'll keep accessing the same origin.
2015-04-06 11:34:29 +02:00
Willy Tarreau
40606ab976 MINOR: session: add a pointer to the session's origin
A session's origin is the entity that was responsible for creating
the session. It can be an applet or a connection for now.
2015-04-06 11:23:58 +02:00
Willy Tarreau
e36cbcb3b0 MEDIUM: stream: move the frontend's pointer to the session
Just like for the listener, the frontend is session-wide so let's move
it to the session. There are a lot of places which were changed but the
changes are minimal in fact.
2015-04-06 11:23:58 +02:00
Willy Tarreau
fb0afa77c9 MEDIUM: stream: move the listener's pointer to the session
The listener is session-specific, move it there.
2015-04-06 11:23:57 +02:00
Willy Tarreau
feb764040d MEDIUM: stream: allocate the session when a stream is created
This is where we'll put some session-wide information.
2015-04-06 11:23:57 +02:00
Willy Tarreau
b1ec8c4a59 MINOR: session: start to reintroduce struct session
There is now a pointer to the session in the stream, which is NULL
for now. The session pool is created as well. Some parts will move
from the stream to the session now.
2015-04-06 11:23:57 +02:00
Willy Tarreau
e7dff02dd4 REORG/MEDIUM: stream: rename stream flags from SN_* to SF_*
This is in order to keep things consistent.
2015-04-06 11:23:57 +02:00
Willy Tarreau
87b09668be REORG/MAJOR: session: rename the "session" entity to "stream"
With HTTP/2, we'll have to support multiplexed streams. A stream is in
fact the largest part of what we currently call a session, it has buffers,
logs, etc.

In order to catch any error, this commit removes any reference to the
struct session and tries to rename most "session" occurrences in function
names to "stream" and "sess" to "strm" when that's related to a session.

The files stream.{c,h} were added and session.{c,h} removed.

The session will be reintroduced later and a few parts of the stream
will progressively be moved overthere. It will more or less contain
only what we need in an embryonic session.

Sample fetch functions and converters will have to change a bit so
that they'll use an L5 (session) instead of what's currently called
"L4" which is in fact L6 for now.

Once all changes are completed, we should see approximately this :

   L7 - http_txn
   L6 - stream
   L5 - session
   L4 - connection | applet

There will be at most one http_txn per stream, and a same session will
possibly be referenced by multiple streams. A connection will point to
a session and to a stream. The session will hold all the information
we need to keep even when we don't yet have a stream.

Some more cleanup is needed because some code was already far from
being clean. The server queue management still refers to sessions at
many places while comments talk about connections. This will have to
be cleaned up once we have a server-side connection pool manager.
Stream flags "SN_*" still need to be renamed, it doesn't seem like
any of them will need to move to the session.
2015-04-06 11:23:56 +02:00
Willy Tarreau
7073c471a1 CLEANUP: lua: get rid of the last two "*hs" for hlua_smp
The two last occurrences were in hlua_fetches_new() and hlua_converters_new().
Now they're called hsmp as in other places.
2015-04-06 11:23:23 +02:00
Willy Tarreau
da5f10827b CLEANUP: lua: rename variable "sc" for struct hlua_smp
It's unclear where this name comes from, but better rename it now to "hsmp"
to be in line with the rest of the code.
2015-04-06 11:23:23 +02:00
Willy Tarreau
bcb39cc339 CLEANUP: lua: rename last occurrences of "*s" to "*htxn" for hlua_txn
These ones were found in the actions to set the query/path/method/uri.
Where it's used, "s" makes one think about session or something like
this, especially when mixed with http_txn.
2015-04-06 11:23:23 +02:00
Willy Tarreau
9a8ad86648 CLEANUP: lua: get rid of the last "*ht" for struct hlua_txn.
All other ones are called "htxn", call it similarly in hlua_http_new(),
this will make copy-paste safer.
2015-04-06 11:14:06 +02:00
Willy Tarreau
b2ccb5644b CLEANUP: hlua: stop using variable name "s" alternately for hlua_txn and hlua_smp
hlua_run_sample_fetch() uses "struct hlua_smp *s" which starts to become
confusing when "s->s" is used, then hlua_txn_close() uses this for struct
hlua_txn with the same "s->s" everywhere. Let's uniformize everything with
htxn and hsmp as in other places.
2015-04-06 11:11:15 +02:00
Willy Tarreau
de49138b9c CLEANUP: lua: fix confusing local variable naming in hlua_txn_new()
Struct hlua_txn is called "htxn" or "ht" everywhere, while here it's
called "hs" which is the name used everywhere for struct "hlua_smp".
Such confusion participate to the dangers of copy-pasting code, so
let's fix the name here.
2015-04-06 11:04:28 +02:00
Willy Tarreau
07081fe6b7 CLEANUP: lua: remove hard-coded sizeof() in object creations and mallocs
Last bug was an example of a side effect of abuse of copy-paste, but
there are other places at risk, so better fix all occurrences of sizeof
to really reference the object size in order to limit the risks in the
future.
2015-04-06 10:59:20 +02:00
Willy Tarreau
08ef3d055d BUG/MAJOR: lua: use correct object size when initializing a new converter
In hlua_converters_new(), we used to allocate the size of an hlua_txn
instead of hlua_smp, resulting in random crashes with one integer being
randomly overwritten at the end, even when no converter is being used.
2015-04-06 10:54:36 +02:00
Willy Tarreau
482564f309 CLEANUP: lua: remove the unused hlua_sleep memory pool
Commit d44731f ("MEDIUM: lua: change the sleep function core") removed
the use for this pool but forgot to remove the pool which is still
created.
2015-04-06 10:43:23 +02:00
Willy Tarreau
cb703b0352 BUG/MAJOR: http: null-terminate the http actions keywords list
Commit a0dc23f ("MEDIUM: http: implement http-request set-{method,path,query,uri}")
forgot to null-terminate the list, resulting in crashes when these actions
are used if the platform doesn't pad the struct with nulls.

Thanks to Gunay Arslan for reporting a detailed trace showing the
origin of this bug.

No backport to 1.5 is needed.
2015-04-03 09:58:02 +02:00
Willy Tarreau
601a4d1741 BUG/MEDIUM: http: hdr_cnt would not count any header when called without name
It's documented that these sample fetch functions should count all headers
and/or all values when called with no name but in practice it's not what is
being done as a missing name causes an immediate return and an absence of
result.

This bug is present in 1.5 as well and must be backported.
2015-04-01 19:16:09 +02:00
Willy Tarreau
418b8c0c41 MAJOR: compression: integrate support for libslz
This library is designed to emit a zlib-compatible stream with no
memory usage and to favor resource savings over compression ratio.
While zlib requires 256 kB of RAM per compression context (and can only
support 4000 connections per GB of RAM), the stateless compression
offered by libslz does not need to retain buffers between subsequent
calls. In theory this slightly reduces the compression ratio but in
practice it does not have that much of an effect since the zlib
window is limited to 32kB.

Libslz is available at :

      http://git.1wt.eu/web?p=libslz.git

It was designed for web compression and provides a lot of savings
over zlib in haproxy. Here are the preliminary results on a single
core of a core2-quad 3.0 GHz in 32-bit for only 300 concurrent
sessions visiting the home page of www.haproxy.org (76 kB) with
the default 16kB buffers :

          BW In      BW Out     BW Saved   Ratio   memory VSZ/RSS
zlib      237 Mbps    92 Mbps   145 Mbps   2.58     84M /  69M
slz       733 Mbps   380 Mbps   353 Mbps   1.93    5.9M / 4.2M

So while the compression ratio is lower, the bandwidth savings are
much more important due to the significantly lower compression cost
which allows to consume even more data from the servers. In the
example above, zlib became the bottleneck at 24% of the output
bandwidth. Also the difference in memory usage is obvious.

More tests run on a single core of a core i5-3320M, with 500 concurrent
users and the default 16kB buffers :

At 100% CPU (no limit) :
          BW In      BW Out     BW Saved   Ratio   memory VSZ/RSS  hits/s
zlib      480 Mbps   188 Mbps   292 Mbps   2.55     130M / 101M     744
slz      1700 Mbps   810 Mbps   890 Mbps   2.10    23.7M / 9.7M    2382

At 85% CPU (limited) :
          BW In      BW Out     BW Saved   Ratio   memory VSZ/RSS  hits/s
zlib     1240 Mbps   976 Mbps   264 Mbps   1.27     130M / 100M    1738
slz      1600 Mbps   976 Mbps   624 Mbps   1.64    23.7M / 9.7M    2210

The most important benefit really happens when the CPU usage is
limited by "maxcompcpuusage" or the BW limited by "maxcomprate" :
in order to preserve resources, haproxy throttles the compression
ratio until usage is within limits. Since slz is much cheaper, the
average compression ratio is much higher and the input bandwidth
is quite higher for one Gbps output.

Other tests made with some reference files :

                           BW In     BW Out    BW Saved  Ratio  hits/s
daniels.html       zlib  1320 Mbps  163 Mbps  1157 Mbps   8.10    1925
                   slz   3600 Mbps  580 Mbps  3020 Mbps   6.20    5300

tv.com/listing     zlib   980 Mbps  124 Mbps   856 Mbps   7.90     310
                   slz   3300 Mbps  553 Mbps  2747 Mbps   5.97    1100

jquery.min.js      zlib   430 Mbps  180 Mbps   250 Mbps   2.39     547
                   slz   1470 Mbps  764 Mbps   706 Mbps   1.92    1815

bootstrap.min.css  zlib   790 Mbps  165 Mbps   625 Mbps   4.79     777
                   slz   2450 Mbps  650 Mbps  1800 Mbps   3.77    2400

So on top of saving a lot of memory, slz is constantly 2.5-3.5 times
faster than zlib and results in providing more savings for a fixed CPU
usage. For links smaller than 100 Mbps, zlib still provides a better
compression ratio, at the expense of a much higher CPU usage.

Larger input files provide slightly higher bandwidth for both libs, at
the expense of a bit more memory usage for zlib (it converges to 256kB
per connection).
2015-03-29 03:32:06 +02:00
Willy Tarreau
7b21877888 CLEANUP: compression: remove unused reset functions
It's unclear what purpose these functions used to server, however they
are not used anywhere, one good reason to remove them.
2015-03-28 22:08:25 +01:00
Willy Tarreau
9787efa97c MEDIUM: compression: split deflate_flush() into flush and finish
This function used to take a zlib-specific flag as argument to indicate
whether a buffer flush or end of contents was met, let's split it in two
so that we don't depend on zlib anymore.
2015-03-28 19:17:31 +01:00
Willy Tarreau
c91840aa33 MEDIUM: compression: add new "raw-deflate" compression algorithm
This algorithm is exactly the same as "deflate" without the zlib wrapper,
and used as an alternative when the browser wants "deflate". All major
browsers understand it and despite violating the standards, it is known
to work better than "deflate", at least on MSIE and some versions of
Safari. Do not use it in conjunction with "deflate", use either one or
the other since both react to the same Accept-Encoding token. Note that
the lack of Adler32 checksum makes it slightly faster.
2015-03-28 17:01:30 +01:00
Willy Tarreau
615105e7e8 MEDIUM: compression: add a distinction between UA- and config- algorithms
Thanks to MSIE/IIS, the "deflate" name is ambigous. According to the RFC
it's a zlib-wrapped deflate stream, but IIS used to send only a raw deflate
stream, which is the only format MSIE understands for "deflate". The other
widely used browsers do support both formats. For this reason some people
prefer to emit a raw deflate stream on "deflate" to serve more users even
it that means violating the standards. Haproxy only follows the standard,
so they cannot do this.

This patch makes it possible to have one algorithm name in the configuration
and another one in the protocol. This will make it possible to have a new
configuration token to add a different algorithm so that users can decide if
they want a raw deflate or the standard one.
2015-03-28 16:46:38 +01:00
Willy Tarreau
9f640a1eab CLEANUP: compression: statify all algo-specific functions
There's no reason for exporting identity_* nor deflate_*, they're only
used in the same file. Mark them static, it will make it easier to add
other algorithms.
2015-03-28 15:46:00 +01:00
Willy Tarreau
e7e49a8d0b MINOR: http: check the algo name "identity" instead of the function pointer
Next patch will statity all compression functions, so let's stop relying
on a function pointer comparison and use the algo name instead.
2015-03-28 15:43:17 +01:00
Willy Tarreau
2aee2215c9 BUG/MINOR: compression: consider the expansion factor in init
When checking if the buffer is large enough, we used to rely on a fixed
size that was "apparently" enough. We need to consider the expansion
factor of deflate-encoded streams instead, which is of 5 bytes per 32kB.
The previous value was OK till 128kB buffers but became wrong past that.
It's totally harmless since we always keep the reserve when compressiong,
so there's 1kB or so available, which is enough for buffers as large as
6.5 MB, but better fix the check anyway.

This fix could be backported into 1.5 since compression was added there.
2015-03-28 12:23:35 +01:00