Commit Graph

665 Commits

Author SHA1 Message Date
Willy Tarreau
b145c78623 MINOR: channel: add the date of last read in the channel
We store the time stamp of last read in the channel in order to
be able to measure some bit rate and pause lengths. We only use
16 bits which were unused for this. We don't need more, as it
allows us to measure with a millisecond precision for up to 65s.
2014-02-12 11:45:59 +01:00
Willy Tarreau
8f39dcdc8d BUG/MINOR: channel: initialize xfer_small/xfer_large on new buffers
These ones are only reset during transfers. There is a low but non-null
risk that a first full read causes the previous value to be reused and
immediately to immediately set the CF_STREAMER flag. The impact is only
to increase earlier than expected the SSL record size and to use splice().

This bug was already present in 1.4, so a backport is possible.
2014-02-12 11:45:45 +01:00
Bhaskar Maddala
a20cb85eba MINOR: stats: Enhancement to stats page to provide information of last session time.
Summary:
Track and report last session time on the stats page for each server
in every backend, as well as the backend.

This attempts to address the requirement in the ROADMAP

  - add a last activity date for each server (req/resp) that will be
    displayed in the stats. It will be useful with soft stop.

The stats page reports this as time elapsed since last session. This
change does not adequately address the requirement for long running
session (websocket, RDP... etc).
2014-02-08 01:19:58 +01:00
Willy Tarreau
cc08d2c9ff MEDIUM: counters: stop relying on session flags at all
Till now, we had one flag per stick counter to indicate if it was
tracked in a backend or in a frontend. We just had to add another
flag per stick-counter to indicate if it relies on contents or just
connection. These flags are quite painful to maintain and tend to
easily conflict with other flags if their number is changed.

The correct solution consists in moving the flags to the stkctr struct
itself, but currently this struct is made of 2 pointers, so adding a
new entry there to store only two bits will cause at least 16 more bytes
to be eaten per counter due to alignment issues, and we definitely don't
want to waste tens to hundreds of bytes per session just for things that
most users don't use.

Since we only need to store two bits per counter, an intermediate
solution consists in replacing the entry pointer with a composite
value made of the original entry pointer and the two flags in the
2 unused lower bits. If later a need for other flags arises, we'll
have to store them in the struct.

A few inline functions have been added to abstract the retrieval
and assignment of the pointers and flags, resulting in very few
changes. That way there is no more dependence on the number of
stick-counters and their position in the session flags.
2014-01-28 23:34:45 +01:00
Willy Tarreau
f3338349ec BUG/MEDIUM: counters: flush content counters after each request
One year ago, commit 5d5b5d8 ("MEDIUM: proto_tcp: add support for tracking
L7 information") brought support for tracking L7 information in tcp-request
content rules. Two years earlier, commit 0a4838c ("[MEDIUM] session-counters:
correctly unbind the counters tracked by the backend") used to flush the
backend counters after processing a request.

While that earliest patch was correct at the time, it became wrong after
the second patch was merged. The code does what it says, but the concept
is flawed. "TCP request content" rules are evaluated for each HTTP request
over a single connection. So if such a rule in the frontend decides to
track any L7 information or to track L4 information when an L7 condition
matches, then it is applied to all requests over the same connection even
if they don't match. This means that a rule such as :

     tcp-request content track-sc0 src if { path /index.html }

will count one request for index.html, and another one for each of the
objects present on this page that are fetched over the same connection
which sent the initial matching request.

Worse, it is possible to make the code do stupid things by using multiple
counters:

     tcp-request content track-sc0 src if { path /foo }
     tcp-request content track-sc1 src if { path /bar }

Just sending two requests first, one with /foo, one with /bar, shows
twice the number of requests for all subsequent requests. Just because
both of them persist after the end of the request.

So the decision to flush backend-tracked counters was not the correct
one. In practice, what is important is to flush countent-based rules
since they are the ones evaluated for each request.

Doing so requires new flags in the session however, to keep track of
which stick-counter was tracked by what ruleset. A later change might
make this easier to maintain over time.

This bug is 1.5-specific, no backport to stable is needed.
2014-01-28 21:40:28 +01:00
Willy Tarreau
91b843d0d2 REORG: stats: move the stats socket states to dumpstats.c
There is no more usage of these values outside of dumpstats.c, and
they're easier to maintain there. Also replace the #defines with an
enum.
2014-01-28 16:28:21 +01:00
Willy Tarreau
71b734c307 MINOR: cli: add more information to the "show info" output
In addition to previous outputs, we also emit the cumulated number of
connections, the cumulated number of requests, the maximum allowed
SSL connection concurrency, the current number of SSL connections and
the cumulated number of SSL connections. This will help troubleshoot
systems which experience memory shortage due to SSL.
2014-01-28 15:19:44 +01:00
Willy Tarreau
25002d206b MINOR: polling: create function fd_compute_new_polled_status()
This function is used to compute the new polling state based on
the previous state. All pollers have to do this in their update
loop, so better centralize the logic for it.
2014-01-26 00:42:32 +01:00
Willy Tarreau
e852545594 MEDIUM: polling: centralize polled events processing
Currently, each poll loop handles the polled events the same way,
resulting in a lot of duplicated, complex code. Additionally, epoll
was the only one to handle newly created FDs immediately.

So instead, let's move that code to fd.c in a new function dedicated
to this task : fd_process_polled_events(). All pollers now use this
function.
2014-01-26 00:42:32 +01:00
Willy Tarreau
6c11bd2f89 OPTIM: raw-sock: don't speculate after a short read if polling is enabled
This is the reimplementation of the "done" action : when we experience
a short read, we're almost certain that we've exhausted the system's
buffers and that we'll meet an EAGAIN if we attempt to read again. If
the FD is not yet polled, the stream interface already takes care of
stopping the speculative read. When the FD is already being polled, we
have two options :
  - either we're running from a level-triggered poller, in which case
    we'd rather report that we've reached the end so that we don't
    speculate over the poller and let it report next time data are
    available ;

  - or we're running from an edge-triggered poller in which case we
    have no choice and have to see the EAGAIN to re-enable events.

At the moment we don't have any edge-triggered poller, so it's desirable
to avoid speculative I/O that we know will fail.

Note that this must not be ported to SSL since SSL hides the real
readiness of the file descriptor.

Thanks to this change, we observe no EAGAIN anymore during keep-alive
transfers, and failed recvfrom() are reduced by half in http-server-close
mode (the client-facing side is always being polled and the second recv
can be avoided). Doing so results in about 5% performance increase in
keep-alive mode. Similarly, we used to have up to about 1.6% of EAGAIN
on accept() (1/maxaccept), and these have completely disappeared under
high loads.
2014-01-26 00:42:32 +01:00
Willy Tarreau
aad69387ac CLEANUP: connection: use conn_xprt_ready() instead of checking the flag
It's easier and safer to rely on conn_xprt_ready() everywhere than to
check the flag itself. It will also simplify adding extra checks later
if needed. Some useless controls for !xprt have been removed, as the
XPRT_READY flag itself guarantees xprt is set.
2014-01-26 00:42:31 +01:00
Willy Tarreau
3c72872da1 CLEANUP: connection: use conn_ctrl_ready() instead of checking the flag
It's easier and safer to rely on conn_ctrl_ready() everywhere than to
check the flag itself. It will also simplify adding extra checks later
if needed. Some useless controls for !ctrl have been removed, as the
CTRL_READY flag itself guarantees ctrl is set.
2014-01-26 00:42:31 +01:00
Willy Tarreau
e1f50c4b02 MEDIUM: connection: remove conn_{data,sock}_poll_{recv,send}
We simply remove these functions and replace their calls with the
appropriate ones :

  - if we're in the data phase, we can simply report wait on the FD
  - if we're in the socket phase, we may also have to signal the
    desire to read/write on the socket because it might not be
    active yet.
2014-01-26 00:42:30 +01:00
Willy Tarreau
310987a038 MAJOR: connection: remove the CO_FL_WAIT_{RD,WR} flags
These flags were used to report the readiness of the file descriptor.
Now this readiness is directly checked at the file descriptor itself.
This removes the need for constantly synchronizing updates between the
file descriptor and the connection and ensures that all layers share
the same level of information.

For now, the readiness is updated in conn_{sock,data}_poll_* by directly
touching the file descriptor. This must move to the lower layers instead
so that these functions can disappear as well. In this state, the change
works but is incomplete. It's sensible enough to avoid making it more
complex.

Now the sock/data updates become much simpler because they just have to
enable/disable access to a file descriptor and not to care anymore about
its readiness.
2014-01-26 00:42:30 +01:00
Willy Tarreau
f817e9f473 MAJOR: polling: rework the whole polling system
This commit heavily changes the polling system in order to definitely
fix the frequent breakage of SSL which needs to remember the last
EAGAIN before deciding whether to poll or not. Now we have a state per
direction for each FD, as opposed to a previous and current state
previously. An FD can have up to 8 different states for each direction,
each of which being the result of a 3-bit combination. These 3 bits
indicate a wish to access the FD, the readiness of the FD and the
subscription of the FD to the polling system.

This means that it will now be possible to remember the state of a
file descriptor across disable/enable sequences that generally happen
during forwarding, where enabling reading on a previously disabled FD
would result in forgetting the EAGAIN flag it met last time.

Several new state manipulation functions have been introduced or
adapted :
  - fd_want_{recv,send} : enable receiving/sending on the FD regardless
    of its state (sets the ACTIVE flag) ;

  - fd_stop_{recv,send} : stop receiving/sending on the FD regardless
    of its state (clears the ACTIVE flag) ;

  - fd_cant_{recv,send} : report a failure to receive/send on the FD
    corresponding to EAGAIN (clears the READY flag) ;

  - fd_may_{recv,send}  : report the ability to receive/send on the FD
    as reported by poll() (sets the READY flag) ;

Some functions are used to report the current FD status :

  - fd_{recv,send}_active
  - fd_{recv,send}_ready
  - fd_{recv,send}_polled

Some functions were removed :
  - fd_ev_clr(), fd_ev_set(), fd_ev_rem(), fd_ev_wai()

The POLLHUP/POLLERR flags are now reported as ready so that the I/O layers
knows it can try to access the file descriptor to get this information.

In order to simplify the conditions to add/remove cache entries, a new
function fd_alloc_or_release_cache_entry() was created to be used from
pollers while scanning for updates.

The following pollers have been updated :

   ev_select() : done, built, tested on Linux 3.10
   ev_poll()   : done, built, tested on Linux 3.10
   ev_epoll()  : done, built, tested on Linux 3.10 & 3.13
   ev_kqueue() : done, built, tested on OpenBSD 5.2
2014-01-26 00:42:30 +01:00
Willy Tarreau
033cd9d78c REORG: polling: rename "fd_process_spec_events()" to "fd_process_cached_events()"
This is in order to be coherent with the rest.
2014-01-26 00:42:29 +01:00
Willy Tarreau
899d95757e REORG: polling: rename the cache allocation functions
- alloc_spec_entry() becomes fd_alloc_cache_entry()
- release_spec_entry() becomes fd_release_cache_entry()
2014-01-26 00:42:29 +01:00
Willy Tarreau
16f649c82c REORG: polling: rename "fd_spec" to "fd_cache"
So fd_spec was renamed "fd_cache" as it's becoming an event cache, and
fd_nbspec becomes fd_cache_num.
2014-01-26 00:42:29 +01:00
Willy Tarreau
15a4dec87e REORG: polling: rename "spec_e" to "state" and "spec_p" to "cache"
We're completely changing the way FDs will be polled. There will be no
more speculative I/O since we'll know the exact FD state, so these will
only be cached events.

First, let's fix a few field names which become confusing. "spec_e" was
used to store a speculative I/O event state. Now we'll store the whole
R/W states for the FD there. "spec_p" was used to store a speculative
I/O cache position. Now let's clearly call it "cache".
2014-01-26 00:42:29 +01:00
Willy Tarreau
69a41fa8a3 CLEANUP: polling: rename "spec_e" to "state"
We're completely changing the way FDs will be polled. First, let's fix
a few field names which become confusing. "spec_e" was used to store a
speculative I/O event state. Now we'll store the whole R/W states for
the FD there.
2014-01-26 00:42:28 +01:00
Willy Tarreau
45b34e8abc MINOR: connection: add more error codes to report connection errors
It is quite often that an connection error only reports "socket error" with
no more information. This is especially problematic with health checks where
many causes are possible, including resource exhaustion which do not lead to
a valid errno code. So let's add explicit codes to cover these cases.
2014-01-24 16:15:04 +01:00
Thierry FOURNIER
e7ba23633b MINOR: pattern: move functions for grouping pat_match_* and pat_parse_* and add documentation. 2014-01-21 22:14:21 +01:00
Willy Tarreau
2aefad5df7 MINOR: connection: add a new conn_drain() function
Till now there was no way to know from a connection if a previous
call to drain() had done any change. This function is used to drain
incoming data and to update the connection's flags at the same time.
It also correctly sets the polling flags on the connection if the
drain function indicates inability to receive. This function will
be used preferably over ctrl->drain() when a connection is used.
2014-01-20 22:27:16 +01:00
Willy Tarreau
8663105095 BUG: Revert "OPTIM: poll: restore polling after a poll/stop/want sequence"
This reverts commit 1208266356.

It randomly breaks SSL. What happens is that if the SSL response is
read at once by the SSL stack and is partially delivered to the buffer,
then there's no way to read the next parts because we wait for some
polling first.

So we'll fix this after the polling rework.
2014-01-13 11:34:42 +01:00
Willy Tarreau
9fe7aae6eb MINOR: checks: use an inline function for health_adjust()
This function is called twice per request, and does almost always nothing.
Better use an inline version to avoid entering it when we can.

About 0.5% additional performance was gained this way.
2013-12-31 23:47:37 +01:00
Willy Tarreau
9e5a3aacf4 MEDIUM: stream-int: make si_connect() return an established state when possible
si_connect() used to only return SI_ST_CON. But it already detect the
connection reuse and is the function which avoids calling connect().
So it already knows the connection is valid and reuse. Thus we make it
return SI_ST_EST when a connection is reused. This means that
connect_server() can return this state and sess_update_stream_int()
as well.

Thanks to this change, we don't need to leave process_session() in
SI_ST_CON state to immediately enter it again to switch to SI_ST_EST.
Implementing this removes one call to process_session() per request
in keep-alive mode. We're now at 2 calls per request, which is the
minimum (one for the request and another one for the response). The
number of calls to http_wait_for_response() has also dropped from 2
to one.

Tests indicate a performance gain of about 2.6% in request rate in
keep-alive mode. There should be no gain in http-server-close() since
we don't use this faster path.
2013-12-31 23:32:12 +01:00
Willy Tarreau
51437d2c59 Revert "MEDIUM: stats: add support for HTTP keep-alive on the stats page"
This reverts commit f3221f99ac.

Igor reported some very strange breakage of his stats page which is
clearly caused by the chunking, though I don't see at first glance
what could be wrong. Better revert it for now.
2013-12-29 00:43:40 +01:00
Willy Tarreau
f3221f99ac MEDIUM: stats: add support for HTTP keep-alive on the stats page
In theory the principle is simple as we just need to send HTTP chunks
if the client is 1.1 compatible. In practice it's harder because we
have to append a CR LF after each block of data and we're never sure
to have the room for this. In order not to have to deal with this, we
instead send the CR LF prior to each chunk size. The only issue is for
the first chunk and for this reason we avoid to send the empty header
line when using chunked encoding.
2013-12-28 21:40:16 +01:00
Willy Tarreau
1208266356 OPTIM: poll: restore polling after a poll/stop/want sequence
If a file descriptor is being polled, and it stopped (eg: buffer full
or end of response), then re-enabled, currently what happens is that
the polling is disabled, then the fd is enabled in speculative mode,
an I/O attempt is made, it loses (otherwise the FD would surely not
have been polled), and the polled is enabled again.

This is too bad, especially with HTTP keep-alive on the server side
where all operations are performed at once before going back to the
poll loop.

Now we improve the behaviour by ensuring that if an fd is still being
polled, when it's enabled after having been disabled, we re-enable the
polling. Doing so saves a number of syscalls and useless wakeups, and
results in a significant performance gain on HTTP keep-alive. A 11%
increase has been observed on the HTTP request rate in keep-alive
thanks to this.

It could be considered as a bug fix, but there was no harm with the
current behaviour, except extra syscalls.
2013-12-27 20:18:52 +01:00
Willy Tarreau
2737562e43 MEDIUM: stream-int: implement a very simplistic idle connection manager
Idle connections are not monitored right now. So if a server closes after
a response without advertising it, it won't be detected until a next
request wants to use the connection. This is a bit problematic because
it unnecessarily maintains file descriptors and sockets in an idle
state.

This patch implements a very simple idle connection manager for the stream
interface. It presents itself as an I/O callback. The HTTP engine enables
it when it recycles a connection. If a close or an error is detected on the
underlying socket, it tries to drain as much data as possible from the socket,
detect the close and responds with a close as well, then detaches from the
stream interface.
2013-12-17 00:00:28 +01:00
Willy Tarreau
e38feed966 BUG/MINOR: stats: correctly report throttle rate of low weight servers
The throttling of low weight servers (<16) could mistakenly be reported
as > 100% due to a rounding that was performed before a multiply by 100
instead of after. This was introduced in 1.5-dev20 when fixing a previous
reporting issue by commit d32c399 (MINOR: stats: report correct throttling
percentage for servers in slowstart).

It should be backported if the patch above is backported.
2013-12-16 18:04:57 +01:00
Willy Tarreau
b490b4e5ad MAJOR: stream-int: handle the connection reuse in si_connect()
This is the best place to reuse a connection. We centralize all
connection requests and we're at the best place to know exactly
what the current state of the underlying connection is. If the
connection is reused, we just enable polling for send() in order
to be able to emit the request.
2013-12-16 02:23:53 +01:00
Willy Tarreau
9471b8ced9 MEDIUM: connection: inform si_alloc_conn() whether existing conn is OK or not
When allocating a new connection, only the caller knows whether it's
acceptable to reuse the previous one or not. Let's pass this information
to si_alloc_conn() which will do the cleanup if the connection is not
acceptable.
2013-12-16 02:23:53 +01:00
Willy Tarreau
ad38acedaa MEDIUM: connection: centralize handling of nolinger in fd management
Right now we see many places doing their own setsockopt(SO_LINGER).
Better only do it just before the close() in fd_delete(). For this
we add a new flag on the file descriptor, indicating if it's safe or
not to linger. If not (eg: after a connect()), then the setsockopt()
call is automatically performed before a close().

The flag automatically turns to safe when receiving a read0.
2013-12-16 02:23:52 +01:00
Willy Tarreau
d02cdd23be MINOR: connection: add simple functions to report connection readiness
conn_xprt_ready() reports if the transport layer is ready.
conn_ctrl_ready() reports if the control layer is ready.

The stream interface uses si_conn_ready() to report that the
underlying connection is ready. This will be used for connection
reuse in keep-alive mode.
2013-12-16 02:23:52 +01:00
Willy Tarreau
975c1784c8 MINOR: sample: make sample_parse_expr() use memprintf() to report parse errors
Doing so ensures that we're consistent between all the functions in the whole
chain. This is important so that we can extract the argument parsing from this
function.
2013-12-12 23:16:54 +01:00
Thierry FOURNIER
c0e0d7b7cf MEDIUM: map: dynamic manipulation of maps
This patch adds map manipulation commands to the socket interface.

add map <map> <key> <value>
  Add the value <value> in the map <map>, at the entry corresponding to
  the key <key>. This command does not verify if the entry already
  exists.

clear map <map>
  Remove entries from the map <map>

del map <map> <key>
  Delete all the map entries corresponding to the <key> value in the map
  <map>.

set map <map> <key> <value>
  Modify the value corresponding to each key <key> in a map <map>. The
  new value is <value>.

show map [<map>]
  Dump info about map converters. Without argument, the list of all
  available maps are returned. If a <map> is specified, is content is
  dumped.
2013-12-12 15:58:30 +01:00
Thierry FOURNIER
01cdcd4a62 MINOR: pattern: add function to lookup a specific entry in pattern list
This is used to dynamically delete or update map entry.
2013-12-12 15:50:01 +01:00
Thierry FOURNIER
b0c0a0f940 MINOR: map: export parse output sample functions
This export is used to identify the parser used
2013-12-12 15:44:05 +01:00
Thierry FOURNIER
7609064fc3 MINOR: pattern: make the pattern matching function return a pointer to the matched element
This feature will be used by the CLI to look up keys.
2013-12-12 15:44:05 +01:00
Thierry FOURNIER
0b2fe4a5cd MINOR: pattern: add support for compiling patterns for lookups
With this patch, patterns can be compiled for two modes :
  - match
  - lookup

The match mode is used for example in ACLs or maps. The lookup mode
is used to lookup a key for pattern maintenance. For example, looking
up a network is different from looking up one address belonging to
this network.

A special case is made for regex. In lookup mode they return the input
regex string and do not compile the regex.
2013-12-12 15:44:02 +01:00
Thierry FOURNIER
7148ce6ef4 MEDIUM: pattern: Extract the index process from the pat_parse_*() functions
Now, the pat_parse_*() functions parses the incoming data. The input
"pattern" struct can be preallocated. If the parser needs to add some
buffers, it allocates memory.

The function pattern_register() runs the call to the parser, process
the key indexation and associate the "sample_storage" used by maps.
2013-12-12 15:42:11 +01:00
Thierry FOURNIER
e3ded59706 MEDIUM: acl: Last patch change the output type
This patch remove the compatibility check from the input type and the
match method. Now, it checks if a casts from the input type to output
type exists and the pattern_exec_match() function apply casts before
each pattern matching.
2013-12-12 15:42:11 +01:00
Thierry FOURNIER
cc0e0b3dbb MINOR: pattern: Each pattern sets the expected input type
This is used later for increasing the compability with incoming
sample types. When multiple compatible types are supported, one
is arbitrarily used (eg: UINT).
2013-12-12 11:07:33 +01:00
Thierry FOURNIER
2d4771ba17 MINOR: map: export map_get_reference() function
This function is used to identify map with his reference into the CLI
functions.
2013-12-11 22:05:03 +01:00
Willy Tarreau
3770f23a3a MINOR: http: switch the http state to an enum
This reduces its size which is not reused by anything else. However it
will significantly improve the debugger's output since we'll now get
real state values.

The default case had to be enabled in the parsers because gcc tries
to optimize the switch/case and noticed some values were missing from
the enums and emitted a warning.
2013-12-09 16:06:22 +01:00
Willy Tarreau
4171e9eef0 MEDIUM: stats: delay appctx initialization
Now that the session handler can automatically initialize the appctx,
let's not do it in stats_accept() anymore.
2013-12-09 15:40:23 +01:00
Willy Tarreau
0a23bcb8be MAJOR: stream-interface: dynamically allocate the applet context
From now on, a call to stream_int_register_handler() causes a call
to si_alloc_appctx() and returns an initialized appctx for the
current stream interface. If one was previously allocated, it is
released. If the stream interface was attached to a connection, it
is released as well.

The appctx are allocated from the same pools as the connections, because
they're substantially smaller in size, and we can't have both a connection
and an appctx on an interface at any moment.

In case of memory shortage, the call may return NULL, which is already
handled by all consumers of stream_int_register_handler().

The field appctx was removed from the stream interface since we only
rely on the endpoint now. On 32-bit, the stream_interface size went down
from 108 to 44 bytes. On 64-bit, it went down from 144 to 64 bytes. This
represents a memory saving of 160 bytes per session.

It seems that a later improvement could be to move the call to
stream_int_register_handler() to session.c for most cases.
2013-12-09 15:40:23 +01:00
Willy Tarreau
1fbe1c9ec8 MEDIUM: stream-int: return the allocated appctx in stream_int_register_handler()
The task returned by stream_int_register_handler() is never used, however we
always need to access the appctx afterwards. So make it return the appctx
instead. We already plan for it to fail, which is the reason for the addition
of a few tests and the possibility for the HTTP analyser to return a status
code 500.
2013-12-09 15:40:23 +01:00
Willy Tarreau
7b4b499fde MEDIUM: stream-int: replace occurrences of si->appctx with si_appctx()
We're about to remove si->appctx, so first let's replace all occurrences
of its usage with a dynamic extract from si->end. A lot of code was changed
by search-n-replace, but the behaviour was intentionally not altered.

The code surrounding calls to stream_int_register_handler() was slightly
changed since we can only use si->end *after* the registration.
2013-12-09 15:40:23 +01:00