Commit Graph

1694 Commits

Author SHA1 Message Date
Willy Tarreau
ed72d82827 MEDIUM: time: measure the time stolen by other threads
The purpose is to detect if threads or processes are competing for the
same CPU. This can happen when threads are incorrectly bound, or after a
reload if the previous process still has an important activity. With
threads this situation is problematic because a preempted thread holding
a lock will block other ones waiting for this lock to be released.

A first attempt consisted in measuring the cumulated lost time more
precisely but the system's scheduler is smart enough to try to limit the
thread preemption rate by mostly context switching during poll()'s blank
periods, so most of the time lost is not seen. In essence this is good
because it means a thread is not preempted with a lock held, and even
regarding the rendez-vous point it cannot prevent the other ones from
making progress. But still it happens tens to hundreds of times per
second that a thread might be preempted, so it's still possible to detect
that the situation is happening, thus it's interesting to measure and
report its frequency.

Each time we enter the poller, we check the CPU time spent working and
see if we've lost time doing something else. To limit false positives,
we're only interested in losses of 500 microseconds or more (i.e. half
a clock tick on a 1 kHz system). If so, it indicates that some time was
stolen by another thread or process. Note that we purposely store some
sub-millisecond counters so that under heavy traffic with a 1 kHz clock,
it's still possible to measure something without being subject to the
risk of rounding errors (i.e. if exactly 1 ms is stolen it's possible
that the time difference could often be slightly lower).

This counter of lost CPU time slots time is reported in "show activity"
in numbers of milliseconds of CPU lost per second, per 15s, and total
over the process' life. By definition, the per-second counter cannot
report values larger than 1000 per thread per second and the 15s one
will be limited to 15000/s in the worst case, but it's possible that
peak values exceed such thresholds after long pauses.
2018-10-19 08:51:59 +02:00
Bertrand Jacquin
d5e4de8e5f DOC: Fix a few typos
these are mostly spelling mistakes, some of them might be candidate for
backporting as well.
2018-10-15 19:38:15 +02:00
Christopher Faulet
25da9e34f1 MINOR: h1: Add the flag H1_MF_NO_PHDR to not add pseudo-headers during parsing
Some pseudo-headers are added during the headers parsing, mainly for the mux
H2. With this flag, it is possible to not add them. This avoid some boring
filtering in the mux H1.
2018-10-12 16:15:18 +02:00
Christopher Faulet
1dc2b49556 MINOR: h1: Change the union h1_sl to use indirect strings to store infos
Instead of using offsets relating to the parsed buffer to store start line
infos, we now use indirect strings. So now, these infos remain valid only if the
origin buffer remains untouched. But it's not a real problem because this union
is used during the parsing and never stored to a later use.
2018-10-12 16:14:57 +02:00
Christopher Faulet
08088e77c6 MINOR: conn-stream: Add CL_FL_NOT_FIRST flag
This flags will be used by multiplexers to warn a conn-stream (and, by
transitivity, a stream) it is not the first one created by the mux. It will help
mux H1 to handle keep-alive connections.
2018-10-12 16:09:26 +02:00
Christopher Faulet
315b39c391 MINOR: http: Use same flag for httpclose and forceclose options
Since keep-alive mode is the default mode, the passive close has disappeared,
and in the code, httpclose and forceclose options are handled the same way:
connections with the client and the server are closed as soon as the request and
the response are received and missing "Connection: close" header is added in
each direction.

So to make things clearer, forceclose is now an alias for httpclose. And
httpclose is explicitly an active close. So the old passive close does not exist
anymore. Internally, the flag PR_O_HTTP_PCL has been removed and PR_O_HTTP_FCL
has been replaced by PR_O_HTTP_CLO. In HTTP analyzers, the checks done to find
the right mode to use, depending on proxies options and "Connection: " header
value, have been simplified.

This should only be a cleanup and no changes are expected.
2018-10-12 16:07:56 +02:00
Olivier Houchard
fa8aa867b9 MEDIUM: connections: Change struct wait_list to wait_event.
When subscribing, we don't need to provide a list element, only the h2 mux
needs it. So instead, Add a list element to struct h2s, and use it when a
list is needed.
This forces us to use the unsubscribe method, since we can't just unsubscribe
by using LIST_DEL anymore.
This patch is larger than it should be because it includes some renaming.
2018-10-11 15:34:39 +02:00
Olivier Houchard
83a0cd8a36 MINOR: connections: Introduce an unsubscribe method.
As we don't know how subscriptions are handled, we can't just assume we can
use LIST_DEL() to unsubscribe, so introduce a new method to mux and connections
to do so.
2018-10-11 15:34:21 +02:00
Dirkjan Bussink
415150f764 MEDIUM: ssl: add support for ciphersuites option for TLSv1.3
OpenSSL released support for TLSv1.3. It also added a separate function
SSL_CTX_set_ciphersuites that is used to set the ciphers used in the
TLS 1.3 handshake. This change adds support for that new configuration
option by adding a ciphersuites configuration variable that works
essentially the same as the existing ciphers setting.

Note that it should likely be backported to 1.8 in order to ease usage
of the now released openssl-1.1.1.
2018-10-08 19:20:13 +02:00
Willy Tarreau
61c112aa5b REORG: http: move HTTP rules parsing to http_rules.c
These ones are mostly called from cfgparse.c for the parsing and do
not depend on the HTTP representation. The functions's prototypes
were moved to proto/http_rules.h, making this file work exactly like
tcp_rules. Ideally we should stop calling these functions directly
from cfgparse and register keywords, but there are a few cases where
that wouldn't work (stats http-request) so it's probably not worth
trying to go this far.
2018-10-02 18:28:05 +02:00
Adis Nezirovic
8878f8eb3d MEDIUM: lua: Add stick table support for Lua.
This ads support for accessing stick tables from Lua. The supported
operations are reading general table info, lookup by string/IP key, and
dumping the table.

Similar to "show table", a data filter is available during dump, and as
an improvement over "show table" it's possible to use up to 4 filter
expressions instead of just one (with implicit AND clause binding the
expressions). Dumping with/without filters can take a long time for
large tables, and should be used sparingly.
2018-09-29 20:15:01 +02:00
Willy Tarreau
2557f6a3e2 MEDIUM: h1: better handle transfer-encoding vs content-length
The transfer-encoding header processing was a bit lenient in this part
because it was made to read messages already validated by haproxy. We
absolutely need to reinstate the strict processing defined in RFC7230
as is currently being done in proto_http.c. That is, transfer-encoding
presence alone is enough to cancel content-length, and must be
terminated by the "chunked" token, except in the response where we
can fall back to the close mode if it's not last.

For this we now use a specific parsing function which updates the
flags and we introduce a new flag H1_MF_XFER_ENC indicating that the
transfer-encoding header is present.

Last, if such a header is found, we delete all content-length header
fields found in the message.
2018-09-14 17:40:35 +02:00
Christopher Faulet
c4e53f4ad7 MINOR: h1: Add H1_MF_XFER_LEN flag
This flag is usefull to handle cases where there is no body, regardless of CL or
TE headers (for instance, responses to HEAD requests). It will not be set by the
parser itself.
2018-09-14 16:02:40 +02:00
Willy Tarreau
98f5cf7a59 MINOR: h1: parse the Connection header field
The new function h1_parse_connection_header() is called when facing a
connection header in the generic parser, and it will set up to 3 bits
in h1m->flags indicating if at least one "close", "keep-alive" or "upgrade"
tokens was seen.
2018-09-13 14:52:31 +02:00
Willy Tarreau
ba5fbca33f MINOR: h1: report in the h1m struct if the HTTP version is 1.1 or above
This will be needed for the mux to know how to process the Connection
header, and will save it from having to re-parse the request line since
it's captured on the fly.
2018-09-13 14:34:09 +02:00
Willy Tarreau
175a2bb507 MINOR: connection: pass the proxy when creating a connection
Till now it was very difficult for a mux to know what proxy it was
working for. Let's pass the proxy when the mux is instanciated at
init() time. It's not yet used but the H1 mux will definitely need
it, just like the H2 mux when dealing with backend connections.
2018-09-12 17:39:22 +02:00
Willy Tarreau
eb528db60b MINOR: h1: add H1_MF_TOLOWER to decide when to turn header names to lower case
The h1 parser used to systematically turn header field names to lower
case because it was designed for H2. Let's add a flag which is off by
default to condition this behaviour so that when using it from an H1
parser it will not affect the message.
2018-09-12 17:38:26 +02:00
Willy Tarreau
11da5674c3 MINOR: h1: remove the HTTP status from the H1M struct
It has nothing to do there and is not used from there anymore, let's
get rid of it.
2018-09-12 17:38:25 +02:00
Willy Tarreau
001823c304 MEDIUM: h1: remove the useless H1_MSG_BODY state
This state was only a delimiter between headers and body but it now
causes more harm than good because it requires someone to change it.
Since the H1 parser knows if we're in DATA or CHUNK_SIZE, simply let
it set the right next state so that h1m->state constantly matches
what is expected afterwards.
2018-09-12 17:38:25 +02:00
Willy Tarreau
a41393fc61 MEDIUM: h1: make the parser support a pointer to a start line
This will allow the parser to fill some extra fields like the method or
status without having to store them permanently in the HTTP message. At
this point however the parser cannot restart from an interrupted read.
2018-09-12 17:38:25 +02:00
Willy Tarreau
bbf3823f82 MINOR: h1: properly pre-initialize err_pos to -2
This way we maintain the old mechanism stating that -2 means we block
on errors, -1 means we only capture them, and a positive value indicates
the position of the first error.
2018-09-12 17:38:25 +02:00
Willy Tarreau
ccaf233741 MINOR: h1: add a message flag to indicate that a message carries a response
This flag is H1_MF_RESP. It will be used by the parser during restarts when
it supports requests.
2018-09-12 17:38:25 +02:00
Willy Tarreau
acc295cab3 MINOR: h1: remove the unused states from h1m_state
States ERROR, 100_SENT, ENDING, CLOSE, CLOSING are not used at all for
the parsers. It's possible that a few others may disappear as well.
2018-09-12 17:38:25 +02:00
Willy Tarreau
b3b0152b6f MINOR: h1: add the restart offsets into struct h1m
Currently the only user of struct h1m is the h2 mux when it has to parse
an H1 message coming from the channel. Unfortunately this is not enough
to efficiently parse HTTP/1 messages like those coming from the network
as we don't want to restart from scratch at every byte received.

This patch reintroduces the "next" offset into the H1 message so that any
H1 parser can use it to restart when called with a state that is not the
initial state.
2018-09-12 17:38:25 +02:00
Willy Tarreau
801250e07d REORG: h1: create a new h1m_state
This is the *parsing* state of an HTTP/1 message. Currently the h1_state
is composite as it's made both of parsing and control (100SENT, BODY,
DONE, TUNNEL, ENDING etc). The purpose here is to have a purely H1 state
that can be used by H1 parsers. For now it's equivalent to h1_state.
2018-09-12 17:38:25 +02:00
Olivier Houchard
71384551fe MINOR: conn_streams: Remove wait_list from conn_streams.
The conn_streams won't be used for subscribing/waiting for I/O events, after
all, so just remove its wait_list, and send/recv/_wait_list.
2018-09-12 17:37:55 +02:00
Olivier Houchard
26e1a8f2bf MINOR: checks: Give checks their own wait_list.
Instead of (ab)using the conn_stream's wait_list, which should disappear,
give the checks their own wait_list.
2018-09-12 17:37:55 +02:00
Olivier Houchard
cb1f49ff93 MINOR: connections: Add a "handle" field to wait_list.
Add a new field to struct wait_list, "handle", that can be used by the
entity in charge of subscribing.
2018-09-12 17:37:55 +02:00
Olivier Houchard
af4021e680 MEDIUM: connections: Get rid of the recv() method.
Remove the recv() method from mux and conn_stream.
The goal is to always receive from the upper layers, instead of waiting
for the connection later. For now, recv() is still called from the wake()
method, but that should change soon.
2018-09-12 17:37:55 +02:00
Olivier Houchard
4cf7fb148f MEDIUM: connections/mux: Add a recv and a send+recv wait list.
For struct connection, struct conn_stream, and for the h2 mux, add 2 new
lists, one that handles waiters for recv, and one that handles waiters for
recv and send. That way we can ask to subscribe for either recv or send.
2018-09-12 17:37:55 +02:00
William Lallemand
2d3f8a411f MEDIUM: protocol: use a custom AF_MAX to help protocol parser
It's possible to have several protocols per family which is a problem
with the current way the protocols are stored.

This allows to register a new protocol in HAProxy which is not a
protocol in the strict socket definition. It will be used to register a
SOCK_STREAM protocol using socketpair().
2018-09-12 07:12:27 +02:00
Willy Tarreau
35b51c6e5b REORG: http: move the HTTP semantics definitions to http.h/http.c
It's a bit painful to have to deal with HTTP semantics for each protocol
version (H1 and H2), and working on the version-agnostic code further
emphasizes the problem.

This patch creates http.h and http.c which are agnostic to the version
in use, and which borrow a few parts from proto_http and from h1. For
example the once thought h1-specific h1_char_classes array is in fact
dictated by RFC7231 and is used to parse HTTP headers. A few changes
were made to a few files which were including proto_http.h while they
only needed http.h.

Certain string definitions pre-dated the introduction of indirect
strings (ist) so some were used to simplify the definition of the known
HTTP methods. The current lookup code saves 2 kB of a heavily used table
and is faster than the previous table based lookup (typ. 14 ns vs 16
before).
2018-09-11 10:30:25 +02:00
William Lallemand
e22f11ff47 MINOR: mworker: keep and clean the listeners
Keep the listeners that should be used in the master process and clean
them in the workers.
2018-09-11 10:23:24 +02:00
Willy Tarreau
4bc7d90d3b MEDIUM: snapshot: merge the captured data after the descriptor
Instead of having a separate area for the captured data, we now have a
contigous block made of the descriptor and the data. At the moment, since
the area is dynamically allocated, we can adjust its size to what is
needed, but the idea is to quickly switch to a pool and an LRU list.
2018-09-07 20:07:17 +02:00
Willy Tarreau
c55015ee5b MEDIUM: snapshots: dynamically allocate the snapshots
Now upon error we dynamically allocate the snapshot instead of overwriting
it. This way there is no more memory wasted in the proxy to hold the two
error snapshot descriptors. Also an appreciable side effect of this is that
the proxy's lock is only taken during the pointer swap, no more while copying
the buffer's contents. This saves 480 bytes of memory per proxy.
2018-09-07 19:59:58 +02:00
Willy Tarreau
7ccdd8dad9 MEDIUM: snapshot: implement a show() callback and use it for HTTP
The HTTP dumps are now configurable in the code : "show errors" now
calls a protocol-specific function to emit the decoded output. For
now only HTTP is implemented.
2018-09-07 18:36:01 +02:00
Willy Tarreau
7480f323ff MINOR: snapshot: split the error snapshots into common and proto-specific parts
The idea will be to make the error snapshot feature accessible to other
protocols than just HTTP. This patch only introduces an "http_snapshot"
structure and renames a few fields to make things more explicit. The
HTTP part was installed inside a union so that we can easily add more
protocols in the future.
2018-09-07 16:13:45 +02:00
Willy Tarreau
5865a8fe69 MINOR: snapshot: restart on the event ID and not the stream ID
The snapshots have the ability to restart a partial dump and they use
the stream ID as the restart point. Since it's purely HTTP, let's use
the event ID instead.
2018-09-07 15:00:43 +02:00
Willy Tarreau
590a0514f2 BUG/MEDIUM: session: fix reporting of handshake processing time in the logs
The handshake processing time used to be stored per stream, which was
valid when there was exactly one stream per session. With H2 and
multiplexing it's not the case anymore and the reported handshake times
are wrong in the logs as it's computed between the TCP accept() and the
stream creation. Let's first move the handshake where it belongs, which
is the session.

However, this is not enough because we don't want to report an excessive
idle time either for H2 (since many requests use the connection).

So the solution used here is to have the stream retrieve sess->tv_accept
and the handshake duration when the stream is created, and let the mux
immediately reset them. This way, the handshake time becomes zero for the
second and subsequent requests in H2 (which was already the case in H1),
and the idle time exactly counts how long the connection remained unused
while it could be used, so in H1 it runs from the end of the previous
response and in H2 it runs from the end of the previous request since the
channel is already available.

This patch will need to be backported to 1.8.
2018-09-05 16:30:23 +02:00
Baptiste Assmann
6d0f38f00d BUG/MEDIUM: dns/server: fix incomatibility between SRV resolution and server state file
Server state file has no indication that a server is currently managed
by a DNS SRV resolution.
And thus, both feature (DNS SRV resolution and server state), when used
together, does not provide the expected behavior: a smooth experience...

This patch introduce the "SRV record name" in the server state file and
loads and applies it if found and wherever required.

This patch applies to haproxy-dev branch only. For backport, a specific patch
is provided for 1.8.
2018-09-04 17:40:22 +02:00
Olivier Houchard
8f0b4c66f5 MINOR: stream_interface: Give stream_interface its own wait_list.
Instead of just using the conn_stream wait_list, give the stream_interface
its own. When the conn_stream will have its own buffers, the stream_interface
may have to wait on it.
2018-08-16 17:29:54 +02:00
Olivier Houchard
e1c6dbcd70 MINOR: connections/mux: Add the wait reason(s) to wait_list.
Add a new element to the wait_list, that let us know which event(s) we are
waiting on.
2018-08-16 17:29:53 +02:00
Olivier Houchard
ed0f207ef5 MINOR: connections: Get rid of txbuf.
Remove txbuf from conn_stream. It is not used yet, and its only user will
probably be the mux_h2, so it will be better suited in the struct h2s.
2018-08-16 17:29:51 +02:00
Olivier Houchard
638b799b09 MINOR: connections: Move rxbuf from the conn_stream to the h2s.
As the mux_h2 is the only user of rxbuf, move it to the struct h2s, instead
of conn_stream.
2018-08-16 17:28:11 +02:00
Patrick Hemmer
268a707a3d MEDIUM: add set-priority-class and set-priority-offset
This adds the set-priority-class and set-priority-offset actions to
http-request and tcp-request content. At this point they are not used
yet, which is the purpose of the next commit, but all the logic to
set and clear the values is there.
2018-08-10 15:06:31 +02:00
Patrick Hemmer
0355dabd7c MINOR: queue: replace the linked list with a tree
We'll need trees to manage the queues by priorities. This change replaces
the list with a tree based on a single key. It's effectively a list but
allows us to get rid of the list management right now.
2018-08-10 15:06:27 +02:00
Patrick Hemmer
da282f4a8f MINOR: queue: store the queue index in the stream when enqueuing
We store the queue index in the stream and check it on dequeueing to
figure how many entries were processed in between. This way we'll be
able to count the elements that may later be added before ours.
2018-08-10 15:06:25 +02:00
Patrick Hemmer
ffe5e8c638 MINOR: stream: rename {srv,prx}_queue_size to *_queue_pos
The current name is misleading as it implies a queue size, but the value
instead indicates a position in the queue.
The value is only the queue size at the exact moment the element is enqueued.
Soon we will gain the ability to insert anywhere into the queue, upon which
clarity of the name is more important.
2018-08-10 15:04:14 +02:00
Christopher Faulet
8ed0a3e32a MINOR: mux/server: Add 'proto' keyword to force the multiplexer's protocol
For now, it is parsed but not used. Tests are done on it to check if the side
and the mode are compatible with the server's definition.
2018-08-08 10:42:08 +02:00
Christopher Faulet
a717b99284 MINOR: mux/frontend: Add 'proto' keyword to force the mux protocol
For now, it is parsed but not used. Tests are done on it to check if the side
and the mode are compatible with the proxy's definition.
2018-08-08 10:41:11 +02:00