When no filter is attached to the stream, the CPU footprint due to the calls to
filters_* functions is huge, especially for chunk-encoded messages. Using macros
to check if we have some filters or not is a great improvement.
Furthermore, instead of checking the filter list emptiness, we introduce a flag
to know if filters are attached or not to a stream.
HTTP compression has been rewritten to use the filter API. This is more a PoC
than other thing for now. It allocates memory to work. So, if only for that, it
should be rewritten.
In the mean time, the implementation has been refactored to allow its use with
other filters. However, there are limitations that should be respected:
- No filter placed after the compression one is allowed to change input data
(in 'http_data' callback).
- No filter placed before the compression one is allowed to change forwarded
data (in 'http_forward_data' callback).
For now, these limitations are informal, so you should be careful when you use
several filters.
About the configuration, 'compression' keywords are still supported and must be
used to configure the HTTP compression behavior. In absence of a 'filter' line
for the compression filter, it is added in the filter chain when the first
compression' line is parsed. This is an easy way to do when you do not use other
filters. But another filter exists, an error is reported so that the user must
explicitly declare the filter.
For example:
listen tst
...
compression algo gzip
compression offload
...
filter flt_1
filter compression
filter flt_2
...
HTTP compression will be moved in a true filter. To prepare the ground, some
functions have been moved in a dedicated file. Idea is to keep everything about
compression algos in compression.c and everything related to the filtering in
flt_http_comp.c.
For now, a header has been added to help during the transition. It will be
removed later.
Unused empty ACL keyword list was removed. The "compression" keyword
parser was moved from cfgparse.c to flt_http_comp.c.
When all callbacks have been called for all filters registered on a stream, if
we are waiting for the next HTTP request, we must reset stream analyzers. But it
is useless to do so if the client has already closed the connection.
This patch adds the support of filters in HAProxy. The main idea is to have a
way to "easely" extend HAProxy by adding some "modules", called filters, that
will be able to change HAProxy behavior in a programmatic way.
To do so, many entry points has been added in code to let filters to hook up to
different steps of the processing. A filter must define a flt_ops sutrctures
(see include/types/filters.h for details). This structure contains all available
callbacks that a filter can define:
struct flt_ops {
/*
* Callbacks to manage the filter lifecycle
*/
int (*init) (struct proxy *p);
void (*deinit)(struct proxy *p);
int (*check) (struct proxy *p);
/*
* Stream callbacks
*/
void (*stream_start) (struct stream *s);
void (*stream_accept) (struct stream *s);
void (*session_establish)(struct stream *s);
void (*stream_stop) (struct stream *s);
/*
* HTTP callbacks
*/
int (*http_start) (struct stream *s, struct http_msg *msg);
int (*http_start_body) (struct stream *s, struct http_msg *msg);
int (*http_start_chunk) (struct stream *s, struct http_msg *msg);
int (*http_data) (struct stream *s, struct http_msg *msg);
int (*http_last_chunk) (struct stream *s, struct http_msg *msg);
int (*http_end_chunk) (struct stream *s, struct http_msg *msg);
int (*http_chunk_trailers)(struct stream *s, struct http_msg *msg);
int (*http_end_body) (struct stream *s, struct http_msg *msg);
void (*http_end) (struct stream *s, struct http_msg *msg);
void (*http_reset) (struct stream *s, struct http_msg *msg);
int (*http_pre_process) (struct stream *s, struct http_msg *msg);
int (*http_post_process) (struct stream *s, struct http_msg *msg);
void (*http_reply) (struct stream *s, short status,
const struct chunk *msg);
};
To declare and use a filter, in the configuration, the "filter" keyword must be
used in a listener/frontend section:
frontend test
...
filter <FILTER-NAME> [OPTIONS...]
The filter referenced by the <FILTER-NAME> must declare a configuration parser
on its own name to fill flt_ops and filter_conf field in the proxy's
structure. An exemple will be provided later to make it perfectly clear.
For now, filters cannot be used in backend section. But this is only a matter of
time. Documentation will also be added later. This is the first commit of a long
list about filters.
It is possible to have several filters on the same listener/frontend. These
filters are stored in an array of at most MAX_FILTERS elements (define in
include/types/filters.h). Again, this will be replaced later by a list of
filters.
The filter API has been highly refactored. Main changes are:
* Now, HA supports an infinite number of filters per proxy. To do so, filters
are stored in list.
* Because filters are stored in list, filters state has been moved from the
channel structure to the filter structure. This is cleaner because there is no
more info about filters in channel structure.
* It is possible to defined filters on backends only. For such filters,
stream_start/stream_stop callbacks are not called. Of course, it is possible
to mix frontend and backend filters.
* Now, TCP streams are also filtered. All callbacks without the 'http_' prefix
are called for all kind of streams. In addition, 2 new callbacks were added to
filter data exchanged through a TCP stream:
- tcp_data: it is called when new data are available or when old unprocessed
data are still waiting.
- tcp_forward_data: it is called when some data can be consumed.
* New callbacks attached to channel were added:
- channel_start_analyze: it is called when a filter is ready to process data
exchanged through a channel. 2 new analyzers (a frontend and a backend)
are attached to channels to call this callback. For a frontend filter, it
is called before any other analyzer. For a backend filter, it is called
when a backend is attached to a stream. So some processing cannot be
filtered in that case.
- channel_analyze: it is called before each analyzer attached to a channel,
expects analyzers responsible for data sending.
- channel_end_analyze: it is called when all other analyzers have finished
their processing. A new analyzers is attached to channels to call this
callback. For a TCP stream, this is always the last one called. For a HTTP
one, the callback is called when a request/response ends, so it is called
one time for each request/response.
* 'session_established' callback has been removed. Everything that is done in
this callback can be handled by 'channel_start_analyze' on the response
channel.
* 'http_pre_process' and 'http_post_process' callbacks have been replaced by
'channel_analyze'.
* 'http_start' callback has been replaced by 'http_headers'. This new one is
called just before headers sending and parsing of the body.
* 'http_end' callback has been replaced by 'channel_end_analyze'.
* It is possible to set a forwarder for TCP channels. It was already possible to
do it for HTTP ones.
* Forwarders can partially consumed forwardable data. For this reason a new
HTTP message state was added before HTTP_MSG_DONE : HTTP_MSG_ENDING.
Now all filters can define corresponding callbacks (http_forward_data
and tcp_forward_data). Each filter owns 2 offsets relative to buf->p, next and
forward, to track, respectively, input data already parsed but not forwarded yet
by the filter and parsed data considered as forwarded by the filter. A any time,
we have the warranty that a filter cannot parse or forward more input than
previous ones. And, of course, it cannot forward more input than it has
parsed. 2 macros has been added to retrieve these offets: FLT_NXT and FLT_FWD.
In addition, 2 functions has been added to change the 'next size' and the
'forward size' of a filter. When a filter parses input data, it can alter these
data, so the size of these data can vary. This action has an effet on all
previous filters that must be handled. To do so, the function
'filter_change_next_size' must be called, passing the size variation. In the
same spirit, if a filter alter forwarded data, it must call the function
'filter_change_forward_size'. 'filter_change_next_size' can be called in
'http_data' and 'tcp_data' callbacks and only these ones. And
'filter_change_forward_size' can be called in 'http_forward_data' and
'tcp_forward_data' callbacks and only these ones. The data changes are the
filter responsability, but with some limitation. It must not change already
parsed/forwarded data or data that previous filters have not parsed/forwarded
yet.
Because filters can be used on backends, when we the backend is set for a
stream, we add filters defined for this backend in the filter list of the
stream. But we must only do that when the backend and the frontend of the stream
are not the same. Else same filters are added a second time leading to undefined
behavior.
The HTTP compression code had to be moved.
So it simplifies http_response_forward_body function. To do so, the way the data
are forwarded has changed. Now, a filter (and only one) can forward data. In a
commit to come, this limitation will be removed to let all filters take part to
data forwarding. There are 2 new functions that filters should use to deal with
this feature:
* flt_set_http_data_forwarder: This function sets the filter (using its id)
that will forward data for the specified HTTP message. It is possible if it
was not already set by another filter _AND_ if no data was yet forwarded
(msg->msg_state <= HTTP_MSG_BODY). It returns -1 if an error occurs.
* flt_http_data_forwarder: This function returns the filter id that will
forward data for the specified HTTP message. If there is no forwarder set, it
returns -1.
When an HTTP data forwarder is set for the response, the HTTP compression is
disabled. Of course, this is not definitive.
The csv stats format breaks when agent changes server state to drain.
Tools like hatop, metric or check agents will fail due to this. This
should be backported to 1.6.
The serial number for a generated certificate was computed using the requested
servername, without any variable/random part. It is not a problem from the
moment it is not regenerated.
But if the cache is disabled or when the certificate is evicted from the cache,
we may need to regenerate it. It is important to not reuse the same serial
number for the new certificate. Else clients (especially browsers) trigger a
warning because 2 certificates issued by the same CA have the same serial
number.
So now, the serial is a static variable initialized with now_ms (internal date
in milliseconds) and incremented at each new certificate generation.
(Ref MPS-2031)
When working on the previous bug, it appeared that it the case that was
triggering the bug would also work between two backends, one of which
doesn't support http-reuse. The reason is that while the idle connection
is moved to the private pool, upon reuse we only check if it holds the
CO_FL_PRIVATE flag. And we don't set this flag when there's no reuse.
So let's always set it in this case, it will guarantee that no undesired
connection sharing may happen.
This fix must be backported to 1.6.
There is a bug in connect_server() : we use si_attach_conn() to offer
the current session's connection to the session we're stealing the
connection from. Unfortunately, si_attach_conn() uses the standard data
connection operations while here we need to use the idle connection
operations.
This results in a situation where when the server's idle timeout strikes,
the read0 is silently ignored, causes the response channel to be shut down
for reads, and the connection remains attached. Next attempt to send a
request when using this connection simply results in nothing being done
because we try to send over an already closed connection. Worse, if the
client aborts, then no timeout remains at all and the session waits
forever and remains assigned to the server.
A more-or-less easy way to reproduce this bug is to have two concurrent
streams each connecting to a different server with "http-reuse aggressive",
typically a cache farm using a URL hash :
stream1: GET /1 HTTP/1.1
stream2: GET /2 HTTP/1.1
stream1: GET /2 HTTP/1.1
wait for the server 1's connection to timeout
stream2: GET /1 HTTP/1.1
The connection hangs here, and "show sess all" shows a closed connection
with a SHUTR on the response channel.
The fix is very simple though not optimal. It consists in calling
si_idle_conn() again after attaching the connection. But in practise
it should not be done like this. The real issue is that there's no way
to cleanly attach a connection to a stream interface without changing
the connection's operations. So the API clearly needs to be revisited
to make such operations easier.
Many thanks to Yves Lafon from W3C for providing lots of useful dumps
and testing patches to help figure the root cause!
This fix must be backported to 1.6.
Michał Pasierb reported doc inconsistencies regarding the old default
HTTP tunnel mode.
This patch fixes a few of those inconsistencies and should be backported
to both 1.6 and 1.5.
in function 'si_connect', an existing connection is reused (and considered as
established) only when there are some pending data in the output channel.
This can be problem when filters are used, because a filter can choose to not
forward data immediatly. So when we try to initiate a connection to a server,
the output channel can be empty. In this situation, if the connection already
exists, it is not considered as established and nothing happens. If the stream
interface is in the state SI_ST_ASS, this leads to an infinite loop in
process_stream because it remains in this state.
This patch fixes this problem. Now, in 'si_connect', we always reuse an existing
connection, whether or not there are pending data in the output channel.
After a POST on the stats admin page, a 303 is emitted. Unfortunately
this 303 doesn't contain a content-length, which forces the connection
to be closed and reopened. Let's simply add a content-length: 0 to solve
this.
Commit 16f649c ("REORG: polling: rename "fd_spec" to "fd_cache"")
missed the server-facing connection during the rename, so the old
names are still in used and add a bit of confusion during the
debugging.
This should be backported to 1.6 and 1.5.
When a connect() to a unix socket returns EAGAIN we talk about
"no free ports" in the error/debug message, which only makes
sense when using TCP.
Explain connect() failure and suggest troubleshooting server
backlog size.
These two actions don't touch the table so the entry will expire and
the values will not be pushed to other peers. Also in the case of gpc0,
the gpc0_rate counter must be updated. The issue was reported by
Ruoshan Huang.
This fix needs to be backported to 1.6.
Arkadiy Kulev noticed that if a server is marked down while a connection
is being trying to establish, we still insist on performing retries on
the same server, which is absurd. Better perform the redispatch if we
already know the server is down. Because of this, it's likely that the
observe-l4 and sudden-death mechanisms are not optimal an cannot help
much the connection which was used to detect the problem.
The fix should be backported to 1.6 and 1.5 at least.
When users request 16384 bytes for a buffer, they get 16392 after
rounding up. This is problematic for SSL as it systematically
causes a small 8-bytes message to be appended after the first 16kB
message and costs about 15% of performance.
Let's add MEM_F_EXACT to use exactly the size we need. This requires
previous patch (MEDIUM: pools: add a new flag to avoid rounding pool
size up).
This issue was introduced in 1.6 and causes trouble there, so this
fix must be backported.
This is issue was reported by Gary Barrueto and diagnosed by Cyril Bont.
Usually it's desirable to merge similarly sized pools, which is the
reason why their size is rounded up to the next multiple of 16. But
for the buffers this is problematic because we add the size of
struct buffer to the user-requested size, and the rounding results
in 8 extra bytes that are usable in the end. So the user gets more
bytes than asked for, and in case of SSL it results in short writes
for the extra bytes that are sent above multiples of 16 kB.
So we add a new flag MEM_F_EXACT to request that the size is not
rounded up when creating the entry. Thus it doesn't disable merging.
The function channel_recv_limit() relies on channel_reserved() which
itself relies on channel_in_transit(). Individually they're OK but
combined they're doing the wrong thing.
The problem is that we refrain from filling buffers while to_forward
is even much larger than the buffer because of a semantic issue along
the call chain. This is particularly visible when offloading SSL on
moderately large files (1 MB), though it is also visible on clear text.
Twice the number of recv() calls are made compared to what is needed,
and the typical performance drops by 15-20% in SSL in 1.6 and later,
and no directly measurable drop in 1.5 except when using strace.
There's no need for all these intermediate functions, so let's get
rid of them and reimplement channel_recv_limit() from scratch in a
safer way.
This fix needs to be backported to 1.6 and 1.5 (at least). Note that in
1.5 the function is called buffer_recv_limit() and it may differ a bit.
Gregor Kovač reported that http_date() did not return the right day of the
week. For example "Sat, 22 Jan 2016 17:43:38 GMT" instead of "Fri, 22 Jan
2016 17:43:38 GMT". Indeed, gmtime() returns a 'struct tm' result, where
tm_wday begins on Sunday, whereas the code assumed it began on Monday.
This patch must be backported to haproxy 1.5 and 1.6.
Servers state function save and apply server IP when DNS resolution is
enabled on a server.
Purpose is to prevent switching traffic from one server to an other one
when multiple IPs are returned by the DNS server for the A or AAAA
record.
That said, a bug in current code lead to erase the service port while
copying the IP found in the file into the server structure in HAProxy's
memory.
This patch fix this bug.
The bug was reported on the ML by Robert Samuel Newson and fix proposed
by Nenad Merdanovic.
Thank you both!!!
backport: can be backported to 1.6
Erez reported a bug on discourse.haproxy.org about DNS resolution not
occuring when no port is specified on the nameserver directive.
This patch prevent this behavior by returning an error explaining this
issue when parsing the configuration file.
That said, later, we may want to force port 53 when client did not
provide any.
backport: 1.6
This function should return a 16-bit type as that is the type for
dns header id.
Also because it is doing an uint16 unpack big-endian operation.
Backport: can be backported to 1.6
Signed-off-by: Thiago Farina <tfarina@chromium.org>
Signed-off-by: Baptiste Assmann <bedis9@gmail.com>
Building RPMs from the provided haproxy.spec fails due to references to
config files that do not exist.
I should point out, this patch will inevitably create an empty /etc/haproxy
dir and I'm not sure that's desirable/intended.
Changes to if statements do not affect code operation, just layout of
the code. Type casts from malloc returns have been removed as this cast
happens automatically from the void* type.
This may be backported to 1.6.
Parameters provided to 51Degrees methods that have changed to require
const pointers are now cast to avoid compiler warnings.
This should be backported to 1.6.
Malloc continues to be used for the creation of cache entries. The
implementation has been enhanced ready for production deployment. A new
method to free cache entries created in 51d.c has been added to ensure
memory is released correctly.
This should be backported to 1.6.
Args pointer is now used as the LRU cache domain to ensure the cache
distinguishes between multiple fetch and conv configurations.
This should be backported to 1.6.
Introduction of a new function in the LRU cache source file.
Purpose of this function is to be used to delete a number of entries in
the cache. 'number' is defined by the caller and the key removed are
taken at the tail of the tree
csv_enc_append() returns a pointer to the beginning of the encoded
string, which makes it convenient to use in printf(). However it's not
convenient for use in chunks as it may leave an unused byte at the
beginning depending on the automatic quoting. Let's modify it to work
in two passes. First it looks for a character that requires escaping
using strpbrk(), and second it encodes the string. This way it
guarantees to always start at the first available byte of the chunk.
Additionally it made the code quite simpler.
We have csv_enc() but there's no way to append some CSV-encoded data
to an existing chunk, so here we modify the existing function for this
and create an inlined version of csv_enc() which first resets the output
chunk. It will be handy to append data to an existing chunk without
having to use an extra temporary chunk, or to encode multiple strings
into a single chunk with chunk_newstr().
The patch is quite small, in fact most changes are typo fixes in the
comments.
chunk_initstr() prepares a read-only chunk from a string of
fixed length. Thus it must be prepared to accept a read-only
string on the input, otherwise the caller has to force-cast
some const char* and that's not a good idea.
These two new functions will make it easier to manipulate small strings
from within functions, because at many places, multiple short strings
are needed which do not deserve a malloc() nor a free(), and alloca()
is often discouraged. Since we already have trash chunks, it's convenient
to be able to allocate substrings from a chunk and use them later since
our functions already perform all the length checks. chunk_newstr() adds
a trailing zero at the end of a chunk and returns the pointer to the next
character, which can be used as an independant string. chunk_strcat()
does what it says.
Since thus function bears the name of a well-known string function, it
must at least promise compatible semantics. Here it means always adding
the trailing zero so that anyone willing to use chunk->str as a regular
string can do it. Of course the zero is not counted in the chunk's length.
chunk_dup() was affected by two bugs at once related to dst->size :
- first, it didn't check dst->size to know if it could free(dst->str),
so using it on a statically allocated chunk would cause a free(constant)
and crash the process ;
- second, it didn't properly set dst->size, possibly causing smaller
strings not to be properly reported in a chunk that was previously
used for something else.
Fortunately, neither of these situations ever happened since the function
is rarely used.
In the process of doing this, we even allocate one more byte for a
trailing zero if the input chunk was not full, so that the copied
string can safely be reused by standard string functions.
The bug was introduced in 1.3.4 nine years ago with this commit :
0f77253 ("[MINOR] store HTTP error messages into a chunk array")
It's better to backport this fix in case a future fix relies on it.
The function http_reply_and_close has been added in proto_http.c to wrap calls
to stream_int_retnclose. This functions will be modified when the filters will
be added.
When the response body is forwarded, if the server closes the input before the
end, an error is thrown. But if the data processing is too slow, all data could
already be received and pending in the input buffer. So this is a bug to stop
processing in this context. The server doesn't really closed the input before
the end.
As an example, this could happen when HAProxy is configured to do compression
offloading. If the server closes the connection explicitly after the response
(keep-alive disabled by the server) and if HAProxy receives the data faster than
they are compressed, then the response could be truncated.
This patch fixes the bug by checking if some pending data remain in the input
buffer before returning an error. If yes, the processing continues.
Several cases of "<=" instead of "<" were found in the url_param parser,
mostly affecting the case where the parameter is wrapping. They shouldn't
affect header operations, just body parsing in a wrapped pipelined request.
The code is a bit complicated with certain operations done multiple times
in multiple functions, so it's not sure others are not left. This code
must be re-audited.
It should only be backported to 1.6 once carefully tested, because it is
possible that other bugs relied on these ones.
The applet can't have access to the session private data. This patch
fix this problem. Now an applet can use private data stored by actions
and fecthes.