This reduces its size which is not reused by anything else. However it
will significantly improve the debugger's output since we'll now get
real state values.
The default case had to be enabled in the parsers because gcc tries
to optimize the switch/case and noticed some values were missing from
the enums and emitted a warning.
We handle "http-request redirect" with a log-format string now, but we
leave "redirect" unaffected.
Note that the control of the special "/" case is move from the runtime
execution to the configuration parsing. If the format rule list is
empty, the build_logline() function does nothing.
Remove event_accept() in include/proto/proto_http.h and use correct function
name in other two files instead of event_accept().
Signed-off-by: Godbach <nylzhaowei@gmail.com>
Some actions were clearly missing to process response headers. This
patch adds a new "http-response" ruleset which provides the following
actions :
- allow : stop evaluating http-response rules
- deny : stop and reject the response with a 502
- add-header : add a header in log-format mode
- set-header : set a header in log-format mode
The req.hdr and res.hdr fetch methods do not work well on headers which
are allowed to contain commas, such as User-Agent, Date or Expires.
More specifically, full-length matching is impossible if a comma is
present.
This patch introduces 4 new fetch functions which are designed to work
with these full-length headers :
- req.fhdr, req.fhdr_cnt
- res.fhdr, res.fhdr_cnt
These ones do not stop at commas and permit to return full-length header
values.
We now have http_apply_redirect_rule() which does all the redirect-specific
job instead of having this inside http_process_req_common().
Also one of the benefit gained from uniformizing this code is that both
keep-alive and close response do emit the PR-- flags. The fix for the
flags could probably be backported to 1.4 though it's very minor.
The previous function http_perform_redirect() was becoming confusing
so it was renamed http_perform_server_redirect() since it only applies
to server-based redirection.
Sample conversions rely on two alternative buffers which were previously
allocated as static bufs of size BUFSIZE. Now they're initialized to the
global buffer size. It was the same for HTTP authentication. Note that it
seems that none of them was prone to any mistake when dealing with the
buffer size, but better stay on the safe side by maintaining the old
assumption that a trash buffer is always "large enough".
This is a massive rename. We'll then split channel and buffer.
This change needs a lot of cleanups. At many locations, the parameter
or variable is still called "buf" which will become ambiguous. Also,
the "struct channel" is still defined in buffers.h.
HTTP header fetch is now done using smp_fetch_hdr() for both ACLs and
patterns. This one also supports an occurrence number, making it possible
to specify explicit occurrences for ACLs and patterns.
The buffer's pointer <lr> was only used by HTTP parsers which also use a
struct http_msg to keep track of the parser's state. We've reached a point
where it makes no sense to keep ->lr in the buffer, as the split between
buffer and msg is only arbitrary for historical reasons.
This change ensures that touching buffers will not impact HTTP messages
anymore, making the buffers more content-agnostic. However, it becomes
very important not to forget to update msg->next when some data get
forwarded or moved (and in general each time buf->p is updated).
The new pointer in http_msg becomes relative to buffer->p so that
parsing multiple messages becomes easier. It is possible that at one
point ->som and ->next will be merged.
Note: http_parse_reqline() and http_parse_stsline() have been temporarily
modified to know the message starting point in the buffer (->p).
New option "http-send-name-header" specifies the name of a header which
will hold the server name in outgoing requests. This is the name of the
server the connection is really sent to, which means that upon redispatches,
the header's value is updated so that it always matches the server's name.
The new function does not return IP addresses but header values instead,
so that the caller is free to make what it want of them. The conversion
is not quite clean yet, as the previous test which considered that address
0.0.0.0 meant "no address" is still used. A different IP parsing function
should be used to take this into account.
And also rename "req_acl_rule" "http_req_rule". At the beginning that
was a bit confusing to me, especially the "req_acl" list which in fact
holds what we call rules. After some digging, it appeared that some
part of the code is 100% HTTP and not just related to authentication
anymore, so let's move that part to HTTP and keep the auth-only code
in auth.c.
This is in fact where those parts belong to. The old data_state was replaced
by applet.state and is now initialized when the applet is registered. It's
worth noting that the applet does not need to know the session nor the
buffer anymore since everything is brought by the stream interface.
It is possible that having a separate applet struct would simplify the
code but that's not a big deal.
Debugging parsing errors can be greatly improved if we know what the parser
state was and what the buffer flags were (especially for closed inputs/outputs
and full buffers). Let's add that to the error snapshots.
Using get_ip_from_hdr2() we can look for occurrence #X or #-X and
extract the IP it contains. This is typically designed for use with
the X-Forwarded-For header.
Using "usesrc hdr_ip(name,occ)", it becomes possible to use the IP address
found in <name>, and possibly specify occurrence number <occ>, as the
source to connect to a server. This is possible both in a server and in
a backend's source statement. This is typically used to use the source
IP previously set by a upstream proxy.
Support the new syntax (http-request allow/deny/auth) in
http stats.
Now it is possible to use the same syntax is the same like in
the frontend/backend http-request access control:
acl src_nagios src 192.168.66.66
acl stats_auth_ok http_auth(L1)
stats http-request allow if src_nagios
stats http-request allow if stats_auth_ok
stats http-request auth realm LB
The old syntax is still supported, but now it is emulated
via private acls and an aditional userlist.
Just as for the req* rules, we can now condition rsp* rules with ACLs.
ACLs match on response, so volatile request information cannot be used.
A warning is emitted if a configuration contains such an anomaly.
From now on, if request filters have ACLs defined, these ACLs will be
evaluated to condition the filter. This will be used to conditionally
remove/rewrite headers based on ACLs.
Calling this function after http_find_header2() automatically deletes
the current value of the header, and removes the header itself if the
value is the only one. The context is automatically adjusted for a
next call to http_find_header2() to return the next header. No other
change nor test should be made on the transient context though.
The body parser will be used in close and keep-alive modes. It follows
the stream to keep in sync with both the request and the response message.
Both chunked transfer-coding and content-length are supported according to
RFC2616.
The multipart/byterange encoding has not yet been implemented and if not
seconded by any of the two other ones, will be forwarded till the close,
as requested by the specification.
Both the request and the response analysers converge into an HTTP_MSG_DONE
state where it will be possible to force a close (option forceclose) or to
restart with a fresh new transaction and maintain keep-alive.
This change is important. All tests are OK but any possible behaviour
change with "option httpclose" might find its root here.
This code really belongs to the http part since it's transaction-specific.
This will also make it easier to later reinitialize a transaction in order
to support keepalive.
To sum up :
- len : it's now the max number of characters for the value, preventing
garbaged results.
- a new option "prefix" is added, this allows to use dynamic cookie
names (e.g. ASPSESSIONIDXXX).
Previously in the thread, I wanted to use the value found with
"capture cookie" but when i started to update the documentation, I
found this solution quite weird. I've made a small rework to not
depend on "capture cookie".
- There's the posssiblity to define the URL parser mode (path parameters
or query string).
We now set msg->col and msg->sov to the first byte of non-header.
They will be used later when parsing chunks. A new macro was added
to perform size additions on an http_msg in order to limit the risks
of copy-paste in the long term.
During this operation, it appeared that the http_msg struct was not
optimal on 64-bit, so it was re-ordered to fill the holes.
The code part which waits for an HTTP response has been extracted
from the old function. We now have two analysers and the second one
may re-enable the first one when an 1xx response is encountered.
This has been tested and works.
The calls to stream_int_return() that were remaining in the wait
analyser have been converted to stream_int_retnclose().
This patch has 2 goals :
1. I wanted to test the appsession feature with a small PHP code,
using PHPSESSID. The problem is that when PHP gets an unknown session
id, it creates a new one with this ID. So, when sending an unknown
session to PHP, persistance is broken : haproxy won't see any new
cookie in the response and will never attach this session to a
specific server.
This also happens when you restart haproxy : the internal hash becomes
empty and all sessions loose their persistance (load balancing the
requests on all backend servers, creating a new session on each one).
For a user, it's like the service is unusable.
The patch modifies the code to make haproxy also learn the persistance
from the client : if no session is sent from the server, then the
session id found in the client part (using the URI or the client cookie)
is used to associated the server that gave the response.
As it's probably not a feature usable in all cases, I added an option
to enable it (by default it's disabled). The syntax of appsession becomes :
appsession <cookie> len <length> timeout <holdtime> [request-learn]
This helps haproxy repair the persistance (with the risk of losing its
session at the next request, as the user will probably not be load
balanced to the same server the first time).
2. This patch also tries to reduce the memory usage.
Here is a little example to explain the current behaviour :
- Take a Tomcat server where /session.jsp is valid.
- Send a request using a cookie with an unknown value AND a path
parameter with another unknown value :
curl -b "JSESSIONID=12345678901234567890123456789012" http://<haproxy>/session.jsp;jsessionid=00000000000000000000000000000001
(I know, it's unexpected to have a request like that on a live service)
Here, haproxy finds the URI session ID and stores it in its internal
hash (with no server associated). But it also finds the cookie session
ID and stores it again.
- As a result, session.jsp sends a new session ID also stored in the
internal hash, with a server associated.
=> For 1 request, haproxy has stored 3 entries, with only 1 which will be usable
The patch modifies the behaviour to store only 1 entry (maximum).
Doing this, we can remove the last BF_HIJACK user and remove
produce_content(). s->data_source could also be removed but
it is currently used to detect if the stats or a server was
used.
The HTTP processing has been splitted into 7 steps, one of which
is not anymore HTTP-specific (content-switching). That way, it
becomes possible to use "use_backend" rules in TCP mode. A new
"use_server" directive should follow soon.
Some stream analysers might become generic enough to be called
for several bits. So we cannot have the analyser bit hard coded
into the analyser itself. Let's make the caller inform the callee.
We want to split several steps in HTTP processing so that
we can call individual analysers depending on what processing
we want to perform. The first step consists in splitting the
part that waits for a request from the rest.
It's useful to be able to accept an invalid header name in a request
or response but still be able to monitor further such errors. Now,
when an invalid request/response is received and accepted due to
an "accept-invalid-http-{request|response}" option, the invalid
request will be captured for later analysis with "show errors" on
the stats socket.
Instead of calling a hard-coded function to produce data, let's
reference this function into the buffer and call it from there
when BF_HIJACK is set. This goes in the direction of more generic
session management code.
It was a bit awkward to have session.c call return_srv_error() for
HTTP error messages related to servers. The function has been adapted
to be passed a pointer to the faulty stream interface, and is now a
pointer in the session. It is possible that in the future, it will
become a callback in the stream interface itself.
The new function looks like the previous one except that it operates
at the stream interface level and assumes an already closed SI.
Also remove some old unused occurrences of srv_close_with_err().
proto_http.c was not suitable for session-related processing, it was
just convenient for the tranformation.
Some more splitting must occur: process_request/response in proto_http.c
must be split again per protocol, and the caller must run a list.
Some functions should be directly attached to the session or the buffer
(eg: perform_http_redirect, return_srv_error, http_sess_log).
srv_state has been removed from HTTP state machines, and states
have been split in either TCP states or analyzers. For instance,
the TARPIT state has just become a simple analyzer.
New flags have been added to the struct buffer to compensate this.
The high-level stream processors sometimes need to force a disconnection
without touching a file-descriptor (eg: report an error). But if
they touched BF_SHUTW or BF_SHUTR, the file descriptor would not
be closed. Thus, the two SHUT?_NOW flags have been added so that
an application can request a forced close which the stream interface
will be forced to obey.
During this change, a new BF_HIJACK flag was added. It will
be used for data generation, eg during a stats dump. It
prevents the producer on a buffer from sending data into it.
BF_SHUTR_NOW /* the producer must shut down for reads ASAP */
BF_SHUTW_NOW /* the consumer must shut down for writes ASAP */
BF_HIJACK /* the producer is temporarily replaced */
BF_SHUTW_NOW has precedence over BF_HIJACK. BF_HIJACK has
precedence over BF_MAY_FORWARD (so that it does not need it).
New functions buffer_shutr_now(), buffer_shutw_now(), buffer_abort()
are provided to manipulate BF_SHUT* flags.
A new type "stream_interface" has been added to describe both
sides of a buffer. A stream interface has states and error
reporting. The session now has two stream interfaces (one per
side). Each buffer has stream_interface pointers to both
consumer and producer sides.
The server-side file descriptor has moved to its stream interface,
so that even the buffer has access to it.
process_srv() has been split into three parts :
- tcp_get_connection() obtains a connection to the server
- tcp_connection_failed() tests if a previously attempted
connection has succeeded or not.
- process_srv_data() only manages the data phase, and in
this sense should be roughly equivalent to process_cli.
Little code has been removed, and a lot of old code has been
left in comments for now.