mirror of
http://git.haproxy.org/git/haproxy.git/
synced 2025-01-22 21:43:04 +00:00
1122d9c03c
Also add some thoughts about the existing and new design. Note: an earlier design used the names "head" and "tail" for both sides of the buffer, but it appears awkward as these words may be understood in two forms (feed by head, output by tail, or make the newcomers wait at the tail of the queue). Also there were already a few functions in the code making use of either terminology. So better avoid this terminology and use "input" and "output" instead.
130 lines
6.1 KiB
Plaintext
130 lines
6.1 KiB
Plaintext
2012/02/27 - redesigning buffers for better simplicity - w@1wt.eu
|
|
|
|
1) Analysis
|
|
-----------
|
|
|
|
Buffer handling becomes complex because buffers are circular but many of their
|
|
users don't support wrapping operations (eg: HTTP parsing). Due to this fact,
|
|
some buffer operations automatically realign buffers as soon as possible when
|
|
the buffer is empty, which makes it very hard to track buffer pointers outside
|
|
of the buffer struct itself. The buffer contains a pointer to last processed
|
|
data (buf->lr) which is automatically realigned with such operations. But in
|
|
the end, its semantics are often unclear and whether it's safe or not to use it
|
|
isn't always obvious, as it has acquired multiple roles over the time.
|
|
|
|
A "struct buffer" is declared this way :
|
|
|
|
struct buffer {
|
|
unsigned int flags; /* BF_* */
|
|
int rex; /* expiration date for a read, in ticks */
|
|
int wex; /* expiration date for a write or connect, in ticks */
|
|
int rto; /* read timeout, in ticks */
|
|
int wto; /* write timeout, in ticks */
|
|
unsigned int l; /* data length */
|
|
char *r, *w, *lr; /* read ptr, write ptr, last read */
|
|
unsigned int size; /* buffer size in bytes */
|
|
unsigned int send_max; /* number of bytes the sender can consume om this buffer, <= l */
|
|
unsigned int to_forward; /* number of bytes to forward after send_max without a wake-up */
|
|
unsigned int analysers; /* bit field indicating what to do on the buffer */
|
|
int analyse_exp; /* expiration date for current analysers (if set) */
|
|
void (*hijacker)(struct session *, struct buffer *); /* alternative content producer */
|
|
unsigned char xfer_large; /* number of consecutive large xfers */
|
|
unsigned char xfer_small; /* number of consecutive small xfers */
|
|
unsigned long long total; /* total data read */
|
|
struct stream_interface *prod; /* producer attached to this buffer */
|
|
struct stream_interface *cons; /* consumer attached to this buffer */
|
|
struct pipe *pipe; /* non-NULL only when data present */
|
|
char data[0]; /* <size> bytes */
|
|
};
|
|
|
|
In order to address this, a struct http_msg was created with other pointers to
|
|
the buffer. The issue is that some of these pointers are absolute and other
|
|
ones are relative, sometimes one to another, sometimes to the beginning of the
|
|
buffer, which doesn't help at all for the case where buffers get realigned.
|
|
|
|
A "struct http_msg" is defined this way :
|
|
|
|
struct http_msg {
|
|
unsigned int msg_state;
|
|
unsigned int flags;
|
|
unsigned int col, sov; /* current header: colon, start of value */
|
|
unsigned int eoh; /* End Of Headers, relative to buffer */
|
|
char *sol; /* start of line, also start of message when fully parsed */
|
|
char *eol; /* end of line */
|
|
unsigned int som; /* Start Of Message, relative to buffer */
|
|
int err_pos; /* err handling: -2=block, -1=pass, 0+=detected */
|
|
union { /* useful start line pointers, relative to ->sol */
|
|
struct {
|
|
int l; /* request line length (not including CR) */
|
|
int m_l; /* METHOD length (method starts at ->som) */
|
|
int u, u_l; /* URI, length */
|
|
int v, v_l; /* VERSION, length */
|
|
} rq; /* request line : field, length */
|
|
struct {
|
|
int l; /* status line length (not including CR) */
|
|
int v_l; /* VERSION length (version starts at ->som) */
|
|
int c, c_l; /* CODE, length */
|
|
int r, r_l; /* REASON, length */
|
|
} st; /* status line : field, length */
|
|
} sl; /* start line */
|
|
unsigned long long chunk_len;
|
|
unsigned long long body_len;
|
|
char **cap;
|
|
};
|
|
|
|
|
|
The first immediate observation is that nothing in a buffer should be relative
|
|
to the beginning of the storage area, everything should be relative to the
|
|
buffer's origin as a floating location. Right now the buffer's origin is equal
|
|
to (buf->w + buf->send_max). It is the place where the first byte of data not
|
|
yet scheduled for being forwarded is found.
|
|
|
|
- buf->w is an absolute pointer, just like buf->data.
|
|
- buf->send_max is a relative value which oscillates between 0 when nothing
|
|
has to be forwarded, and buf->l when the whole buffer must be forwarded.
|
|
|
|
|
|
2) Proposal
|
|
-----------
|
|
|
|
By having such an origin, we could have everything in http_msg relative to this
|
|
origin. This would resist buffer realigns much better than right now.
|
|
|
|
At the moment we have msg->som which is relative to buf->data and which points
|
|
to the beginning of the message. The beginning of the message should *always*
|
|
be the buffer's origin. If data are to be skipped in the message, just wait for
|
|
send_max to become zero and move the origin forwards ; this would definitely get
|
|
rid of msg->som. This is already what is done in the HTTP parser except that it
|
|
has to move both buf->lr and msg->som.
|
|
|
|
Following the same principle, we should then have a relative pointer in
|
|
http_msg to replace buf->lr. It would be relative to the buffer's origin and
|
|
would simply recall what location was last visited.
|
|
|
|
Doing all this could result in more complex operations where more time is spent
|
|
adding buf->w to buf->send_max and then to msg->anything. It would probably make
|
|
more sense to define the buffer's origin as an absolute pointer and to have
|
|
both the buf->h (head) and buf->t (tail) pointers be positive and negative
|
|
positions relative to this origin. Operating on the buffer would then look like
|
|
this :
|
|
|
|
- no buf->l anymore. buf->l is replaced by (head + tail)
|
|
- no buf->lr anymore. Use origin + msg->last for instance
|
|
- recv() : head += recv(origin + head);
|
|
- send() : tail -= send(origin - tail, tail);
|
|
thus, buf->o effectively replaces buf->send_max.
|
|
- forward(N) : tail += N; origin += N;
|
|
- realign() : origin = data
|
|
- detect risk of wrapping of input : origin + head > data + size
|
|
|
|
In general it looks like less pointers are manipulated for common operations
|
|
and that maybe an additional wrapping test (hand-made modulo) will have to be
|
|
added so send() and recv() operations.
|
|
|
|
|
|
3) Caveats
|
|
----------
|
|
|
|
The first caveat is that the elements to modify appear at a very large number
|
|
of places.
|