BUG/MEDIUM: session: fix reporting of handshake processing time in the logs

The handshake processing time used to be stored per stream, which was
valid when there was exactly one stream per session. With H2 and
multiplexing it's not the case anymore and the reported handshake times
are wrong in the logs as it's computed between the TCP accept() and the
stream creation. Let's first move the handshake where it belongs, which
is the session.

However, this is not enough because we don't want to report an excessive
idle time either for H2 (since many requests use the connection).

So the solution used here is to have the stream retrieve sess->tv_accept
and the handshake duration when the stream is created, and let the mux
immediately reset them. This way, the handshake time becomes zero for the
second and subsequent requests in H2 (which was already the case in H1),
and the idle time exactly counts how long the connection remained unused
while it could be used, so in H1 it runs from the end of the previous
response and in H2 it runs from the end of the previous request since the
channel is already available.

This patch will need to be backported to 1.8.
This commit is contained in:
Willy Tarreau 2018-09-05 11:56:48 +02:00
parent 90a7c03ec0
commit 590a0514f2
5 changed files with 37 additions and 18 deletions

View File

@ -15958,7 +15958,10 @@ Detailed fields description :
- "accept_date" is the exact date when the connection was received by haproxy
(which might be very slightly different from the date observed on the
network if there was some queuing in the system's backlog). This is usually
the same date which may appear in any upstream firewall's log.
the same date which may appear in any upstream firewall's log. When used in
HTTP mode, the accept_date field will be reset to the first moment the
connection is ready to receive a new request (end of previous response for
HTTP/1, immediately after previous request for HTTP/2).
- "frontend_name" is the name of the frontend (or listener) which received
and processed the connection.
@ -16158,24 +16161,25 @@ Detailed fields description :
request could be received or the a bad request was received. It should
always be very small because a request generally fits in one single packet.
Large times here generally indicate network issues between the client and
haproxy or requests being typed by hand. See "Timers" below for more details.
haproxy or requests being typed by hand. See section 8.4 "Timing Events"
for more details.
- "Tw" is the total time in milliseconds spent waiting in the various queues.
It can be "-1" if the connection was aborted before reaching the queue.
See "Timers" below for more details.
See section 8.4 "Timing Events" for more details.
- "Tc" is the total time in milliseconds spent waiting for the connection to
establish to the final server, including retries. It can be "-1" if the
request was aborted before a connection could be established. See "Timers"
below for more details.
request was aborted before a connection could be established. See section
8.4 "Timing Events" for more details.
- "Tr" is the total time in milliseconds spent waiting for the server to send
a full HTTP response, not counting data. It can be "-1" if the request was
aborted before a complete response could be received. It generally matches
the server's processing time for the request, though it may be altered by
the amount of data sent by the client to the server. Large times here on
"GET" requests generally indicate an overloaded server. See "Timers" below
for more details.
"GET" requests generally indicate an overloaded server. See section 8.4
"Timing Events" for more details.
- "Ta" is the time the request remained active in haproxy, which is the total
time in milliseconds elapsed between the first byte of the request was
@ -16184,7 +16188,7 @@ Detailed fields description :
one exception, if "option logasap" was specified, then the time counting
stops at the moment the log is emitted. In this case, a '+' sign is
prepended before the value, indicating that the final one will be larger.
See "Timers" below for more details.
See section 8.4 "Timing Events" for more details.
- "status_code" is the HTTP status code returned to the client. This status
is generally set by the server, but it might also be set by haproxy when
@ -16590,16 +16594,20 @@ Timings events in TCP mode:
may indicate that the client only pre-established the connection without
speaking, that it is experiencing network issues preventing it from
completing a handshake in a reasonable time (e.g. MTU issues), or that an
SSL handshake was very expensive to compute.
SSL handshake was very expensive to compute. Please note that this time is
reported only before the first request, so it is safe to average it over
all request to calculate the amortized value. The second and subsequent
request will always report zero here.
- Ti: is the idle time before the HTTP request (HTTP mode only). This timer
counts between the end of the handshakes and the first byte of the HTTP
request. When dealing with a second request in keep-alive mode, it starts
to count after the end of the transmission the previous response. Some
browsers pre-establish connections to a server in order to reduce the
latency of a future request, and keep them pending until they need it. This
delay will be reported as the idle time. A value of -1 indicates that
nothing was received on the connection.
to count after the end of the transmission the previous response. When a
multiplexed protocol such as HTTP/2 is used, it starts to count immediately
after the previous request. Some browsers pre-establish connections to a
server in order to reduce the latency of a future request, and keep them
pending until they need it. This delay will be reported as the idle time. A
value of -1 indicates that nothing was received on the connection.
- TR: total time to get the client request (HTTP mode only). It's the time
elapsed between the first bytes received and the moment the proxy received

View File

@ -46,6 +46,7 @@ struct session {
struct stkctr stkctr[MAX_SESS_STKCTR]; /* stick counters for tcp-connection */
struct vars vars; /* list of variables for the session scope. */
struct task *task; /* handshake timeout processing */
long t_handshake; /* handshake duration, -1 = not completed */
};
#endif /* _TYPES_SESSION_H */

View File

@ -649,6 +649,7 @@ static void h2s_destroy(struct h2s *h2s)
*/
static struct h2s *h2c_stream_new(struct h2c *h2c, int id)
{
struct session *sess = h2c->conn->owner;
struct conn_stream *cs;
struct h2s *h2s;
@ -684,6 +685,15 @@ static struct h2s *h2c_stream_new(struct h2c *h2c, int id)
if (stream_create_from_cs(cs) < 0)
goto out_free_cs;
/* We want the accept date presented to the next stream to be the one
* we have now, the handshake time to be null (since the next stream
* is not delayed by a handshake), and the idle time to count since
* right now.
*/
sess->accept_date = date;
sess->tv_accept = now;
sess->t_handshake = 0;
/* OK done, the stream lives its own life now */
if (h2_has_too_many_cs(h2c))
h2c->flags |= H2_CF_DEM_TOOMANY;

View File

@ -52,6 +52,7 @@ struct session *session_new(struct proxy *fe, struct listener *li, enum obj_type
memset(sess->stkctr, 0, sizeof(sess->stkctr));
vars_init(&sess->vars, SCOPE_SESS);
sess->task = NULL;
sess->t_handshake = -1; /* handshake not done yet */
HA_ATOMIC_UPDATE_MAX(&fe->fe_counters.conn_max,
HA_ATOMIC_ADD(&fe->feconn, 1));
if (li)
@ -405,6 +406,8 @@ static int conn_complete_session(struct connection *conn)
{
struct session *sess = conn->owner;
sess->t_handshake = tv_ms_elapsed(&sess->tv_accept, &now);
conn_clear_xprt_done_cb(conn);
/* Verify if the connection just established. */

View File

@ -114,10 +114,7 @@ struct stream *stream_new(struct session *sess, enum obj_type *origin)
s->logs.level = 0;
s->logs.accept_date = sess->accept_date; /* user-visible date for logging */
s->logs.tv_accept = sess->tv_accept; /* corrected date for internal use */
/* This function is called just after the handshake, so the handshake duration is
* between the accept time and now.
*/
s->logs.t_handshake = tv_ms_elapsed(&sess->tv_accept, &now);
s->logs.t_handshake = sess->t_handshake;
s->logs.t_idle = -1;
tv_zero(&s->logs.tv_request);
s->logs.t_queue = -1;