mirror of
http://git.haproxy.org/git/haproxy.git/
synced 2025-04-10 11:11:37 +00:00
MINOR: ssl: mark the SSL handshake tasklet as heavy
There's a fairness issue between SSL and clear text. A full end-to-end cleartext connection can require up to ~7.7 wakeups on average, plus 3.3 for the SSL tasklet, one of which is particularly expensive. So if we accept to process many handshakes taking 1ms each, we significantly increase the processing time of regular tasks just by adding an extra delay between their calls. Ideally in order to be fair we should have a 1:18 call ratio, but this requires a bit more accounting. With very little effort we can mark the SSL handshake tasklet as TASK_HEAVY until the handshake completes, and remove it once done. Doing so reduces from 14 to 3.0 ms the total response time experienced by HTTP clients running in parallel to 1000 SSL clients doing full handshakes in loops. Better, when tune.sched.low-latency is set to "on", the latency further drops to 1.8 ms. The tasks latency distribution explain pretty well what is happening: Without the patch: $ socat - /tmp/sock1 <<< "show profiling" Per-task CPU profiling : on # set profiling tasks {on|auto|off} Tasks activity: function calls cpu_tot cpu_avg lat_tot lat_avg ssl_sock_io_cb 2785375 19.35m 416.9us 5.401h 6.980ms h1_io_cb 1868949 9.853s 5.271us 4.829h 9.302ms process_stream 1864066 7.582s 4.067us 2.058h 3.974ms si_cs_io_cb 1733808 1.932s 1.114us 26.83m 928.5us h1_timeout_task 935760 - - 1.033h 3.975ms accept_queue_process 303606 4.627s 15.24us 16.65m 3.291ms srv_cleanup_toremove_connections452 64.31ms 142.3us 2.447s 5.415ms task_run_applet 47 5.149ms 109.6us 57.09ms 1.215ms srv_cleanup_idle_connections 34 2.210ms 65.00us 87.49ms 2.573ms With the patch: $ socat - /tmp/sock1 <<< "show profiling" Per-task CPU profiling : on # set profiling tasks {on|auto|off} Tasks activity: function calls cpu_tot cpu_avg lat_tot lat_avg ssl_sock_io_cb 3000365 21.08m 421.6us 20.30h 24.36ms h1_io_cb 2031932 9.278s 4.565us 46.70m 1.379ms process_stream 2010682 7.391s 3.675us 22.83m 681.2us si_cs_io_cb 1702070 1.571s 922.0ns 8.732m 307.8us h1_timeout_task 1009594 - - 17.63m 1.048ms accept_queue_process 339595 4.792s 14.11us 3.714m 656.2us srv_cleanup_toremove_connections779 75.42ms 96.81us 438.3ms 562.6us srv_cleanup_idle_connections 48 2.498ms 52.05us 178.1us 3.709us task_run_applet 17 1.738ms 102.3us 11.29ms 663.9us other 1 947.8us 947.8us 202.6us 202.6us => h1_io_cb() and process_stream() are divided by 6 while ssl_sock_io_cb() is multipled by 4 And with low-latency on: $ socat - /tmp/sock1 <<< "show profiling" Per-task CPU profiling : on # set profiling tasks {on|auto|off} Tasks activity: function calls cpu_tot cpu_avg lat_tot lat_avg ssl_sock_io_cb 3000565 20.96m 419.1us 20.74h 24.89ms h1_io_cb 2019702 9.294s 4.601us 49.22m 1.462ms process_stream 2009755 6.570s 3.269us 1.493m 44.57us si_cs_io_cb 1997820 1.566s 783.0ns 2.985m 89.66us h1_timeout_task 1009742 - - 1.647m 97.86us accept_queue_process 494509 4.697s 9.498us 1.240m 150.4us srv_cleanup_toremove_connections1120 92.32ms 82.43us 463.0ms 413.4us srv_cleanup_idle_connections 70 2.703ms 38.61us 204.5us 2.921us task_run_applet 13 1.303ms 100.3us 85.12us 6.548us => process_stream() is divided by 100 while ssl_sock_io_cb() is multipled by 4 Interestingly, the total HTTPS response time doesn't increase and even very slightly decreases, with an overall ~1% higher request rate. The net effect here is a redistribution of the CPU resources between internal tasks, and in the case of SSL, handshakes wait bit more but everything after completes faster. This was made simple enough to be backportable if it helps some users suffering from high latencies in mixed traffic.
This commit is contained in:
parent
74dea8caea
commit
9205ab31d2
@ -5243,6 +5243,7 @@ static int ssl_sock_init(struct connection *conn, void **xprt_ctx)
|
||||
}
|
||||
ctx->wait_event.tasklet->process = ssl_sock_io_cb;
|
||||
ctx->wait_event.tasklet->context = ctx;
|
||||
ctx->wait_event.tasklet->state |= TASK_HEAVY; // assign it to the bulk queue during handshake
|
||||
ctx->wait_event.events = 0;
|
||||
ctx->sent_early_data = 0;
|
||||
ctx->early_buf = BUF_NULL;
|
||||
@ -5820,8 +5821,13 @@ struct task *ssl_sock_io_cb(struct task *t, void *context, unsigned short state)
|
||||
conn_delete_from_tree(&conn->hash_node->node);
|
||||
HA_SPIN_UNLOCK(IDLE_CONNS_LOCK, &idle_conns[tid].idle_conns_lock);
|
||||
/* First if we're doing an handshake, try that */
|
||||
if (ctx->conn->flags & CO_FL_SSL_WAIT_HS)
|
||||
if (ctx->conn->flags & CO_FL_SSL_WAIT_HS) {
|
||||
ssl_sock_handshake(ctx->conn, CO_FL_SSL_WAIT_HS);
|
||||
if (!(ctx->conn->flags & CO_FL_SSL_WAIT_HS)) {
|
||||
/* handshake completed, leave the bulk queue */
|
||||
_HA_ATOMIC_AND(&tl->state, ~TASK_SELF_WAKING);
|
||||
}
|
||||
}
|
||||
/* If we had an error, or the handshake is done and I/O is available,
|
||||
* let the upper layer know.
|
||||
* If no mux was set up yet, then call conn_create_mux()
|
||||
|
Loading…
Reference in New Issue
Block a user