MEDIUM: stick-table: never learn the "conn_cur" value from peers

There have been a large number of issues reported with conn_cur
synchronization because the concept is wrong. In an active-passive
setup, pushing the local connections count from the active node to
the passive one will result in the passive node to have a higher
counter than the real number of connections. Due to this, after a
switchover, it will never be able to close enough connections to
go down to zero. The same commonly happens on reloads since the new
process preloads its values from the old process, and if no connection
happens for a key after the value is learned, it is impossible to reset
the previous ones. In active-active setups it's a bit different, as the
number of connections reflects the number on the peer that pushed last.

This patch solves this by marking the "conn_cur" local and preventing
it from being learned from peers. It is still pushed, however, so that
any monitoring system that collects values from the peers will still
see it.

The patch is tiny and trivially backportable. While a change of behavior
in stable branches is never welcome, it remains possible to fix issues
if reports become frequent.
This commit is contained in:
Willy Tarreau 2021-10-08 17:53:12 +02:00
parent e3f4d7496d
commit db2ab8218c
4 changed files with 22 additions and 8 deletions

View File

@ -3013,12 +3013,21 @@ user <username> [password|insecure-password <password>]
It is possible to propagate entries of any data-types in stick-tables between
several HAProxy instances over TCP connections in a multi-master fashion. Each
instance pushes its local updates and insertions to remote peers. The pushed
values overwrite remote ones without aggregation. Interrupted exchanges are
automatically detected and recovered from the last known point.
In addition, during a soft restart, the old process connects to the new one
using such a TCP connection to push all its entries before the new process
tries to connect to other peers. That ensures very fast replication during a
reload, it typically takes a fraction of a second even for large tables.
values overwrite remote ones without aggregation. As an exception, the data
type "conn_cur" is never learned from peers, as it is supposed to reflect local
values. Earlier versions used to synchronize it and to cause negative values in
active-active setups, and always-growing values upon reloads or active-passive
switches because the local value would reflect more connections than locally
present. This information, however, is pushed so that monitoring systems can
watch it.
Interrupted exchanges are automatically detected and recovered from the last
known point. In addition, during a soft restart, the old process connects to
the new one using such a TCP connection to push all its entries before the new
process tries to connect to other peers. That ensures very fast replication
during a reload, it typically takes a fraction of a second even for large
tables.
Note that Server IDs are used to identify servers remotely, so it is important
that configurations look similar or at least that the same IDs are forced on
each server on all participants.

View File

@ -125,7 +125,8 @@ struct stktable_data_type {
const char *name; /* name of the data type */
int std_type; /* standard type we can use for this data, STD_T_* */
int arg_type; /* type of optional argument, ARG_T_* */
int is_array;
int is_array:1; /* this is an array of gpc/gpt */
int is_local:1; /* this is local only and never learned */
};
/* stick table keyword type */

View File

@ -1778,6 +1778,10 @@ static int peer_treat_updatemsg(struct appctx *appctx, struct peer *p, int updt,
if (!((1ULL << data_type) & st->remote_data))
continue;
if (stktable_data_types[data_type].is_local)
continue;
if (stktable_data_types[data_type].is_array) {
/* in case of array all elements
* use the same std_type and they

View File

@ -1145,7 +1145,7 @@ struct stktable_data_type stktable_data_types[STKTABLE_DATA_TYPES] = {
[STKTABLE_DT_GPC0_RATE] = { .name = "gpc0_rate", .std_type = STD_T_FRQP, .arg_type = ARG_T_DELAY },
[STKTABLE_DT_CONN_CNT] = { .name = "conn_cnt", .std_type = STD_T_UINT },
[STKTABLE_DT_CONN_RATE] = { .name = "conn_rate", .std_type = STD_T_FRQP, .arg_type = ARG_T_DELAY },
[STKTABLE_DT_CONN_CUR] = { .name = "conn_cur", .std_type = STD_T_UINT },
[STKTABLE_DT_CONN_CUR] = { .name = "conn_cur", .std_type = STD_T_UINT, .is_local = 1 },
[STKTABLE_DT_SESS_CNT] = { .name = "sess_cnt", .std_type = STD_T_UINT },
[STKTABLE_DT_SESS_RATE] = { .name = "sess_rate", .std_type = STD_T_FRQP, .arg_type = ARG_T_DELAY },
[STKTABLE_DT_HTTP_REQ_CNT] = { .name = "http_req_cnt", .std_type = STD_T_UINT },