MEDIUM: log/balance: merge tcp/http algo with log ones

"log-balance" directive was recently introduced to configure the
balancing algorithm to use when in a log backend. However, it is
confusing and it causes issues when used in default section.

In this patch, we take another approach: first we remove the
"log-balance" directive, and instead we rely on existing "balance"
directive to configure log load balancing in log backend.

Some algorithms such as roundrobin can be used as-is in a log backend,
and for log-only algorithms, they are implemented as "log-$name" inside
the "backend" directive.

The documentation was updated accordingly.
This commit is contained in:
Aurelien DARRAGON 2023-11-15 11:15:50 +01:00 committed by Willy Tarreau
parent f42dfaa214
commit b61147fd2a
7 changed files with 77 additions and 165 deletions

View File

@ -4710,7 +4710,8 @@ balance url_param <param> [check_post]
requests for it to be re-integrated into the farm and start
receiving traffic. This is normal, though very rare. It is
indicated here in case you would have the chance to observe
it, so that you don't worry.
it, so that you don't worry. Note: weights are ignored for
backends in LOG mode.
static-rr Each server is used in turns, according to their weights.
This algorithm is as similar to roundrobin except that it is
@ -4719,7 +4720,7 @@ balance url_param <param> [check_post]
limitation on the number of servers, and when a server goes
up, it is always immediately reintroduced into the farm, once
the full map is recomputed. It also uses slightly less CPU to
run (around -1%).
run (around -1%). This algorithm is not usable in LOG mode.
leastconn The server with the lowest number of connections receives the
connection. Round-robin is performed within groups of servers
@ -4730,7 +4731,8 @@ balance url_param <param> [check_post]
algorithm is dynamic, which means that server weights may be
adjusted on the fly for slow starts for instance. It will
also consider the number of queued connections in addition to
the established ones in order to minimize queuing.
the established ones in order to minimize queuing. This
algorithm is not usable in LOG mode.
first The first server with available connection slots receives the
connection. The servers are chosen from the lowest numeric
@ -4760,7 +4762,8 @@ balance url_param <param> [check_post]
is not available, round robin will apply. This algorithm is
static by default, which means that changing a server's
weight on the fly will have no effect, but this can be
changed using "hash-type".
changed using "hash-type". This algorithm is not usable for
backends in LOG mode, please use "log-hash" instead.
source The source IP address is hashed and divided by the total
weight of the running servers to designate which server will
@ -4775,6 +4778,7 @@ balance url_param <param> [check_post]
static by default, which means that changing a server's
weight on the fly will have no effect, but this can be
changed using "hash-type". See also the "hash" option above.
This algorithm is not usable for backends in LOG mode.
uri This algorithm hashes either the left part of the URI (before
the question mark) or the whole URI (if the "whole" parameter
@ -4882,6 +4886,13 @@ balance url_param <param> [check_post]
the Power of Two Random Choices and is described here :
http://www.eecs.harvard.edu/~michaelm/postscripts/handbook2001.pdf
For backends in LOG mode, the number of draws is ignored and
a single random is picked since there is no notion of server
load. Random log balancing can be useful with large farms or
when servers are frequently added or removed from the pool of
available servers as it may avoid the hammering effect that
could result from roundrobin in this situation.
rdp-cookie
rdp-cookie(<name>)
The RDP cookie <name> (or "mstshash" if omitted) will be
@ -4903,20 +4914,39 @@ balance url_param <param> [check_post]
but this can be changed using "hash-type". See also the
"hash" option above.
log-hash Takes a comma-delimited list of converters in argument. These
converters are applied in sequence to the input log message,
and the result will be cast as a string then hashed according
to the configured hash-type. The resulting hash will be used
to select the destination server among the ones declared in
the log backend. The goal of this algorithm is to be able to
extract a key within the final log message using string
converters and then be able to stick to the same server thanks
to the hash. Only "map-based" hashes are supported for now.
This algorithm is only usable for backends in LOG mode, for
others, please use "hash" instead.
log-sticky Tries to stick to the same server as much as possible. The
first server in the list of available servers receives all
the log messages. When the server goes DOWN, the next server
in the list takes its place. When a previously DOWN server
goes back UP it is added at the end of the list so that the
sticky server doesn't change until it becomes DOWN.
<arguments> is an optional list of arguments which may be needed by some
algorithms. Right now, only "url_param" and "uri" support an
optional argument.
algorithms. Right now, only "url_param", "uri" and "log-hash"
support an optional argument.
The load balancing algorithm of a backend is set to roundrobin when no other
algorithm, mode nor option have been set. The algorithm may only be set once
for each backend.
for each backend. In backends in LOG mode, server "weight" is always ignored.
With authentication schemes that require the same connection like NTLM, URI
based algorithms must not be used, as they would cause subsequent requests
to be routed to different backend servers, breaking the invalid assumptions
NTLM relies on.
Examples :
TCP/HTTP Examples :
balance roundrobin
balance url_param userid
balance url_param session_id check_post 64
@ -4927,6 +4957,30 @@ balance url_param <param> [check_post]
balance hash var(req.client_id)
balance hash req.hdr_ip(x-forwarded-for,-1),ipmask(24)
LOG backend examples:
global
log backend@mylog-rrb local0 # send all logs to mylog-rrb backend
log backend@mylog-hash local0 # send all logs to mylog-hash backend
backend mylog-rrb
mode log
balance roundrobin
server s1 udp@127.0.0.1:514 # will receive 50% of log messages
server s2 udp@127.0.0.1:514
backend mylog-hash
mode log
# extract "METHOD URL PROTO" at the end of the log message,
# and let haproxy hash it so that log messages generated from
# similar requests get sent to the same syslog server:
balance log-hash 'field(-2,\")'
# server list here
server s1 127.0.0.1:514
#...
Note: the following caveats and limitations on using the "check_post"
extension with "url_param" must be considered :
@ -8919,82 +8973,6 @@ no log
# level and send in tcp
log "${LOCAL_SYSLOG}:514" local0 notice # send to local server
log-balance <algorithm> [ <arguments> ]
Define the load balancing algorithm to be used in a log backend.
("mode log" enabled)
May be used in sections : defaults | frontend | listen | backend
yes | no | yes | yes
Arguments :
<algorithm> is the algorithm used to select a server when doing load
balancing. This only applies when no persistence information
is available, or when a connection is redispatched to another
server. <algorithm> may be one of the following :
roundrobin Each server is used in turns. This is the smoothest and
fairest algorithm when the server's processing time remains
equally distributed.
sticky The first server in the list of available servers receives all
the log messages. When the server goes DOWN, the next server
in the list takes its place. When a previously DOWN server
goes back UP it is added at the end of the list so that the
sticky server doesn't change until it becomes DOWN.
random A random number will be used as the key for the server
lookup. Random log balancing can be useful with large farms
or when servers are frequently added or removed from the
pool of available servers as it may avoid the hammering
effect that could result from roundrobin in this situation.
hash <arguments> should be found in the form: <cnv_list>
e.g.: log-balance hash <cnv_list>
Each log message will be passed to the converter list
specified in <cnv_list> (ie: "cnv1,cnv2..."), and it will
then be passed to haproxy hashing function according to
"hash-type" settings. The resulting hash will be used to
select the destination server among the ones declared in the
log backend. The goal of this algorithm is to be able to
extract a key within the final log message using string
converters and then be able to stick to the same server thanks
to the hash. Only "map-based" hashes are supported for now.
<arguments> is an optional list of arguments which may be needed by some
algorithms.
The load balancing algorithm of a log backend is set to roundrobin when
no other algorithm has been set. The algorithm may only be set once for each
log backend. The above algorithms support the "backup" server option and the
"allbackups" proxy option. However server "weight" is not supported and will
be ignored.
Examples :
global
log backend@mylog-rrb local0 # send all logs to mylog-rrb backend
log backend@mylog-hash local0 # send all logs to mylog-hash backend
backend mylog-rrb
mode log
log-balance roundrobin
server s1 udp@127.0.0.1:514 # will receive 50% of log messages
server s2 udp@127.0.0.1:514
backend mylog-hash
mode log
# extract "METHOD URL PROTO" at the end of the log message,
# and let haproxy hash it so that log messages generated from
# similar requests get sent to the same syslog server:
log-balance hash 'field(-2,\")'
# server list here
server s1 127.0.0.1:514
#...
log-format <string>
Specifies the log format string to use for traffic logs
May be used in sections: defaults | frontend | listen | backend

View File

@ -65,6 +65,7 @@
#define BE_LB_NEED_ADDR 0x00000100 /* only source address needed */
#define BE_LB_NEED_DATA 0x00000200 /* some payload is needed */
#define BE_LB_NEED_HTTP 0x00000400 /* an HTTP request is needed */
#define BE_LB_NEED_LOG 0x00000800 /* LOG backend required */
#define BE_LB_NEED 0x0000FF00 /* mask to get/clear dependencies */
/* Algorithm */
@ -89,6 +90,8 @@
#define BE_LB_ALGO_HH (BE_LB_KIND_HI | BE_LB_NEED_HTTP | BE_LB_HASH_HDR) /* hash: HTTP header value */
#define BE_LB_ALGO_RCH (BE_LB_KIND_HI | BE_LB_NEED_DATA | BE_LB_HASH_RDP) /* hash: RDP cookie value */
#define BE_LB_ALGO_SMP (BE_LB_KIND_HI | BE_LB_NEED_DATA | BE_LB_HASH_SMP) /* hash: sample expression */
#define BE_LB_ALGO_LH (BE_LB_KIND_HI | BE_LB_NEED_LOG | BE_LB_HASH_SMP) /* log hash: sample expression */
#define BE_LB_ALGO_LS (BE_LB_KIND_CB | BE_LB_NEED_LOG | BE_LB_CB_FAS) /* log sticky */
#define BE_LB_ALGO (BE_LB_KIND | BE_LB_NEED | BE_LB_PARM ) /* mask to clear algo */
/* Higher bits define how a given criterion is mapped to a server. In fact it

View File

@ -45,7 +45,6 @@ void back_handle_st_cer(struct stream *s);
const char *backend_lb_algo_str(int algo);
int backend_parse_balance(const char **args, char **err, struct proxy *curproxy);
int backend_parse_log_balance(const char **args, char **err, struct proxy *curproxy);
int tcp_persist_rdp_cookie(struct stream *s, struct channel *req, int an_bit);
int be_downtime(struct proxy *px);

View File

@ -67,7 +67,7 @@ haproxy h1 -conf {
mode log
# extract id (integer) from URL in the form "GET /id" and use it as hash key
log-balance hash 'field(-2,\"),field(2,/),field(1, )'
balance log-hash 'field(-2,\"),field(2,/),field(1, )'
hash-type map-based none
server s1 udp@${Slg1_addr}:${Slg1_port} # syslog 1 only receives "GET /0" requests
@ -79,7 +79,7 @@ haproxy h1 -conf {
backend mylog-failover
mode log
log-balance sticky
balance log-sticky
server s1 udp@${Slg21_addr}:${Slg21_port} # only receives "GET /srv1" request
server s2 udp@${Slg22_addr}:${Slg22_port} # only receives "GET /srv2" request

View File

@ -2823,55 +2823,23 @@ int backend_parse_balance(const char **args, char **err, struct proxy *curproxy)
return -1;
}
}
else {
memprintf(err, "only supports 'roundrobin', 'static-rr', 'leastconn', 'source', 'uri', 'url_param', 'hdr(name)' and 'rdp-cookie(name)' options.");
return -1;
}
return 0;
}
/* This function parses a "balance" statement in a log backend section
* describing <curproxy>. It returns -1 if there is any error, otherwise zero.
* If it returns -1, it will write an error message into the <err> buffer which
* will automatically be allocated and must be passed as NULL. The trailing '\n'
* will not be written. The function must be called with <args> pointing to the
* first word after "balance".
*/
int backend_parse_log_balance(const char **args, char **err, struct proxy *curproxy)
{
if (!*(args[0])) {
/* if no option is set, use round-robin by default */
curproxy->lbprm.algo &= ~BE_LB_ALGO;
curproxy->lbprm.algo |= BE_LB_ALGO_RR;
return 0;
}
if (strcmp(args[0], "roundrobin") == 0) {
curproxy->lbprm.algo &= ~BE_LB_ALGO;
curproxy->lbprm.algo |= BE_LB_ALGO_RR;
}
else if (strcmp(args[0], "sticky") == 0) {
curproxy->lbprm.algo &= ~BE_LB_ALGO;
/* we use ALGO_FAS as "sticky" mode in log-balance context */
curproxy->lbprm.algo |= BE_LB_ALGO_FAS;
}
else if (strcmp(args[0], "random") == 0) {
curproxy->lbprm.algo &= ~BE_LB_ALGO;
curproxy->lbprm.algo |= BE_LB_ALGO_RND;
}
else if (strcmp(args[0], "hash") == 0) {
else if (strcmp(args[0], "log-hash") == 0) {
if (!*args[1]) {
memprintf(err, "%s requires a converter list.", args[0]);
return -1;
}
curproxy->lbprm.algo &= ~BE_LB_ALGO;
curproxy->lbprm.algo |= BE_LB_ALGO_SMP;
curproxy->lbprm.algo |= BE_LB_ALGO_LH;
ha_free(&curproxy->lbprm.arg_str);
curproxy->lbprm.arg_str = strdup(args[1]);
}
else if (strcmp(args[0], "log-sticky") == 0) {
curproxy->lbprm.algo &= ~BE_LB_ALGO;
curproxy->lbprm.algo |= BE_LB_ALGO_LS;
}
else {
memprintf(err, "only supports 'roundrobin', 'sticky', 'random', 'hash' options");
memprintf(err, "only supports 'roundrobin', 'static-rr', 'leastconn', 'source', 'uri', 'url_param', 'hash', 'hdr(name)', 'rdp-cookie(name)', 'log-hash' and 'log-sticky' options.");
return -1;
}
return 0;

View File

@ -554,15 +554,6 @@ int cfg_parse_listen(const char *file, int linenum, char **args, int kwm)
err_code |= ERR_ALERT | ERR_FATAL;
goto out;
}
/* mode log shares lbprm struct with other modes, but makes a different use of it,
* thus, we must ensure that defproxy settings cannot persist between incompatibles
* modes at this point.
*/
if ((curr_defproxy->mode == PR_MODE_SYSLOG && curproxy->mode != PR_MODE_SYSLOG) ||
(curr_defproxy->mode != PR_MODE_SYSLOG && curproxy->mode == PR_MODE_SYSLOG)) {
/* lbprm settings from incompatible defproxy, back to defaults */
memset(&curproxy->lbprm, 0, sizeof(curproxy->lbprm));
}
}
else if (strcmp(args[0], "id") == 0) {
struct eb32_node *node;
@ -2536,33 +2527,12 @@ int cfg_parse_listen(const char *file, int linenum, char **args, int kwm)
if (warnifnotcap(curproxy, PR_CAP_BE, file, linenum, args[0], NULL))
err_code |= ERR_WARN;
if (curproxy->mode != PR_MODE_TCP && curproxy->mode != PR_MODE_HTTP) {
ha_alert("parsing [%s:%d] : '%s' requires TCP or HTTP mode.\n", file, linenum, args[0]);
err_code |= ERR_ALERT | ERR_FATAL;
goto out;
}
if (backend_parse_balance((const char **)args + 1, &errmsg, curproxy) < 0) {
ha_alert("parsing [%s:%d] : %s %s\n", file, linenum, args[0], errmsg);
err_code |= ERR_ALERT | ERR_FATAL;
goto out;
}
}
else if (strcmp(args[0], "log-balance") == 0) { /* set log-balancing with optional algorithm */
if (warnifnotcap(curproxy, PR_CAP_BE, file, linenum, args[0], NULL))
err_code |= ERR_WARN;
if (curproxy->mode != PR_MODE_SYSLOG) {
ha_alert("parsing [%s:%d] : %s %s\n", file, linenum, args[0], "only available for log backends");
err_code |= ERR_ALERT | ERR_FATAL;
goto out;
}
if (backend_parse_log_balance((const char **)args + 1, &errmsg, curproxy) < 0) {
ha_alert("parsing [%s:%d] : %s %s\n", file, linenum, args[0], errmsg);
err_code |= ERR_ALERT | ERR_FATAL;
goto out;
}
}
else if (strcmp(args[0], "hash-type") == 0) { /* set hashing method */
/**
* The syntax for hash-type config element is
@ -2572,12 +2542,6 @@ int cfg_parse_listen(const char *file, int linenum, char **args, int kwm)
*/
curproxy->lbprm.algo &= ~(BE_LB_HASH_TYPE | BE_LB_HASH_FUNC | BE_LB_HASH_MOD);
if (curproxy->mode != PR_MODE_TCP && curproxy->mode != PR_MODE_HTTP && curproxy->mode != PR_MODE_SYSLOG) {
ha_alert("parsing [%s:%d] : '%s' requires TCP, HTTP or LOG mode.\n", file, linenum, args[0]);
err_code |= ERR_ALERT | ERR_FATAL;
goto out;
}
if (warnifnotcap(curproxy, PR_CAP_BE, file, linenum, args[0], NULL))
err_code |= ERR_WARN;

View File

@ -910,7 +910,7 @@ static int postcheck_log_backend(struct proxy *be)
be->srv_bck = 0;
/* "log-balance hash" needs to compile its expression */
if ((be->lbprm.algo & BE_LB_ALGO) == BE_LB_ALGO_SMP) {
if ((be->lbprm.algo & BE_LB_ALGO) == BE_LB_ALGO_LH) {
struct sample_expr *expr;
char *expr_str = NULL;
char *err_str = NULL;
@ -2198,7 +2198,7 @@ static inline void __do_send_log_backend(struct proxy *be, struct log_header hdr
*/
targetid = HA_ATOMIC_FETCH_ADD(&be->lbprm.log.lastid, 1) % nb_srv;
}
else if ((be->lbprm.algo & BE_LB_ALGO) == BE_LB_ALGO_FAS) {
else if ((be->lbprm.algo & BE_LB_ALGO) == BE_LB_ALGO_LS) {
/* sticky mode: use first server in the pool, which will always stay
* first during dequeuing and requeuing, unless it becomes unavailable
* and will be replaced by another one
@ -2209,7 +2209,7 @@ static inline void __do_send_log_backend(struct proxy *be, struct log_header hdr
/* random mode */
targetid = statistical_prng() % nb_srv;
}
else if ((be->lbprm.algo & BE_LB_ALGO) == BE_LB_ALGO_SMP) {
else if ((be->lbprm.algo & BE_LB_ALGO) == BE_LB_ALGO_LH) {
struct sample result;
/* log-balance hash */