MEDIUM: backend: add new "balance hash <expr>" algorithm

Almost all of our hash-based LB algorithms are implemented as special
cases of something that can now be achieved using sample expressions,
and some of them have adopted some options to adapt their behavior in
ways that could also be achieved using converters.

There are users who want to hash other parameters that are combined
into variables, and who set headers from these values and use
"balance hdr(name)" for this.

Instead of constantly implementing specific options and having users
hack around when they want a real hash, let's implement a native hash
mode that applies to a standard sample expression. This way, any
fetchable element (including variables) may be used to construct the
hash, even modified by any converter if desired.
This commit is contained in:
Willy Tarreau 2022-04-25 10:25:34 +02:00
parent b9f30f398b
commit 7c9a0fe2a6
6 changed files with 129 additions and 6 deletions

View File

@ -3986,6 +3986,19 @@ balance url_param <param> [check_post]
turn new servers on when the queue inflates. Alternatively,
using "http-check send-state" may inform servers on the load.
hash Takes a regular sample expression in argument. The expression
is evaluated for each request and hashed according to the
configured hash-type. The result of the hash is divided by
the total weight of the running servers to designate which
server will receive the request. This can be used in place of
"source", "uri", "hdr()", "url_param()", "rdp-cookie" to make
use of a converter, refine the evaluation, or be used to
extract data from local variables for example. When the data
is not available, round robin will apply. This algorithm is
static by default, which means that changing a server's
weight on the fly will have no effect, but this can be
changed using "hash-type".
source The source IP address is hashed and divided by the total
weight of the running servers to designate which server will
receive the request. This ensures that the same client IP
@ -3998,7 +4011,7 @@ balance url_param <param> [check_post]
to clients which refuse session cookies. This algorithm is
static by default, which means that changing a server's
weight on the fly will have no effect, but this can be
changed using "hash-type".
changed using "hash-type". See also the "hash" option above.
uri This algorithm hashes either the left part of the URI (before
the question mark) or the whole URI (if the "whole" parameter
@ -4030,7 +4043,8 @@ balance url_param <param> [check_post]
A "path-only" parameter indicates that the hashing key starts
at the first '/' of the path. This can be used to ignore the
authority part of absolute URIs, and to make sure that HTTP/1
and HTTP/2 URIs will provide the same hash.
and HTTP/2 URIs will provide the same hash. See also the
"hash" option above.
url_param The URL parameter specified in argument will be looked up in
the query string of each HTTP GET request.
@ -4058,7 +4072,8 @@ balance url_param <param> [check_post]
applied. Note that this algorithm may only be used in an HTTP
backend. This algorithm is static by default, which means
that changing a server's weight on the fly will have no
effect, but this can be changed using "hash-type".
effect, but this can be changed using "hash-type". See also
the "hash" option above.
hdr(<name>) The HTTP header <name> will be looked up in each HTTP
request. Just as with the equivalent ACL 'hdr()' function,
@ -4073,7 +4088,8 @@ balance url_param <param> [check_post]
This algorithm is static by default, which means that
changing a server's weight on the fly will have no effect,
but this can be changed using "hash-type".
but this can be changed using "hash-type". See also the
"hash" option above.
random
random(<draws>)
@ -4121,7 +4137,8 @@ balance url_param <param> [check_post]
This algorithm is static by default, which means that
changing a server's weight on the fly will have no effect,
but this can be changed using "hash-type".
but this can be changed using "hash-type". See also the
"hash" option above.
<arguments> is an optional list of arguments which may be needed by some
algorithms. Right now, only "url_param" and "uri" support an
@ -4143,6 +4160,9 @@ balance url_param <param> [check_post]
balance hdr(User-Agent)
balance hdr(host)
balance hdr(Host) use_domain_only
balance hash req.cookie(clientid)
balance hash var(req.client_id)
balance hash req.hdr_ip(x-forwarded-for,-1),ipmask(24)
Note: the following caveats and limitations on using the "check_post"
extension with "url_param" must be considered :

View File

@ -47,6 +47,7 @@
#define BE_LB_HASH_PRM 0x00002 /* hash HTTP URL parameter */
#define BE_LB_HASH_HDR 0x00003 /* hash HTTP header value */
#define BE_LB_HASH_RDP 0x00004 /* hash RDP cookie value */
#define BE_LB_HASH_SMP 0x00005 /* hash a sample expression */
#define BE_LB_HASH_RND 0x00008 /* hash a random value */
/* BE_LB_RR_* is used with BE_LB_KIND_RR */
@ -89,6 +90,7 @@
#define BE_LB_ALGO_PH (BE_LB_KIND_HI | BE_LB_NEED_HTTP | BE_LB_HASH_PRM) /* hash: HTTP URL parameter */
#define BE_LB_ALGO_HH (BE_LB_KIND_HI | BE_LB_NEED_HTTP | BE_LB_HASH_HDR) /* hash: HTTP header value */
#define BE_LB_ALGO_RCH (BE_LB_KIND_HI | BE_LB_NEED_DATA | BE_LB_HASH_RDP) /* hash: RDP cookie value */
#define BE_LB_ALGO_SMP (BE_LB_KIND_HI | BE_LB_NEED_DATA | BE_LB_HASH_SMP) /* hash: sample expression */
#define BE_LB_ALGO (BE_LB_KIND | BE_LB_NEED | BE_LB_PARM ) /* mask to clear algo */
/* Higher bits define how a given criterion is mapped to a server. In fact it
@ -152,6 +154,7 @@ struct lbprm {
int wmult; /* ratio between user weight and effective weight */
int wdiv; /* ratio between effective weight and user weight */
int hash_balance_factor; /* load balancing factor * 100, 0 if disabled */
struct sample_expr *expr; /* sample expression for "balance hash" */
char *arg_str; /* name of the URL parameter/header/cookie used for hashing */
int arg_len; /* strlen(arg_str), computed only once */
int arg_opt1; /* extra option 1 for the LB algo (algo-specific) */

View File

@ -524,6 +524,40 @@ static struct server *get_server_rch(struct stream *s, const struct server *avoi
return map_get_server_hash(px, hash);
}
/* sample expression HASH. Returns NULL if the sample is not found or if there
* are no server, relying on the caller to fall back to round robin instead.
*/
static struct server *get_server_expr(struct stream *s, const struct server *avoid)
{
struct proxy *px = s->be;
struct sample *smp;
unsigned int hash = 0;
if (px->lbprm.tot_weight == 0)
return NULL;
/* note: no need to hash if there's only one server left */
if (px->lbprm.tot_used == 1)
goto hash_done;
smp = sample_fetch_as_type(px, s->sess, s, SMP_OPT_DIR_REQ | SMP_OPT_FINAL, px->lbprm.expr, SMP_T_BIN);
if (!smp)
return NULL;
/* We have the desired data. Let's hash it according to the configured
* options and algorithm.
*/
hash = gen_hash(px, smp->data.u.str.area, smp->data.u.str.data);
if ((px->lbprm.algo & BE_LB_HASH_MOD) == BE_LB_HMOD_AVAL)
hash = full_hash(hash);
hash_done:
if ((px->lbprm.algo & BE_LB_LKUP) == BE_LB_LKUP_CHTREE)
return chash_get_server_hash(px, hash, avoid);
else
return map_get_server_hash(px, hash);
}
/* random value */
static struct server *get_server_rnd(struct stream *s, const struct server *avoid)
{
@ -760,6 +794,11 @@ int assign_server(struct stream *s)
srv = get_server_rch(s, prev_srv);
break;
case BE_LB_HASH_SMP:
/* sample expression hashing */
srv = get_server_expr(s, prev_srv);
break;
default:
/* unknown balancing algorithm */
err = SRV_STATUS_INTERNAL;
@ -2578,6 +2617,8 @@ const char *backend_lb_algo_str(int algo) {
return "hdr";
else if (algo == BE_LB_ALGO_RCH)
return "rdp-cookie";
else if (algo == BE_LB_ALGO_SMP)
return "hash";
else if (algo == BE_LB_ALGO_NONE)
return "none";
else
@ -2707,6 +2748,23 @@ int backend_parse_balance(const char **args, char **err, struct proxy *curproxy)
}
}
}
else if (strcmp(args[0], "hash") == 0) {
if (!*args[1]) {
memprintf(err, "%s requires a sample expression.", args[0]);
return -1;
}
curproxy->lbprm.algo &= ~BE_LB_ALGO;
curproxy->lbprm.algo |= BE_LB_ALGO_SMP;
ha_free(&curproxy->lbprm.arg_str);
curproxy->lbprm.arg_str = strdup(args[1]);
curproxy->lbprm.arg_len = strlen(args[1]);
if (*args[2]) {
memprintf(err, "%s takes no other argument (got '%s').", args[0], args[2]);
return -1;
}
}
else if (!strncmp(args[0], "hdr(", 4)) {
const char *beg, *end;

View File

@ -3294,6 +3294,47 @@ out_uri_auth_compat:
curproxy->conf.args.line = 0;
}
/* "balance hash" needs to compile its expression */
if ((curproxy->lbprm.algo & BE_LB_ALGO) == BE_LB_ALGO_SMP) {
int idx = 0;
const char *args[] = {
curproxy->lbprm.arg_str,
NULL,
};
err = NULL;
curproxy->conf.args.ctx = ARGC_USRV; // same context as use_server.
curproxy->lbprm.expr =
sample_parse_expr((char **)args, &idx,
curproxy->conf.file, curproxy->conf.line,
&err, &curproxy->conf.args, NULL);
if (!curproxy->lbprm.expr) {
ha_alert("%s '%s' [%s:%d]: failed to parse 'balance hash' expression '%s' in : %s.\n",
proxy_type_str(curproxy), curproxy->id,
curproxy->conf.file, curproxy->conf.line,
curproxy->lbprm.arg_str, err);
ha_free(&err);
cfgerr++;
}
else if (!(curproxy->lbprm.expr->fetch->val & SMP_VAL_BE_SET_SRV)) {
ha_alert("%s '%s' [%s:%d]: error detected while parsing 'balance hash' expression '%s' "
"which requires information from %s, which is not available here.\n",
proxy_type_str(curproxy), curproxy->id,
curproxy->conf.file, curproxy->conf.line,
curproxy->lbprm.arg_str, sample_src_names(curproxy->lbprm.expr->fetch->use));
cfgerr++;
}
else if (curproxy->mode == PR_MODE_HTTP && (curproxy->lbprm.expr->fetch->use & SMP_USE_L6REQ)) {
ha_warning("%s '%s' [%s:%d]: L6 sample fetch <%s> will be ignored in 'balance hash' expression in HTTP mode.\n",
proxy_type_str(curproxy), curproxy->id,
curproxy->conf.file, curproxy->conf.line,
curproxy->lbprm.arg_str);
}
else
curproxy->http_needed |= !!(curproxy->lbprm.expr->fetch->use & SMP_USE_HTTP_ANY);
}
/* only now we can check if some args remain unresolved.
* This must be done after the users and groups resolution.
*/

View File

@ -154,6 +154,7 @@ void free_proxy(struct proxy *p)
free(p->cookie_domain);
free(p->cookie_attrs);
free(p->lbprm.arg_str);
release_sample_expr(p->lbprm.expr);
free(p->server_state_file_name);
free(p->capture_name);
istfree(&p->monitor_uri);

View File

@ -1307,7 +1307,7 @@ int smp_resolve_args(struct proxy *p, char **err)
case ARGC_SRV: where = "in server directive in"; break;
case ARGC_SPOE: where = "in spoe-message directive in"; break;
case ARGC_UBK: where = "in use_backend expression in"; break;
case ARGC_USRV: where = "in use-server expression in"; break;
case ARGC_USRV: where = "in use-server or balance expression in"; break;
case ARGC_HERR: where = "in http-error directive in"; break;
case ARGC_OT: where = "in ot-scope directive in"; break;
case ARGC_TCO: where = "in tcp-request connection expression in"; break;