BUG/MAJOR: dns: disabled servers through SRV records never recover

A regression was introduced by 13a9232ebc
when I added support for Additional section of the SRV responses..

Basically, when a server is managed through SRV records additional
section and it's disabled (because its associated Additional record has
disappeared), it never leaves its MAINT state and so never comes back to
production.
This patch updates the "snr_update_srv_status()" function to clear the
MAINT status when the server now has an IP address and also ensure this
function is called when parsing Additional records (and associating them
to new servers).

This can cause severe outage for people using HAProxy + consul (or any
other service registry) through DNS service discovery).

This should fix issue #793.
This should be backported to 2.2.
This commit is contained in:
Baptiste Assmann 2020-08-04 10:57:21 +02:00 committed by Christopher Faulet
parent cde83033d0
commit 87138c3524
2 changed files with 9 additions and 0 deletions

View File

@ -648,6 +648,9 @@ static void dns_check_dns_response(struct dns_resolution *res)
if (msg)
send_log(srv->proxy, LOG_NOTICE, "%s", msg);
/* now we have an IP address associated to this server, we can update its status */
snr_update_srv_status(srv, 0);
srv->svc_port = item->port;
srv->flags &= ~SRV_F_MAPPORTS;
if ((srv->check.state & CHK_ST_CONFIGURED) &&

View File

@ -3733,6 +3733,12 @@ int snr_update_srv_status(struct server *s, int has_no_ip)
/* If resolution is NULL we're dealing with SRV records Additional records */
if (resolution == NULL) {
/* since this server has an IP, it can go back in production */
if (has_no_ip == 0) {
srv_clr_admin_flag(s, SRV_ADMF_RMAINT);
return 1;
}
if (s->next_admin & SRV_ADMF_RMAINT)
return 1;