From 67957bd59e3ccd7be1174b50a5bf402bd676ecb0 Mon Sep 17 00:00:00 2001 From: Christopher Faulet Date: Wed, 27 Sep 2017 11:00:59 +0200 Subject: [PATCH] MAJOR: dns: Refactor the DNS code This is a huge patch with many changes, all about the DNS. Initially, the idea was to update the DNS part to ease the threads support integration. But quickly, I started to refactor some parts. And after several iterations, it was impossible for me to commit the different parts atomically. So, instead of adding tens of patches, often reworking the same parts, it was easier to merge all my changes in a uniq patch. Here are all changes made on the DNS. First, the DNS initialization has been refactored. The DNS configuration parsing remains untouched, in cfgparse.c. But all checks have been moved in a post-check callback. In the function dns_finalize_config, for each resolvers, the nameservers configuration is tested and the task used to manage DNS resolutions is created. The links between the backend's servers and the resolvers are also created at this step. Here no connection are kept alive. So there is no needs anymore to reopen them after HAProxy fork. Connections used to send DNS queries will be opened on demand. Then, the way DNS requesters are linked to a DNS resolution has been reworked. The resolution used by a requester is now referenced into the dns_requester structure and the resolution pointers in server and dns_srvrq structures have been removed. wait and curr list of requesters, for a DNS resolution, have been replaced by a uniq list. And Finally, the way a requester is removed from a DNS resolution has been simplified. Now everything is done in dns_unlink_resolution. srv_set_fqdn function has been simplified. Now, there is only 1 way to set the server's FQDN, independently it is done by the CLI or when a SRV record is resolved. The static DNS resolutions pool has been replaced by a dynamoc pool. The part has been modified by Baptiste Assmann. The way the DNS resolutions are triggered by the task or by a health-check has been totally refactored. Now, all timeouts are respected. Especially hold.valid. The default frequency to wake up a resolvers is now configurable using "timeout resolve" parameter. Now, as documented, as long as invalid repsonses are received, we really wait all name servers responses before retrying. As far as possible, resources allocated during DNS configuration parsing are releases when HAProxy is shutdown. Beside all these changes, the code has been cleaned to ease code review and the doc has been updated. --- doc/configuration.txt | 101 +- include/proto/dns.h | 44 +- include/proto/server.h | 1 - include/types/dns.h | 406 +++-- include/types/global.h | 1 - include/types/proxy.h | 1 - include/types/server.h | 1 - src/cfgparse.c | 113 +- src/checks.c | 3 +- src/dns.c | 3201 ++++++++++++++++------------------------ src/haproxy.c | 9 +- src/proxy.c | 1 - src/server.c | 308 ++-- 13 files changed, 1645 insertions(+), 2545 deletions(-) diff --git a/doc/configuration.txt b/doc/configuration.txt index f5cf60305..59bbc566f 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -11475,10 +11475,6 @@ resolve-net [, Points to an existing "resolvers" section to resolve current server's hostname. - In order to be operational, DNS resolution requires that health check is - enabled on the server. Actually, health checks triggers the DNS resolution. - You must precise one 'resolvers' parameter on each server line where DNS - resolution is required. Example: @@ -11712,21 +11708,20 @@ different steps of the process life: host name. It uses libc functions to get the host name resolved. This resolution relies on /etc/resolv.conf file. - 2. at run time, when HAProxy gets prepared to run a health check on a server, - it verifies if the current name resolution is still considered as valid. - If not, it processes a new resolution, in parallel of the health check. + 2. at run time, HAProxy performs periodically name resolutions for servers + requiring DNS resolutions. A few other events can trigger a name resolution at run time: - when a server's health check ends up in a connection timeout: this may be because the server has a new IP address. So we need to trigger a name resolution to know this new IP. -When using resolvers, the server name can either be a hostname, or s SRV label. -HAProxy considers anything that starts with an underscore a SRV label. -If a SRV label is specified, then the corresponding SRV records will be -retrieved from the DNS server, and the provided hostnames will be used. The -SRV label will be checked periodically, and if any server are added or removed, -haproxy will automatically do the same. +When using resolvers, the server name can either be a hostname, or a SRV label. +HAProxy considers anything that starts with an underscore as a SRV label. If a +SRV label is specified, then the corresponding SRV records will be retrieved +from the DNS server, and the provided hostnames will be used. The SRV label +will be checked periodically, and if any server are added or removed, haproxy +will automatically do the same. A few things important to notice: - all the name servers are queried in the mean time. HAProxy will process the @@ -11740,9 +11735,8 @@ A few things important to notice: ---------------------------- This section is dedicated to host information related to name resolution in -HAProxy. -There can be as many as resolvers section as needed. Each section can contain -many name servers. +HAProxy. There can be as many as resolvers section as needed. Each section can +contain many name servers. When multiple name servers are configured in a resolvers section, then HAProxy uses the first valid response. In case of invalid responses, only the last one @@ -11750,43 +11744,40 @@ is treated. Purpose is to give the chance to a slow server to deliver a valid answer after a fast faulty or outdated server. When each server returns a different error type, then only the last error is -used by HAProxy to decide what type of behavior to apply. +used by HAProxy. The following processing is applied on this error: -Two types of behavior can be applied: - 1. stop DNS resolution - 2. replay the DNS query with a new query type - In such case, the following types are applied in this exact order: - 1. ANY query type - 2. query type corresponding to family pointed by resolve-prefer - server's parameter - 3. remaining family type + 1. HAProxy retries the same DNS query with a new query type. The A queries are + switch to AAAA or the opposite. SRV queries are not concerned here. Timeout + errors are also excluded. -HAProxy stops DNS resolution when the following errors occur: - - invalid DNS response packet - - wrong name in the query section of the response - - NX domain - - Query refused by server - - CNAME not pointing to an IP address + 2. When the fallback on the query type was done (or not applicable), HAProxy + retries the original DNS query, with the preferred query type. -HAProxy tries a new query type when the following errors occur: - - no Answer records in the response - - DNS response truncated - - Error in DNS response - - No expected DNS records found in the response - - name server timeout + 3. HAProxy retries previous steps times. If no valid + response is received after that, it stops the DNS resolution and reports + the error. -For example, with 2 name servers configured in a resolvers section: - - first response is valid and is applied directly, second response is ignored - - first response is invalid and second one is valid, then second response is - applied; - - first response is a NX domain and second one a truncated response, then - HAProxy replays the query with a new type; - - first response is truncated and second one is a NX Domain, then HAProxy - stops resolution. +For example, with 2 name servers configured in a resolvers section, the +following scenarios are possible: + + - First response is valid and is applied directly, second response is + ignored + + - First response is invalid and second one is valid, then second response is + applied + + - First response is a NX domain and second one a truncated response, then + HAProxy retries the query with a new type + + - First response is a NX domain and second one is a timeout, then HAProxy + retries the query with a new type + + - Query timed out for both name servers, then HAProxy retries it with the + same query type As a DNS server may not answer all the IPs in one DNS request, haproxy keeps a cache of previous answers, an answer will be considered obsolete after -"hold obsolete" seconds without the IP returned. + seconds without the IP returned. resolvers @@ -11796,7 +11787,7 @@ A resolvers section accept the following parameters: accepted_payload_size Defines the maxium payload size accepted by HAProxy and announced to all the - naeservers configured in this resolvers section. + name servers configured in this resolvers section. is in bytes. If not set, HAProxy announces 512. (minimal value defined by RFC 6891) @@ -11822,11 +11813,7 @@ hold Default value is 10s for "valid", 0s for "obsolete" and 30s for others. - Note: since the name resolution is triggered by the health checks, a new - resolution is triggered after modulo the parameter of - the healch check. - -resolution_pool_size +resolution_pool_size (deprecated) Defines the number of resolutions available in the pool for this resolvers. If not defines, it defaults to 64. If your configuration requires more than , then HAProxy will return an error when parsing the configuration. @@ -11844,9 +11831,12 @@ timeout