haproxy/doc/internals/hashing.txt

84 lines
4.4 KiB
Plaintext
Raw Permalink Normal View History

2013/11/20 - How hashing works internally in haproxy - maddalab@gmail.com
This document describes how HAProxy implements hashing both map-based and
consistent hashing, both prior to versions 1.5 and the motivation and tests
[RELEASE] Released version 2.4-dev19 Released version 2.4-dev19 with the following main changes : - BUG/MINOR: hlua: Don't rely on top of the stack when using Lua buffers - BUG/MEDIUM: cli: prevent memory leak on write errors - BUG/MINOR: ssl/cli: fix a lock leak when no memory available - MINOR: debug: add a new "debug dev sym" command in expert mode - MINOR: pools/debug: slightly relax DEBUG_DONT_SHARE_POOLS - CI: Github Actions: switch to LibreSSL-3.3.3 - MINOR: srv: close all idle connections on shutdown - MINOR: connection: move session_list member in a union - MEDIUM: mux_h1: release idling frontend conns on soft-stop - MEDIUM: connection: close front idling connection on soft-stop - MINOR: tools: add functions to retrieve the address of a symbol - CLEANUP: activity: mark the profiling and task_profiling_mask __read_mostly - MINOR: activity: add a "memory" entry to "profiling" - MINOR: activity: declare the storage for memory usage statistics - MEDIUM: activity: collect memory allocator statistics with USE_MEMORY_PROFILING - MINOR: activity: clean up the show profiling io_handler a little bit - MINOR: activity: make "show profiling" support a few arguments - MINOR: activity: make "show profiling" also dump the memoery usage - MINOR: activity: add the profiling.memory global setting - BUILD: makefile: add new option USE_MEMORY_PROFILING - MINOR: channel: Rely on HTX version if appropriate in channel_may_recv() - BUG/MINOR: stream-int: Don't block reads in si_update_rx() if chn may receive - MINOR: conn-stream: Force mux to wait for read events if abortonclose is set - MEDIUM: mux-h1: Don't block reads when waiting for the other side - BUG/MEDIUM: mux-h1: Properly report client close if abortonclose option is set - REGTESTS: Add script to test abortonclose option - MINOR: mux-h1: clean up conditions to enabled and disabled splicing - MINOR: mux-h1: Subscribe for sends if output buffer is not empty in h1_snd_pipe - MINOR: mux-h1: Always subscribe for reads when splicing is disabled - MEDIUM: mux-h1: Wake H1 stream when both sides a synchronized - CLEANUP: mux-h1: rename WAIT_INPUT/WAIT_OUTPUT flags - MINOR: mux-h1: Manage processing blocking flags on the H1 stream - BUG/MINOR: stream: Decrement server current session counter on L7 retry - BUG/MINOR: config: fix uninitialized initial state in ".if" block evaluator - BUG/MINOR: config: add a missing "ELIF_TAKE" test for ".elif" condition evaluator - BUG/MINOR: config: .if/.elif should also accept negative integers - MINOR: config: centralize the ".if"/".elif" condition parser and evaluator - MINOR: config: keep up-to-date current file/line/section in the global struct - MINOR: config: support some pseudo-variables for file/line/section - BUILD: activity: do not include malloc.h - MINOR: arg: improve the error message on missing closing parenthesis - MINOR: global: export the build features string list - MINOR: global: add version comparison functions - MINOR: config: improve .if condition error reporting - MINOR: config: make cfg_eval_condition() support predicates with arguments - MINOR: config: add predicate "defined()" to conditional expression blocks - MINOR: config: add predicates "streq()" and "strneq()" to conditional expressions - MINOR: config: add predicate "feature" to detect certain built-in features - MINOR: config: add predicates "version_atleast" and "version_before" to cond blocks - BUG/MINOR: activity: use the new pointer to calculate the new size in realloc() - BUG/MINOR: stream: properly clear the previous error mask on L7 retries - MEDIUM: log: slightly refine the output format of alerts/warnings/etc - MINOR: config: add a new message directive: .diag - CLEANUP: cli/tree-wide: properly re-align the CLI commands' help messages - BUG/MINOR: stream: Reset stream final state and si error type on L7 retry - BUG/MINOR: checks: Handle synchronous connect when a tcpcheck is started - BUG/MINOR: checks: Reschedule check on observe mode only if fastinter is set - MINOR: global: define tainted flag - MINOR: cfgparse: add a new field flags in cfg_keyword - MINOR: cfgparse: implement experimental config keywords - MINOR: action: replace match_pfx by a keyword flags field - MINOR: action: implement experimental actions - MINOR: cli: set tainted when using CLI expert/experimental mode - MINOR: stats: report tainted on show info - MINOR: http_act: mark normalize-uri as experimental - BUILD: fix usage of ha_alert without format string - MINOR: proxy: define PR_CAP_LB - BUG/MINOR: server: do not report diag for peer servers with null weight - DOC: ssl: Extra files loading now works for backends too - ADDONS: make addons/ discoverable by git via .gitignore - DOC: ssl: Add information about crl-file option - MINOR: sample: improve error reporting on missing arg to strcmp() converter - DOC: management: mention that some fields may be emitted as floats - MINOR: tools: implement trimming of floating point numbers - MINOR: tools: add a float-to-ascii conversion function - MINOR: freq_ctr: add new functions to report float measurements - MINOR: stats: avoid excessive padding of float values with trailing zeroes - MINOR: stats: add the HTML conversion for float types - MINOR: stats: pass the appctx flags to stats_fill_info() - MINOR: stats: support an optional "float" option to "show info" - MINOR: stats: use tv_remain() to precisely compute the uptime - MINOR: stats: report uptime and start time as floats with subsecond resolution - MINOR: stats: make "show info" able to report rates as floats when asked - MINOR: config: mark tune.fd.edge-triggered as experimental - REORG: vars: move the "proc" scope variables out of the global struct - REORG: threads: move all_thread_mask() to thread.h - BUILD: wdt: include signal-t.h - BUILD: auth: include missing list.h - REORG: mworker: move proc_self from global to mworker - BUILD: ssl: ssl_utils requires chunk.h - BUILD: config: cfgparse-ssl.c needs tools.h - BUILD: wurfl: wurfl.c needs tools.h - BUILD: spoe: flt_spoe.c needs tools.h - BUILD: promex: service-prometheus.c needs tools.h - BUILD: resolvers: include tools.h - BUILD: config: include tools.h in cfgparse-listen.c - BUILD: htx: include tools.h in http_htx.c - BUILD: proxy: include tools.h in proxy.c - BUILD: session: include tools.h in session.c - BUILD: cache: include tools.h in cache.c - BUILD: sink: include tools.h in sink.c - BUILD: connection: include tools.h in connection.c - BUILD: server-state: include tools.h from server_state.c - BUILD: dns: include tools.h in dns.c - BUILD: payload: include tools.h in payload.c - BUILD: vars: include tools.h in vars.c - BUILD: compression: include tools.h in compression.c - BUILD: mworker: include tools.h from mworker.c - BUILD: queue: include tools.h from queue.c - BUILD: udp: include tools.h from proto_udp.c - BUILD: stick-table: include freq_ctr.h from stick_table.h - BUILD: server: include tools.h from server.c - BUILD: server: include missing proxy.h in server.c - BUILD: sink: include proxy.h in sink.c - BUILD: mworker: include proxy.h in mworker.c - BUILD: filters: include proxy.h in filters.c - BUILD: fcgi-app: include proxy.h in fcgi-app.c - BUILD: connection: move list_mux_proto() to connection.c - REORG: stick-table: uninline stktable_alloc_data_type() - REORG: stick-table: move composite address functions to stick_table.h - REORG: config: uninline warnifnotcap() and failifnotcap() - BUILD: task: remove unused includes from task.c - MINOR: task: stop including stream.h from task.c - BUILD: connection: stop including listener-t.h - BUILD: hlua: include proxy.h from hlua.c - BUILD: mux-h1: include proxy.h from mux-h1.c - BUILD: mux-fcgi: include proxy.h from mux-fcgi.c - BUILD: listener: include proxy.h from listener.c - BUILD: http-rules: include proxy.h from http_rules.c - BUILD: thread: include log.h from thread.c - BUILD: comp: include proxy.h from flt_http_comp.c - BUILD: fd: include log.h from fd.c - BUILD: config: do not include proxy.h nor errors.h anymore in cfgparse.h - BUILD: makefile: reorder object files by build time - DOC: Fix a few grammar/spelling issues and casing of HAProxy - REGTESTS: run-regtests: match both "HAProxy" and "HA-Proxy" in the version - MINOR: version: report "HAProxy" not "HA-Proxy" in the version output - DOC: remove last occurrences of "HA-Proxy" syntax - DOC: peers: fix the protocol tag name in the doc - ADMIN: netsnmp: report "HAProxy" and not "Haproxy" in output descriptions - MEDIUM: mailers: use "HAProxy" nor "HAproxy" in the subject of messages - DOC: fix a few remainig cases of "Haproxy" and "HAproxy" in doc and comments - MINOR: tools/rnd: compute the result outside of the CAS loop - BUILD: http_fetch: address a few aliasing warnings with older compilers - BUILD: ssl: define HAVE_CRYPTO_memcmp() based on the library version - BUILD: errors: include stdarg in errors.h - REGTESTS: disable inter-thread idle connection sharing on sensitive tests - MINOR: cli: make "help" support a command in argument - MINOR: cli: sort the output of the "help" keywords - CLEANUP: cli/mworker: properly align the help messages - BUILD: memprof: make the old caller pointer a const in get_prof_bin() - BUILD: compat: include malloc_np.h for USE_MEMORY_PROFILING on FreeBSD - CI: Github Actions: enable USE_QUIC=1 for BoringSSL builds - BUG/MEDIUM: quic: fix null deref on error path in qc_conn_init() - BUILD: cli: appease a null-deref warning in cli_gen_usage_msg()
2021-05-10 05:50:26 +00:00
that were done when providing additional options starting in version 2.4
A note on hashing in general, hash functions strive to have little
correlation between input and output. The heart of a hash function is its
mixing step. The behavior of the mixing step largely determines whether the
hash function is collision-resistant. Hash functions that are collision
resistant are more likely to have an even distribution of load.
The purpose of the mixing function is to spread the effect of each message
bit throughout all the bits of the internal state. Ideally every bit in the
hash state is affected by every bit in the message. And we want to do that
as quickly as possible simply for the sake of program performance. A
function is said to satisfy the strict avalanche criterion if, whenever a
single input bit is complemented (toggled between 0 and 1), each of the
output bits should change with a probability of one half for an arbitrary
selection of the remaining input bits.
To guard against a combination of hash function and input that results in
high rate of collisions, haproxy implements an avalanche algorithm on the
result of the hashing function. In all versions 1.4 and prior avalanche is
always applied when using the consistent hashing directive. It is intended
to provide quite a good distribution for little input variations. The result
is quite suited to fit over a 32-bit space with enough variations so that
a randomly picked number falls equally before any server position, which is
ideal for consistently hashed backends, a common use case for caches.
In all versions 1.4 and prior HAProxy implements the SDBM hashing function.
However tests show that alternatives to SDBM have a better cache
distribution on different hashing criteria. Additional tests involving
alternatives for hash input and an option to trigger avalanche, we found
different algorithms perform better on different criteria. DJB2 performs
well when hashing ascii text and is a good choice when hashing on host
header. Other alternatives perform better on numbers and are a good choice
when using source ip. The results also vary by use of the avalanche flag.
The results of the testing can be found under the tests folder. Here is
a summary of the discussion on the results on 1 input criteria and the
methodology used to generate the results.
A note of the setup when validating the results independently, one
would want to avoid backend server counts that may skew the results. As
an example with DJB2 avoid 33 servers. Please see the implementations of
the hashing function, which can be found in the links under references.
The following was the set up used
(a) hash-type consistent/map-based
(b) avalanche on/off
(c) balanche host(hdr)
(d) 3 criteria for inputs
- ~ 10K requests, including duplicates
- ~ 46K requests, unique requests from 1 MM requests were obtained
- ~ 250K requests, including duplicates
(e) 17 servers in backend, all servers were assigned the same weight
Result of the hashing were obtained across the server via monitoring log
files for haproxy. Population Standard deviation was used to evaluate the
efficacy of the hashing algorithm. Lower standard deviation, indicates
a better distribution of load across the backends.
On 10K requests, when using consistent hashing with avalanche on host
headers, DJB2 significantly out performs SDBM. Std dev on SDBM was 48.95
and DJB2 was 26.29. This relationship is inverted with avalanche disabled,
however DJB2 with avalanche enabled out performs SDBM with avalanche
disabled.
On map-based hashing SDBM out performs DJB2 irrespective of the avalanche
option. SDBM without avalanche is marginally better than with avalanche.
DJB2 performs significantly worse with avalanche enabled.
Summary: The results of the testing indicate that there isn't a hashing
algorithm that can be applied across all input criteria. It is necessary
to support alternatives to SDBM, which is generally the best option, with
algorithms that are better for different inputs. Avalanche is not always
applicable and may result in less smooth distribution.
References:
Mixing Functions/Avalanche: https://papa.bretmulvey.com/post/124027987928/hash-functions
Hash Functions: http://www.cse.yorku.ca/~oz/hash.html