From 08be63572ed0b063b28abb406c16f4392bafb90e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Christian=20G=C3=B6ttsche?= Date: Wed, 16 Aug 2023 14:38:43 +0200 Subject: [PATCH] libsepol/cil: use DJB2a string hash function MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The hash table implementation uses `& (h->size - 1)` to truncate generated hashes to the number of buckets. This operation is equal to `% h->size` if and only if the size is a power of two (which seems to be always the case). One property of the binary and with a power of two (and probably a small one <=2048) is all higher bits are discarded. Thus a hash function is needed with a good avalanche effect, which the current one is not. Benchmark of building dssp5: # Current Time (mean ± σ): 1.347 s ± 0.065 s [User: 1.207 s, System: 0.138 s] Range (min … max): 1.274 s … 1.436 s 10 runs # Patch Time (mean ± σ): 1.336 s ± 0.029 s [User: 1.195 s, System: 0.140 s] Range (min … max): 1.303 s … 1.376 s 10 runs Signed-off-by: Christian Göttsche Acked-by: James Carter --- libsepol/cil/src/cil_strpool.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/libsepol/cil/src/cil_strpool.c b/libsepol/cil/src/cil_strpool.c index e32ee4e9..beea5c9d 100644 --- a/libsepol/cil/src/cil_strpool.c +++ b/libsepol/cil/src/cil_strpool.c @@ -47,16 +47,13 @@ static hashtab_t cil_strpool_tab = NULL; static unsigned int cil_strpool_hash(hashtab_t h, const_hashtab_key_t key) { - const char *p; - size_t size; - unsigned int val; + unsigned int hash = 5381; + unsigned char c; - val = 0; - size = strlen(key); - for (p = key; ((size_t) (p - key)) < size; p++) - val = - (val << 4 | (val >> (8 * sizeof(unsigned int) - 4))) ^ (*p); - return val & (h->size - 1); + while ((c = *(unsigned const char *)key++)) + hash = ((hash << 5) + hash) ^ c; + + return hash & (h->size - 1); } static int cil_strpool_compare(hashtab_t h __attribute__ ((unused)), const_hashtab_key_t key1, const_hashtab_key_t key2)