libavfilter/signature_lookup: fix jaccard distance

Actually, the jaccard distance is defined as D = 1 - intersect / union.
Additionally, the distance value is compared against a constant that
must be between 0 and 1, which is not the case here. Both facts together
has led to the fact, that the function always returned a matching course
signature. To leave the constant intact and to avoid floating point
computation, this commit multiplies with 1 << 16 making the constant
effectively 9000 / (1<<16) =~ 0.14.

Reported-by: Sachin Tilloo <sachin.tilloo@gmail.com>
Reviewed-by: Sachin Tilloo <sachin.tilloo@gmail.com>
Tested-by: Sachin Tilloo <sachin.tilloo@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This commit is contained in:
Gerion Entrup 2024-06-02 14:02:53 +02:00 committed by Michael Niedermayer
parent 3152c684cb
commit 300df41c30
No known key found for this signature in database
GPG Key ID: B18E8928B3948D64
1 changed files with 2 additions and 1 deletions

View File

@ -127,9 +127,10 @@ static int get_jaccarddist(SignatureContext *sc, CoarseSignature *first, CoarseS
{ {
int jaccarddist, i, composdist = 0, cwthcount = 0; int jaccarddist, i, composdist = 0, cwthcount = 0;
for (i = 0; i < 5; i++) { for (i = 0; i < 5; i++) {
if ((jaccarddist = intersection_word(first->data[i], second->data[i])) > 0) { if ((jaccarddist = (1 << 16) * intersection_word(first->data[i], second->data[i])) > 0) {
jaccarddist /= union_word(first->data[i], second->data[i]); jaccarddist /= union_word(first->data[i], second->data[i]);
} }
jaccarddist = (1 << 16) - jaccarddist;
if (jaccarddist >= sc->thworddist) { if (jaccarddist >= sc->thworddist) {
if (++cwthcount > 2) { if (++cwthcount > 2) {
/* more than half (5/2) of distances are too wide */ /* more than half (5/2) of distances are too wide */