Commit Graph

6 Commits

Author SHA1 Message Date
Lynne bbe95f7353
x86: replace explicit REP_RETs with RETs
From x86inc:
> On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either
> a branch or a branch target. So switch to a 2-byte form of ret in that case.
> We can automatically detect "follows a branch", but not a branch target.
> (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.)

x86inc can automatically determine whether to use REP_RET rather than
REP in most of these cases, so impact is minimal. Additionally, a few
REP_RETs were used unnecessary, despite the return being nowhere near a
branch.

The only CPUs affected were AMD K10s, made between 2007 and 2011, 16
years ago and 12 years ago, respectively.

In the future, everyone involved with x86inc should consider dropping
REP_RETs altogether.
2023-02-01 04:23:55 +01:00
Paul B Mahol dae95b3ffd avfilter/vf_maskedmerge: fix rounding when masking 2022-03-03 09:57:53 +01:00
James Almer ce4c85de6a x86/vf_maskedmerge: make ff_maskedmerge8_sse2 work on x86_32
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-12-24 13:05:18 -03:00
Michael Niedermayer e42e0b11f1 avfilter/x86/vf_maskedmerge: Clear upper part of width
Fixes crash
Fixes: Ticket5055

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-23 22:38:15 +01:00
Paul B Mahol 45938f0301 avfilter/x86/vf_maskedmerge: move %define out of .nextrow
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-12-10 09:52:04 +01:00
Paul B Mahol 160556c9ad avfilter/vf_maskedmerge: add SIMD for maskedmerge with 8 bit depth input
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-10-02 17:40:57 +02:00