* commit '711781d7a1714ea4eb0217eb1ba04811978c43d1':
x86: checkasm: check for or handle missing cleanup after MMX instructions
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
These aren't quite as helpful as the ones in 8bpp, since over there,
we can use pmulhrsw, but here the coefficients have too many bits to
be able to take advantage of pmulhrsw. However, we can still skip
cols for which all coefs are 0, and instead just zero the input data
for the row itx. This helps a few % on overall decoding speed.
The randomize_buffer() implementation assures that "most of the time",
we'll do a good mix of wide16/wide8/hev/regular/no filters for complete
code coverage. However, this is not mathematically assured because that
would make the code either much more complex, or much less random.