Some function had exceed 30 inline assembly register oprands limiation
when using LOONGSON2 version of MMI macros. We can avoid that by take
$at, which is register reserved for assembler, as temporary register.
As none of instructions used in these macros is pseudo, it is safe to
utilize $at here.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
{SAVE,RECOVER}_REG will be available for Loongson2 again,
also comment about the magic.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Optimize put_hevc_qpel_hv_8 with mmi in the case width=4/8/12/16/24/32/48/64.
This optimization improved HEVC decoding performance 11%(1.81x to 2.01x, tested on loongson 3A3000).
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Performance of mpeg4 decoding improved about 23%(from 128fps to 158fps, tested on loongson 3A3000).
Reoptimized following functions with mmi.
1. ff_simple_idct_put_8_mmi
2. ff_simple_idct_add_8_mmi
3. ff_simple_idct_8_mmi
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
1.mmiutils.h defined MMI_ load/store macros for loongson2e/2f/3a
2.mmiutils.h defined some mmi assembly macors
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>