Commit Graph

10 Commits

Author SHA1 Message Date
Jiaxun Yang a1cd62883f avutil/mips: Use $at as MMI macro temporary register
Some function had exceed 30 inline assembly register oprands limiation
when using LOONGSON2 version of MMI macros. We can avoid that by take
$at, which is register reserved for assembler, as temporary register.

As none of instructions used in these macros is pseudo, it is safe to
utilize $at here.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-07-28 23:31:48 +02:00
Jiaxun Yang b868272d7e avutil/mips: Use MMI_{L, S}QC1 macro in {SAVE, RECOVER}_REG
{SAVE,RECOVER}_REG will be available for Loongson2 again,
also comment about the magic.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-07-28 23:31:48 +02:00
Anton Khirnov c8c2dfbc37 lavu: move LOCAL_ALIGNED from internal.h to mem_internal.h
That is a more appropriate place for it.
2021-01-01 14:11:01 +01:00
Shiyou Yin 11f99a9a45 avutil/mips: Avoid instruction exception caused by gssqc1/gslqc1.
Ensure the address accesed by gssqc1/gslqc1 are 16-byte aligned.
2019-08-02 19:01:51 +02:00
gxw 4571c7c05d avcodec/mips: [loongson] mmi optimizations for VP9 put and avg functions
VP9 decoding speed improved about 60.5%(from 38fps to 61fps, tested on loongson 3A3000).

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-02-27 01:51:40 +01:00
Shiyou Yin 6d19164811 avcodec/mips: [loongson] optimize put_hevc_qpel_hv_8 with mmi.
Optimize put_hevc_qpel_hv_8 with mmi in the case width=4/8/12/16/24/32/48/64.
This optimization improved HEVC decoding performance 11%(1.81x to 2.01x, tested on loongson 3A3000).

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-01-22 00:46:36 +01:00
Shiyou Yin 5161f7bcfd avutil/mips: [loongson] simplify macro TRANSPOSE_4H and TRANSPOSE_8B
Simplify macro TRANSPOSE_4H in mmiutils.h and add TRANSPOSE_8B as a common macro.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-09 12:01:07 +02:00
gxw 090647da84 avcodec/mips: [loongson] optimize vp8 decoding in vp8dsp.
Optimize vp8 loop filter with mmi, four functions optimized:
1. ff_vp8_h_loop_filter8uv_mmi.
2. ff_vp8_v_loop_filter8uv_mmi.
3. ff_vp8_h_loop_filter16_mmi.
4. ff_vp8_v_loop_filter16_mmi.

Vp8 decoding speed improved about 50%(from 73fps to 110fps, Tested on loongson 3A3000).

Signed-off-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-09 12:01:07 +02:00
Shiyou Yin df13b75aa1 avcodec/mips: [loongson] reoptimize simple idct with mmi.
Performance of mpeg4 decoding improved about 23%(from 128fps to 158fps, tested on loongson 3A3000).
Reoptimized following functions with mmi.
1. ff_simple_idct_put_8_mmi
2. ff_simple_idct_add_8_mmi
3. ff_simple_idct_8_mmi

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-02 03:37:32 +02:00
Zhou Xiaoyong b9cd922660 avutil/mips: loongson add mmi utils header file
1.mmiutils.h defined MMI_ load/store macros for loongson2e/2f/3a
2.mmiutils.h defined some mmi assembly macors

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-10-23 03:23:09 +02:00