Skip to content
Snippets Groups Projects
Commit 1b7f1263 authored by Martin Storsjö's avatar Martin Storsjö
Browse files

arm32: looprestoration: Apply simplifications to align with C code

This applies the same simplifications that were done for the C
code and the x86 assembly in 4613d3a5,
and the arm64 assembly in ce80e6da,
to the arm32 implementation.

This gives a minor speedup of around a couple percent.

Before:             Cortex A7         A8        A53        A72        A73
sgr_3x3_8bpc_neon:   926600.0   753468.3   553704.1   399379.1   369674.4
sgr_5x5_8bpc_neon:   621722.9   540412.7   357275.9   274474.3   254996.0
sgr_mix_8bpc_neon:  1529715.1  1171282.5   894982.9   659996.6   610407.2
After:
sgr_3x3_8bpc_neon:   899020.3   697278.6   541569.9   382824.3   353891.8
sgr_5x5_8bpc_neon:   602183.2   498322.9   348974.5   264833.9   243837.7
sgr_mix_8bpc_neon:  1497870.8  1182121.3   880470.9   635939.3   590909.3
parent c43debf1
No related branches found
No related tags found
1 merge request!1765arm32: looprestoration: Rewrite the SGR functions
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment