arm: looprestoration: NEON optimized wiener filter
The relative speedup compared to C code is around 4-8x: Cortex A7 A8 A9 A53 A72 A73 wiener_luma_8bpc_neon: 4.00 7.54 4.74 6.84 4.91 8.01
src/arm/32/looprestoration.S
0 → 100644
The relative speedup compared to C code is around 4-8x: Cortex A7 A8 A9 A53 A72 A73 wiener_luma_8bpc_neon: 4.00 7.54 4.74 6.84 4.91 8.01