Skip to content

arm32: mc: NEON implementation of warp8x8 for 16 bpc

Martin Storsjö requested to merge mstorsjo/dav1d:arm32-warp16 into master

Checkasm benchmarks:

                    Cortex A7      A8     A53     A72     A73
warp_8x8_16bpc_neon:   4062.6  2109.4  2462.0  1338.9  1391.1
warp_8x8t_16bpc_neon:  3996.3  2102.4  2412.0  1273.8  1368.9

Corresponding numbers for arm64, for comparison:

                                   Cortex A53     A72     A73
warp_8x8_16bpc_neon:                   2037.0  1148.8  1222.0
warp_8x8t_16bpc_neon:                  2008.0  1120.4  1200.9

Merge request reports