Skip to content

arm64: mc: NEON implementation of warp8x8{,t}

Martin Storsjö requested to merge mstorsjo/dav1d:arm64-warp into master

Relative speedup vs C code:

                 Cortex A53    A72    A73
warp_8x8_8bpc_neon:    3.19   2.60   3.66
warp_8x8t_8bpc_neon:   3.09   2.50   3.58

@gramner I'm making the warp filter table order conditional in tables.c/mc_tmpl.c here, which effectively reverts a0692eb8 for other architectures than x86. The order that is beneficial for x86 SIMD is not beneficial for other architectures.

For a NEON implementation of the warp filter, reordering the filter coefficients back in the right order took 1/4 of the filter runtime.

Edited by Martin Storsjö

Merge request reports