Skip to content

arm64: mc: NEON implementation of emu_edge for 8bpc

Martin Storsjö requested to merge mstorsjo/dav1d:arm64-emuedge into master

Relative speedups over C code:

                     Cortex A53    A72    A73
emu_edge_w4_8bpc_neon:     3.82   2.93   2.41
emu_edge_w8_8bpc_neon:     3.28   2.86   2.51
emu_edge_w16_8bpc_neon:    3.58   3.27   2.63
emu_edge_w32_8bpc_neon:    3.04   1.68   2.12
emu_edge_w64_8bpc_neon:    2.58   1.45   1.48
emu_edge_w128_8bpc_neon:   1.79   1.02   1.57

The benchmark numbers for the larger size on A72 fluctuate a whole lot and thus seem very unreliable.

Merge request reports

Loading