Skip to content

ARM64: Port msac improvements to more functions

Kyle Siefring requested to merge KyleSiefring/dav1d:arm64_msac_impr2 into master

Required changing the tests to zero pad cdfs.

Results - Neoverse N1

Old:
msac_decode_symbol_adapt4_c:         41.4 ( 1.00x)
msac_decode_symbol_adapt4_neon:      31.0 ( 1.34x)
msac_decode_symbol_adapt8_c:         54.5 ( 1.00x)
msac_decode_symbol_adapt8_neon:      32.2 ( 1.69x)
msac_decode_symbol_adapt16_c:        85.6 ( 1.00x)
msac_decode_symbol_adapt16_neon:     37.5 ( 2.28x)

New:
msac_decode_symbol_adapt4_c:         41.5 ( 1.00x)
msac_decode_symbol_adapt4_neon:      27.7 ( 1.50x)
msac_decode_symbol_adapt8_c:         55.7 ( 1.00x)
msac_decode_symbol_adapt8_neon:      30.1 ( 1.85x)
msac_decode_symbol_adapt16_c:        82.4 ( 1.00x)
msac_decode_symbol_adapt16_neon:     35.2 ( 2.34x)
Edited by Kyle Siefring

Merge request reports