Skip to content

x86: Add 6-tap variants of 8bpc mc AVX2 functions

Henrik Gramner requested to merge gramner/dav1d:x86_6tap_mc_8bpc_avx2 into master

Overall decoding performance increases by up to 10% depending on the input when using AVX2.

Checkasm numbers on Zen 4:

          8-tap    6-tap
w2_v       18.2     16.0
w2_hv      32.3     29.7

w4_v       17.5     14.9
w4_hv      36.9     33.6

w8_h       21.5     17.1
w8_v       19.1     16.9
w8_hv      65.6     51.5

w16_h      48.1     37.4
w16_v      37.2     31.1
w16_hv    170.8    134.1

w32_h     130.9     96.8
w32_v     107.6     89.9
w32_hv    509.0    400.1

w64_h     462.5    343.3
w64_v     368.5    305.8
w64_hv   1738.7   1375.8

w128_h   1314.2    977.5
w128_v   1068.6    903.2
w128_hv  4874.8   3866.1

Merge request reports