Skip to content

x86: add AVX512-IceLake implementation of HBD 64x32 DCT^2

Ronald S. Bultje requested to merge rbultje/dav1d:itx-avx512icl-hbd-64x32 into master
inv_txfm_add_64x32_dct_dct_0_10bpc_c:           1760.6 ( 1.00x)
inv_txfm_add_64x32_dct_dct_0_10bpc_sse4:         271.1 ( 6.49x)
inv_txfm_add_64x32_dct_dct_0_10bpc_avx2:         121.3 (14.52x)
inv_txfm_add_64x32_dct_dct_0_10bpc_avx512icl:    116.3 (15.14x)
inv_txfm_add_64x32_dct_dct_1_10bpc_c:          66507.4 ( 1.00x)
inv_txfm_add_64x32_dct_dct_1_10bpc_sse4:        3712.4 (17.91x)
inv_txfm_add_64x32_dct_dct_1_10bpc_avx2:        1830.5 (36.33x)
inv_txfm_add_64x32_dct_dct_1_10bpc_avx512icl:    805.4 (82.58x)
inv_txfm_add_64x32_dct_dct_2_10bpc_c:          66491.6 ( 1.00x)
inv_txfm_add_64x32_dct_dct_2_10bpc_sse4:        5325.3 (12.49x)
inv_txfm_add_64x32_dct_dct_2_10bpc_avx2:        2578.5 (25.79x)
inv_txfm_add_64x32_dct_dct_2_10bpc_avx512icl:   1394.5 (47.68x)
inv_txfm_add_64x32_dct_dct_3_10bpc_c:          66490.2 ( 1.00x)
inv_txfm_add_64x32_dct_dct_3_10bpc_sse4:        6418.5 (10.36x)
inv_txfm_add_64x32_dct_dct_3_10bpc_avx2:        3305.6 (20.11x)
inv_txfm_add_64x32_dct_dct_3_10bpc_avx512icl:   2571.5 (25.86x)
inv_txfm_add_64x32_dct_dct_4_10bpc_c:          66508.6 ( 1.00x)
inv_txfm_add_64x32_dct_dct_4_10bpc_sse4:        8671.2 ( 7.67x)
inv_txfm_add_64x32_dct_dct_4_10bpc_avx2:        4054.2 (16.40x)
inv_txfm_add_64x32_dct_dct_4_10bpc_avx512icl:   2691.6 (24.71x)

Merge request reports