Skip to content

Add SSSE3 implementation for the 4x4 blocks in itx

Liwei Wang requested to merge liwei/dav1d:x86_ssse3 into master

Cycle times:

inv_txfm_add_4x4_adst_adst_0_8bpc_c: 446.2
inv_txfm_add_4x4_adst_adst_0_8bpc_ssse3 : 31.4
inv_txfm_add_4x4_adst_adst_1_8bpc_c : 663.1
inv_txfm_add_4x4_adst_adst_1_8bpc_ssse3 : 55.0
inv_txfm_add_4x4_adst_dct_0_8bpc_c : 472.2
inv_txfm_add_4x4_adst_dct_0_8bpc_ssse3 : 24.1
inv_txfm_add_4x4_adst_dct_1_8bpc_c : 469.5
inv_txfm_add_4x4_adst_dct_1_8bpc_ssse3 : 54.5
inv_txfm_add_4x4_adst_flipadst_0_8bpc_c : 454.9
inv_txfm_add_4x4_adst_flipadst_0_8bpc_ssse3 : 23.9
inv_txfm_add_4x4_adst_flipadst_1_8bpc_c : 457.2
inv_txfm_add_4x4_adst_flipadst_1_8bpc_ssse3 : 53.1
inv_txfm_add_4x4_adst_identity_0_8bpc_c : 454.3
inv_txfm_add_4x4_adst_identity_0_8bpc_ssse3 : 44.1
inv_txfm_add_4x4_adst_identity_1_8bpc_c : 411.6
inv_txfm_add_4x4_adst_identity_1_8bpc_ssse3 : 53.1
inv_txfm_add_4x4_dct_adst_0_8bpc_c : 468.4
inv_txfm_add_4x4_dct_adst_0_8bpc_ssse3 : 23.2
inv_txfm_add_4x4_dct_adst_1_8bpc_c : 467.4
inv_txfm_add_4x4_dct_adst_1_8bpc_ssse3 : 47.8
inv_txfm_add_4x4_dct_dct_0_8bpc_c : 644.1
inv_txfm_add_4x4_dct_dct_0_8bpc_ssse3 : 23.2
inv_txfm_add_4x4_dct_dct_1_8bpc_c : 494.3
inv_txfm_add_4x4_dct_dct_1_8bpc_ssse3 : 48.2
inv_txfm_add_4x4_dct_flipadst_0_8bpc_c : 688.2
inv_txfm_add_4x4_dct_flipadst_0_8bpc_ssse3 : 24.7
inv_txfm_add_4x4_dct_flipadst_1_8bpc_c : 478.5
inv_txfm_add_4x4_dct_flipadst_1_8bpc_ssse3 : 49.5
inv_txfm_add_4x4_dct_identity_0_8bpc_c : 436.1
inv_txfm_add_4x4_dct_identity_0_8bpc_ssse3 : 26.6
inv_txfm_add_4x4_dct_identity_1_8bpc_c : 434.6
inv_txfm_add_4x4_dct_identity_1_8bpc_ssse3 : 36.6
inv_txfm_add_4x4_flipadst_adst_0_8bpc_c : 460.5
inv_txfm_add_4x4_flipadst_adst_0_8bpc_ssse3 : 24.0
inv_txfm_add_4x4_flipadst_adst_1_8bpc_c : 459.9
inv_txfm_add_4x4_flipadst_adst_1_8bpc_ssse3 : 53.1
inv_txfm_add_4x4_flipadst_dct_0_8bpc_c : 481.4
inv_txfm_add_4x4_flipadst_dct_0_8bpc_ssse3 : 23.7
inv_txfm_add_4x4_flipadst_dct_1_8bpc_c : 492.2
inv_txfm_add_4x4_flipadst_dct_1_8bpc_ssse3 : 51.7
inv_txfm_add_4x4_flipadst_flipadst_0_8bpc_c : 495.7
inv_txfm_add_4x4_flipadst_flipadst_0_8bpc_ssse3 : 24.0
inv_txfm_add_4x4_flipadst_flipadst_1_8bpc_c : 461.2
inv_txfm_add_4x4_flipadst_flipadst_1_8bpc_ssse3 : 52.6
inv_txfm_add_4x4_flipadst_identity_0_8bpc_c : 421.2
inv_txfm_add_4x4_flipadst_identity_0_8bpc_ssse3 : 43.3
inv_txfm_add_4x4_flipadst_identity_1_8bpc_c : 420.0
inv_txfm_add_4x4_flipadst_identity_1_8bpc_ssse3 : 44.7
inv_txfm_add_4x4_identity_adst_0_8bpc_c : 408.1
inv_txfm_add_4x4_identity_adst_0_8bpc_ssse3 : 41.8
inv_txfm_add_4x4_identity_adst_1_8bpc_c : 409.3
inv_txfm_add_4x4_identity_adst_1_8bpc_ssse3 : 42.3
inv_txfm_add_4x4_identity_dct_0_8bpc_c : 435.1
inv_txfm_add_4x4_identity_dct_0_8bpc_ssse3 : 25.4
inv_txfm_add_4x4_identity_dct_1_8bpc_c : 436.0
inv_txfm_add_4x4_identity_dct_1_8bpc_ssse3 : 37.9
inv_txfm_add_4x4_identity_flipadst_0_8bpc_c : 426.0
inv_txfm_add_4x4_identity_flipadst_0_8bpc_ssse3 : 56.3
inv_txfm_add_4x4_identity_flipadst_1_8bpc_c : 428.2
inv_txfm_add_4x4_identity_flipadst_1_8bpc_ssse3 : 52.1
inv_txfm_add_4x4_identity_identity_0_8bpc_c : 376.6
inv_txfm_add_4x4_identity_identity_0_8bpc_ssse3 : 29.6
inv_txfm_add_4x4_identity_identity_1_8bpc_c : 382.5
inv_txfm_add_4x4_identity_identity_1_8bpc_ssse3 : 28.4
inv_txfm_add_4x4_wht_wht_0_8bpc_c : 268.3
inv_txfm_add_4x4_wht_wht_0_8bpc_ssse3 : 34.4
inv_txfm_add_4x4_wht_wht_1_8bpc_c : 373.5
inv_txfm_add_4x4_wht_wht_1_8bpc_ssse3 : 34.5
Edited by Ronald S. Bultje

Merge request reports