Skip to content

Add SSSE3 implementation for the 16x16 blocks in itx

Liwei Wang requested to merge liwei/dav1d:x86_itx_ssse3 into master

Cycle times:

inv_txfm_add_16x16_adst_adst_0_8bpc_c: 19643.8
inv_txfm_add_16x16_adst_adst_0_8bpc_ssse3: 870.0
inv_txfm_add_16x16_adst_adst_1_8bpc_c: 19611.7
inv_txfm_add_16x16_adst_adst_1_8bpc_ssse3: 870.3
inv_txfm_add_16x16_adst_adst_2_8bpc_c: 19554.2
inv_txfm_add_16x16_adst_adst_2_8bpc_ssse3: 869.9
inv_txfm_add_16x16_adst_dct_0_8bpc_c: 19499.2
inv_txfm_add_16x16_adst_dct_0_8bpc_ssse3: 761.1
inv_txfm_add_16x16_adst_dct_1_8bpc_c: 19819.1
inv_txfm_add_16x16_adst_dct_1_8bpc_ssse3: 760.9
inv_txfm_add_16x16_adst_dct_2_8bpc_c: 19684.5
inv_txfm_add_16x16_adst_dct_2_8bpc_ssse3: 761.4
inv_txfm_add_16x16_adst_flipadst_0_8bpc_c: 19309.3
inv_txfm_add_16x16_adst_flipadst_0_8bpc_ssse3: 877.2
inv_txfm_add_16x16_adst_flipadst_1_8bpc_c: 19374.3
inv_txfm_add_16x16_adst_flipadst_1_8bpc_ssse3: 876.8
inv_txfm_add_16x16_adst_flipadst_2_8bpc_c: 19548.6
inv_txfm_add_16x16_adst_flipadst_2_8bpc_ssse3: 879.4
inv_txfm_add_16x16_dct_adst_0_8bpc_c: 19715.3
inv_txfm_add_16x16_dct_adst_0_8bpc_ssse3: 757.6
inv_txfm_add_16x16_dct_adst_1_8bpc_c: 19586.6
inv_txfm_add_16x16_dct_adst_1_8bpc_ssse3: 756.8
inv_txfm_add_16x16_dct_adst_2_8bpc_c: 19447.3
inv_txfm_add_16x16_dct_adst_2_8bpc_ssse3: 757.2
inv_txfm_add_16x16_dct_dct_0_8bpc_c: 19188.0
inv_txfm_add_16x16_dct_dct_0_8bpc_ssse3: 64.3
inv_txfm_add_16x16_dct_dct_1_8bpc_c: 19230.1
inv_txfm_add_16x16_dct_dct_1_8bpc_ssse3: 649.1
inv_txfm_add_16x16_dct_dct_2_8bpc_c: 19276.7
inv_txfm_add_16x16_dct_dct_2_8bpc_ssse3: 649.5
inv_txfm_add_16x16_dct_flipadst_0_8bpc_c: 19967.8
inv_txfm_add_16x16_dct_flipadst_0_8bpc_ssse3: 761.1
inv_txfm_add_16x16_dct_flipadst_1_8bpc_c: 19665.7
inv_txfm_add_16x16_dct_flipadst_1_8bpc_ssse3: 761.0
inv_txfm_add_16x16_dct_flipadst_2_8bpc_c: 19766.2
inv_txfm_add_16x16_dct_flipadst_2_8bpc_ssse3: 760.6
inv_txfm_add_16x16_dct_identity_0_8bpc_c: 13874.5
inv_txfm_add_16x16_dct_identity_0_8bpc_ssse3: 97.3
inv_txfm_add_16x16_dct_identity_1_8bpc_c: 13931.8
inv_txfm_add_16x16_dct_identity_1_8bpc_ssse3: 76.3
inv_txfm_add_16x16_dct_identity_2_8bpc_c: 13801.5
inv_txfm_add_16x16_dct_identity_2_8bpc_ssse3: 454.6
inv_txfm_add_16x16_flipadst_adst_0_8bpc_c: 18900.6
inv_txfm_add_16x16_flipadst_adst_0_8bpc_ssse3: 884.6
inv_txfm_add_16x16_flipadst_adst_1_8bpc_c: 19180.2
inv_txfm_add_16x16_flipadst_adst_1_8bpc_ssse3: 886.7
inv_txfm_add_16x16_flipadst_adst_2_8bpc_c: 19320.8
inv_txfm_add_16x16_flipadst_adst_2_8bpc_ssse3: 884.6
inv_txfm_add_16x16_flipadst_dct_0_8bpc_c: 19399.7
inv_txfm_add_16x16_flipadst_dct_0_8bpc_ssse3: 775.0
inv_txfm_add_16x16_flipadst_dct_1_8bpc_c: 19345.0
inv_txfm_add_16x16_flipadst_dct_1_8bpc_ssse3: 774.6
inv_txfm_add_16x16_flipadst_dct_2_8bpc_c: 19426.2
inv_txfm_add_16x16_flipadst_dct_2_8bpc_ssse3: 775.6
inv_txfm_add_16x16_flipadst_flipadst_0_8bpc_c: 19457.6
inv_txfm_add_16x16_flipadst_flipadst_0_8bpc_ssse3: 887.8
inv_txfm_add_16x16_flipadst_flipadst_1_8bpc_c: 19413.8
inv_txfm_add_16x16_flipadst_flipadst_1_8bpc_ssse3: 885.3
inv_txfm_add_16x16_flipadst_flipadst_2_8bpc_c: 19425.6
inv_txfm_add_16x16_flipadst_flipadst_2_8bpc_ssse3: 886.3
inv_txfm_add_16x16_identity_dct_0_8bpc_c: 14150.7
inv_txfm_add_16x16_identity_dct_0_8bpc_ssse3: 104.3
inv_txfm_add_16x16_identity_dct_1_8bpc_c: 14041.5
inv_txfm_add_16x16_identity_dct_1_8bpc_ssse3: 104.2
inv_txfm_add_16x16_identity_dct_2_8bpc_c: 13917.7
inv_txfm_add_16x16_identity_dct_2_8bpc_ssse3: 459.7
inv_txfm_add_16x16_identity_identity_0_8bpc_c: 8761.7
inv_txfm_add_16x16_identity_identity_0_8bpc_ssse3: 263.3
inv_txfm_add_16x16_identity_identity_1_8bpc_c: 8669.5
inv_txfm_add_16x16_identity_identity_1_8bpc_ssse3: 263.4
inv_txfm_add_16x16_identity_identity_2_8bpc_c: 8282.1
inv_txfm_add_16x16_identity_identity_2_8bpc_ssse3: 263.3

Merge request reports