Skip to content
  • Liwei Wang's avatar
    Add SSSE3 implementation for the {16, 32, 64}x64 and 64 x{16, 32} blocks in itx · 589e96a1
    Liwei Wang authored
    Cycle times:
    inv_txfm_add_16x64_dct_dct_0_8bpc_c: 3973.5
    inv_txfm_add_16x64_dct_dct_0_8bpc_ssse3: 185.7
    inv_txfm_add_16x64_dct_dct_1_8bpc_c: 37869.1
    inv_txfm_add_16x64_dct_dct_1_8bpc_ssse3: 2103.1
    inv_txfm_add_16x64_dct_dct_2_8bpc_c: 37822.9
    inv_txfm_add_16x64_dct_dct_2_8bpc_ssse3: 2099.1
    inv_txfm_add_16x64_dct_dct_3_8bpc_c: 37871.7
    inv_txfm_add_16x64_dct_dct_3_8bpc_ssse3: 2663.5
    inv_txfm_add_16x64_dct_dct_4_8bpc_c: 38002.9
    inv_txfm_add_16x64_dct_dct_4_8bpc_ssse3: 2589.7
    inv_txfm_add_32x64_dct_dct_0_8bpc_c: 8319.2
    inv_txfm_add_32x64_dct_dct_0_8bpc_ssse3: 376.9
    inv_txfm_add_32x64_dct_dct_1_8bpc_c: 85956.8
    inv_txfm_add_32x64_dct_dct_1_8bpc_ssse3: 4298.1
    inv_txfm_add_32x64_dct_dct_2_8bpc_c: 89906.2
    inv_txfm_add_32x64_dct_dct_2_8bpc_ssse3: 4291.3
    inv_txfm_add_32x64_dct_dct_3_8bpc_c: 83710.9
    inv_txfm_add_32x64_dct_dct_3_8bpc_ssse3: 5589.5
    inv_txfm_add_32x64_dct_dct_4_8bpc_c: 87733.5
    inv_txfm_add_32x64_dct_dct_4_8bpc_ssse3: 5658.4
    inv_txfm_add_64x16_dct_dct_0_8bpc_c: 3895.9
    inv_txfm_add_64x16_dct_dct_0_8bpc_ssse3: 179.5
    inv_txfm_add_64x16_dct_dct_1_8bpc_c: 51375.2
    inv_txfm_add_64x16_dct_dct_1_8bpc_ssse3: 3859.2
    inv_txfm_add_64x16_dct_dct_2_8bpc_c: 52562.9
    inv_txfm_add_64x16_dct_dct_2_8bpc_ssse3: 4044.1
    inv_txfm_add_64x16_dct_dct_3_8bpc_c: 51347.0
    inv_txfm_add_64x16_dct_dct_3_8bpc_ssse3: 5259.5
    inv_txfm_add_64x16_dct_dct_4_8bpc_c: 49642.2
    inv_txfm_add_64x16_dct_dct_4_8bpc_ssse3: 4008.4
    inv_txfm_add_64x32_dct_dct_0_8bpc_c: 7196.4
    inv_txfm_add_64x32_dct_dct_0_8bpc_ssse3: 355.8
    inv_txfm_add_64x32_dct_dct_1_8bpc_c: 106588.4
    inv_txfm_add_64x32_dct_dct_1_8bpc_ssse3: 4965.3
    inv_txfm_add_64x32_dct_dct_2_8bpc_c: 106230.7
    inv_txfm_add_64x32_dct_dct_2_8bpc_ssse3: 4772.0
    inv_txfm_add_64x32_dct_dct_3_8bpc_c: 107427.0
    inv_txfm_add_64x32_dct_dct_3_8bpc_ssse3: 7146.9
    inv_txfm_add_64x32_dct_dct_4_8bpc_c: 111785.7
    inv_txfm_add_64x32_dct_dct_4_8bpc_ssse3: 7156.2
    inv_txfm_add_64x64_dct_dct_0_8bpc_c: 14512.4
    inv_txfm_add_64x64_dct_dct_0_8bpc_ssse3: 674.2
    inv_txfm_add_64x64_dct_dct_1_8bpc_c: 173246.3
    inv_txfm_add_64x64_dct_dct_1_8bpc_ssse3: 8790.8
    inv_txfm_add_64x64_dct_dct_2_8bpc_c: 174264.6
    inv_txfm_add_64x64_dct_dct_2_8bpc_ssse3: 8767.6
    inv_txfm_add_64x64_dct_dct_3_8bpc_c: 170047.3
    inv_txfm_add_64x64_dct_dct_3_8bpc_ssse3: 10784.9
    inv_txfm_add_64x64_dct_dct_4_8bpc_c: 170182.2
    inv_txfm_add_64x64_dct_dct_4_8bpc_ssse3: 10795.6
    589e96a1