• Martin Storsjö's avatar
    arm: mc: Port the ARM64 warp filter to arm32 · 61442bee
    Martin Storsjö authored
    Relative speedup over C code:
                      Cortex A7     A8     A9    A53    A72    A73
    warp_8x8_8bpc_neon:    2.79   5.45   4.18   3.96   4.16   4.51
    warp_8x8t_8bpc_neon:   2.79   5.33   4.18   3.98   4.22   4.25
    
    Comparison to original ARM64 assembly:
    
    ARM64:            Cortex A53     A72     A73
    warp_8x8_8bpc_neon:   1854.6  1072.5  1102.5
    warp_8x8t_8bpc_neon:  1839.6  1069.4  1089.5
    ARM32:
    warp_8x8_8bpc_neon:   2132.5  1160.3  1218.0
    warp_8x8t_8bpc_neon:  2113.7  1148.0  1209.1
    61442bee