• Martin Storsjö's avatar
    arm64: mc: NEON implementation of emu_edge for 8bpc · ea54dbe2
    Martin Storsjö authored
    Relative speedups over C code:
                         Cortex A53    A72    A73
    emu_edge_w4_8bpc_neon:     3.82   2.93   2.41
    emu_edge_w8_8bpc_neon:     3.28   2.86   2.51
    emu_edge_w16_8bpc_neon:    3.58   3.27   2.63
    emu_edge_w32_8bpc_neon:    3.04   1.68   2.12
    emu_edge_w64_8bpc_neon:    2.58   1.45   1.48
    emu_edge_w128_8bpc_neon:   1.79   1.02   1.57
    
    The benchmark numbers for the larger size on A72 fluctuate a
    whole lot and thus seem very unreliable.
    ea54dbe2
mc.S 118 KB