• Janne Grunau's avatar
    aarch64: Optimize various intra_predict asm functions · aec81efd
    Janne Grunau authored
    Make them at least as fast as the compiled C version (tested on
    cortex-a53 vs. gcc 4.9.2).
    
                            C     NEON (before)   NEON (after)
    intra_predict_4x4_dc:   260   335             260
    intra_predict_4x4_dct:  210   265             200
    intra_predict_8x8c_dc:  497   548             493
    intra_predict_8x8c_v:   232   309             179 (arm64)
    intra_predict_8x16c_dc: 795   830             790
    aec81efd
predict.h 2.72 KB