Skip to content
  • Arpad Panyik's avatar
    AArch64: Specialise Neon convolutions for 6-tap filters · e51f4377
    Arpad Panyik authored
    The 8-tap sub-pel filters used for motion vector interpolation are:
    regular, smooth, sharp. The regular and smooth filter kernels are
    zero-padded, so they are effectively 6-tap filters (some of them are
    5-tap or even 4-tap).
    
    This patch specialises the put_8tap_neon and prep_8tap_neon functions
    for 6-tap filters, avoiding a lot of redundant work to multiply by
    and add zero. Wherever the sharp filtering is used the 8-tap path
    will be always selected.
    
    Benchmarking this on a broad range of recent CPUs shows a 7-15% FPS
    uplift.
    
    Get raw sample video:
    https://ultravideo.fi/video/Bosphorus_1920x1080_120fps_420_8bit_YUV_RAW.7z
    
    Encode using:
    aomenc --good --cpu-used=5 -w 1920 -h 1080 --bit-depth=8 --ivf -o Bosphorus_1080p_8bit.ivf Bosphorus_1920x1080_120fps_420_8bit_YUV.y4m
    e51f4377