Skip to content

Cfl ac simd

Ronald S. Bultje requested to merge rbultje/dav1d:cfl-ac-simd into master

Continuation of !441 (closed):

  • per-width versions;
  • use aligned reads/writes in sub_loop;
  • integrate sum_loop into the main loop;
  • special case w=8/16 if wpad != 0.

before:

    cfl_ac_420_w4_8bpc_c: 367.4
    cfl_ac_420_w4_8bpc_avx2: 72.8
    cfl_ac_420_w8_8bpc_c: 621.6
    cfl_ac_420_w8_8bpc_avx2: 85.1
    cfl_ac_420_w16_8bpc_c: 983.4
    cfl_ac_420_w16_8bpc_avx2: 141.0

after:

    cfl_ac_420_w4_8bpc_c: 376.2
    cfl_ac_420_w4_8bpc_avx2: 28.5
    cfl_ac_420_w8_8bpc_c: 607.2
    cfl_ac_420_w8_8bpc_avx2: 29.9
    cfl_ac_420_w16_8bpc_c: 962.1
    cfl_ac_420_w16_8bpc_avx2: 48.8
Edited by Ronald S. Bultje

Merge request reports