1. 05 Dec, 2018 1 commit
    • Ronald S. Bultje's avatar
      Change type of MC intermediates from coef to int16_t · 2e6c8a92
      Ronald S. Bultje authored
      Coef was originally chosen to accomodate 12 bits/component with
      4 extra precision intermediates + some under/overflow range, but
      it turns out that 12 bits/component only uses 2 extra precision
      intermediates, so we don't need coef.
      2e6c8a92
  2. 16 Nov, 2018 1 commit
  3. 13 Nov, 2018 1 commit
  4. 10 Nov, 2018 1 commit
    • Henrik Gramner's avatar
      Split MC blend · 58fc5165
      Henrik Gramner authored
      The mstride == 0, mstride == 1, and mstride == w cases are very different
      from each other, and splitting them into separate functions makes it easier
      top optimize them.
      
      Also add some further optimizations to the AVX2 asm that became possible
      after this change.
      58fc5165
  5. 07 Nov, 2018 1 commit
  6. 06 Nov, 2018 1 commit
  7. 03 Nov, 2018 1 commit
  8. 20 Oct, 2018 1 commit
    • Janne Grunau's avatar
      arm64/mc: add 8-bit neon asm for avg, w_avg and mask · 80e47425
      Janne Grunau authored
      checkasm --bench on a Qualcomm Kryo (Sanpdragon 820):
      nop: 33.0
      avg_w4_8bpc_c: 450.5
      avg_w4_8bpc_neon: 20.1
      avg_w8_8bpc_c: 438.6
      avg_w8_8bpc_neon: 45.2
      avg_w16_8bpc_c: 1003.7
      avg_w16_8bpc_neon: 112.8
      avg_w32_8bpc_c: 3249.6
      avg_w32_8bpc_neon: 429.9
      avg_w64_8bpc_c: 7213.3
      avg_w64_8bpc_neon: 1299.4
      avg_w128_8bpc_c: 16791.3
      avg_w128_8bpc_neon: 2978.4
      w_avg_w4_8bpc_c: 605.7
      w_avg_w4_8bpc_neon: 30.9
      w_avg_w8_8bpc_c: 545.8
      w_avg_w8_8bpc_neon: 72.9
      w_avg_w16_8bpc_c: 1430.1
      w_avg_w16_8bpc_neon: 193.5
      w_avg_w32_8bpc_c: 4876.3
      w_avg_w32_8bpc_neon: 715.3
      w_avg_w64_8bpc_c: 11338.0
      w_avg_w64_8bpc_neon: 2147.0
      w_avg_w128_8bpc_c: 26822.0
      w_avg_w128_8bpc_neon: 4596.3
      mask_w4_8bpc_c: 604.6
      mask_w4_8bpc_neon: 37.2
      mask_w8_8bpc_c: 654.8
      mask_w8_8bpc_neon: 96.0
      mask_w16_8bpc_c: 1663.0
      mask_w16_8bpc_neon: 272.4
      mask_w32_8bpc_c: 5707.6
      mask_w32_8bpc_neon: 1028.9
      mask_w64_8bpc_c: 12735.3
      mask_w64_8bpc_neon: 2533.2
      mask_w128_8bpc_c: 31027.6
      mask_w128_8bpc_neon: 6247.2
      80e47425
  9. 27 Sep, 2018 1 commit
  10. 22 Sep, 2018 1 commit
    • Ronald S. Bultje's avatar
      Initial decoder implementation. · e2892ffa
      Ronald S. Bultje authored
      With minor contributions from:
       - Jean-Baptiste Kempf <jb@videolan.org>
       - Marvin Scholz <epirat07@gmail.com>
       - Hugo Beauzée-Luyssen <hugo@videolan.org>
      e2892ffa