1. 10 Nov, 2018 1 commit
    • Henrik Gramner's avatar
      Split MC blend · 58fc5165
      Henrik Gramner authored
      The mstride == 0, mstride == 1, and mstride == w cases are very different
      from each other, and splitting them into separate functions makes it easier
      top optimize them.
      
      Also add some further optimizations to the AVX2 asm that became possible
      after this change.
      58fc5165
  2. 08 Nov, 2018 1 commit
  3. 07 Nov, 2018 3 commits
  4. 06 Nov, 2018 3 commits
  5. 03 Nov, 2018 1 commit
  6. 01 Nov, 2018 1 commit
  7. 31 Oct, 2018 1 commit
  8. 30 Oct, 2018 1 commit
    • Rostislav Pehlivanov's avatar
      Rewrite msac.c · 33d16d81
      Rostislav Pehlivanov authored
      This rewrites msac.c to the point of there being no libaom project
      code left, hence changing the license of the file to the dav1d
      project's license.
      
      The rewrite much simplifies and optimizes entropy decoding.
      Some encoder specific code also remained, such as tell_offs, to
      tell the fractional amount of bits left, which the decoder does
      not need.
      
      ctx_refill is much simpler and has a tighter loop with less
      instructions, which on some CPUs can actually be ran in one cycle.
      The old mechanism which checked to see if the buffer reached the
      end to disable calling ctx_refill is gone, as all it saved was
      a mostly well predicted branch.
      The optimizations regarading this function enabled us to use
      an ec_win of 64 bits whilst improving performance. This was not
      possible with the old needlessly robust system.
      
      Some msac-specific API changes were made - msac_decode_bool now
      takes a scaled value directly rather than doing scaling itself.
      This saves a shift in most use cases as the function is mainly
      used to read equiprobable bools rather than ones with specific
      probabilities.
      
      There's still room for optimizations, mainly in that update_cdf
      could be SIMD'd. This commit prepares for some of them by
      moving the init function at the bottom of the file.
      
      Overall decoder speedup seems to be around 3%-5%, specific on
      bitrate and encoder as expected.
      33d16d81
  9. 25 Oct, 2018 1 commit
  10. 22 Oct, 2018 3 commits
  11. 19 Oct, 2018 1 commit
  12. 14 Oct, 2018 4 commits
  13. 08 Oct, 2018 2 commits
  14. 05 Oct, 2018 1 commit
  15. 04 Oct, 2018 2 commits
  16. 01 Oct, 2018 1 commit
  17. 29 Sep, 2018 3 commits
  18. 27 Sep, 2018 1 commit
  19. 25 Sep, 2018 3 commits
  20. 24 Sep, 2018 2 commits
  21. 22 Sep, 2018 1 commit
    • Ronald S. Bultje's avatar
      Initial decoder implementation. · e2892ffa
      Ronald S. Bultje authored
      With minor contributions from:
       - Jean-Baptiste Kempf <jb@videolan.org>
       - Marvin Scholz <epirat07@gmail.com>
       - Hugo Beauzée-Luyssen <hugo@videolan.org>
      e2892ffa