1. 13 Feb, 2019 1 commit
  2. 19 Jan, 2019 1 commit
  3. 13 Jan, 2019 1 commit
  4. 25 Nov, 2018 1 commit
  5. 19 Nov, 2018 1 commit
    • Niklas Haas's avatar
      film_grain: implement film grain synthesis · cfa986fe
      Niklas Haas authored
      This is using a slightly adapted version of my GPU-based algorithm. The
      major difference to the algorithm suggested by the spec (and implemented
      in libaom) is that instead of using a line buffer to hold the previous
      row's film grain blocks, we compute each row/block fully independently.
      
      This opens up the door to exploit parallelism in the future, since we
      don't have any left->right or top->down dependency except for the PRNG
      state. (Which we could pre-compute for a massively parallel / GPU
      implementation)
      
      That being said, it's probably somewhat slower than using a line buffer
      for the serial / single CPU case, although most likely not by much
      (since the areas with the most redundant work get progressively smaller,
      down to a single 2x2 square for the worst case).
      cfa986fe
  6. 16 Nov, 2018 1 commit
  7. 08 Nov, 2018 1 commit
  8. 07 Nov, 2018 1 commit
    • Janne Grunau's avatar
      recon: fix bilinear entry in dav1d_filter_dir table · 55d512c7
      Janne Grunau authored
      Fixes #152, #153. Fixes a global buffer overflow in obmc() with
      clusterfuzz-testcase-dav1d_fuzzer_mt-5702455078158336 and an ubsan
      Index-out-of-bounds error in dav1d_recon_b_inter_8bpc() with
      clusterfuzz-testcase-minimized-dav1d_fuzzer_mt-5688109887389696. Credits
      to oss-fuzz.
      55d512c7
  9. 05 Nov, 2018 2 commits
    • Henrik Gramner's avatar
      Reorder the mc warp filter array · a0692eb8
      Henrik Gramner authored
      Required to be able to use pmaddubsw without overflow in the x86 SIMD.
      a0692eb8
    • Ronald S. Bultje's avatar
      Add AVX2 implementation for SGR looprestoration · 4a499fd5
      Ronald S. Bultje authored
      Total decoding time for first 1000 frames of TwxVOYxoukU:
      after: 0m3.761s
      before: 0m6.868s
      
      Cycle times:
      selfguided_3x3_8bpc_c: 438865.8
      selfguided_3x3_8bpc_avx2: 112522.6
      selfguided_5x5_8bpc_c: 326938.3
      selfguided_5x5_8bpc_avx2: 75850.1
      selfguided_mix_8bpc_c: 755980.5
      selfguided_mix_8bpc_avx2: 195930.3
      4a499fd5
  10. 31 Oct, 2018 1 commit
    • Luc Trudeau's avatar
      Remove dav1d_sgr_one_by_x · 285d1b76
      Luc Trudeau authored
      Since n equals either 25 or 9, the dav1d_sgr_one_by_x table can be
      replaced with a ternary operation.
      285d1b76
  11. 20 Oct, 2018 1 commit
  12. 05 Oct, 2018 1 commit
  13. 04 Oct, 2018 1 commit
  14. 27 Sep, 2018 1 commit
    • Henrik Gramner's avatar
      Downshift mc subpel multiplier constants · 14072e73
      Henrik Gramner authored
      Downshift all the constants by one, and reduce the rounding shift by one.
      This is mathematically equivalent since all constants are a multiple of two,
      but allows for using 16-bit intermediates in the 1st pass of the 8-tap filter.
      14072e73
  15. 22 Sep, 2018 1 commit
    • Ronald S. Bultje's avatar
      Initial decoder implementation. · e2892ffa
      Ronald S. Bultje authored
      With minor contributions from:
       - Jean-Baptiste Kempf <jb@videolan.org>
       - Marvin Scholz <epirat07@gmail.com>
       - Hugo Beauzée-Luyssen <hugo@videolan.org>
      e2892ffa