Skip to content
Snippets Groups Projects
  1. Jan 12, 2023
  2. Dec 14, 2022
  3. Dec 13, 2022
  4. Dec 09, 2022
  5. Dec 04, 2022
  6. Nov 21, 2022
  7. Nov 10, 2022
  8. Oct 30, 2022
  9. Oct 27, 2022
  10. Oct 26, 2022
  11. Oct 20, 2022
    • Victorien Le Couviour--Tuffet's avatar
      threading: Fix a race around frame completion (frame-mt) · 3e7886db
      Victorien Le Couviour--Tuffet authored
      The completion of the first frame to decode while an async reset
      request on that same frame is pending will render it stale. The
      processing of such a stale request is likely to result in a hang.
      
      One reason this happens is the skip condition at the beginning of
      reset_task_cur().
      => Consume the async request before that check.
      
      Another reason is several threads producing async reset requests in
      parallel: an async request for the first frame could cascade through the
      other threads (other frames) during completion of that frame, meaning
      not being caught by the last synchronous reset_task_cur() after
      signaling the main thread and before releasing the lock.
      => To solve this we need to add protections at the racy locations. That
      means after we increase first, before returning from
      reset_task_cur_async(), and after consuming the async request.
      3e7886db
  12. Oct 10, 2022
    • Sebastian Dröge's avatar
      Handle host_machine.system() 'ios' and 'tvos' the same way as 'darwin' · 5b07b425
      Sebastian Dröge authored
      Despite not being documented in Meson's list of canonical system names,
      Meson does accept 'ios' mostly a synonym for darwin.
      
      By using 'ios' instead of darwin, it allows distinguishing between the
      two in the cases where that is necessary. Therefore, within dav1d, allow
      using the 'ios' name as alias for 'darwin' for system name, to allow
      using cross files that does this distinction.
      
      meson itself also allows 'tvos' in addition to 'ios' in the internal
      `is_darwin()` function, as such all 3 are handled the same here.
      5b07b425
  13. Sep 30, 2022
  14. Sep 28, 2022
  15. Sep 26, 2022
  16. Sep 19, 2022
    • Martin Storsjö's avatar
      arm: itx: Add clipping to row_clip_min/max in the 10 bpc codepaths · 345127a7
      Martin Storsjö authored
      This fixes conformance with the argon test samples, in particular
      with these samples:
          profile0_core/streams/test10100_579_8614.obu
          profile0_core/streams/test10218_6914.obu
      
      This gives a pretty notable slowdown to these transforms - some
      examples:
      
      Before:                                 Cortex A53       A72       A73    Apple M1
      inv_txfm_add_8x8_dct_dct_1_10bpc_neon:       365.7     290.2     299.8    0.3
      inv_txfm_add_16x16_dct_dct_2_10bpc_neon:    1865.2    1384.1    1457.5    2.6
      inv_txfm_add_64x64_dct_dct_4_10bpc_neon:   33976.3   26817.0   24864.2   40.4
      After:
      inv_txfm_add_8x8_dct_dct_1_10bpc_neon:       397.7     322.2     335.1    0.4
      inv_txfm_add_16x16_dct_dct_2_10bpc_neon:    2121.9    1336.7    1664.6    2.6
      inv_txfm_add_64x64_dct_dct_4_10bpc_neon:   38569.4   27622.6   28176.0   51.0
      
      Thus, for the transforms alone, it makes them around 10-13% slower
      (the Apple M1 measurements are too noisy to be conclusive here).
      
      Measured on actual full decoding, it makes decoding of 10 bpc
      Chimera around maybe 1% slower on an Apple M1 - close to measurement
      noise anyway.
      345127a7
    • Henrik Gramner's avatar
      9c74a9b0
    • Henrik Gramner's avatar
      x86: Fix overflows in 12bpc AVX2 DC-only IDCT · 49b1c3c5
      Henrik Gramner authored
      Using smaller immediates also results in a small code size reduction in
      some cases, so apply those changes to the (10bpc-only) SSE code as well.
      49b1c3c5
    • Henrik Gramner's avatar
      x86: Fix clipping in high bit-depth AVX2 4x16 IDCT · 0c8a3461
      Henrik Gramner authored
      Certain clips were incorrectly performed on negated values, which
      caused things to be off-by-one in both directions. Correct this by
      negating such values prior to clipping instead of afterwards.
      0c8a3461
  17. Sep 15, 2022
    • Martin Storsjö's avatar
      Don't use gas-preprocessor with clang-cl for arm targets · cc9651f5
      Martin Storsjö authored
      Since meson 0.58.0 (released in May 2021), meson accepts adding '.S'
      assembly files as source files to the clang-cl compiler.
      
      If using an older version of meson, keep using gas-preprocessor
      just like for MSVC builds.
      cc9651f5
    • David Conrad's avatar
      Fix checking the reference dimesions for the projection process · d4a2b75d
      David Conrad authored
      Section 7.9.2 returns 0 "If RefMiRows[ srcIdx ] is not equal to MiRows,
      RefMiCols[ srcIdx ] is not equal to MiCols"
      
      dav1d was comparing pixel width/height, not block width/height,
      so conform with the spec
      d4a2b75d
    • David Conrad's avatar
      Fix calculation of OBMC lap dimensions · eb25f00c
      David Conrad authored
      Individual OBMC lapped predictions have a max width of 64 pixels
      for the top lap and have a max height of 64 for the left laps
      
      This is 7.11.3.9. Overlapped motion compensation process
      step4 = Clip3( 2, 16, Num_4x4_Blocks_Wide[ candSz ] )
      
      dav1d wasn't clipping this as needed, which means that with scaled MC, the
      interpolation of the 2nd half of a 128 block was incorrect, since mx/my
      for subpel filter selection need to be reset at the 64 pixel boundary
      eb25f00c
    • David Conrad's avatar
      Support film grain application whose only effect is clipping to video range · 10f5ce54
      David Conrad authored
      This is the parameter combination:
      num_y_points == 0 && num_cb_points == 0 && num_cr_points == 0 &&
      chroma_scaling_from_luma == 1 && clip_to_restricted_range == 1
      
      Film grain application has two effects: adding noise, and optionally
      clipping to video range
      
      For luma, the spec skips film grain application if there's no noise
      (num_y_points == 0), but for chroma, it's only skipped if there's no
      chroma noise *and* chroma_scaling_from_luma is false
      
      This means it's possible for there to be no noise (num_*_points = 0), but
      if clip_to_restricted_range is true then chroma pixels can be clipped to
      video range, if chroma_scaling_from_luma is true. Luma pixels, however,
      aren't clipped to video range unless there's noise to apply.
      dav1d currently skips applying film grain entirely if there is no noise,
      regardless of the secondary clipping.
      10f5ce54
    • David Conrad's avatar
      Ignore T.35 metadata if the OBU contains no payload · 673ee248
      David Conrad authored
      The syntax of itu_t_t35_payload_bytes is not defined in the AV1
      specification, but it does state that decoders should ignore the
      entire OBU if they do not understand it.
      673ee248
    • David Conrad's avatar
      Fix chroma deblock filter size calculation for lossless · 2152826b
      David Conrad authored
      In section 5.11.34 txSz is always defined to TX_4X4 if Lossless is true
      
      Chroma deblock filter size calculation needs to use this overridden txSz
      when lossless is enabled
      2152826b
    • David Conrad's avatar
      Fix rounding in the calculation of initialSubpelX · e202fa08
      David Conrad authored
      The spec divides err by two, rounding to 0, instead of >>1,
      which rounds towards negative infinity
      e202fa08
Loading