Skip to content
Snippets Groups Projects

Spec fixes for Argon bitstreams

Merged David Conrad requested to merge dconrad/dav1d:spec-fixes into master
  1. Sep 15, 2022
    • David Conrad's avatar
      Fix checking the reference dimesions for the projection process · d4a2b75d
      David Conrad authored
      Section 7.9.2 returns 0 "If RefMiRows[ srcIdx ] is not equal to MiRows,
      RefMiCols[ srcIdx ] is not equal to MiCols"
      
      dav1d was comparing pixel width/height, not block width/height,
      so conform with the spec
      d4a2b75d
    • David Conrad's avatar
      Fix calculation of OBMC lap dimensions · eb25f00c
      David Conrad authored
      Individual OBMC lapped predictions have a max width of 64 pixels
      for the top lap and have a max height of 64 for the left laps
      
      This is 7.11.3.9. Overlapped motion compensation process
      step4 = Clip3( 2, 16, Num_4x4_Blocks_Wide[ candSz ] )
      
      dav1d wasn't clipping this as needed, which means that with scaled MC, the
      interpolation of the 2nd half of a 128 block was incorrect, since mx/my
      for subpel filter selection need to be reset at the 64 pixel boundary
      eb25f00c
    • David Conrad's avatar
      Support film grain application whose only effect is clipping to video range · 10f5ce54
      David Conrad authored
      This is the parameter combination:
      num_y_points == 0 && num_cb_points == 0 && num_cr_points == 0 &&
      chroma_scaling_from_luma == 1 && clip_to_restricted_range == 1
      
      Film grain application has two effects: adding noise, and optionally
      clipping to video range
      
      For luma, the spec skips film grain application if there's no noise
      (num_y_points == 0), but for chroma, it's only skipped if there's no
      chroma noise *and* chroma_scaling_from_luma is false
      
      This means it's possible for there to be no noise (num_*_points = 0), but
      if clip_to_restricted_range is true then chroma pixels can be clipped to
      video range, if chroma_scaling_from_luma is true. Luma pixels, however,
      aren't clipped to video range unless there's noise to apply.
      dav1d currently skips applying film grain entirely if there is no noise,
      regardless of the secondary clipping.
      10f5ce54
    • David Conrad's avatar
      Ignore T.35 metadata if the OBU contains no payload · 673ee248
      David Conrad authored
      The syntax of itu_t_t35_payload_bytes is not defined in the AV1
      specification, but it does state that decoders should ignore the
      entire OBU if they do not understand it.
      673ee248
    • David Conrad's avatar
      Fix chroma deblock filter size calculation for lossless · 2152826b
      David Conrad authored
      In section 5.11.34 txSz is always defined to TX_4X4 if Lossless is true
      
      Chroma deblock filter size calculation needs to use this overridden txSz
      when lossless is enabled
      2152826b
    • David Conrad's avatar
      Fix rounding in the calculation of initialSubpelX · e202fa08
      David Conrad authored
      The spec divides err by two, rounding to 0, instead of >>1,
      which rounds towards negative infinity
      e202fa08
    • David Conrad's avatar
      Fix overflow when saturating dequantized coefficients clipped to 0 · ee98592b
      David Conrad authored
      It's possible to encode a large coefficient that becomes 0 after
      the clipping in dequant (Abs( dq ) & 0xFFFFFF), e.g. 0x1000000
      After that &0xFFFFFF, coeffs are saturated in the range of
      [-(1 << (bitdepth+7)), 1 << (bitdepth+7))
      
      dav1d implements this saturation via umin(dq - sign, cf_max), then applies
      the sign afterwards via xor. However, for dq = 0 and sign = 1, this step
      evaulates to umin(UINT_MAX, cf_max) == cf_max instead of the expected 0.
      
      So instead, do unsigned saturate as umin(dq, cf_max + sign),
      then apply sign via (sign ? -dq : dq)
      On arm this is the same number of instructions, since cneg exists and is used
      On x86 this requires an additional instruction, but this isn't a
      latency-critical path
      ee98592b
    • David Conrad's avatar
      Fix overflow in 8-bit NEON ADST · 1bdb776c
      David Conrad authored
      In 8-bit adst, it's possible that the final Round2(x[0], 12) can exceed
      16-bits signed
      
      Specifically, in 7.13.2.6. Inverse ADST4 process, the precision requirement is:
      "It is a requirement of bitstream conformance that all values stored in the
      s and x arrays by this process are representable by a signed integer using
      r + 12 bits of precision."
      
      For 8 bits, r is 16 for both row and column, so x[] can be 28-bit signed.
      For values [134215680, 134217727] (within 2047 of the maximum 28-bit value),
      the final Round2(x[0], 12) evaluates to 32768, exceeding 16-bits signed.
      
      So switch to using sqrshrn, which saturates to 16-bits signed
      
      This is a continuation of: Commit b53ff29d
      arm: itx: Do clipping in all narrowing downshifts
      1bdb776c
Loading