Spec fixes for Argon bitstreams (!1450) · Merge requests · VideoLAN / dav1d

Merged David Conrad requested to merge dconrad/dav1d:spec-fixes into master 2 years ago

Sep 15, 2022

Fix checking the reference dimesions for the projection process · d4a2b75d

David Conrad authored 2 years ago

Section 7.9.2 returns 0 "If RefMiRows[ srcIdx ] is not equal to MiRows,
RefMiCols[ srcIdx ] is not equal to MiCols"

dav1d was comparing pixel width/height, not block width/height,
so conform with the spec

d4a2b75d

Fix calculation of OBMC lap dimensions · eb25f00c

David Conrad authored 2 years ago

Individual OBMC lapped predictions have a max width of 64 pixels
for the top lap and have a max height of 64 for the left laps

This is 7.11.3.9. Overlapped motion compensation process
step4 = Clip3( 2, 16, Num_4x4_Blocks_Wide[ candSz ] )

dav1d wasn't clipping this as needed, which means that with scaled MC, the
interpolation of the 2nd half of a 128 block was incorrect, since mx/my
for subpel filter selection need to be reset at the 64 pixel boundary

eb25f00c

Support film grain application whose only effect is clipping to video range · 10f5ce54

David Conrad authored 2 years ago

This is the parameter combination:
num_y_points == 0 && num_cb_points == 0 && num_cr_points == 0 &&
chroma_scaling_from_luma == 1 && clip_to_restricted_range == 1

Film grain application has two effects: adding noise, and optionally
clipping to video range

For luma, the spec skips film grain application if there's no noise
(num_y_points == 0), but for chroma, it's only skipped if there's no
chroma noise *and* chroma_scaling_from_luma is false

This means it's possible for there to be no noise (num_*_points = 0), but
if clip_to_restricted_range is true then chroma pixels can be clipped to
video range, if chroma_scaling_from_luma is true. Luma pixels, however,
aren't clipped to video range unless there's noise to apply.
dav1d currently skips applying film grain entirely if there is no noise,
regardless of the secondary clipping.

10f5ce54

Ignore T.35 metadata if the OBU contains no payload · 673ee248

David Conrad authored 2 years ago

The syntax of itu_t_t35_payload_bytes is not defined in the AV1
specification, but it does state that decoders should ignore the
entire OBU if they do not understand it.

673ee248

Fix chroma deblock filter size calculation for lossless · 2152826b

David Conrad authored 2 years ago

In section 5.11.34 txSz is always defined to TX_4X4 if Lossless is true

Chroma deblock filter size calculation needs to use this overridden txSz
when lossless is enabled

2152826b

Fix rounding in the calculation of initialSubpelX · e202fa08
David Conrad authored 2 years ago
```
The spec divides err by two, rounding to 0, instead of >>1,
which rounds towards negative infinity
```
e202fa08

Fix overflow when saturating dequantized coefficients clipped to 0 · ee98592b

David Conrad authored 2 years ago

It's possible to encode a large coefficient that becomes 0 after
the clipping in dequant (Abs( dq ) & 0xFFFFFF), e.g. 0x1000000
After that &0xFFFFFF, coeffs are saturated in the range of
[-(1 << (bitdepth+7)), 1 << (bitdepth+7))

dav1d implements this saturation via umin(dq - sign, cf_max), then applies
the sign afterwards via xor. However, for dq = 0 and sign = 1, this step
evaulates to umin(UINT_MAX, cf_max) == cf_max instead of the expected 0.

So instead, do unsigned saturate as umin(dq, cf_max + sign),
then apply sign via (sign ? -dq : dq)
On arm this is the same number of instructions, since cneg exists and is used
On x86 this requires an additional instruction, but this isn't a
latency-critical path

ee98592b

Fix overflow in 8-bit NEON ADST · 1bdb776c

David Conrad authored 2 years ago

In 8-bit adst, it's possible that the final Round2(x[0], 12) can exceed
16-bits signed

Specifically, in 7.13.2.6. Inverse ADST4 process, the precision requirement is:
"It is a requirement of bitstream conformance that all values stored in the
s and x arrays by this process are representable by a signed integer using
r + 12 bits of precision."

For 8 bits, r is 16 for both row and column, so x[] can be 28-bit signed.
For values [134215680, 134217727] (within 2047 of the maximum 28-bit value),
the final Round2(x[0], 12) evaluates to 32768, exceeding 16-bits signed.

So switch to using sqrshrn, which saturates to 16-bits signed

This is a continuation of: Commit b53ff29d
arm: itx: Do clipping in all narrowing downshifts

1bdb776c

Spec fixes for Argon bitstreams

Merge request reports

Activity