src/film_grain.h · b6bb8536ad299d52a5ff49a4f0317b923ce6b8bb · Xuefeng Jiang / dav1d

film_grain: implement film grain synthesis · cfa986fe

Niklas Haas authored Nov 13, 2018 and

Ronald S. Bultje committed Nov 19, 2018

This is using a slightly adapted version of my GPU-based algorithm. The
major difference to the algorithm suggested by the spec (and implemented
in libaom) is that instead of using a line buffer to hold the previous
row's film grain blocks, we compute each row/block fully independently.

This opens up the door to exploit parallelism in the future, since we
don't have any left->right or top->down dependency except for the PRNG
state. (Which we could pre-compute for a massively parallel / GPU
implementation)

That being said, it's probably somewhat slower than using a line buffer
for the serial / single CPU case, although most likely not by much
(since the areas with the most redundant work get progressively smaller,
down to a single 2x2 square for the worst case).

cfa986fe