      Decreases runtime of decoding first 1000 frames of Chimera (1080p, 8bit)
      from 12.227 to 12.075s (average of 6 runs) after changing decode.c, and
      further down to 12.027s (1.67%) with the changes to recon_tmpl.c included.
      After the changes to lf_mask.c, it goes down to 11.842s.