-
Fiona Glaser authored
This reduces overhead and lets us use less branchy code for zigzag, dequant, decimate, and so on. Reorganize and optimize a lot of macroblock_encode using this new function. ~1-2% faster overall. Includes NEON and x86 versions of the new function. Using larger merged functions like this will also make wider SIMD, like AVX2, more effective.
993c81e9