Changes

Ronald S. Bultje · 91188a57
--- a/task-list.md
+++ b/task-list.md
@@ -16,8 +16,7 @@ Multi-threading:

 Algorithmic optimizations:
 - prevent per-plane `memcpy()` if some (but not all) planes have film grain. We tried this before but it had to be reverted (#426, !1522). This probably needs per-plane `allocater_data`.
- the identity_* inverse transforms are stored transposed (as are all other coefficient tables). In all other cases, this saves a transpose in assembly, but for those, it actually means we have to transpose, even though in theory we wouldn't have to at all. Therefore, a potential optimization would be to have a special untransposed zigzag coefficient table and remove the transpose from the assembly, which would make those inverse transforms slightly faster;
- early exits in C inverse transform code if eob is small (e.g. identity^2 - although this applies to all types): !1682.
+- the identity_* inverse transforms are stored transposed (as are all other coefficient tables). In all other cases, this saves a transpose in assembly, but for those, it actually means we have to transpose, even though in theory we wouldn't have to at all. Therefore, a potential optimization would be to have a special untransposed zigzag coefficient table and remove the transpose from the assembly, which would make those inverse transforms slightly faster.

 Cleanups:
 - lfmask and l/a ctx zero can be done in tile instead of frame context for better distribution.