... | ... | @@ -16,8 +16,7 @@ Multi-threading: |
|
|
|
|
|
Algorithmic optimizations:
|
|
|
- prevent per-plane `memcpy()` if some (but not all) planes have film grain. We tried this before but it had to be reverted (#426, !1522). This probably needs per-plane `allocater_data`.
|
|
|
- the identity_* inverse transforms are stored transposed (as are all other coefficient tables). In all other cases, this saves a transpose in assembly, but for those, it actually means we have to transpose, even though in theory we wouldn't have to at all. Therefore, a potential optimization would be to have a special untransposed zigzag coefficient table and remove the transpose from the assembly, which would make those inverse transforms slightly faster;
|
|
|
- early exits in C inverse transform code if eob is small (e.g. identity^2 - although this applies to all types): !1682.
|
|
|
- the identity_* inverse transforms are stored transposed (as are all other coefficient tables). In all other cases, this saves a transpose in assembly, but for those, it actually means we have to transpose, even though in theory we wouldn't have to at all. Therefore, a potential optimization would be to have a special untransposed zigzag coefficient table and remove the transpose from the assembly, which would make those inverse transforms slightly faster.
|
|
|
|
|
|
Cleanups:
|
|
|
- lfmask and l/a ctx zero can be done in tile instead of frame context for better distribution.
|
... | ... | |