... | @@ -16,6 +16,7 @@ Multi-threading: |
... | @@ -16,6 +16,7 @@ Multi-threading: |
|
- show_existing_frame will be placed in the frame output queue as something keeping a frame thread busy, meaning for such cases, the frame thread will momentarily stall. This is partially required to prevent overflows of the output queue, or growing it to possibly infinite size on garbage input. But for the regular use case, it makes sense to dis-associate the input and output queue so show-existing-frame does not affect how many frames are actively being processed.
|
|
- show_existing_frame will be placed in the frame output queue as something keeping a frame thread busy, meaning for such cases, the frame thread will momentarily stall. This is partially required to prevent overflows of the output queue, or growing it to possibly infinite size on garbage input. But for the regular use case, it makes sense to dis-associate the input and output queue so show-existing-frame does not affect how many frames are actively being processed.
|
|
|
|
|
|
Algorithmic optimizations:
|
|
Algorithmic optimizations:
|
|
|
|
- prevent per-plane `memcpy()` if some (but not all) planes have film grain. We tried this before but it had to be reverted (#426, !1522). This probably needs per-plane `allocater_data`.
|
|
- the identity_* inverse transforms are stored transposed (as are all other coefficient tables). In all other cases, this saves a transpose in assembly, but for those, it actually means we have to transpose, even though in theory we wouldn't have to at all. Therefore, a potential optimization would be to have a special untransposed zigzag coefficient table and remove the transpose from the assembly, which would make those inverse transforms slightly faster;
|
|
- the identity_* inverse transforms are stored transposed (as are all other coefficient tables). In all other cases, this saves a transpose in assembly, but for those, it actually means we have to transpose, even though in theory we wouldn't have to at all. Therefore, a potential optimization would be to have a special untransposed zigzag coefficient table and remove the transpose from the assembly, which would make those inverse transforms slightly faster;
|
|
- early exits in C inverse transform code if eob is small (e.g. identity^2 - although this applies to all types).
|
|
- early exits in C inverse transform code if eob is small (e.g. identity^2 - although this applies to all types).
|
|
|
|
|
... | | ... | |