| ... | ... | @@ -25,6 +25,7 @@ Removing redundancies: |
|
|
|
- bonus points for merging the CDEF backup and LR backup together so LR backs up nothing at all;
|
|
|
|
- obmc blend masks have one quarter of zeroes at their tail, so would there be gains if we set height to be 0.75 of what it currently is (for mc and/or blend)? Does this impact SIMD design in some unwanted way?
|
|
|
|
- for inter, test if prediction at 128x128 should be done at 64x64 subblocks to improve cache efficiency (and also simplify SIMD);
|
|
|
|
- the identity_identity inverse transforms are stored transposed (as are all other coefficient tables). In all other cases, this saves a transpose in assembly, but for identity^2, it actually means we have to transpose, even though in theory we wouldn't have to at all. Therefore, a potential optimization would be to have a special identity^2 untransposed zigzag coefficient table and remove the transpose from the assembly, which would make identity^2 inverse transforms slightly faster.
|
|
|
|
|
|
|
|
Other speed optimizations:
|
|
|
|
- film-grain GL shader (like [placebo](https://github.com/haasn/libplacebo/blob/master/src/shaders/av1.c));
|
| ... | ... | |
| ... | ... | |