... | ... | @@ -21,7 +21,7 @@ Missing software features: |
|
|
- eliminate triggerable assert() from the library.
|
|
|
|
|
|
Performance optimizations:
|
|
|
- it may make sense to copy one row (8px+2x2px edges) of pre-cdef data in `uint16_t` at a time so we don't need to extend buffers or add edge data inside the SIMD. This may make the code both simpler *and* faster;
|
|
|
- it may make sense to copy one row (8px+2x2px edges) of pre-cdef data in `uint16_t` at a time so we don't need to extend buffers or add edge data inside the SIMD. This may make the code both simpler *and* faster. Same is true for looprestoration also. In one far-fetched design, cdef/LR might be able to share the top buffers between the two, thus reducing the amount of `memcpy` between the two;
|
|
|
- simd for any function already in a ${anything}DSPContext, for any platform (see #78 for AVX2);
|
|
|
- move emu_edge to dsp for simd;
|
|
|
- move dequant from `decode_coeffs()` to itx;
|
... | ... | |