... | @@ -29,6 +29,7 @@ Performance optimizations: |
... | @@ -29,6 +29,7 @@ Performance optimizations: |
|
- change coef contexting (hi/lo_ctx) to be diagonal-oriented for dsp/simd;
|
|
- change coef contexting (hi/lo_ctx) to be diagonal-oriented for dsp/simd;
|
|
- change multi-symbol coding `read_symbol()` symbol discovery loop and adaptivity to be simd'ed [Rostislav expressed interest in this];
|
|
- change multi-symbol coding `read_symbol()` symbol discovery loop and adaptivity to be simd'ed [Rostislav expressed interest in this];
|
|
- project_motion_field in `ref_mvs.c` can be SIMD'ed;
|
|
- project_motion_field in `ref_mvs.c` can be SIMD'ed;
|
|
|
|
- `memset()` for context setting in coefficient (`decode_coeffs()` in `recon.c`) and block (`decode_b()` in `decode.c`) can be optimized similar to ffvp9 to act in blocks using `switch`/`case` pairs for constant-size writes instead of `memset()`. For examples, see `SPLAT_CTX()` in `vp9block.c` in FFmpeg;
|
|
- postfilter threading;
|
|
- postfilter threading;
|
|
- threading can become a generic worker queue (one tile_sbrow symbol parsing/recon, one sbrow postfilter(s)) and then use a generic single threadpool instead of separate tile/frame[/postfilter?] ones.
|
|
- threading can become a generic worker queue (one tile_sbrow symbol parsing/recon, one sbrow postfilter(s)) and then use a generic single threadpool instead of separate tile/frame[/postfilter?] ones.
|
|
|
|
|
... | | ... | |