... | ... | @@ -13,7 +13,7 @@ Missing support for weird header bit features (please provide samples): |
|
|
- OBU without length field;
|
|
|
- super_res;
|
|
|
- tile_ext;
|
|
|
- scalable references (#121);
|
|
|
- scalable references (#121/!342);
|
|
|
- frame_ref_short signaling.
|
|
|
|
|
|
Missing software features:
|
... | ... | @@ -28,7 +28,7 @@ Performance optimizations: |
|
|
- change coef contexting (hi/lo_ctx) to be diagonal-oriented for dsp/simd;
|
|
|
- change multi-symbol coding `read_symbol()` symbol discovery loop and adaptivity to be simd'ed [Rostislav expressed interest in this];
|
|
|
- project_motion_field in `ref_mvs.c` can be SIMD'ed;
|
|
|
- `cfl_ac` should take size (`w`/`h`) as function arguments rather than as function LUT indices, so that only subsampling (`420`, `422`, `444`) is a LUT entry;
|
|
|
- `cfl_ac` should take size (`w`/`h`) as function arguments rather than as function LUT indices, so that only subsampling (`420`, `422`, `444`) is a LUT entry (!340);
|
|
|
- `backup_lpf()` in `lr_apply_tmpl.c` backs up 4 lines per 64 pixels per plane, and copies bottom to top per superblock (each 128 or 64 pixels). Most of this is unnecessary. Using a flippable index means we don't need the second copy, and using 64-pixel instead of sb (64 or 128) pixel cdef runs (and then running LR, and then optionally the second cdef and second LR) means we only need to copy the pre-cdef top pixels, not the bottom ones, saving 50% copies. CDEF backup already does all of this. Bonus points for merging the CDEF backup and LR backup together so LR backs up nothing at all;
|
|
|
- postfilter threading;
|
|
|
- threading can become a generic worker queue (one tile_sbrow symbol parsing/recon, one sbrow postfilter(s)) and then use a generic single threadpool instead of separate tile/frame[/postfilter?] ones;
|
... | ... | |