... | ... | @@ -32,6 +32,7 @@ Performance optimizations: |
|
|
- postfilter threading;
|
|
|
- threading can become a generic worker queue (one tile_sbrow symbol parsing/recon, one sbrow postfilter(s)) and then use a generic single threadpool instead of separate tile/frame[/postfilter?] ones;
|
|
|
- obmc blend masks have one quarter of zeroes at their tail, so would there be gains if we set height to be 0.75 of what it currently is (for mc and/or blend)? Does this impact SIMD design in some unwanted way?
|
|
|
- for inter, test if prediction at 128x128 should be done at 64x64 subblocks to improve cache efficiency (and also simplify SIMD).
|
|
|
|
|
|
Cleanups:
|
|
|
- LR/MC intermediate 2d buffers in C dsp can be reduced by doing windowed like in SIMD;
|
... | ... | |