1. 03 Feb, 2009 2 commits
  2. 01 Feb, 2009 1 commit
  3. 30 Jan, 2009 1 commit
    • Fiona Glaser's avatar
      Massive overhaul of nnz/cbp calculation · e394bd60
      Fiona Glaser authored
      Modify quantization to also calculate array_non_zero.
      PPC assembly changes by gpoirior.
      New quant asm includes some small tweaks to quant and SSE4 versions using ptest for the array_non_zero.
      Use this new feature of quant to merge nnz/cbp calculation directly with encoding and avoid many unnecessary calls to dequant/zigzag/decimate/etc.
      Also add new i16x16 DC-only iDCT with asm.
      Since intra encoding now directly calculates nnz, skip_intra now backs up nnz/cbp as well.
      Output should be equivalent except when using p4x4+RDO because of a subtlety involving old nnz values lying around.
      Performance increase in macroblock_encode: ~18% with dct-decimate, 30% without at CRF 25.
      Overall performance increase 0-6% depending on encoding settings.
      e394bd60
  4. 29 Jan, 2009 2 commits
  5. 28 Jan, 2009 3 commits
  6. 27 Jan, 2009 1 commit
    • Fiona Glaser's avatar
      Much faster chroma encoding and other opts · 83d805fe
      Fiona Glaser authored
      ~15% faster chroma encode by reorganizing CBP calculation and adding special-case idct_dc function, since most coded chroma blocks are DC-only.
      Small optimization in cache_save (skip_bp)
      Fix array_non_zero to not violate strict aliasing (should eliminate miscompilation issues in the future)
      Add in automatic substitutions for some asm instructions that have an equivalent smaller representation.
      83d805fe
  7. 26 Jan, 2009 1 commit
  8. 23 Jan, 2009 2 commits
  9. 20 Jan, 2009 2 commits
  10. 19 Jan, 2009 1 commit
  11. 18 Jan, 2009 2 commits
  12. 17 Jan, 2009 1 commit
  13. 14 Jan, 2009 6 commits
  14. 08 Jan, 2009 1 commit
    • Fiona Glaser's avatar
      Fix regression in r1066 · d7d1d37f
      Fiona Glaser authored
      With some combinations of video width and other settings, the scratch buffer was slightly too small.
      This caused heap corruption on some systems.
      Also prevent merange from being raised during encoding with esa/tesa through encoder_reconfig, as this no longer works.
      d7d1d37f
  15. 06 Jan, 2009 1 commit
  16. 05 Jan, 2009 2 commits
  17. 02 Jan, 2009 1 commit
  18. 31 Dec, 2008 4 commits
  19. 30 Dec, 2008 2 commits
  20. 28 Dec, 2008 1 commit
    • Fiona Glaser's avatar
      Much faster CABAC RDO · 406a40dc
      Fiona Glaser authored
      Since RDO doesn't care about what order bit costs are calculated, merge sigmap and level coding into the same loop in RDO.
      This is bit-exact for 4x4dct but slightly incorrect for 8x8dct due to the sigmap containing duplicated contexts.
      However, the PSNR penalty of this is extremely small (~0.001db).
      Speed benefit is about 15% in 4x4dct and 30% in 8x8dct residual bit cost calculation at QP20.
      Overall encoding speed benefit is up to 5%, depending on encoding settings.
      Also remove an old unnecessary CABAC table that hasn't been used for years.
      406a40dc
  21. 26 Dec, 2008 1 commit
    • Fiona Glaser's avatar
      VLC table optimizations · 131d066e
      Fiona Glaser authored
      Slightly reorganize VLC tables for ~2% faster block_residual_write_cavlc.
      Also a small optimization in p8x8 CAVLC.
      131d066e
  22. 25 Dec, 2008 1 commit
  23. 24 Dec, 2008 1 commit
    • Fiona Glaser's avatar
      Optimize variance asm + minor changes · 9fe6e5e6
      Fiona Glaser authored
      Remove SAD argument from var, not needed anymore.
      Speed up var asm a bit by eliminating psadbw and instead HADDWing at end.
      Eliminate all remaining warnings on gcc 3.4 on cygwin
      Port another minor optimization from lavc (pskip)
      9fe6e5e6