1. 11 Feb, 2009 2 commits
  2. 10 Feb, 2009 1 commit
    • Manuel Rommel's avatar
      fix 10l in 75b495f2723fcb77f · 65304078
      Manuel Rommel authored
      Original thread:
      date: Mon, Feb 9, 2009 at 9:37 PM
      subject: [x264-devel] commit: Spare a vec_perm and a vec_mergeh though using a LUT of permutation vectors . (Guillaume Poirier )
      65304078
  3. 09 Feb, 2009 5 commits
  4. 08 Feb, 2009 1 commit
  5. 04 Feb, 2009 2 commits
  6. 03 Feb, 2009 1 commit
    • Fiona Glaser's avatar
      Faster 8x8dct+CAVLC interleave · ded3e28c
      Fiona Glaser authored
      Integrate array_non_zero with the CAVLC 8x8dct interleave function.
      Roughly 1.5-2x faster than the original separate array_non_zero method.
      ded3e28c
  7. 01 Feb, 2009 1 commit
  8. 30 Jan, 2009 1 commit
    • Fiona Glaser's avatar
      Massive overhaul of nnz/cbp calculation · e394bd60
      Fiona Glaser authored
      Modify quantization to also calculate array_non_zero.
      PPC assembly changes by gpoirior.
      New quant asm includes some small tweaks to quant and SSE4 versions using ptest for the array_non_zero.
      Use this new feature of quant to merge nnz/cbp calculation directly with encoding and avoid many unnecessary calls to dequant/zigzag/decimate/etc.
      Also add new i16x16 DC-only iDCT with asm.
      Since intra encoding now directly calculates nnz, skip_intra now backs up nnz/cbp as well.
      Output should be equivalent except when using p4x4+RDO because of a subtlety involving old nnz values lying around.
      Performance increase in macroblock_encode: ~18% with dct-decimate, 30% without at CRF 25.
      Overall performance increase 0-6% depending on encoding settings.
      e394bd60
  9. 29 Jan, 2009 1 commit
  10. 28 Jan, 2009 1 commit
  11. 27 Jan, 2009 1 commit
    • Fiona Glaser's avatar
      Much faster chroma encoding and other opts · 83d805fe
      Fiona Glaser authored
      ~15% faster chroma encode by reorganizing CBP calculation and adding special-case idct_dc function, since most coded chroma blocks are DC-only.
      Small optimization in cache_save (skip_bp)
      Fix array_non_zero to not violate strict aliasing (should eliminate miscompilation issues in the future)
      Add in automatic substitutions for some asm instructions that have an equivalent smaller representation.
      83d805fe
  12. 26 Jan, 2009 1 commit
  13. 23 Jan, 2009 2 commits
  14. 20 Jan, 2009 2 commits
  15. 19 Jan, 2009 1 commit
  16. 18 Jan, 2009 2 commits
  17. 17 Jan, 2009 1 commit
  18. 14 Jan, 2009 3 commits
  19. 08 Jan, 2009 1 commit
    • Fiona Glaser's avatar
      Fix regression in r1066 · d7d1d37f
      Fiona Glaser authored
      With some combinations of video width and other settings, the scratch buffer was slightly too small.
      This caused heap corruption on some systems.
      Also prevent merange from being raised during encoding with esa/tesa through encoder_reconfig, as this no longer works.
      d7d1d37f
  20. 05 Jan, 2009 2 commits
  21. 02 Jan, 2009 1 commit
  22. 31 Dec, 2008 4 commits
  23. 30 Dec, 2008 2 commits
  24. 28 Dec, 2008 1 commit
    • Fiona Glaser's avatar
      Much faster CABAC RDO · 406a40dc
      Fiona Glaser authored
      Since RDO doesn't care about what order bit costs are calculated, merge sigmap and level coding into the same loop in RDO.
      This is bit-exact for 4x4dct but slightly incorrect for 8x8dct due to the sigmap containing duplicated contexts.
      However, the PSNR penalty of this is extremely small (~0.001db).
      Speed benefit is about 15% in 4x4dct and 30% in 8x8dct residual bit cost calculation at QP20.
      Overall encoding speed benefit is up to 5%, depending on encoding settings.
      Also remove an old unnecessary CABAC table that hasn't been used for years.
      406a40dc