1. 30 Mar, 2009 2 commits
  2. 27 Mar, 2009 1 commit
  3. 19 Mar, 2009 1 commit
  4. 17 Mar, 2009 1 commit
    • Fiona Glaser's avatar
      SSE2 zigzag_interleave · d25d50c9
      Fiona Glaser authored
      Replace PHADD with FastShuffle (more accurate naming).
      This flag represents asm functions that rely on fast SSE2 shuffle units, and thus are only faster on Phenom, Nehalem, and Penryn CPUs.
      d25d50c9
  5. 10 Mar, 2009 1 commit
  6. 09 Mar, 2009 1 commit
  7. 08 Mar, 2009 1 commit
  8. 07 Mar, 2009 3 commits
    • Fiona Glaser's avatar
      SSSE3 hpel_filter_v · f701ebc8
      Fiona Glaser authored
      Optimized using the same method as in r1122.  Patch partially by Holger.
      ~8% faster hpel filter on 64-bit Nehalem
      f701ebc8
    • Fiona Glaser's avatar
      Update some asm copyright headers · 936f76e0
      Fiona Glaser authored
      936f76e0
    • Holger Lubitz's avatar
      Vastly faster SATD/SA8D/Hadamard_AC/SSD/DCT/IDCT · 54e38917
      Holger Lubitz authored
      Heavily optimized for Core 2 and Nehalem, but performance should improve on all modern x86 CPUs.
      16x16 SATD: +18% speed on K8(64bit), +22% on K10(32bit), +42% on Penryn(64bit), +44% on Nehalem(64bit), +50% on P4(32bit), +98% on Conroe(64bit)
      Similar performance boosts in SATD-like functions (SA8D, hadamard_ac) and somewhat less in DCT/IDCT/SSD.
      Overall performance boost is up to ~15% on 64-bit Conroe.
      54e38917
  9. 06 Mar, 2009 1 commit
  10. 04 Mar, 2009 4 commits
  11. 03 Mar, 2009 1 commit
  12. 26 Feb, 2009 1 commit
  13. 16 Feb, 2009 1 commit
  14. 14 Feb, 2009 1 commit
  15. 11 Feb, 2009 2 commits
  16. 10 Feb, 2009 1 commit
    • Manuel Rommel's avatar
      fix 10l in 75b495f2723fcb77f · 65304078
      Manuel Rommel authored
      Original thread:
      date: Mon, Feb 9, 2009 at 9:37 PM
      subject: [x264-devel] commit: Spare a vec_perm and a vec_mergeh though using a LUT of permutation vectors . (Guillaume Poirier )
      65304078
  17. 09 Feb, 2009 7 commits
  18. 08 Feb, 2009 1 commit
  19. 04 Feb, 2009 2 commits
  20. 03 Feb, 2009 2 commits
  21. 01 Feb, 2009 1 commit
  22. 30 Jan, 2009 1 commit
    • Fiona Glaser's avatar
      Massive overhaul of nnz/cbp calculation · e394bd60
      Fiona Glaser authored
      Modify quantization to also calculate array_non_zero.
      PPC assembly changes by gpoirior.
      New quant asm includes some small tweaks to quant and SSE4 versions using ptest for the array_non_zero.
      Use this new feature of quant to merge nnz/cbp calculation directly with encoding and avoid many unnecessary calls to dequant/zigzag/decimate/etc.
      Also add new i16x16 DC-only iDCT with asm.
      Since intra encoding now directly calculates nnz, skip_intra now backs up nnz/cbp as well.
      Output should be equivalent except when using p4x4+RDO because of a subtlety involving old nnz values lying around.
      Performance increase in macroblock_encode: ~18% with dct-decimate, 30% without at CRF 25.
      Overall performance increase 0-6% depending on encoding settings.
      e394bd60
  23. 29 Jan, 2009 2 commits
  24. 28 Jan, 2009 1 commit