1. 23 Feb, 2015 1 commit
  2. 12 Dec, 2014 1 commit
  3. 26 Aug, 2014 1 commit
  4. 22 Apr, 2014 1 commit
  5. 12 Mar, 2014 1 commit
    • Henrik Gramner's avatar
      x86inc: Support arbitrary stack alignments · 7c860f07
      Henrik Gramner authored
      If the stack is known to be at least 32-byte aligned we can safely store ymm
      registers on the stack without doing manual alignment.
      
      Change ALLOC_STACK to always align the stack before allocating stack space for
      consistency. Previously alignment would occur either before or after allocating
      stack space depending on whether manual alignment was required or not.
      7c860f07
  6. 24 Feb, 2014 1 commit
  7. 08 Jan, 2014 1 commit
  8. 23 Aug, 2013 1 commit
  9. 05 Jul, 2013 1 commit
    • Henrik Gramner's avatar
      x86: Remove X264_CPU_SSE_MISALIGN functions · ff41804e
      Henrik Gramner authored
      Prevents a crash if the misaligned exception mask bit is cleared for some reason.
      
      Misaligned SSE functions are only used on AMD Phenom CPUs and the benefit is miniscule.
      They also require modifying the MXCSR control register and by removing those functions
      we can get rid of that complexity altogether.
      
      VEX-encoded instructions also supports unaligned memory operands. I tried adding AVX
      implementations of all removed functions but there were no performance improvements on
      Ivy Bridge. pixel_sad_x3 and pixel_sad_x4 had significant code size reductions though
      so I kept them and added some minor cosmetics fixes and tweaks.
      ff41804e
  10. 17 May, 2013 1 commit
  11. 26 Feb, 2013 1 commit
    • Fiona Glaser's avatar
      x86: detect Bobcat, improve Atom optimizations, reorganize flags · 5d60b9c9
      Fiona Glaser authored
      The Bobcat has a 64-bit SIMD unit reminiscent of the Athlon 64; detect this
      and apply the appropriate flags.
      
      It also has an extremely slow palignr instruction; create a flag for this to
      avoid massive penalties on palignr-heavy functions.
      
      Improve Atom function selection and document exactly what the SLOW_ATOM flag
      covers.
      
      Add Atom-optimized SATD/SA8D/hadamard_ac functions: simply combine the ssse3
      optimizations with the sse2 algorithm to avoid pmaddubsw, which is slow on
      Atom along with other SIMD multiplies.
      
      Drop TBM detection; it'll probably never be useful for x264.
      
      Invert FastShuffle to SlowShuffle; it only ever applied to one CPU (Conroe).
      
      Detect CMOV, to fail more gracefully when run on a chip with MMX2 but no CMOV.
      5d60b9c9
  12. 25 Feb, 2013 1 commit
  13. 09 Jan, 2013 1 commit
  14. 06 Dec, 2012 1 commit
  15. 07 Nov, 2012 1 commit
  16. 04 Feb, 2012 2 commits
  17. 22 Oct, 2011 1 commit
  18. 09 Aug, 2011 1 commit
  19. 05 Aug, 2011 2 commits
  20. 21 Jul, 2011 1 commit
  21. 24 Mar, 2011 2 commits
  22. 18 Feb, 2011 1 commit
  23. 07 Feb, 2011 1 commit
  24. 27 Jan, 2011 2 commits
  25. 25 Jan, 2011 2 commits
    • Fiona Glaser's avatar
      Initial AVX support · 68cda11b
      Fiona Glaser authored
      Automatically handle 3-operand instructions and abstraction between SSE and AVX.
      Implement one function with this (denoise_dct) as an initial test.
      x264 can't make much use of the 256-bit support of AVX (as it's float-only), but 3-operand could give some small benefits.
      68cda11b
    • Sean McGovern's avatar
      Bump dates to 2011 · ee9bc136
      Sean McGovern authored
      ee9bc136
  26. 14 Dec, 2010 1 commit
  27. 31 Oct, 2010 1 commit
  28. 10 Oct, 2010 1 commit
  29. 18 Sep, 2010 1 commit
    • Fiona Glaser's avatar
      Update source file headers · 213a99d0
      Fiona Glaser authored
      Update dates, improve file descriptions, make things more consistent.
      Also add information about commercial licensing.
      213a99d0
  30. 09 Jun, 2010 1 commit
  31. 26 May, 2010 1 commit
    • Fiona Glaser's avatar
      Detect Atom CPU, enable appropriate asm functions · 57729402
      Fiona Glaser authored
      I'm not going to actually optimize for this pile of garbage unless someone pays me.
      But it can't hurt to at least enable the correct functions based on benchmarks.
      
      Also save some cache on Intel CPUs that don't need the decimate LUT due to having fast bsr/bsf.
      57729402
  32. 06 May, 2010 2 commits
    • Anton Mitrofanov's avatar
      More cosmetics · 54e784fd
      Anton Mitrofanov authored
      54e784fd
    • Fiona Glaser's avatar
      Deduplicate asm constants, automate name prefixing · 311c4bb1
      Fiona Glaser authored
      Auto-prefix global constants with x264_ in cextern.
      Eliminate x264_ prefix from asm files; automate it in cglobal.
      Deduplicate asm constants wherever possible to save data cache (move them to a new const-a.asm).
      Remove x264_emms() entirely on non-x86 (don't even call an empty function).
      Add cextern_naked for a non-prefixed cextern (used in checkasm).
      311c4bb1
  33. 05 Apr, 2010 1 commit
    • Fiona Glaser's avatar
      Massive cosmetic and syntax cleanup · 58d2349d
      Fiona Glaser authored
      Convert all applicable loops to use C99 loop index syntax.
      Clean up most inconsistent syntax in ratecontrol.c, visualize, ppc, etc.
      Replace log(x)/log(2) constructs with log2, and similar with log10.
      Fix all -Wshadow violations.
      Fix visualize support.
      58d2349d
  34. 27 Mar, 2010 1 commit