- 09 Jan, 2013 1 commit
-
-
Loren Merritt authored
-
- 06 Dec, 2012 1 commit
-
-
Sean McGovern authored
Solaris responds correctly to the same value as Cygwin, so let's use that.
-
- 07 Nov, 2012 1 commit
-
-
David Wolstencroft authored
The Apple A6 CPU doesn't support performance counters, so this test caused a crash.
-
- 04 Feb, 2012 2 commits
-
-
Fiona Glaser authored
TBM and BMI1 are supported by Trinity/Piledriver. The others (and BMI1) will probably appear in Intel's upcoming Haswell. Also update x86inc with AVX2 stuff.
-
Hii authored
-
- 22 Oct, 2011 1 commit
-
-
Fiona Glaser authored
~10% faster Hadamard functions (SATD/SA8D/hadamard_ac) plus other improvements.
-
- 09 Aug, 2011 1 commit
-
-
Loren Merritt authored
-
- 05 Aug, 2011 2 commits
-
-
Loren Merritt authored
Previously required "--asm sse2fast,fastshuffle,sse4.2,avx".
-
Anton Mitrofanov authored
-
- 21 Jul, 2011 1 commit
-
-
Rafaël Carré authored
The cpu_set_t structure is considered opaque. Also handle sched_getaffinity() error case if "cpusetsize is smaller than the size of the affinity mask used by the kernel."
-
- 24 Mar, 2011 2 commits
-
-
Steven Walters authored
Also fix broken thread detection on cygwin.
-
Kieran Kunhya authored
-
- 18 Feb, 2011 1 commit
-
-
Anton Mitrofanov authored
Luckily didn't affect anything due to C signedness rules.
-
- 07 Feb, 2011 1 commit
-
-
Fiona Glaser authored
-
- 27 Jan, 2011 2 commits
-
-
Fiona Glaser authored
-
Fiona Glaser authored
Even if not using ymm registers, AVX operations will cause SIGILLs on unsupported OSs. On Windows, AVX is only available on Windows 7 SP1 or later.
-
- 25 Jan, 2011 2 commits
-
-
Fiona Glaser authored
Automatically handle 3-operand instructions and abstraction between SSE and AVX. Implement one function with this (denoise_dct) as an initial test. x264 can't make much use of the 256-bit support of AVX (as it's float-only), but 3-operand could give some small benefits.
-
Sean McGovern authored
-
- 14 Dec, 2010 1 commit
-
-
Steven Walters authored
Patch originally by Pegasys Inc.
-
- 31 Oct, 2010 1 commit
-
-
Fiona Glaser authored
Technically, such functions should be declared with (void), not ().
-
- 10 Oct, 2010 1 commit
-
-
Anton Mitrofanov authored
Exorcise some CamelCase.
-
- 18 Sep, 2010 1 commit
-
-
Fiona Glaser authored
Update dates, improve file descriptions, make things more consistent. Also add information about commercial licensing.
-
- 09 Jun, 2010 1 commit
-
-
Steven Walters authored
Unify input/output defines to HAVE_* format. Define values as 1 to simplify conditionals.
-
- 26 May, 2010 1 commit
-
-
Fiona Glaser authored
I'm not going to actually optimize for this pile of garbage unless someone pays me. But it can't hurt to at least enable the correct functions based on benchmarks. Also save some cache on Intel CPUs that don't need the decimate LUT due to having fast bsr/bsf.
-
- 06 May, 2010 2 commits
-
-
Anton Mitrofanov authored
-
Fiona Glaser authored
Auto-prefix global constants with x264_ in cextern. Eliminate x264_ prefix from asm files; automate it in cglobal. Deduplicate asm constants wherever possible to save data cache (move them to a new const-a.asm). Remove x264_emms() entirely on non-x86 (don't even call an empty function). Add cextern_naked for a non-prefixed cextern (used in checkasm).
-
- 05 Apr, 2010 1 commit
-
-
Fiona Glaser authored
Convert all applicable loops to use C99 loop index syntax. Clean up most inconsistent syntax in ratecontrol.c, visualize, ppc, etc. Replace log(x)/log(2) constructs with log2, and similar with log10. Fix all -Wshadow violations. Fix visualize support.
-
- 27 Mar, 2010 1 commit
-
-
Fiona Glaser authored
-
- 30 Jan, 2010 1 commit
-
-
Loren Merritt authored
r1413 caused crashes on any system with malloc.h. Also switch to std=c99 or std=gnu99 if supported by the compiler. Fix visualize support.
-
- 21 Jan, 2010 1 commit
-
-
Fiona Glaser authored
Apparently these CPUs have SSE4a, but not misaligned SSE.
-
- 20 Aug, 2009 1 commit
-
-
David Conrad authored
x264 will detect which ARM core it's building for and only build NEON asm if the target is ARMv6 or above, then enable NEON at runtime.
-
- 17 Mar, 2009 1 commit
-
-
Fiona Glaser authored
Replace PHADD with FastShuffle (more accurate naming). This flag represents asm functions that rely on fast SSE2 shuffle units, and thus are only faster on Phenom, Nehalem, and Penryn CPUs.
-
- 19 Jan, 2009 1 commit
-
-
Brad Smith authored
-
- 31 Dec, 2008 1 commit
-
-
Fiona Glaser authored
Significantly speeds up coeff_last and coeff_level_run on Phenom CPUs for faster CAVLC and CABAC. Also a small tweak to coeff_level_run asm.
-
- 29 Nov, 2008 1 commit
-
-
Fiona Glaser authored
-
- 25 Nov, 2008 1 commit
-
-
Fiona Glaser authored
Do satd 4x8 by transposing the two blocks' positions and running satd 8x4. Use pinsrd (SSE4) for faster width4 SSD Globally replace movlhps with punpcklqdq (it seems to be faster on Conroe) Move mask_misalign declaration to cpu.h to avoid warning in encoder.c. These optimizations help on Nehalem, Phenom, and Penryn CPUs.
-
- 23 Nov, 2008 1 commit
-
-
Fiona Glaser authored
Faster hpel_filter by using unaligned loads instead of emulated PALIGNR Faster hpel_filter on 64-bit by using the 32-bit version (the cost of emulated PALIGNR is high enough that the savings from caching intermediate values is not worth it). Add support for misaligned_mask on Phenom: ~2% faster hpel_filter, ~4% faster width16 multisad, 7% faster width20 get_ref. Replace width12 mmx with width16 sse on Phenom and Nehalem: 32% faster width12 get_ref on Phenom. Merge cpu-32.asm and cpu-64.asm Thanks to Easy123 for contributing a Phenom box for a weekend so I could write these optimizations.
-
- 05 Nov, 2008 1 commit
-
-
Fiona Glaser authored
movaps/movups are no longer equivalent to their integer equivalents on the Nehalem, so that substitution is removed. Nehalem has a much lower cacheline split penalty than previous Intel CPUs, so cacheline workarounds are no longer necessary. Thanks to Intel for providing Avail Media with the pre-release Nehalem CPU needed to prepare these (and other not-yet-committed) optimizations. Overall speed improvement with Nehalem vs Penryn at the same clock speed is around 40%.
-
- 04 Jul, 2008 1 commit
-
-
Fiona Glaser authored
Update "Authors" lists based on actual authorship; highest is most important Update copyright notices and remove old CVS tags from file headers Add file headers to GTK and other sections missing them Update FSF address Other header-related cosmetics
-
- 29 Jun, 2008 1 commit
-
-
Gabriel Bouvigne authored
-