- 21 Jan, 2014 2 commits
-
-
Kieran Kunhya authored
-
James Weaver authored
Assembly based on code by Henrik Gramner and Loren Merritt.
-
- 08 Jan, 2014 1 commit
-
-
Henrik Gramner authored
Also update AUTHORS file and my e-mail address in the headers of various files.
-
- 30 Oct, 2013 2 commits
-
-
Anton Mitrofanov authored
It probably wasn't used or maintained for last few years.
-
Fiona Glaser authored
Allows generation of hard-CBR streams without using NAL HRD. Useful if you want to be able to reconfigure the bitrate (which you can't do with NAL HRD on).
-
- 23 Aug, 2013 4 commits
-
-
Henrik Gramner authored
Windows, unlike most other operating systems, uses UTF-16 for Unicode strings while x264 is designed for UTF-8. This patch does the following in order to handle things like Unicode filenames: * Keep strings internally as UTF-8. * Retrieve the CLI command line as UTF-16 and convert it to UTF-8. * Always use Unicode versions of Windows API functions and convert strings to UTF-16 when calling them. * Attempt to use legacy 8.3 short filenames for external libraries without Unicode support.
-
Kieran Kunhya authored
This format has been reverse engineered and x264's output has almost exactly the same bitstream as Panasonic cameras and encoders produce. It therefore does not comply with SMPTE RP2027 since Panasonic themselves do not comply with their own specification. It has been tested in Avid, Premiere, Edius and Quantel. Parts of this patch were written by Fiona Glaser and some reverse engineering was done by Joseph Artsimovich.
-
Henrik Gramner authored
Combine frame and mb data mallocs into a single large malloc. Additionally, on Linux systems with hugepage support, ask for hugepages on large mallocs. This gives a small performance improvement (~0.2-0.9%) on systems without hugepage support, as well as a small memory footprint reduction. On recent Linux kernels with hugepage support enabled (set to madvise or always), it improves performance up to 4% at the cost of about 7-12% more memory usage on typical settings.. It may help even more on Haswell and other recent CPUs with improved 2MB page support in hardware.
-
Anton Mitrofanov authored
-
- 03 Jul, 2013 1 commit
-
-
Fiona Glaser authored
Stops x264 from attempting to optimize global stream headers, ensuring that different segments of a video will have identical headers when used with identical encoding settings.
-
- 20 May, 2013 1 commit
-
-
Anton Mitrofanov authored
Autoload the OpenCL library so that it's not required to run an openCL-enabled build of x264. Update X264_BUILD, which should have been changed with the first patch.
-
- 23 Apr, 2013 4 commits
-
-
Fiona Glaser authored
AVX2 functions: mc_chroma intra_sad_x3_16x16 last64 ads hpel dct4 idct4 sub16x16_dct8 quant_4x4x4 quant_4x4 quant_4x4_dc quant_8x8 SAD_X3/X4 SATD var var2 SSD zigzag interleave weightp weightb intra_sad_8x8_x9 decimate integral hadamard_ac sa8d_satd sa8d lowres_init denoise
-
Steve Borho authored
OpenCL support is compiled in by default, but must be enabled at runtime by an --opencl command line flag. Compiling OpenCL support requires perl. To avoid the perl requirement use: configure --disable-opencl. When enabled, the lookahead thread is mostly off-loaded to an OpenCL capable GPU device. Lowres intra cost prediction, lowres motion search (including subpel) and bidir cost predictions are all done on the GPU. MB-tree and final slice decisions are still done by the CPU. Presets which do not use a threaded lookahead will not use OpenCL at all (superfast, ultrafast). Because of data dependencies, the GPU must use an iterative motion search which performs more total work than the CPU would do, so this is not work efficient or power efficient. But if there are spare GPU cycles to spare, it can often speed up the encode. Output quality when OpenCL lookahead is enabled is often very slightly worse in quality than the CPU quality (because of the same data dependencies). x264 must compile its OpenCL kernels for your device before running them, and in order to avoid doing this every run it caches the compiled kernel binary in a file named x264_lookahead.clbin (--opencl-clbin FNAME to override). The cache file will be ignored if the device, driver, or OpenCL source are changed. x264 will use the first GPU device which supports the required cl_image features required by its kernels. Most modern discrete GPUs and all AMD integrated GPUs will work. Intel integrated GPUs (up to IvyBridge) do not support those necessary features. Use --opencl-device N to specify a number of capable GPUs to skip during device detection. Switchable graphics environments (e.g. AMD Enduro) are currently not supported, as some have bugs in their OpenCL drivers that cause output to be silently incorrect. Developed by MulticoreWare with support from AMD and Telestream.
-
Fiona Glaser authored
The H.264 spec technically has limits on the number of slices per frame. x264 normally ignores this, since most use-cases that require large numbers of slices prefer it to. However, certain decoders may break with extremely large numbers of slices, as can occur with some slice-max-size/mbs settings. When set, x264 will refuse to create any slices beyond the maximum number, even if slice-max-size/mbs requires otherwise.
-
Fiona Glaser authored
Works in conjunction with slice-max-mbs and/or slice-max-size to avoid overly small slices. Useful with certain decoders that barf on extremely small slices. If slice-min-mbs would be violated as a result of slice-max-size, x264 will exceed slice-max-size and print a warning.
-
- 26 Feb, 2013 1 commit
-
-
Fiona Glaser authored
The Bobcat has a 64-bit SIMD unit reminiscent of the Athlon 64; detect this and apply the appropriate flags. It also has an extremely slow palignr instruction; create a flag for this to avoid massive penalties on palignr-heavy functions. Improve Atom function selection and document exactly what the SLOW_ATOM flag covers. Add Atom-optimized SATD/SA8D/hadamard_ac functions: simply combine the ssse3 optimizations with the sse2 algorithm to avoid pmaddubsw, which is slow on Atom along with other SIMD multiplies. Drop TBM detection; it'll probably never be useful for x264. Invert FastShuffle to SlowShuffle; it only ever applied to one CPU (Conroe). Detect CMOV, to fail more gracefully when run on a chip with MMX2 but no CMOV.
-
- 25 Feb, 2013 1 commit
-
-
Mike Gorchak authored
-
- 09 Jan, 2013 1 commit
-
-
Loren Merritt authored
-
- 03 Jul, 2012 1 commit
-
-
Anton Mitrofanov authored
Fix some integer overflows and check input parameters better. Also fix incorrect type specifiers for demuxer info printing.
-
- 18 May, 2012 1 commit
-
-
Fiona Glaser authored
Split each lookahead frame analysis call into multiple threads. Has a small impact on quality, but does not seem to be consistently any worse. This helps alleviate bottlenecks with many cores and frame threads. In many case, this massively increases performance on many-core systems. For example, over 100% faster 1080p encoding with --preset veryfast on a 12-core i7 system. Realtime 1080p30 at --preset slow should now be feasible on real systems. For sliced-threads, this patch should be faster regardless of settings (~10%). By default, lookahead threads are 1/6 of regular threads. This isn't exacting, but it seems to work well for all presets on real systems. With sliced-threads, it's the same as the number of encoding threads.
-
- 04 Feb, 2012 1 commit
-
-
Hii authored
-
- 22 Oct, 2011 1 commit
-
-
Henrik Gramner authored
Gives a slight speed increase and significant binary size reduction when only one chroma format is needed.
-
- 21 Sep, 2011 2 commits
-
-
Henrik Gramner authored
-
Loren Merritt authored
i4x4 analysis cycles (per partition): penryn sandybridge 184-> 75 157-> 54 preset=superfast (sad) 281->165 225->124 preset=faster (satd with early termination) 332->165 263->124 preset=medium 379->165 297->124 preset=slower (satd without early termination) This is the first code in x264 that intentionally produces different behavior on different cpus: satd_x9 is implemented only on ssse3+ and checks all intra directions, whereas the old code (on fast presets) may early terminate after checking only some of them. There is no systematic difference on slow presets, though they still occasionally disagree about tiebreaks. For ease of debugging, add an option "--cpu-independent" to disable satd_x9 and any analogous future code.
-
- 05 Aug, 2011 1 commit
-
-
Loren Merritt authored
Previously required "--asm sse2fast,fastshuffle,sse4.2,avx".
-
- 22 Jul, 2011 2 commits
-
-
Dan Larkin authored
Necessary for a future trellis mode decision/motion estimation patch. Also add the slowest presets to the regression test.
-
Anton Mitrofanov authored
-
- 10 Jul, 2011 2 commits
-
-
xvidfan authored
Much less efficient than YUV444, but easy to support using the YUV444 framework.
-
Fiona Glaser authored
-
- 13 Jun, 2011 1 commit
-
-
Hii authored
-
- 13 Apr, 2011 1 commit
-
-
Fiona Glaser authored
This option is now required for Blu-ray compatibility. --open-gop bluray is now gone (using bluray-compat and open-gop implies a Blu-ray compatible open-gop). This option doesn't automatically enforce every aspect of Blu-ray compatibility (e.g. resolution, framerate, level, etc).
-
- 12 Apr, 2011 1 commit
-
-
Fiona Glaser authored
-
- 24 Mar, 2011 1 commit
-
-
Steven Walters authored
Big thanks to David Rudie, the original author of this patch.
-
- 25 Jan, 2011 1 commit
-
-
Sean McGovern authored
-
- 10 Jan, 2011 2 commits
-
-
Anton Mitrofanov authored
-
Fiona Glaser authored
-
- 14 Dec, 2010 1 commit
-
-
Vittorio Giovara authored
-
- 07 Dec, 2010 1 commit
-
-
Fiona Glaser authored
-
- 25 Nov, 2010 2 commits
-
-
Alex Wright authored
Since fade analysis is now so fast, weightp 1 now does fade analysis but no reference duplication. This is the opposite of what it used to do (reference duplication but no fade analysis). This also gives weightp's better fade quality to faster presets (up to superfast).
-
Fiona Glaser authored
There's probably no real reason to keep it at 10 anymore, and lowering it allows AQ to pick lower quantizers in really flat areas. Might help on gradients at high quality levels. The previous value of 10 was arbitrary anyways.
-