1. 06 Mar, 2019 1 commit
  2. 06 Aug, 2018 3 commits
  3. 17 Jan, 2018 1 commit
  4. 24 Dec, 2017 1 commit
  5. 21 May, 2017 3 commits
  6. 21 Jan, 2017 1 commit
  7. 01 Dec, 2016 1 commit
    • Anton Mitrofanov's avatar
      Cosmetics · b2b39dae
      Anton Mitrofanov authored
      Also make x264_weighted_reference_duplicate() static.
      b2b39dae
  8. 16 Jan, 2016 1 commit
  9. 25 Jul, 2015 2 commits
  10. 23 Feb, 2015 1 commit
  11. 20 Dec, 2014 1 commit
  12. 21 Jan, 2014 1 commit
  13. 08 Jan, 2014 1 commit
  14. 23 Aug, 2013 1 commit
    • Henrik Gramner's avatar
      Transparent hugepage support · fa1e2b74
      Henrik Gramner authored
      Combine frame and mb data mallocs into a single large malloc.
      Additionally, on Linux systems with hugepage support, ask for hugepages on
      large mallocs.
      
      This gives a small performance improvement (~0.2-0.9%) on systems without
      hugepage support, as well as a small memory footprint reduction.
      
      On recent Linux kernels with hugepage support enabled (set to madvise or
      always), it improves performance up to 4% at the cost of about 7-12% more
      memory usage on typical settings..
      
      It may help even more on Haswell and other recent CPUs with improved 2MB page
      support in hardware.
      fa1e2b74
  15. 20 May, 2013 1 commit
  16. 23 Apr, 2013 3 commits
    • Fiona Glaser's avatar
      x86: more AVX2 framework, AVX2 functions, plus some existing asm tweaks · 0ea5be85
      Fiona Glaser authored
      AVX2 functions:
      mc_chroma
      intra_sad_x3_16x16
      last64
      ads
      hpel
      dct4
      idct4
      sub16x16_dct8
      quant_4x4x4
      quant_4x4
      quant_4x4_dc
      quant_8x8
      SAD_X3/X4
      SATD
      var
      var2
      SSD
      zigzag interleave
      weightp
      weightb
      intra_sad_8x8_x9
      decimate
      integral
      hadamard_ac
      sa8d_satd
      sa8d
      lowres_init
      denoise
      0ea5be85
    • Steve Borho's avatar
      OpenCL lookahead · f49a1b2e
      Steve Borho authored
      OpenCL support is compiled in by default, but must be enabled at runtime by an
      --opencl command line flag. Compiling OpenCL support requires perl. To avoid
      the perl requirement use: configure --disable-opencl.
      
      When enabled, the lookahead thread is mostly off-loaded to an OpenCL capable GPU
      device.  Lowres intra cost prediction, lowres motion search (including subpel)
      and bidir cost predictions are all done on the GPU.  MB-tree and final slice
      decisions are still done by the CPU.  Presets which do not use a threaded
      lookahead will not use OpenCL at all (superfast, ultrafast).
      
      Because of data dependencies, the GPU must use an iterative motion search which
      performs more total work than the CPU would do, so this is not work efficient
      or power efficient. But if there are spare GPU cycles to spare, it can often
      speed up the encode. Output quality when OpenCL lookahead is enabled is often
      very slightly worse in quality than the CPU quality (because of the same data
      dependencies).
      
      x264 must compile its OpenCL kernels for your device before running them, and in
      order to avoid doing this every run it caches the compiled kernel binary in a
      file named x264_lookahead.clbin (--opencl-clbin FNAME to override).  The cache
      file will be ignored if the device, driver, or OpenCL source are changed.
      
      x264 will use the first GPU device which supports the required cl_image
      features required by its kernels. Most modern discrete GPUs and all AMD
      integrated GPUs will work.  Intel integrated GPUs (up to IvyBridge) do not
      support those necessary features. Use --opencl-device N to specify a number of
      capable GPUs to skip during device detection.
      
      Switchable graphics environments (e.g. AMD Enduro) are currently not supported,
      as some have bugs in their OpenCL drivers that cause output to be silently
      incorrect.
      
      Developed by MulticoreWare with support from AMD and Telestream.
      f49a1b2e
    • Fiona Glaser's avatar
      Add slices-max feature · 732e4f7e
      Fiona Glaser authored
      The H.264 spec technically has limits on the number of slices per frame. x264
      normally ignores this, since most use-cases that require large numbers of
      slices prefer it to. However, certain decoders may break with extremely large
      numbers of slices, as can occur with some slice-max-size/mbs settings.
      
      When set, x264 will refuse to create any slices beyond the maximum number,
      even if slice-max-size/mbs requires otherwise.
      732e4f7e
  17. 26 Feb, 2013 1 commit
    • Fiona Glaser's avatar
      x86: detect Bobcat, improve Atom optimizations, reorganize flags · 5d60b9c9
      Fiona Glaser authored
      The Bobcat has a 64-bit SIMD unit reminiscent of the Athlon 64; detect this
      and apply the appropriate flags.
      
      It also has an extremely slow palignr instruction; create a flag for this to
      avoid massive penalties on palignr-heavy functions.
      
      Improve Atom function selection and document exactly what the SLOW_ATOM flag
      covers.
      
      Add Atom-optimized SATD/SA8D/hadamard_ac functions: simply combine the ssse3
      optimizations with the sse2 algorithm to avoid pmaddubsw, which is slow on
      Atom along with other SIMD multiplies.
      
      Drop TBM detection; it'll probably never be useful for x264.
      
      Invert FastShuffle to SlowShuffle; it only ever applied to one CPU (Conroe).
      
      Detect CMOV, to fail more gracefully when run on a chip with MMX2 but no CMOV.
      5d60b9c9
  18. 09 Jan, 2013 1 commit
  19. 26 Jul, 2012 1 commit
  20. 15 May, 2012 1 commit
  21. 24 Apr, 2012 1 commit
    • Fiona Glaser's avatar
      Add mb_info API for signalling constant macroblocks · 8e57a9a0
      Fiona Glaser authored
      Some use-cases of x264 involve encoding video with large constant areas of the frame.
      Sometimes, the caller knows which areas these are, and can tell x264.
      This API lets the caller do this and adds internal tracking of modifications to macroblocks to avoid problems.
      This is really only suitable without B-frames.
      An example use-case would be using x264 for VNC.
      8e57a9a0
  22. 25 Mar, 2012 1 commit
  23. 07 Mar, 2012 1 commit
    • Fiona Glaser's avatar
      Sliced-threads: do hpel and deblock after returning · a155572e
      Fiona Glaser authored
      Lowers encoding latency around 14% in sliced threads mode with preset superfast.
      Additionally, even if there is no waiting time between frames, this improves parallelism, because hpel+deblock are done during the (singlethreaded) lookahead.
      For ease of debugging, dump-yuv forces all of the threads to wait and finish instead of setting b_full_recon.
      a155572e
  24. 06 Mar, 2012 2 commits
    • Henrik Gramner's avatar
      Fix incorrect zero-extension assumptions in x86_64 asm · 3131a19c
      Henrik Gramner authored
      Some x264 asm assumed that the high 32 bits of registers containing "int" values would be zero.
      This is almost always the case, and it seems to work with gcc, but it is *not* guaranteed by the ABI.
      As a result, it breaks with some other compilers, like Clang, that take advantage of this in optimizations.
      Accordingly, fix all x86 code by using intptr_t instead of int or using movsxd where neccessary.
      Also add checkasm hack to detect when assembly functions incorrectly assumes that 32-bit integers are zero-extended to 64-bit.
      3131a19c
    • Anton Mitrofanov's avatar
      Fix RGB colorspace input · 0fc5acc6
      Anton Mitrofanov authored
      BGR/BGRA input was correct.
      0fc5acc6
  25. 04 Feb, 2012 1 commit
  26. 01 Dec, 2011 1 commit
  27. 22 Oct, 2011 1 commit
  28. 15 Oct, 2011 1 commit
  29. 21 Sep, 2011 1 commit
  30. 24 Aug, 2011 3 commits