1. 19 May, 2017 1 commit
  2. 11 Oct, 2015 3 commits
  3. 25 Jul, 2015 6 commits
  4. 23 Feb, 2015 2 commits
  5. 16 Dec, 2014 2 commits
    • Janne Grunau's avatar
      aarch64: cabac_encode_{decision,bypass,terminal}_asm · 59b9c252
      Janne Grunau authored
      benchmarks on a Nexus 9 (nvidia denver):
      101.3 cycles in x264_cabac_encode_decision_c,   67105369 runs, 3495 skips
       97.3 cycles in x264_cabac_encode_decision_asm, 67105493 runs, 3371 skips
      132.8 cycles in x264_cabac_encode_terminal_c,    1046950 runs, 1626 skips
      116.1 cycles in x264_cabac_encode_terminal_asm,  1048424 runs, 152 skips
       92.4 cycles in x264_cabac_encode_bypass_c,     16776192 runs, 1024 skips
       89.6 cycles in x264_cabac_encode_bypass_asm,   16776453 runs, 763 skips
      
      Cycle counts are not as stable as one would like. The dynamic code
      optimisation seems to produce different results for small chnages in a
      binary. Repeated runs with the same binary produce stable results
      though (ignoring the first run).
      59b9c252
    • Janne Grunau's avatar
      aarch64: nal_escape_neon · fa7e9d3d
      Janne Grunau authored
      3-4 times faster.
      fa7e9d3d
  6. 17 Oct, 2014 1 commit
  7. 26 Aug, 2014 7 commits
  8. 20 Jul, 2014 6 commits
  9. 30 Oct, 2013 2 commits
  10. 03 Sep, 2013 1 commit
  11. 20 May, 2013 1 commit
  12. 23 Apr, 2013 1 commit
    • Steve Borho's avatar
      OpenCL lookahead · f49a1b2e
      Steve Borho authored
      OpenCL support is compiled in by default, but must be enabled at runtime by an
      --opencl command line flag. Compiling OpenCL support requires perl. To avoid
      the perl requirement use: configure --disable-opencl.
      
      When enabled, the lookahead thread is mostly off-loaded to an OpenCL capable GPU
      device.  Lowres intra cost prediction, lowres motion search (including subpel)
      and bidir cost predictions are all done on the GPU.  MB-tree and final slice
      decisions are still done by the CPU.  Presets which do not use a threaded
      lookahead will not use OpenCL at all (superfast, ultrafast).
      
      Because of data dependencies, the GPU must use an iterative motion search which
      performs more total work than the CPU would do, so this is not work efficient
      or power efficient. But if there are spare GPU cycles to spare, it can often
      speed up the encode. Output quality when OpenCL lookahead is enabled is often
      very slightly worse in quality than the CPU quality (because of the same data
      dependencies).
      
      x264 must compile its OpenCL kernels for your device before running them, and in
      order to avoid doing this every run it caches the compiled kernel binary in a
      file named x264_lookahead.clbin (--opencl-clbin FNAME to override).  The cache
      file will be ignored if the device, driver, or OpenCL source are changed.
      
      x264 will use the first GPU device which supports the required cl_image
      features required by its kernels. Most modern discrete GPUs and all AMD
      integrated GPUs will work.  Intel integrated GPUs (up to IvyBridge) do not
      support those necessary features. Use --opencl-device N to specify a number of
      capable GPUs to skip during device detection.
      
      Switchable graphics environments (e.g. AMD Enduro) are currently not supported,
      as some have bugs in their OpenCL drivers that cause output to be silently
      incorrect.
      
      Developed by MulticoreWare with support from AMD and Telestream.
      f49a1b2e
  13. 23 Apr, 2012 1 commit
  14. 06 Mar, 2012 1 commit
  15. 04 Feb, 2012 2 commits
  16. 15 Jan, 2012 1 commit
    • Loren Merritt's avatar
      CABAC trellis opts part 4: x86_64 asm · 7d804baf
      Loren Merritt authored
      Another 20% faster.
      18k->12k codesize.
      
      This patch series may have a large impact on encoding speed.
      For example, 24% faster at --preset slower --crf 23 with 720p parkjoy.
      Overall speed increase is proportional to the cost of trellis (which is proportional to bitrate, and much more with --trellis 2).
      7d804baf
  17. 12 Jan, 2012 1 commit
  18. 28 Nov, 2011 1 commit