1. 06 Mar, 2019 1 commit
  2. 06 Aug, 2018 2 commits
  3. 17 Jan, 2018 1 commit
  4. 24 Dec, 2017 1 commit
  5. 24 Jun, 2017 1 commit
  6. 14 Jun, 2017 2 commits
  7. 21 May, 2017 2 commits
  8. 19 May, 2017 1 commit
    • Henrik Gramner's avatar
      osdep: Rework alignment macros · d13b4c3a
      Henrik Gramner authored
      Drop ALIGNED_N and ALIGNED_ARRAY_N in favor of using explicit alignment.
      
      This will allow us to increase the native alignment without unnecessarily
      increasing the alignment of everything that's currently 32-byte aligned.
      d13b4c3a
  9. 21 Jan, 2017 1 commit
  10. 01 Dec, 2016 1 commit
    • Anton Mitrofanov's avatar
      Cosmetics · b2b39dae
      Anton Mitrofanov authored
      Also make x264_weighted_reference_duplicate() static.
      b2b39dae
  11. 20 Apr, 2016 1 commit
  12. 16 Jan, 2016 1 commit
  13. 25 Jul, 2015 1 commit
  14. 23 Feb, 2015 1 commit
  15. 12 Dec, 2014 1 commit
  16. 13 Mar, 2014 1 commit
    • Fiona Glaser's avatar
      Macroblock tree overhaul/optimization · b3fb7184
      Fiona Glaser authored
      Move the second core part of macroblock tree into an assembly function;
      SIMD-optimize roughly half of it (for x86). Roughly ~25-65% faster mbtree,
      depending on content.
      
      Slightly change how mbtree handles the tradeoff between range and precision
      for propagation.
      
      Overall a slight (but mostly negligible) effect on SSIM and ~2% faster.
      b3fb7184
  17. 11 Mar, 2014 1 commit
  18. 08 Jan, 2014 1 commit
  19. 23 Aug, 2013 1 commit
    • Henrik Gramner's avatar
      Transparent hugepage support · fa1e2b74
      Henrik Gramner authored
      Combine frame and mb data mallocs into a single large malloc.
      Additionally, on Linux systems with hugepage support, ask for hugepages on
      large mallocs.
      
      This gives a small performance improvement (~0.2-0.9%) on systems without
      hugepage support, as well as a small memory footprint reduction.
      
      On recent Linux kernels with hugepage support enabled (set to madvise or
      always), it improves performance up to 4% at the cost of about 7-12% more
      memory usage on typical settings..
      
      It may help even more on Haswell and other recent CPUs with improved 2MB page
      support in hardware.
      fa1e2b74
  20. 23 Apr, 2013 2 commits
  21. 09 Jan, 2013 1 commit
  22. 18 May, 2012 1 commit
    • Fiona Glaser's avatar
      Threaded lookahead · df700eae
      Fiona Glaser authored
      Split each lookahead frame analysis call into multiple threads.  Has a small
      impact on quality, but does not seem to be consistently any worse.
      
      This helps alleviate bottlenecks with many cores and frame threads. In many
      case, this massively increases performance on many-core systems.  For example,
      over 100% faster 1080p encoding with --preset veryfast on a 12-core i7 system.
      Realtime 1080p30 at --preset slow should now be feasible on real systems.
      
      For sliced-threads, this patch should be faster regardless of settings (~10%).
      
      By default, lookahead threads are 1/6 of regular threads.  This isn't exacting,
      but it seems to work well for all presets on real systems.  With sliced-threads,
      it's the same as the number of encoding threads.
      df700eae
  23. 07 Mar, 2012 2 commits
    • Fiona Glaser's avatar
      Sliced-threads: do hpel and deblock after returning · a155572e
      Fiona Glaser authored
      Lowers encoding latency around 14% in sliced threads mode with preset superfast.
      Additionally, even if there is no waiting time between frames, this improves parallelism, because hpel+deblock are done during the (singlethreaded) lookahead.
      For ease of debugging, dump-yuv forces all of the threads to wait and finish instead of setting b_full_recon.
      a155572e
    • Fiona Glaser's avatar
      Add row-reencoding support to VBV for improved accuracy · 2535ba17
      Fiona Glaser authored
      Extremely accurate, possibly 100% so (I can't get it to fail even with difficult VBVs).
      Does not yet support rows split on slice boundaries (occurs often with slice-max-size/mbs).
      Still inaccurate with sliced threads, but better than before.
      2535ba17
  24. 06 Mar, 2012 1 commit
    • Henrik Gramner's avatar
      Fix incorrect zero-extension assumptions in x86_64 asm · 3131a19c
      Henrik Gramner authored
      Some x264 asm assumed that the high 32 bits of registers containing "int" values would be zero.
      This is almost always the case, and it seems to work with gcc, but it is *not* guaranteed by the ABI.
      As a result, it breaks with some other compilers, like Clang, that take advantage of this in optimizations.
      Accordingly, fix all x86 code by using intptr_t instead of int or using movsxd where neccessary.
      Also add checkasm hack to detect when assembly functions incorrectly assumes that 32-bit integers are zero-extended to 64-bit.
      3131a19c
  25. 04 Feb, 2012 1 commit
  26. 22 Oct, 2011 1 commit
  27. 15 Oct, 2011 1 commit
  28. 21 Sep, 2011 2 commits
  29. 24 Aug, 2011 2 commits
  30. 09 Aug, 2011 1 commit
    • Loren Merritt's avatar
      Remove some unused, broken, and/or useless functions · 52f287e8
      Loren Merritt authored
      Unused frame_sort.
      Unused x86_64 dequant_4x4dc_mmx2, predict_8x8_vr_mmx2.
      Unused and broken high_depth integral_init*h_sse4, optimize_chroma_*, dequant_flat_*, sub8x8_dct_dc_*, zigzag_sub_*.
      Useless high_depth dequant_sse4, dequant_dc_sse4.
      52f287e8
  31. 22 Jul, 2011 3 commits