1. 26 Feb, 2009 1 commit
  2. 09 Feb, 2009 1 commit
  3. 20 Jan, 2009 1 commit
    • Fiona Glaser's avatar
      Eliminate support for direct_8x8_inference=0 · 1f0e78d8
      Fiona Glaser authored
      The benefit in the most extreme contrived situation was at most 0.001db PSNR, at the cost of slower decoding.
      As this option was basically useless, it was a waste of code and prevented some other useful optimizations.
      Remove some unused mc code related to sub-8x8 partitions.
      Small deblocking speedup when p4x4 is used.
      Also remove unused x264_nal_decode prototype from x264.h.
      1f0e78d8
  4. 22 Dec, 2008 1 commit
  5. 29 Nov, 2008 1 commit
  6. 11 Nov, 2008 1 commit
  7. 10 Nov, 2008 1 commit
  8. 09 Nov, 2008 1 commit
    • Fiona Glaser's avatar
      Faster b-adapt + adaptive quantization · 0c841de6
      Fiona Glaser authored
      Factor out pow to be only called once per macroblock.  Speeds up b-adapt, especially b-adapt 2, considerably.
      Speed boost is as high as 24% with b-adapt 2 + b-frames 16.
      0c841de6
  9. 22 Oct, 2008 1 commit
  10. 02 Oct, 2008 1 commit
  11. 17 Sep, 2008 1 commit
  12. 16 Sep, 2008 1 commit
    • Fiona Glaser's avatar
      Cache motion vectors in lowres lookahead · c299b7d8
      Fiona Glaser authored
      This vastly speeds up b-adapt 2, especially at large bframes values.
      This changes output because now MV prediction in lookahead only uses L0/L1 MVs, not bidir.  This isn't a problem, since the bidir prediction wasn't really correct to begin with, so the change in output is neither positive nor negative.
      This also allowed the removal of some unnecessary memsets, which should also give a small speed boost.
      Finally, this allows the use of the lowres motion vectors for predictors in some future patch.
      c299b7d8
  13. 15 Sep, 2008 1 commit
    • Fiona Glaser's avatar
      Add optional more optimal B-frame decision method · 95ed2720
      Fiona Glaser authored
      This method (--b-adapt 2) uses a Viterbi algorithm somewhat similar to that used in trellis quantization.
      Note that it is not fully optimized and is very slow with large --bframes values.
      It also takes into account weightb, which should improve fade detection.
      Additionally, changes were made to cache lowres intra results for each frame to avoid recalculating them.  This should improve performance in both B-frame decision methods.
      This can also be done for motion vectors, which will dramatically improve b-adapt 2 performance when it is complete.
      This patch also reads b_adapt and scenecut settings from the first pass so that the x264 header information in the output file will have correct information (since frametype decision is only done on the first pass).
      95ed2720
  14. 14 Sep, 2008 1 commit
    • Fiona Glaser's avatar
      Move adaptive quantization to before ratecontrol, eliminate qcomp bias · 80458ffc
      Fiona Glaser authored
      This change improves VBV accuracy and improves bit distribution in CRF and 2pass.
      Instead of being applied after ratecontrol, AQ becomes part of the complexity measure that ratecontrol uses.
      This allows for modularity for changes to AQ; a new AQ algorithm can be introduced simply by introducing a new aq_mode and a corresponding if in adaptive_quant_frame.
      This also allows quantizer field smoothing, since quantizers are calculated beofrehand rather during encoding.
      Since there is no more reason for it, aq_mode 1 is removed.  The new mode 1 is in a sense a merger of the old modes 1 and 2.
      WARNING: This change redefines CRF when using AQ, so output bitrate for a given CRF may be significantly different from before this change!
      80458ffc
  15. 21 Aug, 2008 2 commits
  16. 19 Aug, 2008 2 commits
  17. 16 Aug, 2008 1 commit
  18. 15 Aug, 2008 1 commit
    • Fiona Glaser's avatar
      Faster deblocking · ddee314e
      Fiona Glaser authored
      Early termination for bS=0, alpha=0, beta=0
      Refactoring, various other optimizations
      About 30% faster deblocking overall.
      ddee314e
  19. 24 Jul, 2008 1 commit
  20. 04 Jul, 2008 1 commit
    • Fiona Glaser's avatar
      Update file headers throughout x264 · bdbd4fe7
      Fiona Glaser authored
      Update "Authors" lists based on actual authorship; highest is most important
      Update copyright notices and remove old CVS tags from file headers
      Add file headers to GTK and other sections missing them
      Update FSF address
      Other header-related cosmetics
      bdbd4fe7
  21. 02 Jul, 2008 1 commit
    • Loren Merritt's avatar
      lowres_init asm · 04dc2536
      Loren Merritt authored
      rounding is changed for asm convenience. this makes the c version slower, but there's no way around that if all the implementations are to have the same results.
      04dc2536
  22. 29 Jun, 2008 1 commit
  23. 24 Jun, 2008 1 commit
    • Fiona Glaser's avatar
      Convert NNZ to raster order and other optimizations · ec3d0955
      Fiona Glaser authored
      Converting NNZ to raster order simplifies a lot of the load/store code and allows more use of write-combining.
      More use of write-combining throughout load/save code in common/macroblock.c
      GCC has aliasing issues in the case of stores to 8-bit heap-allocated arrays; dereferencing the pointer once avoids this problem and significantly increases performance.
      More manual loop unrolling and such.
      Move all packXtoY functions to macroblock.h so any function can use them.
      Add pack8to32.
      Minor optimizations to encoder/macroblock.c
      ec3d0955
  24. 15 Jun, 2008 1 commit
    • Fiona Glaser's avatar
      Cosmetics and loop unrolling · dba0e5a2
      Fiona Glaser authored
      GCC is not very good at loop unrolling in cases where it can perform constant propagation, so the unrolling unfortunately has to be done manually.
      dba0e5a2
  25. 11 Jun, 2008 1 commit
  26. 08 Jun, 2008 1 commit
    • Loren Merritt's avatar
      many changes to which asm functions are enabled on which cpus. · c0c0e1f4
      Loren Merritt authored
      with Phenom, 3dnow is no longer equivalent to "sse2 is slow", so make a new flag for that.
      some sse2 functions are useful only on Core2 and Phenom, so make a "sse2 is fast" flag for that.
      some ssse3 instructions didn't become useful until Penryn, so yet another flag.
      disable sse2 completely on Pentium M and Core1, because it's uniformly slower than mmx.
      enable some sse2 functions on Athlon64 that always were faster and we just didn't notice.
      remove mc_luma_sse3, because the only cpu that has lddqu (namely Pentium 4D) doesn't have "sse2 is fast".
      don't print mmx1, sse1, nor 3dnow in the detected cpuflags, since we don't really have any such functions. likewise don't print sse3 unless it's used (Pentium 4D).
      c0c0e1f4
  27. 02 Jun, 2008 1 commit
    • Gabriel Bouvigne's avatar
      2-pass VBV support and improved VBV handling · 56f2bc89
      Gabriel Bouvigne authored
      Dramatically improves 1-pass VBV ratecontrol (especially CBR) and provides support for VBV in 2-pass mode.  This consists of a series of functions that attempts to find overflows and underflows in the VBV from the first-pass statsfile and fix them before encoding.
      1-pass VBV code partially by Fiona Glaser.
      56f2bc89
  28. 18 May, 2008 1 commit
  29. 27 Apr, 2008 1 commit
  30. 17 Apr, 2008 1 commit
  31. 13 Apr, 2008 1 commit
  32. 16 Mar, 2008 1 commit
  33. 27 Jan, 2008 2 commits
  34. 20 Nov, 2007 1 commit
  35. 16 Nov, 2007 1 commit
  36. 15 Nov, 2007 1 commit
  37. 24 Sep, 2007 1 commit