1. 28 Jul, 2009 1 commit
      Faster bidir_rd plus some bugfixes · b08410d0
      Cache chroma MC during refine_bidir_rd and use both the luma and chroma caches to skip MC in macroblock_encode.
      Fix incorrect call to rd_cost_part; refine_bidir_rd output was incorrect for i8>0.
      Remove some redundant clips.
      ~12% faster refine_bidir_rd.
  2. 27 Jul, 2009 1 commit
      Fix two bugs in QPRD · 9ea7b69d
      fprofile settings now actually fprofile QPRD.
      Don't use i_mbrd before initializing it.
  3. 26 Jul, 2009 3 commits
      Fix 10l in QPRD · 11f50441
      Trellis used wrong lambda with trellis=1
      Fix a nondeterminism with threads and subme>7 · fa3b8139
      Also add a few more checks to eliminate the need for spel_border.
      Add QPRD support as subme=10 · 4304c427
      Refactor trellis lambda selection to be done in analyse_init instead of in trellis.
      This will allow for more easy adaption of lambda later on; for now it allows constant lambda across variable QPs.
      QPRD is only available with adaptive quantization enabled and generally improves SSIM and visual quality.
      Additionally, weight the SSD values from RD based on the relative QP offset for chroma; helps visually at high QPs where chroma has a lower QP than luma.
      This fixes some visual artifacts created by QPRD at high QPs.
      Note that this generally hurts PSNR and SSIM, and so is only on when psy-RD is on.
      New AQ algorithm option · 2e1db1f6
      "Auto-variance" uses log(var)^2 instead of log(var) and attempts to adapt strength per-frame.
      Generates significantly better SSIM; on by default with --tune ssim.
      Whether it generates visually better quality is still up for debate.
      Available as --aq-mode 2.
      Totally new preset system for x264.c (not libx264), new defaults · 71b9d885
      Other new features include "tune" and "profile" settings; see --help for more details.
      Unlike most other settings, "preset" and "tune" act before all other options.
      However, "profile" acts afterwards, overriding all other options.
      Our defaults have also changed: new defaults are --subme 7 --bframes 3 --8x8dct --no-psnr --no-ssim --threads auto --ref 3 --mixed-refs --trellis 1 --weightb --crf 23 --progress.
      Users will hopefully find these changes to greatly improve usability.
      Early termination for chroma encoding · 205a032c
      Faster chroma encoding by terminating early if heuristics indicate that the block will be DC-only.
      This works because the vast majority of inter chroma blocks have no coefficients at all, and those that do are almost always DC-only.
      Add two new helper DSP functions for this: dct_dc_8x8 and var2_8x8.  mmx/sse2/ssse3 versions of each.
      Early termination is disabled at very low QPs due to it not being useful there.
      Performance increase is ~1-2% without trellis, up to 5-6% with trellis=2.
      Increase is greater with lower bitrates.
      Various CABAC optimizations and cleanups · 90bec46b
      Faster CABAC CBF context calculation for inter blocks.
      Add x264_constant_p(), will probably be useful in the future as well.
      Simpler subpartition functions.
      Clean up and optimize mvd_cpn a bit more.
      Various other minor optimizations.
      Various CABAC optimizations · 2bcc39fd
      Move calculation of b_intra out of the core residual loop and hardcode it where applicable.
      Inlining cabac_mb_mvd was unnecessary and wasted tremendous amounts of code size.  Inlining only cache_mvd is faster and significantly smaller.
      Faster CABAC RDO · be3c3d21
      Since the bypass case is quite unlikely, especially when doing merged sigmap/level coding,
      it's faster to use a branch than a cmov.
