1. 15 Nov, 2009 1 commit
  2. 12 Nov, 2009 1 commit
    • Fiona Glaser's avatar
      Fix all aliasing violations · 03cb8c09
      Fiona Glaser authored
      New type-punning macros perform write/read-combining without aliasing violations per the second-to-last part of 6.5.7 in the C99 specification.
      GCC 4.4, however, doesn't seem to have read this part of the spec and still warns about the violations.
      Regardless, it seems to fix all known aliasing miscompilations, so perhaps the GCC warning generator is just broken.
      As such, add -Wno-strict-aliasing to CFLAGS.
      03cb8c09
  3. 09 Nov, 2009 3 commits
    • Dylan Yudaken's avatar
      Weighted P-frame prediction · ccac8546
      Dylan Yudaken authored
      Merge Dylan's Google Summer of Code 2009 tree.
      Detect fades and use weighted prediction to improve compression and quality.
      "Blind" mode provides a small overall quality increase by using a -1 offset without doing any analysis, as described in JVT-AB033.
      "Smart", the default mode, also performs fade detection and decides weights accordingly.
      MB-tree takes into account the effects of "smart" analysis in lookahead, even further improving quality in fades.
      If psy is on, mbtree is on, interlaced is off, and weightp is off, fade detection will still be performed.
      However, it will be used to adjust quality instead of create actual weights.
      This will improve quality in fades when encoding in Baseline profile.
      
      Doesn't add support for interlaced encoding with weightp yet.
      Only adds support for luma weights, not chroma weights.
      Internal code for chroma weights is in, but there's no analysis yet.
      Baseline profile requires that weightp be off.
      All weightp modes may cause minor breakage in non-compliant decoders that take shortcuts in deblocking reference frame checks.
      "Smart" may cause serious breakage in non-compliant decoders that take shortcuts in handling of duplicate reference frames.
      
      Thanks to Google for sponsoring our most successful Summer of Code yet!
      ccac8546
    • Steven Walters's avatar
    • Anton Mitrofanov's avatar
      Fix large file support, broken in r1302 · f3c9e6f3
      Anton Mitrofanov authored
      f3c9e6f3
  4. 25 Oct, 2009 1 commit
  5. 19 Oct, 2009 1 commit
    • Lamont Alston's avatar
      Make B-pyramid spec-compliant · cf5ba813
      Lamont Alston authored
      The rules of the specification with regard to picture buffering for pyramid coding are widely ignored.
      x264's b-pyramid implementation, despite being practically identical to that proposed by the original paper, was technically not compliant.
      Now it is.
      Two modes are now available:
      1) strict b-pyramid, while worse for compression, follows the rule mandated by Blu-ray (no P-frames can reference B-frames)
      2) normal b-pyramid, which is like the old mode except fully compliant.
      This patch also adds MMCO support (necessary for compliant pyramid in some cases).
      MB-tree still doesn't support b-pyramid (but will soon).
      cf5ba813
  6. 12 Oct, 2009 2 commits
    • Loren Merritt's avatar
      change all dct arrays to 1d. · 1fbba0ca
      Loren Merritt authored
      the C standard doesn't allow you to iterate 1-dimensionally over 2d arrays, and nothing other than the dsp functions themselves cares about the 2dness of dct.
      this fixes a miscompilation in x264_mb_optimize_chroma_dc.
      1fbba0ca
    • Anton Mitrofanov's avatar
      Optimize exp2fix8 · 1a1b9c6f
      Anton Mitrofanov authored
      Slightly faster and more accurate rounding.
      1a1b9c6f
  7. 07 Oct, 2009 1 commit
    • Fiona Glaser's avatar
      Constrained intra prediction support · 7639d496
      Fiona Glaser authored
      Enable with --constrained-intra.  Significantly reduces compression, but required for the base layer of SVC encodes and maybe some other use-cases.
      
      Commit sponsored by a media streaming company that wishes to remain anonymous.
      7639d496
  8. 23 Sep, 2009 1 commit
  9. 21 Sep, 2009 2 commits
    • Fiona Glaser's avatar
      Add intra prediction modes to output stats · bbf573c7
      Fiona Glaser authored
      Also eliminate some NANs in stat output with intra-only encoding.
      Marginal speedup: disable stat calculation if log level is below X264_LOG_INFO.
      Various minor cosmetics.
      bbf573c7
    • Fiona Glaser's avatar
      Major API change: encapsulate NALs within libx264 · 7a0fbed7
      Fiona Glaser authored
      libx264 now returns NAL units instead of raw data.  x264_nal_encode is no longer a public function.
      See x264.h for full documentation of changes.
      New parameter: b_annexb, on by default.  If disabled, startcodes are replaced by sizes as in mp4.
      x264's VBV now works on a NAL level, taking into account escape codes.
      VBV will also take into account the bit cost of SPS/PPS, but only if b_repeat_headers is set.
      Add an overhead tracking system to VBV to better predict the constant overhead of frames (headers, NALU overhead, etc).
      7a0fbed7
  10. 14 Sep, 2009 1 commit
    • Fiona Glaser's avatar
      Make MV costs global instead of static · b1eac265
      Fiona Glaser authored
      Fixes some extremely rare threading race conditions and makes the code cleaner.
      Downside: slightly higher memory usage when calling multiple encoders from the same application.
      b1eac265
  11. 02 Sep, 2009 1 commit
    • Steven Walters's avatar
      Threaded lookahead · 6940dcae
      Steven Walters authored
      Move lookahead into a separate thread, set to higher priority than the other threads, for optimal performance.
      Reduces the amount that lookahead bottlenecks encoding, greatly increasing performance with lookahead-intensive settings (e.g. b-adapt 2) on many-core CPUs.
      Buffer size can be controlled with --sync-lookahead, which defaults to auto (threads+bframes buffer size).
      Note that this buffer is separate from the rc-lookahead value.
      Note also that this does not split lookahead itself into multiple threads yet; this may be added in the future.
      Additionally, split frames into "fdec" and "fenc" frame types and keep the two separate.
      This split greatly reduces memory usage, which helps compensate for the larger lookahead size.
      Extremely special thanks to Michael Kazmier and Alex Giladi of Avail Media, the original authors of this patch.
      6940dcae
  12. 31 Aug, 2009 1 commit
    • Fiona Glaser's avatar
      Multi-slice encoding support · 4ccbb199
      Fiona Glaser authored
      Slicing support is available through three methods (which can be mixed):
      --slices sets a number of slices per frame and ensures rectangular slices (required for Blu-ray).  Overridden by either of the following options:
      --slice-max-mbs sets a maximum number of macroblocks per slice.
      --slice-max-size sets a maximum slice size, in bytes (includes NAL overhead).
      Implement macroblock re-encoding support to allow highly accurate slice size limitation.  Might be useful for other things in the future, too.
      4ccbb199
  13. 27 Aug, 2009 2 commits
  14. 23 Aug, 2009 1 commit
    • David Conrad's avatar
      GSOC merge part 2: ARM stack alignment · ca7da1ae
      David Conrad authored
      Neither GCC nor ARMCC support 16 byte stack alignment despite the fact that NEON loads require it.
      These macros only work for arrays, but fortunately that covers almost all instances of stack alignment in x264.
      ca7da1ae
  15. 08 Aug, 2009 1 commit
  16. 07 Aug, 2009 1 commit
    • Fiona Glaser's avatar
      Macroblock-tree ratecontrol · 835ccc3c
      Fiona Glaser authored
      On by default; can be turned off with --no-mbtree.
      Uses a large lookahead to track temporal propagation of data and weight quality accordingly.
      Requires a very large separate statsfile (2 bytes per macroblock) in multi-pass mode.
      Doesn't work with b-pyramid yet.
      Note that MB-tree inherently measures quality different from the standard qcomp method, so bitrates produced by CRF may change somewhat.
      This makes the "medium" preset a bit slower.  Accordingly, make "fast" slower as well, and introduce a new preset "faster" between "fast" and "veryfast".
      All presets "fast" and above will have MB-tree on.
      Add a new option, --rc-lookahead, to control the distance MB tree looks ahead to perform propagation analysis.
      Default is 40; larger values will be slower and require more memory but give more accurate results.
      This value will be used in the future to control ratecontrol lookahead (VBV).
      Add a new option, --no-psy, to disable all psy optimizations that don't improve PSNR or SSIM.
      This disables psy-RD/trellis, but also other more subtle internal psy optimizations that can't be controlled directly via external parameters.
      Quality improvement from MB-tree is about 2-70% depending on content.
      Strength of MB-tree adjustments can be tweaked using qcompress; higher values mean lower MB-tree strength.
      Note that MB-tree may perform slightly suboptimally on fades; this will be fixed by weighted prediction, which is coming soon.
      835ccc3c
  17. 26 Jul, 2009 1 commit
    • Fiona Glaser's avatar
      Add QPRD support as subme=10 · 4304c427
      Fiona Glaser authored
      Refactor trellis lambda selection to be done in analyse_init instead of in trellis.
      This will allow for more easy adaption of lambda later on; for now it allows constant lambda across variable QPs.
      QPRD is only available with adaptive quantization enabled and generally improves SSIM and visual quality.
      Additionally, weight the SSD values from RD based on the relative QP offset for chroma; helps visually at high QPs where chroma has a lower QP than luma.
      This fixes some visual artifacts created by QPRD at high QPs.
      Note that this generally hurts PSNR and SSIM, and so is only on when psy-RD is on.
      4304c427
  18. 19 Jun, 2009 1 commit
  19. 18 Apr, 2009 1 commit
    • Fiona Glaser's avatar
      Add "coded blocks" stat to output information. · 448ea688
      Fiona Glaser authored
      This measures the total percentage of blocks, intra and inter, which have nonzero coefficients.
      "y,uvAC,uvDC" refers to luma, chroma DC, and chroma AC blocks.
      Note that skip blocks are included in this stat.
      448ea688
  20. 04 Mar, 2009 1 commit
    • Fiona Glaser's avatar
      Remove non-pre scenecut · 42f27d04
      Fiona Glaser authored
      Add support for no-b-adapt + pre-scenecut (patch by BugMaster)
      Pre-scenecut was generally better than regular scenecut in terms of accuracy and regular scenecut didn't work in threaded mode anyways.
      Add no-scenecut option (scenecut=0 is now no scenecut; previously it was -1)
      Fix an incorrect bias towards P-frames near scenecuts with B-adapt 2.
      Simplify pre-scenecut code.
      42f27d04
  21. 16 Feb, 2009 1 commit
  22. 04 Feb, 2009 1 commit
  23. 30 Jan, 2009 1 commit
    • Fiona Glaser's avatar
      Massive overhaul of nnz/cbp calculation · e394bd60
      Fiona Glaser authored
      Modify quantization to also calculate array_non_zero.
      PPC assembly changes by gpoirior.
      New quant asm includes some small tweaks to quant and SSE4 versions using ptest for the array_non_zero.
      Use this new feature of quant to merge nnz/cbp calculation directly with encoding and avoid many unnecessary calls to dequant/zigzag/decimate/etc.
      Also add new i16x16 DC-only iDCT with asm.
      Since intra encoding now directly calculates nnz, skip_intra now backs up nnz/cbp as well.
      Output should be equivalent except when using p4x4+RDO because of a subtlety involving old nnz values lying around.
      Performance increase in macroblock_encode: ~18% with dct-decimate, 30% without at CRF 25.
      Overall performance increase 0-6% depending on encoding settings.
      e394bd60
  24. 20 Jan, 2009 1 commit
    • Fiona Glaser's avatar
      Eliminate support for direct_8x8_inference=0 · 1f0e78d8
      Fiona Glaser authored
      The benefit in the most extreme contrived situation was at most 0.001db PSNR, at the cost of slower decoding.
      As this option was basically useless, it was a waste of code and prevented some other useful optimizations.
      Remove some unused mc code related to sub-8x8 partitions.
      Small deblocking speedup when p4x4 is used.
      Also remove unused x264_nal_decode prototype from x264.h.
      1f0e78d8
  25. 31 Dec, 2008 1 commit
  26. 22 Dec, 2008 1 commit
  27. 11 Dec, 2008 1 commit
    • Fiona Glaser's avatar
      Much faster CAVLC residual coding · 99448f6c
      Fiona Glaser authored
      Use a VLC table for common levelcodes instead of constructing them on-the-spot
      Branchless version of i_trailing calculation (2x faster on Nehalem)
      Completely remove array_non_zero_count and instead use the count calculated in level/run coding.  Note: this slightly changes output with subme > 7 due to different nonzero counts being stored during qpel RD.
      99448f6c
  28. 28 Nov, 2008 1 commit
    • Fiona Glaser's avatar
      Significantly faster CABAC and CAVLC residual coding and bit cost calculation · c1d73389
      Fiona Glaser authored
      Early-terminate in residual writing using stored nnz counts
      To allow the above, store nnz counts for luma and chroma DC
      Add assembly functions to find the last nonzero coefficient in a block
      Overall ~1.9% faster at subme9+8x8dct+qp25 with CAVLC, ~0.7% faster with CABAC
      Note this changes output slightly with CABAC RDO because it requires always storing correct nnz values during RDO, which wasn't done before in cases it wasn't useful.
      CAVLC output should be equivalent.
      c1d73389
  29. 28 Sep, 2008 1 commit
    • Fiona Glaser's avatar
      Replace High 4:4:4 profile lossless with High 4:4:4 Predictive. · a9e86d24
      Fiona Glaser authored
      This improves lossless compression by about 4-25% depending on source.
      The benefit is generally higher for intra-only compression.
      Also add support for 8x8dct and i8x8 blocks in lossless mode; this improves compression very slightly.
      In some rare cases 8x8dct can hurt compression in lossless mode, but its usually helpful, albeit marginally.
      Note that 8x8dct is only available with CABAC as it is never useful with CAVLC.
      High 4:4:4 Predictive replaced the previous profile in a 2007 revision to the H.264 standard.
      The only known compliant decoder for this profile is the latest version of CoreAVC.
      As I write this, JM does not actually correctly decode this profile.
      Hopefully this lack of support will soon change with this commit, as x264 will be (to my knowledge) the first compliant encoder.
      a9e86d24
  30. 15 Sep, 2008 2 commits
    • Fiona Glaser's avatar
      Psychovisually optimized rate-distortion optimization and trellis · ecc9bfab
      Fiona Glaser authored
      The latter, psy-trellis, is disabled by default and is reserved as experimental; your mileage may vary.
      Default subme is raised to 6 so that psy RD is on by default.
      ecc9bfab
    • Fiona Glaser's avatar
      Add optional more optimal B-frame decision method · 95ed2720
      Fiona Glaser authored
      This method (--b-adapt 2) uses a Viterbi algorithm somewhat similar to that used in trellis quantization.
      Note that it is not fully optimized and is very slow with large --bframes values.
      It also takes into account weightb, which should improve fade detection.
      Additionally, changes were made to cache lowres intra results for each frame to avoid recalculating them.  This should improve performance in both B-frame decision methods.
      This can also be done for motion vectors, which will dramatically improve b-adapt 2 performance when it is complete.
      This patch also reads b_adapt and scenecut settings from the first pass so that the x264 header information in the output file will have correct information (since frametype decision is only done on the first pass).
      95ed2720
  31. 30 Aug, 2008 1 commit
  32. 21 Aug, 2008 2 commits
  33. 16 Aug, 2008 1 commit