- 09 Nov, 2009 7 commits
-
-
Dylan Yudaken authored
Merge Dylan's Google Summer of Code 2009 tree. Detect fades and use weighted prediction to improve compression and quality. "Blind" mode provides a small overall quality increase by using a -1 offset without doing any analysis, as described in JVT-AB033. "Smart", the default mode, also performs fade detection and decides weights accordingly. MB-tree takes into account the effects of "smart" analysis in lookahead, even further improving quality in fades. If psy is on, mbtree is on, interlaced is off, and weightp is off, fade detection will still be performed. However, it will be used to adjust quality instead of create actual weights. This will improve quality in fades when encoding in Baseline profile. Doesn't add support for interlaced encoding with weightp yet. Only adds support for luma weights, not chroma weights. Internal code for chroma weights is in, but there's no analysis yet. Baseline profile requires that weightp be off. All weightp modes may cause minor breakage in non-compliant decoders that take shortcuts in deblocking reference frame checks. "Smart" may cause serious breakage in non-compliant decoders that take shortcuts in handling of duplicate reference frames. Thanks to Google for sponsoring our most successful Summer of Code yet!
-
Steven Walters authored
-
David Conrad authored
Fix comment for mc_copy_neon. Fix memzero_aligned_neon prototype. Update NEON (i)dct_dc prototypes. Duplicate x86 behavior for global+hidden functions.
-
Fiona Glaser authored
Aliasing violation in spatial prediction caused nasty artifacts. Shut up two other GCC warnings while we're at it.
-
Anton Mitrofanov authored
-
Fiona Glaser authored
~10k of code size eliminated.
-
Loren Merritt authored
weirdly, valgrind reported this only with --no-asm.
-
- 29 Oct, 2009 2 commits
-
-
Fiona Glaser authored
Cacheline-aware in the same fashion as width8, but not conditional.
-
Fiona Glaser authored
Turning off inlining saves a whole boatload of code size for near-zero speed cost. Simplify offset calculation. Various other optimizations.
-
- 25 Oct, 2009 5 commits
-
-
Loren Merritt authored
-
Fiona Glaser authored
As the assembly abstraction layer is very useful in non-x264 projects, it is now ISC (simplified BSD) so that others, even in commercial projects, can use it as well.
-
Fiona Glaser authored
-
Fiona Glaser authored
And an errant space in common/macroblock.c
-
Henrik Gramner authored
-
- 19 Oct, 2009 1 commit
-
-
Lamont Alston authored
The rules of the specification with regard to picture buffering for pyramid coding are widely ignored. x264's b-pyramid implementation, despite being practically identical to that proposed by the original paper, was technically not compliant. Now it is. Two modes are now available: 1) strict b-pyramid, while worse for compression, follows the rule mandated by Blu-ray (no P-frames can reference B-frames) 2) normal b-pyramid, which is like the old mode except fully compliant. This patch also adds MMCO support (necessary for compliant pyramid in some cases). MB-tree still doesn't support b-pyramid (but will soon).
-
- 18 Oct, 2009 1 commit
-
-
Loren Merritt authored
-
- 12 Oct, 2009 3 commits
-
-
Loren Merritt authored
the C standard doesn't allow you to iterate 1-dimensionally over 2d arrays, and nothing other than the dsp functions themselves cares about the 2dness of dct. this fixes a miscompilation in x264_mb_optimize_chroma_dc.
-
Anton Mitrofanov authored
Slightly faster and more accurate rounding.
-
Fiona Glaser authored
"Flashes" are defined as any scene which lasts a very short period before a previous scene returns. A common example of this is of course a camera flash. Accordingly, look ahead during scenecut analysis and rule out the possibility of certain frames being scenecuts. Also handles cases of tons of short scenes in sequence and avoids making those scenecuts as well. Can only catch flashes of 1 frame in length with b-adapt 1. With b-adapt 2, can catch flashes of length --bframes. Speed cost should be negligible.
-
- 07 Oct, 2009 3 commits
-
-
Loren Merritt authored
-
Holger Lubitz authored
27->24 clocks on Nehalem. This is really just an excuse to use "movsd" in a real function. Add some comments to subsum-related macros in x86util.
-
Fiona Glaser authored
Enable with --constrained-intra. Significantly reduces compression, but required for the base layer of SVC encodes and maybe some other use-cases. Commit sponsored by a media streaming company that wishes to remain anonymous.
-
- 23 Sep, 2009 1 commit
-
-
Anton Mitrofanov authored
Avoid unnecessary cond_wait
-
- 21 Sep, 2009 2 commits
-
-
Fiona Glaser authored
Also eliminate some NANs in stat output with intra-only encoding. Marginal speedup: disable stat calculation if log level is below X264_LOG_INFO. Various minor cosmetics.
-
Fiona Glaser authored
libx264 now returns NAL units instead of raw data. x264_nal_encode is no longer a public function. See x264.h for full documentation of changes. New parameter: b_annexb, on by default. If disabled, startcodes are replaced by sizes as in mp4. x264's VBV now works on a NAL level, taking into account escape codes. VBV will also take into account the bit cost of SPS/PPS, but only if b_repeat_headers is set. Add an overhead tracking system to VBV to better predict the constant overhead of frames (headers, NALU overhead, etc).
-
- 14 Sep, 2009 1 commit
-
-
Fiona Glaser authored
Fixes some extremely rare threading race conditions and makes the code cleaner. Downside: slightly higher memory usage when calling multiple encoders from the same application.
-
- 02 Sep, 2009 3 commits
-
-
David Conrad authored
-
Steven Walters authored
Instead of setting the lookahead thread to max priority, lower all the other threads' priorities instead. This is particularly useful when the "max priority" is "realtime", as in Windows, which can cause some problems.
-
Steven Walters authored
Move lookahead into a separate thread, set to higher priority than the other threads, for optimal performance. Reduces the amount that lookahead bottlenecks encoding, greatly increasing performance with lookahead-intensive settings (e.g. b-adapt 2) on many-core CPUs. Buffer size can be controlled with --sync-lookahead, which defaults to auto (threads+bframes buffer size). Note that this buffer is separate from the rc-lookahead value. Note also that this does not split lookahead itself into multiple threads yet; this may be added in the future. Additionally, split frames into "fdec" and "fenc" frame types and keep the two separate. This split greatly reduces memory usage, which helps compensate for the larger lookahead size. Extremely special thanks to Michael Kazmier and Alex Giladi of Avail Media, the original authors of this patch.
-
- 31 Aug, 2009 1 commit
-
-
Fiona Glaser authored
Slicing support is available through three methods (which can be mixed): --slices sets a number of slices per frame and ensures rectangular slices (required for Blu-ray). Overridden by either of the following options: --slice-max-mbs sets a maximum number of macroblocks per slice. --slice-max-size sets a maximum slice size, in bytes (includes NAL overhead). Implement macroblock re-encoding support to allow highly accurate slice size limitation. Might be useful for other things in the future, too.
-
- 28 Aug, 2009 1 commit
-
-
Fiona Glaser authored
Correctly error out if the implied minimum chroma QP is too low. Add missing emms to checkasm macroblock_tree_propagate test.
-
- 27 Aug, 2009 2 commits
-
-
Fiona Glaser authored
Avoid an int->float conversion with a small table. Change lowres_inter_types to a bitfield; cut its size by 75%. Somewhat lower memory usage with lots of bframes. Make log2/exp2 tables global to avoid duplication.
-
Fiona Glaser authored
22->13 cycles on Core 2 with mfpmath=sse
-
- 24 Aug, 2009 6 commits
-
-
David Conrad authored
4x4 dc/h/ddr/ddl, 8x8 dc/h, 8x8c h/v, 16x16 dc/h/v
-
David Conrad authored
Originally written for ffmpeg by Mans Rullgard; ported by David. Luma and chroma inter deblocking; no intra yet.
-
David Conrad authored
(de)quant 4x4, (de)quant 8x8, (de)quant DC, coeff_last
-
David Conrad authored
(i)dct4x4dc, (i)dct4x4, (i)dct8x8, (i)dct_dc, zigzag_scan_frame_4x4
-
David Conrad authored
prefetch, memcpy_aligned, memzero_aligned, avg, mc_luma, get_ref, mc_chroma, hpel_filter, frame_init_lowres
-
David Conrad authored
SAD, SADX3/X4, SSD, SATD, SA8D, Hadamard_AC, VAR, VAR2, SSIM
-
- 23 Aug, 2009 1 commit
-
-
David Conrad authored
Neither GCC nor ARMCC support 16 byte stack alignment despite the fact that NEON loads require it. These macros only work for arrays, but fortunately that covers almost all instances of stack alignment in x264.
-