- 11 Jul, 2008 1 commit
-
-
Fiona Glaser authored
Also remove display of "PCM" if PCM mode is never used in the encode. L1 reflist information will only show if pyramid coding is used.
-
- 10 Jul, 2008 5 commits
-
-
Fiona Glaser authored
In RD mode, always consider PCM as a macroblock mode possibility Fix bitstream writing for PCM blocks in CAVLC and CABAC, and a few other minor changes to make PCM work. PCM macroblocks improve compression at very low QPs (1-5) and in lossless mode.
-
Loren Merritt authored
-
Loren Merritt authored
-
Fiona Glaser authored
-
Loren Merritt authored
-
- 06 Jul, 2008 2 commits
-
-
Fiona Glaser authored
Update AUTHORS file with Gabriel and me update XCHG macro to work correctly in if statements Add new lookup tables for block_idx and fdec/fenc addresses Slightly faster array_non_zero_count_mmx (patch by holger) Eliminate branch in analyse_intra Unroll loops in and clean up chroma encode Convert some for loops to do/while loops for speed improvement Do explicit write-combining on --me tesa mvsad_t struct Shrink --me esa zero[] array Speed up bime by reducing size of visited[][][] array
-
Fiona Glaser authored
In some cases, the mmx version of frame_init_lowres could leave the FPU uninitialized for use in ratecontrol, resulting in floating point exceptions. Since frame_init_lowres is such a time-consuming function, an emms was just put at the end, since it costs almost nothing compared to the total time of frame_init_lowres.
-
- 04 Jul, 2008 2 commits
-
-
Eric Petit authored
-
Fiona Glaser authored
Update "Authors" lists based on actual authorship; highest is most important Update copyright notices and remove old CVS tags from file headers Add file headers to GTK and other sections missing them Update FSF address Other header-related cosmetics
-
- 03 Jul, 2008 2 commits
-
-
Fiona Glaser authored
-
Loren Merritt authored
SWAP can now take mmregs directly, rather than just their numbers
-
- 02 Jul, 2008 3 commits
-
-
Fiona Glaser authored
In some cases adaptive quantization did not correctly calculate the variance. Bug reported by MasterNobody
-
Loren Merritt authored
rounding is changed for asm convenience. this makes the c version slower, but there's no way around that if all the implementations are to have the same results.
-
Fiona Glaser authored
If an i4x4 dct block has no coefficients, don't bother with dequant/zigzag/idct. Not useful for larger sizes because the odds of an empty block are much lower. Cosmetics in i16x16 to be more consistent with other similar functions. Add an SSD threshold for chroma in probe_skip to improve speed and minimize time spent on chroma skip analysis. Rename lambda arrays to lambda_tab for consistency.
-
- 29 Jun, 2008 1 commit
-
-
Gabriel Bouvigne authored
-
- 24 Jun, 2008 2 commits
-
-
Fiona Glaser authored
Additionally, instead of silently truncating the frame upon reaching the end of the buffer, reallocate a larger buffer instead.
-
Fiona Glaser authored
Converting NNZ to raster order simplifies a lot of the load/store code and allows more use of write-combining. More use of write-combining throughout load/save code in common/macroblock.c GCC has aliasing issues in the case of stores to 8-bit heap-allocated arrays; dereferencing the pointer once avoids this problem and significantly increases performance. More manual loop unrolling and such. Move all packXtoY functions to macroblock.h so any function can use them. Add pack8to32. Minor optimizations to encoder/macroblock.c
-
- 18 Jun, 2008 3 commits
-
-
Loren Merritt authored
-
Loren Merritt authored
-
Loren Merritt authored
-
- 15 Jun, 2008 3 commits
-
-
Fiona Glaser authored
x264 will now terminate gracefully rather than SIGILL when run on a machine with no MMXEXT support. A configure option is now available to build x264 without assembly support for support on such old CPUs as the Pentium 2, K6, etc.
-
Fiona Glaser authored
-
Fiona Glaser authored
GCC is not very good at loop unrolling in cases where it can perform constant propagation, so the unrolling unfortunately has to be done manually.
-
- 12 Jun, 2008 3 commits
-
-
Fiona Glaser authored
i_mvc needs to be 64-bit when used with a 64-bit memory pointer
-
Fiona Glaser authored
Added inline MMX version of UMH's predictor difference test Various cosmetics throughout me.c Removed a C99-ism introduced in r878.
-
Fiona Glaser authored
r736 added intra RD refinement to B-frames; however, it is possible for subme=7 to be used without b-rdo. This means intra RD isn't run, and therefore it is possible for intra chroma analysis to not have been run, since update_cache was never called for an intra block, and chroma ME is not required even at subme=7. r801, which removed a memset, made this worse because previously the chroma prediction mode was at least initialized to zero; now it was not initialized at all. Therefore, --no-chroma-me, --subme 7, and no --b-rdo had the potential to crash. This change restricts intra RD refinement to only be run when --b-rdo is enabled (sensible to begin with), thus preventing a crash in this case.
-
- 11 Jun, 2008 3 commits
-
-
Fiona Glaser authored
Bug resulted in rare incorrect chroma encoding
-
Gabriel Bouvigne authored
-
Fiona Glaser authored
Use write-combining for predictor checking and other tweaks.
-
- 08 Jun, 2008 6 commits
-
-
Fiona Glaser authored
Inlining trellis into the 4x4/8x8 trellis wrappers increases trellis speed by about 5-10% through constant propagation.
-
Fiona Glaser authored
-
Fiona Glaser authored
-
Loren Merritt authored
with Phenom, 3dnow is no longer equivalent to "sse2 is slow", so make a new flag for that. some sse2 functions are useful only on Core2 and Phenom, so make a "sse2 is fast" flag for that. some ssse3 instructions didn't become useful until Penryn, so yet another flag. disable sse2 completely on Pentium M and Core1, because it's uniformly slower than mmx. enable some sse2 functions on Athlon64 that always were faster and we just didn't notice. remove mc_luma_sse3, because the only cpu that has lddqu (namely Pentium 4D) doesn't have "sse2 is fast". don't print mmx1, sse1, nor 3dnow in the detected cpuflags, since we don't really have any such functions. likewise don't print sse3 unless it's used (Pentium 4D).
-
Loren Merritt authored
-
Loren Merritt authored
-
- 05 Jun, 2008 2 commits
-
-
Fiona Glaser authored
-
Fiona Glaser authored
Cplxblur was originally intended to use a gaussian window, but in its current form did not. This change provides a tiny improvement to 2pass ratecontrol.
-
- 03 Jun, 2008 2 commits
-
-
Loren Merritt authored
-
Loren Merritt authored
-