1. 30 Jul, 2008 2 commits
    • Fiona Glaser's avatar
      Fix regression in r922 · ff7639b0
      Fiona Glaser authored
      set the chroma DC coefficients to zero for residual coding in qpel-rd
      fix C99ism
      ff7639b0
    • Fiona Glaser's avatar
      Improve intra RD refine, speed up residual_write_cabac · 63b84fa4
      Fiona Glaser authored
      a do/while loop can be used for residual_write, but i8x8 had to be fixed so that it wouldn't call residual_write with zero coeffs
      proper nnz handling added to cabac intra rd refine
      chroma cbp added to 8x8 chroma rd
      cbp was tested, but wasn't useful
      63b84fa4
  2. 26 Jul, 2008 2 commits
  3. 24 Jul, 2008 1 commit
  4. 18 Jul, 2008 1 commit
  5. 16 Jul, 2008 1 commit
  6. 12 Jul, 2008 1 commit
  7. 11 Jul, 2008 2 commits
  8. 10 Jul, 2008 3 commits
  9. 06 Jul, 2008 1 commit
    • Fiona Glaser's avatar
      Various optimizations and cosmetics · c9c7edf3
      Fiona Glaser authored
      Update AUTHORS file with Gabriel and me
      update XCHG macro to work correctly in if statements
      Add new lookup tables for block_idx and fdec/fenc addresses
      Slightly faster array_non_zero_count_mmx (patch by holger)
      Eliminate branch in analyse_intra
      Unroll loops in and clean up chroma encode
      Convert some for loops to do/while loops for speed improvement
      Do explicit write-combining on --me tesa mvsad_t struct
      Shrink --me esa zero[] array
      Speed up bime by reducing size of visited[][][] array
      c9c7edf3
  10. 04 Jul, 2008 1 commit
    • Fiona Glaser's avatar
      Update file headers throughout x264 · bdbd4fe7
      Fiona Glaser authored
      Update "Authors" lists based on actual authorship; highest is most important
      Update copyright notices and remove old CVS tags from file headers
      Add file headers to GTK and other sections missing them
      Update FSF address
      Other header-related cosmetics
      bdbd4fe7
  11. 03 Jul, 2008 1 commit
  12. 02 Jul, 2008 2 commits
    • Fiona Glaser's avatar
      Fix bug in adaptive quantization · 5b92682d
      Fiona Glaser authored
      In some cases adaptive quantization did not correctly calculate the variance.
      Bug reported by MasterNobody
      5b92682d
    • Fiona Glaser's avatar
      Optimizations and cosmetics in macroblock.c · a59f4a7b
      Fiona Glaser authored
      If an i4x4 dct block has no coefficients, don't bother with dequant/zigzag/idct.  Not useful for larger sizes because the odds of an empty block are much lower.
      Cosmetics in i16x16 to be more consistent with other similar functions.
      Add an SSD threshold for chroma in probe_skip to improve speed and minimize time spent on chroma skip analysis.
      Rename lambda arrays to lambda_tab for consistency.
      a59f4a7b
  13. 24 Jun, 2008 2 commits
    • Fiona Glaser's avatar
      Move bitstream end check to macroblock level · e9369576
      Fiona Glaser authored
      Additionally, instead of silently truncating the frame upon reaching the end of the buffer, reallocate a larger buffer instead.
      e9369576
    • Fiona Glaser's avatar
      Convert NNZ to raster order and other optimizations · ec3d0955
      Fiona Glaser authored
      Converting NNZ to raster order simplifies a lot of the load/store code and allows more use of write-combining.
      More use of write-combining throughout load/save code in common/macroblock.c
      GCC has aliasing issues in the case of stores to 8-bit heap-allocated arrays; dereferencing the pointer once avoids this problem and significantly increases performance.
      More manual loop unrolling and such.
      Move all packXtoY functions to macroblock.h so any function can use them.
      Add pack8to32.
      Minor optimizations to encoder/macroblock.c
      ec3d0955
  14. 18 Jun, 2008 1 commit
  15. 15 Jun, 2008 2 commits
  16. 12 Jun, 2008 2 commits
    • Fiona Glaser's avatar
      More tweaks to me.c · 52041128
      Fiona Glaser authored
      Added inline MMX version of UMH's predictor difference test
      Various cosmetics throughout me.c
      Removed a C99-ism introduced in r878.
      52041128
    • Fiona Glaser's avatar
      Fix regression in r736 · d4e07786
      Fiona Glaser authored
      r736 added intra RD refinement to B-frames; however, it is possible for subme=7 to be used without b-rdo.
      This means intra RD isn't run, and therefore it is possible for intra chroma analysis to not have been run, since update_cache was never called for an intra block, and chroma ME is not required even at subme=7.
      r801, which removed a memset, made this worse because previously the chroma prediction mode was at least initialized to zero; now it was not initialized at all.
      Therefore, --no-chroma-me, --subme 7, and no --b-rdo had the potential to crash.
      This change restricts intra RD refinement to only be run when --b-rdo is enabled (sensible to begin with), thus preventing a crash in this case.
      d4e07786
  17. 11 Jun, 2008 3 commits
  18. 08 Jun, 2008 3 commits
    • Fiona Glaser's avatar
      Partially inline trellis quantization · 9cc180ac
      Fiona Glaser authored
      Inlining trellis into the 4x4/8x8 trellis wrappers increases trellis speed by about 5-10% through constant propagation.
      9cc180ac
    • Fiona Glaser's avatar
      Various cosmetic changes. · 3d9b6b3c
      Fiona Glaser authored
      3d9b6b3c
    • Loren Merritt's avatar
      many changes to which asm functions are enabled on which cpus. · c0c0e1f4
      Loren Merritt authored
      with Phenom, 3dnow is no longer equivalent to "sse2 is slow", so make a new flag for that.
      some sse2 functions are useful only on Core2 and Phenom, so make a "sse2 is fast" flag for that.
      some ssse3 instructions didn't become useful until Penryn, so yet another flag.
      disable sse2 completely on Pentium M and Core1, because it's uniformly slower than mmx.
      enable some sse2 functions on Athlon64 that always were faster and we just didn't notice.
      remove mc_luma_sse3, because the only cpu that has lddqu (namely Pentium 4D) doesn't have "sse2 is fast".
      don't print mmx1, sse1, nor 3dnow in the detected cpuflags, since we don't really have any such functions. likewise don't print sse3 unless it's used (Pentium 4D).
      c0c0e1f4
  19. 05 Jun, 2008 2 commits
  20. 03 Jun, 2008 4 commits
  21. 02 Jun, 2008 3 commits