1. 06 Mar, 2019 1 commit
  2. 06 Aug, 2018 2 commits
  3. 17 Jan, 2018 1 commit
  4. 24 Dec, 2017 1 commit
    • Vittorio Giovara's avatar
      Unify 8-bit and 10-bit CLI and libraries · 71ed44c7
      Vittorio Giovara authored
      Add 'i_bitdepth' to x264_param_t with the corresponding '--output-depth' CLI
      option to set the bit depth at runtime.
      
      Drop the 'x264_bit_depth' global variable. Rather than hardcoding it to an
      incorrect value, it's preferable to induce a linking failure. If applications
      relies on this symbol this will make it more obvious where the problem is.
      
      Add Makefile rules that compiles modules with different bit depths. Assembly
      on x86 is prefixed with the 'private_prefix' define, while all other archs
      modify their function prefix internally.
      
      Templatize the main C library, x86/x86_64 assembly, ARM assembly, AARCH64
      assembly, PowerPC assembly, and MIPS assembly.
      
      The depth and cache CLI filters heavily depend on bit depth size, so they
      need to be duplicated for each value. This means having to rename these
      filters, and adjust the callers to use the right version.
      
      Unfortunately the threaded input CLI module inherits a common.h dependency
      (input/frame -> common/threadpool -> common/frame -> common/common) which
      is extremely complicated to address in a sensible way. Instead duplicate
      the module and select the appropriate one at run time.
      
      Each bitdepth needs different checkasm compilation rules, so split the main
      checkasm target into two executables.
      71ed44c7
  5. 26 Jun, 2017 1 commit
  6. 24 Jun, 2017 5 commits
  7. 21 May, 2017 5 commits
  8. 21 Jan, 2017 1 commit
  9. 01 Dec, 2016 4 commits
  10. 20 Sep, 2016 1 commit
  11. 12 Apr, 2016 1 commit
    • Henrik Gramner's avatar
      x86: dct2x4dc asm · eeb9b66d
      Henrik Gramner authored
      Only used in 4:2:2. MMX2 version implemented for 8-bit, SSE2 and AVX
      versions implemented for high bit-depth.
      
      2.5x faster on 32-bit and 1.6x faster on 64-bit compared to C on Ivy Bridge.
      eeb9b66d
  12. 16 Jan, 2016 1 commit
  13. 11 Oct, 2015 1 commit
  14. 25 Jul, 2015 1 commit
  15. 23 Feb, 2015 1 commit
  16. 16 Dec, 2014 2 commits
  17. 16 Sep, 2014 1 commit
  18. 26 Aug, 2014 1 commit
  19. 08 Jan, 2014 1 commit
  20. 20 May, 2013 1 commit
  21. 23 Apr, 2013 1 commit
  22. 26 Feb, 2013 1 commit
    • Fiona Glaser's avatar
      x86: detect Bobcat, improve Atom optimizations, reorganize flags · 5d60b9c9
      Fiona Glaser authored
      The Bobcat has a 64-bit SIMD unit reminiscent of the Athlon 64; detect this
      and apply the appropriate flags.
      
      It also has an extremely slow palignr instruction; create a flag for this to
      avoid massive penalties on palignr-heavy functions.
      
      Improve Atom function selection and document exactly what the SLOW_ATOM flag
      covers.
      
      Add Atom-optimized SATD/SA8D/hadamard_ac functions: simply combine the ssse3
      optimizations with the sse2 algorithm to avoid pmaddubsw, which is slow on
      Atom along with other SIMD multiplies.
      
      Drop TBM detection; it'll probably never be useful for x264.
      
      Invert FastShuffle to SlowShuffle; it only ever applied to one CPU (Conroe).
      
      Detect CMOV, to fail more gracefully when run on a chip with MMX2 but no CMOV.
      5d60b9c9
  23. 09 Jan, 2013 1 commit
  24. 04 Feb, 2012 3 commits
  25. 15 Jan, 2012 1 commit