1. 06 Aug, 2018 7 commits
    • Henrik Gramner's avatar
      x86: AVX-512 plane_copy and plane_copy_swap · 3d9ec58f
      Henrik Gramner authored
      Avoid the scalar C wrapper by utilizing opmasks to prevent overreading the
      input buffer.
      3d9ec58f
    • eruffaldi's avatar
      4:0:0 (monochrome) encoding support · 698c5a32
      eruffaldi authored
      Virtually zero increase in compression efficiency compared to 4:2:0 with empty
      chroma planes. Performance is better though, especially with fast settings.
      698c5a32
    • Diego Biurrun's avatar
      Makefile improvements · 814e61e8
      Diego Biurrun authored
       * Coalesce some install recipe lines
      
       * Remove empty addition of GPLed filters
      
       * Install libdir in recipes that directly require it
      
       * Coalesce etags/TAGS rules
      
       * Simplify fprofiled rule
      814e61e8
    • Henrik Gramner's avatar
      x86inc: Improve SAVE/LOAD_MM_PERMUTATION macros · 28e48798
      Henrik Gramner authored
      Use register numbers instead of copying the full register names. This makes it
      possible to change register widths in the middle of a function and keep the
      mmreg permutations intact which can be useful for code that only needs larger
      vectors for parts of the function in combination with macros etc.
      
      Also change the LOAD_MM_PERMUTATION macro to use the same default name as the
      SAVE macro. This simplifies swapping from ymm to xmm registers or vice versa:
      
          SAVE_MM_PERMUTATION
          INIT_XMM <cpuflags>
          LOAD_MM_PERMUTATION
      28e48798
    • Henrik Gramner's avatar
      x86inc: Optimize VEX instruction encoding · 8badb910
      Henrik Gramner authored
      Most VEX-encoded instructions require an additional byte to encode when src2
      is a high register (e.g. x|ymm8..15). If the instruction is commutative we
      can swap src1 and src2 when doing so reduces the instruction length, e.g.
      
          vpaddw xmm0, xmm0, xmm8 -> vpaddw xmm0, xmm8, xmm0
      8badb910
    • Henrik Gramner's avatar
      x86inc: Fix VEX -> EVEX instruction conversion · 0a84d986
      Henrik Gramner authored
      There's an edge case that wasn't properly handled.
      0a84d986
    • Anton Mitrofanov's avatar
  2. 21 Jul, 2018 3 commits
  3. 29 Jun, 2018 1 commit
  4. 24 Jun, 2018 1 commit
  5. 02 Jun, 2018 1 commit
    • Henrik Gramner's avatar
      Fix clang stack alignment issues · 7737e6ad
      Henrik Gramner authored
      Clang emits aligned AVX stores for things like zeroing stack-allocated
      variables when using -mavx even with -fno-tree-vectorize set which can
      result in crashes if this occurs before we've realigned the stack.
      
      Previously we only ensured that the stack was realigned before calling
      assembly functions that accesses stack-allocated buffers but this is
      not sufficient. Fix the issue by changing the stack realignment to
      instead occur immediately in all CLI, API and thread entry points.
      7737e6ad
  6. 27 May, 2018 6 commits
  7. 31 Mar, 2018 2 commits
  8. 18 Jan, 2018 1 commit
  9. 17 Jan, 2018 7 commits
  10. 24 Dec, 2017 11 commits