1. 07 Jun, 2019 1 commit
  2. 06 Jun, 2019 1 commit
  3. 05 Jun, 2019 2 commits
  4. 04 Jun, 2019 1 commit
    • Marvin Scholz's avatar
      meson: Fix nasm detection · 098a565c
      Marvin Scholz authored
      nasm -v can actually fail for example on macOS, where nasm could be a
      stub executable that forwards commands to the real nasm, but if the real
      nasm is not installed, fails.
      This would lead to a confusing error message due to the out of bounds
      array access, to avoid that, explicitly check the exit code.
      098a565c
  5. 01 Jun, 2019 1 commit
  6. 31 May, 2019 1 commit
  7. 24 May, 2019 2 commits
  8. 23 May, 2019 1 commit
  9. 21 May, 2019 5 commits
  10. 19 May, 2019 3 commits
    • Martin Storsjö's avatar
      ci: Add full testdata tests on aarch64 · a690e548
      Martin Storsjö authored
      The armv7 runner doesn't seem to cope well with the testdata though.
      a690e548
    • Henrik Gramner's avatar
      7d5f0d0c
    • Martin Storsjö's avatar
      arm: mc: Fix 8tap_v w8 with OBMC 3/4 heights · bf920fba
      Martin Storsjö authored
      Also make sure that the w4 case can exit after processing 12 pixels,
      where it is convenient.
      
      This gives a small slowdown for in-order cores like A7, A8, A53, but
      acutally seems to give a small speedup for out-of-order cores like
      A9, A72 and A73.
      
      AArch64:
      Before:                      Cortex A53     A72     A73
      mc_8tap_regular_w8_v_8bpc_neon:   223.8   247.3   228.5
      After:
      mc_8tap_regular_w8_v_8bpc_neon:   232.5   243.9   223.4
      
      AArch32:
      Before:                       Cortex A7      A8      A9     A53     A72     A73
      mc_8tap_regular_w8_v_8bpc_neon:   550.2   470.7   520.5   257.0   256.4   248.2
      After:
      mc_8tap_regular_w8_v_8bpc_neon:   554.3   474.2   511.6   267.5   252.6   246.8
      bf920fba
  11. 18 May, 2019 1 commit
    • Henrik Gramner's avatar
      Optimize obmc blend · f64fdae5
      Henrik Gramner authored
      The last 1/4 of the mask is always zero, so we can skip some
      calculations that doesn't change the output.
      f64fdae5
  12. 17 May, 2019 3 commits
  13. 16 May, 2019 2 commits
  14. 15 May, 2019 2 commits
    • Martin Storsjö's avatar
      arm64: msac: Add handwritten versions of msac_decode_bool functions · 2e8a3a21
      Martin Storsjö authored
      GCC                     Cortex A53   A72   A73
      msac_decode_bool_c:           29.9  17.9  23.2
      msac_decode_bool_neon:        27.4  15.3  20.4
      msac_decode_bool_adapt_c:     49.2  26.8  31.0
      msac_decode_bool_adapt_neon:  38.2  22.2  25.4
      msac_decode_bool_equi_c:      26.6  16.8  19.4
      msac_decode_bool_equi_neon:   23.9  13.7  15.7
      
      Clang                   Cortex A53   A72   A73
      msac_decode_bool_c:           28.0  16.4  23.1
      msac_decode_bool_neon:        26.9  14.6  21.0
      msac_decode_bool_adapt_c:     46.8  25.1  31.4
      msac_decode_bool_adapt_neon:  36.2  19.0  26.2
      msac_decode_bool_equi_c:      23.7  13.4  18.8
      msac_decode_bool_equi_neon:   23.7  11.3  14.2
      
      This is as fast as, or faster than, what either GCC or Clang
      produces.
      2e8a3a21
    • Martin Storsjö's avatar
      arm64: msac: Fix a typo in a comment · 84f938ec
      Martin Storsjö authored
      84f938ec
  15. 14 May, 2019 5 commits
  16. 12 May, 2019 2 commits
  17. 11 May, 2019 2 commits
  18. 09 May, 2019 5 commits