1. 10 Sep, 2019 5 commits
  2. 06 Sep, 2019 1 commit
  3. 05 Sep, 2019 2 commits
    • Henrik Gramner's avatar
      Silence some clang-cl warnings · acad1a99
      Henrik Gramner authored
      For some reason the MSVC CRT _wassert() function is not flagged as
       __declspec(noreturn), so when using those headers the compiler will
      expect execution to continue after an assertion has been triggered
      and will therefore complain about the use of uninitialized variables
      when compiled in debug mode in certain code paths.
      
      Reorder some case statements as a workaround.
      acad1a99
    • Henrik Gramner's avatar
      x86: Fix buffer overead in mc put · 69dae683
      Henrik Gramner authored
      For w <= 32 we can't process more than two rows per loop iteration.
      
      Credit to OSS-Fuzz.
      69dae683
  4. 04 Sep, 2019 5 commits
    • Henrik Gramner's avatar
      x86: Increase precision of the final inverse ADST transform stages · a9315f5f
      Henrik Gramner authored
      16-bit precision is sufficient for the second pass, but the first pass
      requires 32-bit precision to correctly handle some esoteric edge cases.
      a9315f5f
    • Martin Storsjö's avatar
      arm64: itx: Do the final calculation of adst4/adst8/adst16 in 32 bit to avoid too narrow clipping · e2702eaf
      Martin Storsjö authored
      See issue #295, this fixes it for arm64.
      
      Before:                                 Cortex A53      A72      A73
      inv_txfm_add_4x4_adst_adst_1_8bpc_neon:      103.0     63.2     65.2
      inv_txfm_add_4x8_adst_adst_1_8bpc_neon:      197.0    145.0    134.2
      inv_txfm_add_8x8_adst_adst_1_8bpc_neon:      332.0    248.0    247.1
      inv_txfm_add_16x16_adst_adst_2_8bpc_neon:   1676.8   1197.0   1186.8
      After:
      inv_txfm_add_4x4_adst_adst_1_8bpc_neon:      103.0     76.4     67.0
      inv_txfm_add_4x8_adst_adst_1_8bpc_neon:      205.0    155.0    143.8
      inv_txfm_add_8x8_adst_adst_1_8bpc_neon:      358.0    269.0    276.2
      inv_txfm_add_16x16_adst_adst_2_8bpc_neon:   1785.2   1347.8   1312.1
      
      This would probably only be needed for adst in the first pass, but
      the additional code complexity from splitting the implementations
      (as we currently don't have transforms differentiated between first
      and second pass) isn't necessarily worth it (the speedup over C code
      is still 8-10x).
      e2702eaf
    • Henrik Gramner's avatar
      Prefer __builtin_unreachable() over __assume() on clang-cl · c0e1988b
      Henrik Gramner authored
      __assume() doesn't work correctly in clang-cl versions prior to 7.0.0
      which causes bogus warnings regarding use of uninitialized variables
      to be printed. Avoid that by using __builtin_unreachable() instead.
      c0e1988b
    • Henrik Gramner's avatar
      Fix clang-cl assertion warning · 666c71a0
      Henrik Gramner authored
      clang-cl doesn't like function calls in __assume statements, even
      trivial inline ones.
      666c71a0
    • Janne Grunau's avatar
      arm: Fix assembling with older binutils · e65abadf
      Janne Grunau authored
      This large constant needs a movw instruction, which newer binutils can
      figure out, but older versions need stated explicitly.
      
      This fixes #296.
      e65abadf
  5. 03 Sep, 2019 1 commit
    • Janne Grunau's avatar
      TileContext: reorder scratch buffer to avoid conflicts · 863c3731
      Janne Grunau authored
      The chroma part of pal_idx potentially conflicts during intra
      reconstruction with edge_{8,16}bpc. Fixes out of range pixel values
      caused by invalid palette indices in
      clusterfuzz-testcase-minimized-dav1d_fuzzer_mt-5076736684851200.
      Fixes #294. Reported as integer overflows in boxsum5sqr with undefined
      behavior sanitizer. Credits to oss-fuzz.
      863c3731
  6. 01 Sep, 2019 1 commit
  7. 30 Aug, 2019 3 commits
  8. 29 Aug, 2019 3 commits
  9. 28 Aug, 2019 3 commits
  10. 23 Aug, 2019 2 commits
  11. 21 Aug, 2019 1 commit
  12. 18 Aug, 2019 1 commit
  13. 15 Aug, 2019 1 commit
    • B Krishnan Iyer's avatar
      arm64: mc: NEON implementation of w_mask_444/422/420 function · 3d94fb9a
      B Krishnan Iyer authored
      	                        A73	        A53
      
      w_mask_420_w4_8bpc_c:	        818	        1082.9
      w_mask_420_w4_8bpc_neon:	79	        126.6
      w_mask_420_w8_8bpc_c:	        2486	        3399.8
      w_mask_420_w8_8bpc_neon:	200.2	        343.7
      w_mask_420_w16_8bpc_c:	        8022.3	        10989.6
      w_mask_420_w16_8bpc_neon:	528.1   	889
      w_mask_420_w32_8bpc_c:	        31851.8	        42808.6
      w_mask_420_w32_8bpc_neon:	2062.5	        3380.8
      w_mask_420_w64_8bpc_c:	        79268.5	        102683.9
      w_mask_420_w64_8bpc_neon:	5252.9	        8575.4
      w_mask_420_w128_8bpc_c:	        193704.1	255586.5
      w_mask_420_w128_8bpc_neon:	14602.3	        22167.7
      
      w_mask_422_w4_8bpc_c:	        777.3	        1038.5
      w_mask_422_w4_8bpc_neon:	72.1	        112.9
      w_mask_422_w8_8bpc_c:	        2405.7	        3168
      w_mask_422_w8_8bpc_neon:	191.9	        314.1
      w_mask_422_w16_8bpc_c:	        7783.7	        10543.9
      w_mask_422_w16_8bpc_neon:	559.8	        835.5
      w_mask_422_w32_8bpc_c:	        30895.7	        41141.2
      w_mask_422_w32_8bpc_neon:	2089.7	        3187.2
      w_mask_422_w64_8bpc_c:	        75500.2	        98766.3
      w_mask_422_w64_8bpc_neon:	5379	        8208.2
      w_mask_422_w128_8bpc_c:	        186967.1	245809.1
      w_mask_422_w128_8bpc_neon:	15159.9	        21474.5
      
      w_mask_444_w4_8bpc_c:	        850.1	        1136.6
      w_mask_444_w4_8bpc_neon:	66.5	        104.7
      w_mask_444_w8_8bpc_c:	        2373.5	        3262.9
      w_mask_444_w8_8bpc_neon:	180.5	        290.2
      w_mask_444_w16_8bpc_c:	        7291.6	        10590.7
      w_mask_444_w16_8bpc_neon:	550.9	        809.7
      w_mask_444_w32_8bpc_c:	        8048.3	        10140.8
      w_mask_444_w32_8bpc_neon:	2136.2	        3095
      w_mask_444_w64_8bpc_c:	        18055.3	        23060
      w_mask_444_w64_8bpc_neon:	5522.5	        8124.8
      w_mask_444_w128_8bpc_c:	        42754.3	        56072
      w_mask_444_w128_8bpc_neon:	15569.5	        21531.5
      3d94fb9a
  14. 14 Aug, 2019 2 commits
    • B Krishnan Iyer's avatar
      arm64: mc: NEON implementation of blend, blend_h and blend_v function · 1dc2dc7d
      B Krishnan Iyer authored
                         	A73	A53
      blend_h_w2_8bpc_c:	184.7	301.5
      blend_h_w2_8bpc_neon:	58.8	104.1
      blend_h_w4_8bpc_c:	291.4	507.3
      blend_h_w4_8bpc_neon:	48.7	108.9
      blend_h_w8_8bpc_c:	510.1	992.7
      blend_h_w8_8bpc_neon:	66.5	99.3
      blend_h_w16_8bpc_c:	972	1835.3
      blend_h_w16_8bpc_neon:	82.7	145.2
      blend_h_w32_8bpc_c:	776.7	912.9
      blend_h_w32_8bpc_neon:	155.1	266.9
      blend_h_w64_8bpc_c:	1424.3	1635.4
      blend_h_w64_8bpc_neon:	273.4	480.9
      blend_h_w128_8bpc_c:	3318.1	3774
      blend_h_w128_8bpc_neon:	614.1	1097.9
      blend_v_w2_8bpc_c:	278.8	427.5
      blend_v_w2_8bpc_neon:	113.7	170.4
      blend_v_w4_8bpc_c:	960.2	1597.7
      blend_v_w4_8bpc_neon:	222.9	351.4
      blend_v_w8_8bpc_c:	1694.2	3333.5
      blend_v_w8_8bpc_neon:	200.9	333.6
      blend_v_w16_8bpc_c:	3115.2	5971.6
      blend_v_w16_8bpc_neon:	233.2	494.8
      blend_v_w32_8bpc_c:	3949.7	6070.6
      blend_v_w32_8bpc_neon:	460.4	841.6
      blend_w4_8bpc_c:	244.2	388.3
      blend_w4_8bpc_neon:	25.5	66.7
      blend_w8_8bpc_c:	616.3	1120.8
      blend_w8_8bpc_neon:	46	110.7
      blend_w16_8bpc_c:	2193.1	4056.4
      blend_w16_8bpc_neon:	140.7	299.3
      blend_w32_8bpc_c:	2502.8	2998.5
      blend_w32_8bpc_neon:	381.4	725.3
      1dc2dc7d
    • Michael Bradshaw's avatar
      d20d70e8
  15. 13 Aug, 2019 4 commits
  16. 10 Aug, 2019 2 commits
  17. 09 Aug, 2019 3 commits