- 20 Oct, 2018 8 commits
-
-
Janne Grunau authored
checkasm --bench on a Qualcomm Kryo (Sanpdragon 820): nop: 33.0 avg_w4_8bpc_c: 450.5 avg_w4_8bpc_neon: 20.1 avg_w8_8bpc_c: 438.6 avg_w8_8bpc_neon: 45.2 avg_w16_8bpc_c: 1003.7 avg_w16_8bpc_neon: 112.8 avg_w32_8bpc_c: 3249.6 avg_w32_8bpc_neon: 429.9 avg_w64_8bpc_c: 7213.3 avg_w64_8bpc_neon: 1299.4 avg_w128_8bpc_c: 16791.3 avg_w128_8bpc_neon: 2978.4 w_avg_w4_8bpc_c: 605.7 w_avg_w4_8bpc_neon: 30.9 w_avg_w8_8bpc_c: 545.8 w_avg_w8_8bpc_neon: 72.9 w_avg_w16_8bpc_c: 1430.1 w_avg_w16_8bpc_neon: 193.5 w_avg_w32_8bpc_c: 4876.3 w_avg_w32_8bpc_neon: 715.3 w_avg_w64_8bpc_c: 11338.0 w_avg_w64_8bpc_neon: 2147.0 w_avg_w128_8bpc_c: 26822.0 w_avg_w128_8bpc_neon: 4596.3 mask_w4_8bpc_c: 604.6 mask_w4_8bpc_neon: 37.2 mask_w8_8bpc_c: 654.8 mask_w8_8bpc_neon: 96.0 mask_w16_8bpc_c: 1663.0 mask_w16_8bpc_neon: 272.4 mask_w32_8bpc_c: 5707.6 mask_w32_8bpc_neon: 1028.9 mask_w64_8bpc_c: 12735.3 mask_w64_8bpc_neon: 2533.2 mask_w128_8bpc_c: 31027.6 mask_w128_8bpc_neon: 6247.2
-
Janne Grunau authored
-
James Almer authored
-
Henrik Gramner authored
-
Henrik Gramner authored
Ordering the elements this way is more SIMD-friendly.
-
Henrik Gramner authored
-
James Almer authored
Fixes stack buffer overflows.
-
- 19 Oct, 2018 11 commits
-
-
Ronald S. Bultje authored
-
Marvin Scholz authored
Sets the meson b_ndebug option to default to if-release, so that asserts are disabled in release builds.
-
Ronald S. Bultje authored
-
-
David Michael Barr authored
Helped-by:
Henrik Gramner <gramner@twoorioles.com>
-
David Michael Barr authored
-
David Michael Barr authored
This will help when writing x86_64 assembly.
-
This fixes warnings like these, if not all bitdepths are enabled: ../src/decode.c: In function ‘dav1d_submit_frame’: ../src/decode.c:2825:5: warning: "CONFIG_10BPC" is not defined [-Wundef] #if CONFIG_10BPC
-
Martin Storsjö authored
Despite what MSDN says, this intrinsic doesn't exist for ARM, only for ARM64.
-
-
Martin Storsjö authored
-
- 18 Oct, 2018 8 commits
-
-
James Almer authored
All the functions are public, and the only prototype in this header is a duplicate.
-
Fix following ubsan error in #68: ../src/env.h:296:24: runtime error: shift exponent -1 is negative [Detaching after fork from child process 22253] #0 0x7ffff76ad6f9 in get_poc_diff /home/janne/src/dav1d/build-usan/../src/env.h:296:24 #1 0x7ffff76ad6f9 in parse_frame_hdr /home/janne/src/dav1d/build-usan/../src/obu.c:757 #2 0x7ffff7696491 in dav1d_parse_obus /home/janne/src/dav1d/build-usan/../src/obu.c:1023:20 #3 0x7ffff7921c7d in dav1d_decode /home/janne/src/dav1d/build-usan/../src/lib.c:193:20 #4 0x424869 in main /home/janne/src/dav1d/build-usan/../tools/dav1d.c:108:20 #5 0x7ffff63dfae6 in __libc_start_main (/lib64/libc.so.6+0x21ae6) #6 0x403489 in _start (/home/janne/src/dav1d/build-usan/tools/dav1d+0x403489) I can't reproduce the ubsan error in the issue.
-
Reject out of range values as errors and avoid undefined shifts. Fixes #67.
-
Fixes #66. Also fixes a leak of the demuxer context.
-
Fixes #66.
-
Fixes #62.
-
-
- 17 Oct, 2018 1 commit
-
-
Ronald S. Bultje authored
wiener_luma_8bpc_c: 326272.1 wiener_luma_8bpc_avx2: 19841.5 Decoding time of first 1000 frames of Chimera-8bit-1920x1080.ivf goes from 27.471 to 23.558 seconds.
-
- 16 Oct, 2018 1 commit
-
-
Ronald S. Bultje authored
Also copy 4 pixels so SIMD can use a padded write (movd).
-
- 15 Oct, 2018 1 commit
-
-
Luc Trudeau authored
-
- 14 Oct, 2018 5 commits
-
-
David Michael Barr authored
-
David Michael Barr authored
Deriving uvtx for chroma-from-luma is the same for lossless. The availability of CfL is constrained to when they match.
-
David Michael Barr authored
Removes the splat of pixels that are then overwritten. Rename cfl_pred_1 to cfl_pred now that there is only one.
-
David Michael Barr authored
-
Henrik Gramner authored
-
- 13 Oct, 2018 5 commits
-
-
Fixes emmory leak with asan seen with 'dav1d --tilethreads 2 ...'
-
Makes the tile parsing code simpler. Fixes a heap buffer overflow with clusterfuzz-testcase-minimized-dav1d_fuzzer-5726018392817664. Credit to oss-fuzz.
-
When breaking out of the decoding either through an error or reaching the limit of decoded frames the input buffer might not be fully consumed by the previous dav1d_decode() call. Fixes a memory leak discovered while testing with frame and tile threads with --limit.
-
Fixes a heap buffer overflow in clusterfuzz-testcase-minimized-dav1d_fuzzer-5677513716531200. Credits to oss-fuzz.
-
Martin Storsjö authored
On ARM, the readtime implementations are instructions that might not always be allowed at runtime (depending on whether the kernel has allowed user mode code to access the cycle counter registers). In order to allow building checkasm with the option for benchmarking, while still running on devices where benchmarking isn't possible, don't use readtime anywhere unless --bench has been specified. Use GetTickCount for the seed on windows, and gettimeofday on unix.
-