1. 07 Aug, 2019 1 commit
  2. 28 Jul, 2019 1 commit
  3. 27 Jul, 2019 5 commits
  4. 25 Jul, 2019 1 commit
  5. 23 Jul, 2019 4 commits
    • B Krishnan Iyer's avatar
      arm: mc: neon: Merge load and other related operations in blend/blend_h/blend_v functions · 407c27db
      B Krishnan Iyer authored
      	                        A73		A53
      	                Current	Earlier	Current	Earlier
      blend_h_w2_8bpc_neon:	71.1	74.1	132.7	137.5
      blend_h_w4_8bpc_neon:	60.2	65.8	137.5	147.1
      blend_h_w8_8bpc_neon:	62.2	68.9	123.1	131.7
      blend_h_w16_8bpc_neon:	82.1	86	180.7	190.3
      blend_h_w32_8bpc_neon:	149.9	149.2	358.3	358
      blend_h_w64_8bpc_neon:	265.3	263.1	630.2	629.8
      blend_h_w128_8bpc_neon:	579.5	571	1404.4	1404.5
      blend_v_w2_8bpc_neon:	118.7	118.7	193.2	195.3
      blend_v_w4_8bpc_neon:	248.6	245.8	373.4	357.3
      blend_v_w8_8bpc_neon:	202.7	202	356.4	357.2
      blend_v_w16_8bpc_neon:	238.8	234.8	590.4	591.3
      blend_v_w32_8bpc_neon:	346.7	344.4	993.7	994.7
      blend_w4_8bpc_neon:	33.5	37.5	90.7	96.7
      blend_w8_8bpc_neon:	49.7	53	123.3	123.3
      blend_w16_8bpc_neon:	151.8	151	348.8	332.4
      blend_w32_8bpc_neon:	372.9	370.9	908.3	908.4
      407c27db
    • B Krishnan Iyer's avatar
      arm: mc: neon: Reduce usage of general purpose registers in blend/blend_v functions · d4df8619
      B Krishnan Iyer authored
      	                	A73		A53
                      	Current	Earlier	Current	Earlier
      blend_h_w2_8bpc_neon:	74.1	74.1	137.5	137.5
      blend_h_w4_8bpc_neon:	65.8	65.8	147.1	147.1
      blend_h_w8_8bpc_neon:	68.9	68.7	131.7	131.7
      blend_h_w16_8bpc_neon:	86	85.6	190.3	190.4
      blend_h_w32_8bpc_neon:	149.2	149.8	358	358.3
      blend_h_w64_8bpc_neon:	263.1	264.1	629.8	630.3
      blend_h_w128_8bpc_neon:	571	575.4	1404.5	1404.2
      blend_v_w2_8bpc_neon:	118.7	120.1	195.3	196.4
      blend_v_w4_8bpc_neon:	245.8	247.2	357.3	358.4
      blend_v_w8_8bpc_neon:	202	204.2	357.2	358.4
      blend_v_w16_8bpc_neon:	234.8	238.5	591.3	591.8
      blend_v_w32_8bpc_neon:	344.4	347.2	994.7	997.2
      blend_w4_8bpc_neon:	37.5	38.3	96.7	98.7
      blend_w8_8bpc_neon:	53	54.8	123.3	125.3
      blend_w16_8bpc_neon:	151	150.8	332.4	334.5
      blend_w32_8bpc_neon:	370.9	361.6	908.4	910.7
      d4df8619
    • B Krishnan Iyer's avatar
      arm: mc: neon: Use vld with ! post-increment instead of a register in... · b704a993
      B Krishnan Iyer authored
      arm: mc: neon: Use vld with ! post-increment instead of a register in blend/blend_h/blend_v function
      
      	                        A73		A53
      	                Current	Earlier	Current	Earlier
      blend_h_w2_8bpc_neon:	74.1	74.6	137.5	137
      blend_h_w4_8bpc_neon:	65.8	66	147.1	146.6
      blend_h_w8_8bpc_neon:	68.7	68.6	131.7	131.2
      blend_h_w16_8bpc_neon:	85.6	85.9	190.4	192
      blend_h_w32_8bpc_neon:	149.8	149.8	358.3	357.6
      blend_h_w64_8bpc_neon:	264.1	262.8	630.3	629.5
      blend_h_w128_8bpc_neon:	575.4	577	1404.2	1402
      blend_v_w2_8bpc_neon:	120.1	121.3	196.4	195.5
      blend_v_w4_8bpc_neon:	247.2	247.5	358.4	358.5
      blend_v_w8_8bpc_neon:	204.2	205.2	358.4	358.5
      blend_v_w16_8bpc_neon:	238.5	237.1	591.8	590.5
      blend_v_w32_8bpc_neon:	347.2	345.8	997.2	994.1
      blend_w4_8bpc_neon:	38.3	38.6	98.7	99.2
      blend_w8_8bpc_neon:	54.8	55.1	125.3	125.8
      blend_w16_8bpc_neon:	150.8	150.1	334.5	344
      blend_w32_8bpc_neon:	361.6	360.4	910.7	910.9
      b704a993
    • Marvin Scholz's avatar
      tools: add a simple player example · 5ab6d231
      Marvin Scholz authored
      5ab6d231
  6. 17 Jul, 2019 1 commit
  7. 15 Jul, 2019 1 commit
    • Emmanuel Gil Peyrot's avatar
      Set thread names on Linux · 15a93861
      Emmanuel Gil Peyrot authored
      This is using the Linux-only prctl(PR_SET_NAME, …) call, because glibc’s
      pthread_setname_np() is doing exactly the same call so there is no
      reason to use it instead, as it isn’t any more portable.
      
      I don’t have any other OS to test this on, but if you want to add one
      just add an #else defined(__YOUR_OS__) before the #else in thread.h.
      15a93861
  8. 13 Jul, 2019 1 commit
    • B Krishnan Iyer's avatar
      arm: mc: NEON implementation of w_mask_444/422/420 function · b271590a
      B Krishnan Iyer authored
      		                        A73		A53
      
      w_mask_420_w4_8bpc_c:	        	797.5		1072.7
      w_mask_420_w4_8bpc_neon:		85.6		152.7
      w_mask_420_w8_8bpc_c:		        2344.3		3118.7
      w_mask_420_w8_8bpc_neon:		221.9		372.4
      w_mask_420_w16_8bpc_c:		        7429.9		9702.1
      w_mask_420_w16_8bpc_neon:		620.4		1024.1
      w_mask_420_w32_8bpc_c:	        	27498.2		37205.7
      w_mask_420_w32_8bpc_neon:		2394.1		3838
      w_mask_420_w64_8bpc_c:  		66495.8		88721.3
      w_mask_420_w64_8bpc_neon:      		6081.4		9630
      w_mask_420_w128_8bpc_c:	        	163369.3	219494
      w_mask_420_w128_8bpc_neon:		16015.7		24969.3
      w_mask_422_w4_8bpc_c:	        	858.3		1100.2
      w_mask_422_w4_8bpc_neon:		81.5		143.1
      w_mask_422_w8_8bpc_c:	        	2447.5		3284.6
      w_mask_422_w8_8bpc_neon:		217.5		342.4
      w_mask_422_w16_8bpc_c:	        	7673.4		10135.9
      w_mask_422_w16_8bpc_neon:		632.5		1062.6
      w_mask_422_w32_8bpc_c:	        	28344.9		39090
      w_mask_422_w32_8bpc_neon:		2393.4		3963.8
      w_mask_422_w64_8bpc_c:	        	68159.6		93447
      w_mask_422_w64_8bpc_neon:		6015.7		9928.1
      w_mask_422_w128_8bpc_c:	        	169501.2	231702.7
      w_mask_422_w128_8bpc_neon:		15847.5		25803.4
      w_mask_444_w4_8bpc_c:	        	674.6		862.3
      w_mask_444_w4_8bpc_neon:		80.2		135.4
      w_mask_444_w8_8bpc_c:	        	2031.4		2693
      w_mask_444_w8_8bpc_neon:		209.3		318.7
      w_mask_444_w16_8bpc_c:		        6576		8217.4
      w_mask_444_w16_8bpc_neon:		627.3		986.2
      w_mask_444_w32_8bpc_c:		        26051.7		31593.9
      w_mask_444_w32_8bpc_neon:		2374		3671.6
      w_mask_444_w64_8bpc_c:		        63600		75849.9
      w_mask_444_w64_8bpc_neon:		5957		9335.5
      w_mask_444_w128_8bpc_c:		        156964.7	187932.4
      w_mask_444_w128_8bpc_neon:		15759.4		24549.5
      b271590a
  9. 08 Jul, 2019 1 commit
  10. 07 Jul, 2019 1 commit
  11. 06 Jul, 2019 1 commit
  12. 05 Jul, 2019 3 commits
    • Henrik Gramner's avatar
      Improve robustness of handling malloc failures · e2e56ab9
      Henrik Gramner authored
      Calling dav1d_get_picture() again after it has already returned with
      an error due to a memory allocation failure could result in crashes.
      
      Although doing so is not a proper API usage, and the outcome is going
      to be unpredictable, we should at least try to avoid crashing.
      e2e56ab9
    • Henrik Gramner's avatar
      Correctly return an error on malloc failure · c1a28d0e
      Henrik Gramner authored
      dav1d_submit_frame() could erroneously return 0 when tile data memory
      allocation failed.
      
      Fixes an assertion failure in dav1d_parse_obus().
      c1a28d0e
    • Henrik Gramner's avatar
      Fix potential memory leak · 0435ec9c
      Henrik Gramner authored
      In the (very unlikely) scenario of a pthread mutex/cond init failure
      in the tile state reallocation code some newly allocated mutexes/conds
      could leak.
      0435ec9c
  13. 02 Jul, 2019 4 commits
  14. 30 Jun, 2019 2 commits
  15. 29 Jun, 2019 3 commits
  16. 27 Jun, 2019 2 commits
  17. 26 Jun, 2019 3 commits
    • Martin Storsjö's avatar
      arm64: itx: Add NEON optimized inverse transforms · ef1ea008
      Martin Storsjö authored
      The speedup for most non-dc-only dct functions is around 9-12x
      over the C code generated by GCC 7.3.
      
      Relative speedups vs C for a few functions:
      
                                                    Cortex A53    A72    A73
      inv_txfm_add_4x4_dct_dct_0_8bpc_neon:               3.90   4.16   5.65
      inv_txfm_add_4x4_dct_dct_1_8bpc_neon:               7.20   8.05  11.19
      inv_txfm_add_8x8_dct_dct_0_8bpc_neon:               5.09   6.73   6.45
      inv_txfm_add_8x8_dct_dct_1_8bpc_neon:              12.18  10.80  13.05
      inv_txfm_add_16x16_dct_dct_0_8bpc_neon:             7.31   9.35  11.17
      inv_txfm_add_16x16_dct_dct_1_8bpc_neon:            14.36  13.06  15.93
      inv_txfm_add_16x16_dct_dct_2_8bpc_neon:            11.00  10.09  12.05
      inv_txfm_add_32x32_dct_dct_0_8bpc_neon:             4.41   5.40   5.77
      inv_txfm_add_32x32_dct_dct_1_8bpc_neon:            13.84  13.81  18.04
      inv_txfm_add_32x32_dct_dct_2_8bpc_neon:            11.75  11.87  15.22
      inv_txfm_add_32x32_dct_dct_3_8bpc_neon:            10.20  10.40  13.13
      inv_txfm_add_32x32_dct_dct_4_8bpc_neon:             9.01   9.21  11.56
      inv_txfm_add_64x64_dct_dct_0_8bpc_neon:             3.84   4.82   5.28
      inv_txfm_add_64x64_dct_dct_1_8bpc_neon:            14.40  12.69  16.71
      inv_txfm_add_64x64_dct_dct_4_8bpc_neon:            10.91   9.63  12.67
      
      Some of the specialcased identity_identity transforms for 32x32
      give insane speedups over the generic C code:
      
      inv_txfm_add_32x32_identity_identity_0_8bpc_neon: 225.26 238.11 247.07
      inv_txfm_add_32x32_identity_identity_1_8bpc_neon: 225.33 238.53 247.69
      inv_txfm_add_32x32_identity_identity_2_8bpc_neon:  59.60  61.94  64.63
      inv_txfm_add_32x32_identity_identity_3_8bpc_neon:  26.98  27.99  29.21
      inv_txfm_add_32x32_identity_identity_4_8bpc_neon:  15.08  15.93  16.56
      ef1ea008
    • Marvin Scholz's avatar
      tools: Use DAV1D_ERR for strerror calls · e0346114
      Marvin Scholz authored
      e0346114
    • Marvin Scholz's avatar
      04dc8a4d
  18. 24 Jun, 2019 2 commits
  19. 21 Jun, 2019 1 commit
  20. 20 Jun, 2019 1 commit
  21. 19 Jun, 2019 1 commit