arm32: itx: Add a NEON implementation of itx for 10 bpc

Relative speedup vs C for a few functions:

                                      Cortex A7     A8     A9    A53    A72    A73
inv_txfm_add_4x4_dct_dct_0_10bpc_neon:     2.79   5.08   2.99   2.83   3.49   4.44
inv_txfm_add_4x4_dct_dct_1_10bpc_neon:     5.74   9.43   5.72   7.19   6.73   6.92
inv_txfm_add_8x8_dct_dct_0_10bpc_neon:     3.13   3.68   2.79   3.25   3.21   3.33
inv_txfm_add_8x8_dct_dct_1_10bpc_neon:     7.09  10.41   7.00  10.55   8.06   9.02
inv_txfm_add_16x16_dct_dct_0_10bpc_neon:   5.01   6.76   4.56   5.58   5.52   2.97
inv_txfm_add_16x16_dct_dct_1_10bpc_neon:   8.62  12.48  13.71  11.75  15.94  16.86
inv_txfm_add_16x16_dct_dct_2_10bpc_neon:   6.05   8.81   6.13   8.18   7.90  12.27
inv_txfm_add_32x32_dct_dct_0_10bpc_neon:   2.90   3.90   2.16   2.63   3.56   2.74
inv_txfm_add_32x32_dct_dct_1_10bpc_neon:  13.57  17.00  13.30  13.76  14.54  17.08
inv_txfm_add_32x32_dct_dct_2_10bpc_neon:   8.29  10.54   8.05  10.68  12.75  14.36
inv_txfm_add_32x32_dct_dct_3_10bpc_neon:   6.78   8.40   7.60  10.12   8.97  12.96
inv_txfm_add_32x32_dct_dct_4_10bpc_neon:   6.48   6.74   6.00   7.38   7.67   9.70
inv_txfm_add_64x64_dct_dct_0_10bpc_neon:   3.02   4.59   2.21   2.65   3.36   2.47
inv_txfm_add_64x64_dct_dct_1_10bpc_neon:   9.86  11.30   9.14  13.80  12.46  14.83
inv_txfm_add_64x64_dct_dct_2_10bpc_neon:   8.65   9.76   7.60  12.05  10.55  12.62
inv_txfm_add_64x64_dct_dct_3_10bpc_neon:   7.78   8.65   6.98  10.63   9.15  11.73
inv_txfm_add_64x64_dct_dct_4_10bpc_neon:   6.61   7.01   5.52   8.41   8.33   9.69
36 jobs for master in 5 minutes and 7 seconds (queued for 7 seconds)
Status Job ID Name Coverage
  Style
passed style-check #577496
amd64 docker

00:00:07

passed x86inc-check #577497
amd64 docker

00:00:06

 
  Build
passed build-android-aarch64 #577511
amd64 docker

00:00:12

passed build-android-armv7 #577510
amd64 docker

00:00:13

passed build-debian #577498
amd64 avx2 docker

00:00:29

passed build-debian-aarch64 #577512
aarch64 docker

00:00:27

passed build-debian-aarch64-clang-5 #577513
aarch64 docker

00:00:26

passed build-debian-armv7 #577516
armv7 docker

00:01:04

passed build-debian-armv7-clang-5 #577517
armv7 docker

00:00:50

passed build-debian-bitdepth-16 #577504
amd64 docker

00:00:14

passed build-debian-bitdepth-8 #577503
amd64 docker

00:00:18

passed build-debian-examples #577501
amd64 docker

00:00:19

passed build-debian-no-tools #577502
amd64 docker

00:00:23

passed build-debian-ppc64le #577519
ppc64le docker

00:01:49

passed build-debian-static #577499
amd64 docker

00:00:21

passed build-debian-werror #577515
aarch64 docker

00:00:08

passed build-debian32 #577500
amd64 docker

00:00:30

passed build-macos #577514
catalina amd64

00:00:36

passed build-ubuntu-snap #577518
amd64 docker

00:00:37

passed build-win-arm32 #577508
amd64 docker

00:00:12

passed build-win-arm64 #577509
amd64 docker

00:00:14

passed build-win32 #577505
amd64 docker

00:00:30

passed build-win32-unaligned-stack #577506
amd64 docker

00:00:24

passed build-win64 #577507
amd64 docker

00:00:42

 
  Test
passed test-debian #577520
amd64 docker

00:01:09

95.791%
passed test-debian-aarch64 #577529
aarch64 docker

00:00:27

passed test-debian-armv7-clang-5 #577531
armv7 docker

00:01:45

passed test-debian-asan #577524
amd64 docker

00:01:55

passed test-debian-asm #577521
avx2 amd64 docker

00:01:07

passed test-debian-msan #577525
amd64 docker

00:01:06

passed test-debian-ppc64le #577530
ppc64le docker

00:02:53

passed test-debian-tsan #577527
amd64 docker

00:03:23

passed test-debian-ubsan #577526
amd64 docker

00:02:01

passed test-debian-unaligned-stack #577523
avx2 docker amd64

00:00:50

passed test-debian32-asm #577522
avx2 amd64 docker

00:01:02

passed test-win64 #577528
amd64 avx2 docker

00:01:10