arm: mc: NEON implementation of w_mask_444/422/420 function
A73 A53
w_mask_420_w4_8bpc_c: 794.6 1071.2
w_mask_420_w4_8bpc_neon: 140.6 221.8
w_mask_420_w8_8bpc_c: 2342.5 3116
w_mask_420_w8_8bpc_neon: 268.8 446.4
w_mask_420_w16_8bpc_c: 7416.3 9697.5
w_mask_420_w16_8bpc_neon: 802.2 1259.1
w_mask_420_w32_8bpc_c: 27456.3 37248.9
w_mask_420_w32_8bpc_neon: 3148.8 4849.4
w_mask_420_w64_8bpc_c: 65982.9 88579.2
w_mask_420_w64_8bpc_neon: 7751.9 11770.1
w_mask_420_w128_8bpc_c: 162092.7 219981.7
w_mask_420_w128_8bpc_neon: 19891.6 30762.7
w_mask_422_w4_8bpc_c: 858.8 1099.5
w_mask_422_w4_8bpc_neon: 127.2 207.8
w_mask_422_w8_8bpc_c: 2446.5 3280.1
w_mask_422_w8_8bpc_neon: 267 439
w_mask_422_w16_8bpc_c: 7654 10136.9
w_mask_422_w16_8bpc_neon: 869.2 1372.1
w_mask_422_w32_8bpc_c: 28120.4 39148.3
w_mask_422_w32_8bpc_neon: 3446.7 5306.4
w_mask_422_w64_8bpc_c: 67270.3 93370.8
w_mask_422_w64_8bpc_neon: 8445.3 12872.9
w_mask_422_w128_8bpc_c: 166402.8 232168.5
w_mask_422_w128_8bpc_neon: 21621.2 33560.9
w_mask_444_w4_8bpc_c: 674.2 861.7
w_mask_444_w4_8bpc_neon: 106 184.5
w_mask_444_w8_8bpc_c: 2030.6 2681.1
w_mask_444_w8_8bpc_neon: 254.7 379.7
w_mask_444_w16_8bpc_c: 6543.8 8202
w_mask_444_w16_8bpc_neon: 744.4 1171.9
w_mask_444_w32_8bpc_c: 25793.8 31631
w_mask_444_w32_8bpc_neon: 2907.3 4510
w_mask_444_w64_8bpc_c: 62985.2 75637.7
w_mask_444_w64_8bpc_neon: 7167.7 11046
w_mask_444_w128_8bpc_c: 155366.4 188690.5
w_mask_444_w128_8bpc_neon: 18656.1 29500.1
Edited by B Krishnan Iyer