Commits on Source (14)
-
Add if/else clause in files to control which code is used. Move generic function out of 8-bit depth scope to common one for both modes. Signed-off-by:
Hubert Mazur <hum@semihalf.com>
249924ea -
Previously some functions from motion compensation family used uint8_t, while the others pixel definition. Unify this and change every uint8_t usage to pixel. This commit is a prerequisite to 10 bit depth support. Signed-off-by:
Hubert Mazur <hum@semihalf.com>
ba45eba3 -
Provide neon optimized implementation for pixel_avg functions from motion compensation family for 10 bit depth. Checkasm benchmarks are shown below. avg_4x2_c: 703 avg_4x2_neon: 222 avg_4x4_c: 1405 avg_4x4_neon: 516 avg_4x8_c: 2759 avg_4x8_neon: 898 avg_4x16_c: 5808 avg_4x16_neon: 1776 avg_8x4_c: 2767 avg_8x4_neon: 412 avg_8x8_c: 5559 avg_8x8_neon: 841 avg_8x16_c: 11176 avg_8x16_neon: 1668 avg_16x8_c: 10493 avg_16x8_neon: 1504 avg_16x16_c: 21116 avg_16x16_neon: 2985 Signed-off-by:
Hubert Mazur <hum@semihalf.com>
13a24888 -
Provide neon optimized implementation for pixel_avg2 functions from motion compensation family for 10 bit depth. Signed-off-by:
Hubert Mazur <hum@semihalf.com>
bb3d83dd -
Provide neon optimized implementation for mc_copy functions from motion compensation family for 10 bit depth. Signed-off-by:
Hubert Mazur <hum@semihalf.com>
f0b0489f -
Provide neon optimized implementation for mc_weight functions from motion compensation family for 10 bit depth. Benchmark results are shown below. weight_w4_c: 4734 weight_w4_neon: 4165 weight_w8_c: 8930 weight_w8_neon: 1620 weight_w16_c: 16939 weight_w16_neon: 2729 weight_w20_c: 20721 weight_w20_neon: 3470 Signed-off-by:
Hubert Mazur <hum@semihalf.com>
25d5baf4 -
Provide mc_luma and get_ref wrappers were only defined with 8 bit depth. As all required 10 bit depth helper functions exists, move it out from if scope and make it always defined regardless the bit depth. Signed-off-by:
Hubert Mazur <hum@semihalf.com>
08761208 -
Provide neon optimized implementation for mc_chroma functions from motion compensation family for 10 bit depth. Benchmark results are shown below. mc_chroma_2x2_c: 700 mc_chroma_2x2_neon: 478 mc_chroma_2x4_c: 1300 mc_chroma_2x4_neon: 765 mc_chroma_4x2_c: 1229 mc_chroma_4x2_neon: 483 mc_chroma_4x4_c: 2383 mc_chroma_4x4_neon: 773 mc_chroma_4x8_c: 4662 mc_chroma_4x8_neon: 1319 mc_chroma_8x4_c: 4450 mc_chroma_8x4_neon: 940 mc_chroma_8x8_c: 8797 mc_chroma_8x8_neon: 1638 Signed-off-by:
Hubert Mazur <hum@semihalf.com>
7ff0f978 -
Provide neon optimized implementation for mc_integral functions from motion compensation family for 10 bit depth. Benchmark results are shown below. integral_init4h_c: 2651 integral_init4h_neon: 550 integral_init4v_c: 4247 integral_init4v_neon: 612 integral_init8h_c: 2544 integral_init8h_neon: 1027 integral_init8v_c: 1996 integral_init8v_neon: 245 Signed-off-by:
Hubert Mazur <hum@semihalf.com>
25ef8832 -
Provide neon optimized implementation for mc_lowres function from motion compensation family for 10 bit depth. Benchmark results are shown below. lowres_init_c: 149446 lowres_init_neon: 13172 Signed-off-by:
Hubert Mazur <hum@semihalf.com>
0a810f4f -
Provide neon optimized implementation for mc_load_deinterleave function from motion compensation family for 10 bit depth. Benchmark results are shown below. load_deinterleave_chroma_fdec_c: 2936 load_deinterleave_chroma_fdec_neon: 422 Signed-off-by:
Hubert Mazur <hum@semihalf.com>
68d71206 -
Provide neon optimized implementation for mc_store_interleave function from motion compensation family for 10 bit depth. Benchmark results are shown below. load_deinterleave_chroma_fenc_c: 2910 load_deinterleave_chroma_fenc_neon: 430 Signed-off-by:
Hubert Mazur <hum@semihalf.com>
df179744 -
Provide neon optimized implementation for mc_plane_copy function from motion compensation family for 10 bit depth. Benchmark results are shown below. plane_copy_c: 2955 plane_copy_neon: 2910 plane_copy_deinterleave_c: 24056 plane_copy_deinterleave_neon: 3625 plane_copy_deinterleave_rgb_c: 19928 plane_copy_deinterleave_rgb_neon: 3941 plane_copy_interleave_c: 24399 plane_copy_interleave_neon: 4723 plane_copy_swap_c: 32269 plane_copy_swap_neon: 3211 Signed-off-by:
Hubert Mazur <hum@semihalf.com>
e47bede8 -
Provide neon optimized implementation for mc_plane_copy function from motion compensation family for 10 bit depth. Benchmark results are shown below. hpel_filter_c: 111495 hpel_filter_neon: 37849 Signed-off-by:
Hubert Mazur <hum@semihalf.com>
cc5c343f
This diff is collapsed.