riscv64/mc: Branchless vsetvl in blend_h function
Kendryte K230 blend_h_w2_8bpc_c: 165.9 ( 1.00x) blend_h_w2_8bpc_rvv: 83.8 ( 1.98x) blend_h_w4_8bpc_c: 295.2 ( 1.00x) blend_h_w4_8bpc_rvv: 83.8 ( 3.52x) blend_h_w8_8bpc_c: 557.9 ( 1.00x) blend_h_w8_8bpc_rvv: 92.5 ( 6.03x) blend_h_w16_8bpc_c: 1078.8 ( 1.00x) blend_h_w16_8bpc_rvv: 117.3 ( 9.19x) blend_h_w32_8bpc_c: 2117.8 ( 1.00x) blend_h_w32_8bpc_rvv: 200.5 (10.57x) blend_h_w64_8bpc_c: 4194.7 ( 1.00x) blend_h_w64_8bpc_rvv: 363.2 (11.55x) blend_h_w128_8bpc_c: 10271.4 ( 1.00x) blend_h_w128_8bpc_rvv: 844.5 (12.16x) SpacemiT K1 blend_h_w2_8bpc_c: 162.5 ( 1.00x) blend_h_w2_8bpc_rvv: 83.9 ( 1.94x) blend_h_w4_8bpc_c: 288.6 ( 1.00x) blend_h_w4_8bpc_rvv: 83.7 ( 3.45x) blend_h_w8_8bpc_c: 544.7 ( 1.00x) blend_h_w8_8bpc_rvv: 84.0 ( 6.48x) blend_h_w16_8bpc_c: 1052.8 ( 1.00x) blend_h_w16_8bpc_rvv: 102.9 (10.23x) blend_h_w32_8bpc_c: 2068.0 ( 1.00x) blend_h_w32_8bpc_rvv: 131.4 (15.73x) blend_h_w64_8bpc_c: 4093.7 ( 1.00x) blend_h_w64_8bpc_rvv: 220.3 (18.58x) blend_h_w128_8bpc_c: 10023.1 ( 1.00x) blend_h_w128_8bpc_rvv: 467.3 (21.45x)
Loading
Please register or sign in to comment