Skip to content

riscv64/cdef: filter and dir intrinsic functions

jerry tsai requested to merge jerrytsai569/dav1d:cdef_filter_and_dir into master

Update: add BPi-F3 benchmarks

Benchmarks:

Kendryte K230 Banana Pi F3
cdef_dir_8bpc_c:               1369.3 ( 1.00x)
cdef_dir_8bpc_rvv:              503.8 ( 2.72x)
cdef_filter_4x4_01_8bpc_c:     1320.8 ( 1.00x)
cdef_filter_4x4_01_8bpc_rvv:    550.8 ( 2.40x)
cdef_filter_4x4_10_8bpc_c:      872.9 ( 1.00x)
cdef_filter_4x4_10_8bpc_rvv:    379.2 ( 2.30x)
cdef_filter_4x4_11_8bpc_c:     2685.0 ( 1.00x)
cdef_filter_4x4_11_8bpc_rvv:    922.7 ( 2.91x)
cdef_filter_4x8_01_8bpc_c:     2465.6 ( 1.00x)
cdef_filter_4x8_01_8bpc_rvv:   1117.3 ( 2.21x)
cdef_filter_4x8_10_8bpc_c:     1577.2 ( 1.00x)
cdef_filter_4x8_10_8bpc_rvv:    800.5 ( 1.97x)
cdef_filter_4x8_11_8bpc_c:     5391.1 ( 1.00x)
cdef_filter_4x8_11_8bpc_rvv:   1797.0 ( 3.00x)
cdef_filter_8x8_01_8bpc_c:     4692.3 ( 1.00x)
cdef_filter_8x8_01_8bpc_rvv:   1605.0 ( 2.92x)
cdef_filter_8x8_10_8bpc_c:     2959.9 ( 1.00x)
cdef_filter_8x8_10_8bpc_rvv:   1097.1 ( 2.70x)
cdef_filter_8x8_11_8bpc_c:    12051.7 ( 1.00x)
cdef_filter_8x8_11_8bpc_rvv:   2813.6 ( 4.28x)
cdef_dir_8bpc_c:               1311.7 ( 1.00x)
cdef_dir_8bpc_rvv:              495.5 ( 2.65x)
cdef_filter_4x4_01_8bpc_c:     1284.9 ( 1.00x)
cdef_filter_4x4_01_8bpc_rvv:    518.9 ( 2.48x)
cdef_filter_4x4_10_8bpc_c:      845.9 ( 1.00x)
cdef_filter_4x4_10_8bpc_rvv:    356.5 ( 2.37x)
cdef_filter_4x4_11_8bpc_c:     2737.6 ( 1.00x)
cdef_filter_4x4_11_8bpc_rvv:    885.8 ( 3.09x)
cdef_filter_4x8_01_8bpc_c:     2410.8 ( 1.00x)
cdef_filter_4x8_01_8bpc_rvv:   1096.3 ( 2.20x)
cdef_filter_4x8_10_8bpc_c:     1532.0 ( 1.00x)
cdef_filter_4x8_10_8bpc_rvv:    787.7 ( 1.94x)
cdef_filter_4x8_11_8bpc_c:     5475.3 ( 1.00x)
cdef_filter_4x8_11_8bpc_rvv:   1750.9 ( 3.13x)
cdef_filter_8x8_01_8bpc_c:     4709.5 ( 1.00x)
cdef_filter_8x8_01_8bpc_rvv:   1526.5 ( 3.09x)
cdef_filter_8x8_10_8bpc_c:     2897.6 ( 1.00x)
cdef_filter_8x8_10_8bpc_rvv:   1032.8 ( 2.81x)
cdef_filter_8x8_11_8bpc_c:    12491.9 ( 1.00x)
cdef_filter_8x8_11_8bpc_rvv:   2710.7 ( 4.61x)
Edited by jerry tsai

Merge request reports