x86: cdef_filter: use 8-bit arithmetic for SSE

Port of c204da0f for AVX-2
from Kyle Siefring.

---------------------
x86_64:
------------------------------------------
before: cdef_filter_4x4_8bpc_ssse3: 141.7
 after: cdef_filter_4x4_8bpc_ssse3: 131.6
before: cdef_filter_4x4_8bpc_sse4: 128.3
 after: cdef_filter_4x4_8bpc_sse4: 119.0
------------------------------------------
before: cdef_filter_4x8_8bpc_ssse3: 253.4
 after: cdef_filter_4x8_8bpc_ssse3: 236.1
before: cdef_filter_4x8_8bpc_sse4: 228.5
 after: cdef_filter_4x8_8bpc_sse4: 213.2
------------------------------------------
before: cdef_filter_8x8_8bpc_ssse3: 429.6
 after: cdef_filter_8x8_8bpc_ssse3: 386.9
before: cdef_filter_8x8_8bpc_sse4: 379.9
 after: cdef_filter_8x8_8bpc_sse4: 335.9
------------------------------------------

---------------------
x86_32:
------------------------------------------
before: cdef_filter_4x4_8bpc_ssse3: 184.3
 after: cdef_filter_4x4_8bpc_ssse3: 163.3
before: cdef_filter_4x4_8bpc_sse4: 168.9
 after: cdef_filter_4x4_8bpc_sse4: 146.1
------------------------------------------
before: cdef_filter_4x8_8bpc_ssse3: 335.3
 after: cdef_filter_4x8_8bpc_ssse3: 280.7
before: cdef_filter_4x8_8bpc_sse4: 305.1
 after: cdef_filter_4x8_8bpc_sse4: 257.9
------------------------------------------
before: cdef_filter_8x8_8bpc_ssse3: 579.1
 after: cdef_filter_8x8_8bpc_ssse3: 500.5
before: cdef_filter_8x8_8bpc_sse4: 517.0
 after: cdef_filter_8x8_8bpc_sse4: 455.8
------------------------------------------
parent 22c3594d
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment