Skip to content

x86: Add refmvs.save_tmvs SSSE3 asm

64-bit:

(10 checkasm runs)
Speed ups:
c..ssse3: 2.921x (o=0.0249)
c..avx2: 3.134x (o=0.0289)
Speed diffs:
c..ssse3: 34.24% (o=0.29)
c..avx2: 31.91% (o=0.29)
save_tmvs_c:      25681.4 ( 1.00x)
save_tmvs_ssse3:   8711.6 ( 2.95x)
save_tmvs_avx2:    8075.1 ( 3.18x)
chimera: 453.64 => 455.93 (~+0.5%)

32-bit:

(10 checkasm runs)
Speed ups:
c..ssse3: 2.353x (o=0.0253)
Speed diffs:
c..ssse3: 42.51% (o=0.46)
save_tmvs_c:      23775.4 ( 1.00x)
save_tmvs_ssse3:   9799.6 ( 2.43x)
chimera: 408.40 => 412.16 (~+0.9%)

These chimera decodes numbers aren't very stable, it's probably better to just look at the checkasm ones.

Edited by Victorien Le Couviour--Tuffet

Merge request reports