• Martin Storsjö's avatar
    arm64: looprestoration: Add a NEON implementation of SGR · 204bf211
    Martin Storsjö authored
    Relative speedup vs (autovectorized) C code:
                          Cortex A53    A72    A73
    selfguided_3x3_8bpc_neon:   2.91   2.12   2.68
    selfguided_5x5_8bpc_neon:   3.18   2.65   3.39
    selfguided_mix_8bpc_neon:   3.04   2.29   2.98
    
    The relative speedup vs non-vectorized C code is around 2.6-4.6x.
    204bf211
Name
Last commit
Last update
doc Loading commit data...
include Loading commit data...
snap Loading commit data...
src Loading commit data...
tests Loading commit data...
tools Loading commit data...
.gitignore Loading commit data...
.gitlab-ci.yml Loading commit data...
CONTRIBUTING.md Loading commit data...
COPYING Loading commit data...
NEWS Loading commit data...
README.md Loading commit data...
THANKS.md Loading commit data...
meson.build Loading commit data...
meson_options.txt Loading commit data...