AArch64: Optimize Armv8.0 Neon path of HBD HV 6-tap filters
- Sep 06, 2024
-
-
The horizontal parts of 6-tap HV subpel filters can be further improved by some pointer arithmetic and saving some instructions (EXTs) in their data rearrangement codes. Relative runtime of micro benchmarks after this patch on Cortex CPU cores: HBD mct hv X1 A78 A76 A72 A55 regular w8: 0.952x 0.989x 0.924x 0.973x 0.976x regular w16: 0.961x 0.993x 0.928x 0.952x 0.971x regular w32: 0.964x 0.996x 0.930x 0.973x 0.972x regular w64: 0.963x 0.997x 0.930x 0.969x 0.974x
2d808de1
-