Skip to content
Snippets Groups Projects

AArch64: Optimize Armv8.0 Neon path of HBD HV 6-tap filters

Merged Arpad Panyik requested to merge arpadpanyik-arm/dav1d:mc_hbd_hv_6tap_neon into master

The horizontal parts of Armv8.0 Neon 6-tap HV subpel filters can be further improved by some pointer arithmetic and saving some EXT instructions in their data rearrangement codes.

Relative runtime of micro benchmarks after this patch on some Cortex CPU cores:

HBD mct hv        X1     A78     A76     A72     A55
 regular  w8:  0.952x  0.989x  0.924x  0.973x  0.976x
 regular w16:  0.961x  0.993x  0.928x  0.952x  0.971x
 regular w32:  0.964x  0.996x  0.930x  0.973x  0.972x
 regular w64:  0.963x  0.997x  0.930x  0.969x  0.974x

Merge request reports

Pipeline #510830 passed

Pipeline passed for 2d808de1 on arpadpanyik-arm:mc_hbd_hv_6tap_neon

Test coverage 91.45% (0.15%) from 1 job
Approved by

Merged by Martin StorsjöMartin Storsjö 5 months ago (Sep 6, 2024 8:06am UTC)

Merge details

  • Changes merged into master with 2d808de1.
  • Deleted the source branch.
  • Auto-merge enabled

Pipeline #510838 canceled

Pipeline canceled for 2d808de1 on master

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
Please register or sign in to reply
Loading