Skip to content
Snippets Groups Projects

AArch64: Optimize Armv8.0 Neon path of SBD H/HV 6-tap filters

Merged Arpad Panyik requested to merge arpadpanyik-arm/dav1d:mc_sbd_h_hv_6tap_neon into master

The 6-tap horizontal and the horizontal parts of 6-tap HV subpel filters can be further improved by some pointer arithmetic and saving some EXT instructions in their data rearrangement codes.

Relative runtime of micro benchmarks after this patch on some Cortex CPU cores:

SBD mct h         X1     A78     A76     A72     A55
 regular  w8:  0.878x  0.894x  0.990x  0.923x  0.944x
 regular w16:  0.962x  0.931x  0.943x  0.949x  0.949x
 regular w32:  0.937x  0.937x  0.972x  0.938x  0.947x
 regular w64:  0.920x  0.965x  0.992x  0.936x  0.944x

SBD mct hv        X1     A78     A76     A72     A55
 regular  w8:  0.931x  0.970x  0.951x  0.950x  0.971x
 regular w16:  0.940x  0.971x  0.941x  0.952x  0.967x
 regular w32:  0.943x  0.972x  0.946x  0.961x  0.974x
 regular w64:  0.943x  0.973x  0.952x  0.944x  0.975x

Merge request reports

Pipeline #510839 passed

Pipeline passed for a992a9be on arpadpanyik-arm:mc_sbd_h_hv_6tap_neon

Test coverage 91.53% from 1 job

Merged by Martin StorsjöMartin Storsjö 6 months ago (Sep 6, 2024 8:35am UTC)

Merge details

  • Changes merged into master with a992a9be.
  • Deleted the source branch.
  • Auto-merge enabled

Pipeline #510847 passed

Pipeline passed for a992a9be on master

Test coverage 91.16% from 1 job

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
Please register or sign in to reply
Loading