Skip to content
Snippets Groups Projects

AArch64: Optimize prep_neon function

Merged Arpad Panyik requested to merge arpadpanyik-arm/dav1d:mc_prep_opt into master

Optimize the prep_neon function, details are in the commit messages.

Relative performance of micro benchmarks including all commits (lower is better):

Cortex-A55 mct_w4: 0.795x mct_w8: 0.913x mct_w16: 0.912x mct_w32: 0.838x mct_w64: 1.025x mct_w128: 1.002x
Cortex-A510 mct_w4: 0.760x mct_w8: 0.636x mct_w16: 0.640x mct_w32: 0.854x mct_w64: 0.864x mct_w128: 0.995x
Cortex-A72 mct_w4: 0.616x mct_w8: 0.854x mct_w16: 0.756x mct_w32: 1.052x mct_w64: 1.044x mct_w128: 0.702x
Cortex-A76 mct_w4: 0.837x mct_w8: 0.797x mct_w16: 0.841x mct_w32: 0.804x mct_w64: 0.948x mct_w128: 0.904x
Cortex-A78 mct_w16: 0.542x mct_w32: 0.725x mct_w64: 0.741x mct_w128: 0.745x
Cortex-A715 mct_w16: 0.561x mct_w32: 0.720x mct_w64: 0.740x mct_w128: 0.748x
Cortex-X1 mct_w32: 0.886x mct_w64: 0.882x mct_w128: 0.917x
Cortex-X3 mct_w32: 0.835x mct_w64: 0.803x mct_w128: 0.808x

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
Please register or sign in to reply
Loading