Skip to content
Snippets Groups Projects

arm64: ipred: 16 bpc NEON implementation of the Z1 and Z3 functions

Merged Martin Storsjö requested to merge mstorsjo/dav1d:arm64-z1-z3-16bpc into master

As usual, there's a handful of minor things to fix in the 8 bpc case that I notice after looking closer at it again.

Overall relative speedup over C code:

                          Cortex A53    A55    A72    A73    A76   Apple M1
intra_pred_z1_w4_16bpc_neon:    3.49   2.63   2.83   3.85   3.14   9.00
intra_pred_z1_w8_16bpc_neon:    6.19   4.39   3.65   6.58   4.99   6.50
intra_pred_z1_w16_16bpc_neon:   6.65   4.64   3.97   7.78   4.87   7.00
intra_pred_z1_w32_16bpc_neon:   7.76   5.49   5.17   7.83   5.59   8.24
intra_pred_z1_w64_16bpc_neon:   8.02   5.80   5.33   8.41   5.77   8.70
intra_pred_z3_w4_16bpc_neon:    3.06   2.87   2.17   1.97   2.33   7.75
intra_pred_z3_w8_16bpc_neon:    3.90   3.94   2.97   3.16   2.93   4.43
intra_pred_z3_w16_16bpc_neon:   4.08   4.48   3.31   4.68   3.13   5.00
intra_pred_z3_w32_16bpc_neon:   4.43   4.85   3.50   4.02   3.33   5.62
intra_pred_z3_w64_16bpc_neon:   4.68   5.30   3.72   3.96   3.52   5.78

Merge request reports

Pipeline #324723 passed

Pipeline passed for e75caab9 on mstorsjo:arm64-z1-z3-16bpc

Test coverage 91.95% (-0.12%) from 1 job

Merged by Martin StorsjöMartin Storsjö 1 year ago (Mar 21, 2023 7:06am UTC)

Merge details

  • Changes merged into master with e75caab9.
  • Deleted the source branch.
  • Auto-merge enabled

Pipeline #324728 passed

Pipeline passed for e75caab9 on master

Test coverage 92.07% (-0.12%) from 1 job

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
Please register or sign in to reply
Loading