Skip to content
Snippets Groups Projects

Draft: aarch64: add DAV1D_ARM_CPU_FLAG_DOTP

Closed Janne Grunau requested to merge janne/dav1d:arm64_cpuid into master
1 unresolved thread

Runtime detection on Linux using HWCAP_CPUID for user space access to the CPU feature registers. See https://www.kernel.org/doc/html/latest/arm64/cpu-feature-registers.html

Runtime detection on MacOS using sysctlbyname("hw.cpufamily") and matching known ARMv8.4-A CPUs. MacOS unfortunately doesn't expose a flag for this feature.

marked as draft as it is pointless without users.

Merge request reports

Approval is optional

Closed by Ronald S. BultjeRonald S. Bultje 1 year ago (Feb 24, 2024 6:05pm UTC)

Merge details

  • The changes were not merged into master.

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Janne Grunau resolved all threads

    resolved all threads

  • Janne Grunau added 1 commit

    added 1 commit

    • dcf4ce3c - aarch64: add DAV1D_ARM_CPU_FLAG_DOTP

    Compare with previous version

  • Martin Storsjö
  • Martin Storsjö
    Martin Storsjö @mstorsjo started a thread on commit dcf4ce3c
  • 124 size_t size = sizeof(cpu_family);
    125 /* there is no explicit flag for dot product availability
    126 * enable it on know ARMv8.4-A CPUs
    127 */
    128 int ret = sysctlbyname("hw.cpufamily", &cpu_family, &size, NULL, 0);
    129 if (!ret) {
    130 switch (cpu_family)
    131 {
    132 case CPUFAMILY_ARM_LIGHTNING_THUNDER: // explicitit fall through
    133 case CPUFAMILY_ARM_FIRESTORM_ICESTORM:
    134 flags |= DAV1D_ARM_CPU_FLAG_DOTP;
    135 break;
    136 default:
    137 break;
    138 }
    139 }
    • Would it be good to somehow fold in a check for __ARM_FEATURE_DOTPROD too, which would imply that it's always available unconditionally? (Clang doesn't seem to set that one, but GCC does if building with -march=armv8.4-a.)

    • Author Maintainer

      I'd say yes but I'm reluctant to add this to the current #idef hell. I'll try to think of something. Maybe it would make sense to split cpu.c into 32/64-bit files.

    • Yep, the code is a bit tricky. Maybe a separate #if defined(__aarch64__) && defined(__ARM_FEATURE_DOTPROD) (and we could have the same for __ARM_NEON and other hardcoded conditions) after the different cases of actually doing runtime probing?

    • Please register or sign in to reply
  • Janne Grunau added 1 commit

    added 1 commit

    • a10bbd1f - aarch64: add DAV1D_ARM_CPU_FLAG_DOTP

    Compare with previous version

  • Ronald S. Bultje mentioned in merge request !1609 (merged)

    mentioned in merge request !1609 (merged)

  • Superseeded by !1609 (merged).

  • Please register or sign in to reply
    Loading