Skip to content

arm: Don't use SVE2 functions on Linux on Apple M4

The Apple M4 supports the SME instruction set extension. This also implicitly supports Streaming SVE2; this allows using a subset of SVE2 instructions, but only after "streaming mode" has been activated.

Our SVE2 functions don't enable streaming mode, and won't work on such a CPU.

We currently don't have any way of detecting support for SVE or SVE2 on macOS, but we do have that for Linux (and Windows). If running (e.g. virtualized) Linux on an M4, we do see the hwcap for SVE2 enabled, but not that for plain SVE - see [1] for an example of what hwcaps can be detected there.

Currently, this would translate into enabling DAV1D_ARM_CPU_FLAG_SVE2 but not DAV1D_ARM_CPU_FLAG_SVE.

Currently, we only check whether the DAV1D_ARM_CPU_FLAG_SVE2 flag is set, before taking such a function into use. To avoid this issue, we could update all cases to check for both DAV1D_ARM_CPU_FLAG_SVE and DAV1D_ARM_CPU_FLAG_SVE2. However this is both cumbersome and error-prone.

Instead, clear the DAV1D_ARM_CPU_FLAG_SVE2 flag if DAV1D_ARM_CPU_FLAG_SVE isn't set. This clarifies the role of the DAV1D_ARM_CPU_FLAG_SVE2 flag to indicate whether non-streaming SVE2 can be used.

We could also do this check in src/arm/cpu.c, within dav1d_get_cpu_flags_arm() (possibly factorizing a wrapper around the existing OS specific detection functions), however by doing it in src/cpu.h we also avoid the (somewhat hypothetical) case if SVE2 seems enabled in the compiler (via the __ARM_FEATURE_SVE2 define).

[1] https://github.com/docker/roadmap/issues/782

Merge request reports

Loading