Skip to content

aarch64: Use regular hwcaps flags instead of HWCAP_CPUID for CPU feature detection on Linux

Martin Storsjö requested to merge mstorsjo/x264:aarch64-cpu into master

This makes the code much simpler (especially for adding support for other instruction set extensions), avoids needing inline assembly for this feature, and generally is more of the canonical way to do this.

The CPU feature detection was added in 9c3c7168, using HWCAP_CPUID.

The argument for using that, was that HWCAP_CPUID was added much earlier in the kernel (in Linux v4.11), while the HWCAP flags for individual features always come later. This allows detecting support for new CPU extensions before the kernel exposes information about them via hwcap flags.

However in practice, there's probably quite little advantage in this. E.g. HWCAP_SVE was added in Linux v4.15, and HWCAP2_SVE2 was added in v5.10 - later than HWCAP_CPUID, but there's probably very little practical cases where one would run a kernel older than that on a CPU that supports those instructions.

Additionally, we provide our own definitions of the flag values to check (as they are fixed constants anyway), with names not conflicting with the ones from system headers. This reduces the number of ifdefs needed, and allows detecting those features even if building with userland headers that are lacking the definitions of those flags.

Also, slightly older versions of QEMU, e.g. 6.2 in Ubuntu 22.04, do expose support for these features via HWCAP flags, but the emulated cpuid registers are missing the bits for exposing e.g. SVE2 (This issue is fixed in later versions of QEMU though.)

Also drop the ifdef check for whether AT_HWCAP is defined; it was added to glibc in 1997. AT_HWCAP2 was added in 2013, in glibc 2.18, which also precedes when aarch64 was commonly used anyway, so don't guard the use of that with an ifdef.

Merge request reports