checkasm: Add support for the private macOS kperf API for benchmarking
On AArch64, the performance counter registers usually are restricted and not accessible from user space.
On macOS, we currently use mach_absolute_time() as timer on aarch64. This measures wallclock time but with a very coarse resolution.
There is a private API, kperf, that one can use for getting high precision timers though. Unfortunately, it requires running the checkasm binary as root (e.g. with sudo).
Also, as it is a private, undocumented API, it can potentially change at any time.
This is handled by adding a new meson build option, for switching to this timer. If the timer source in checkasm could be changed at runtime with an option, this wouldn't need to be a build time option.
This allows getting benchmarks like this:
mc_8tap_regular_w16_hv_8bpc_c: 1522.1 ( 1.00x)
mc_8tap_regular_w16_hv_8bpc_neon: 331.8 ( 4.59x)
Instead of this:
mc_8tap_regular_w16_hv_8bpc_c: 9.0 ( 1.00x)
mc_8tap_regular_w16_hv_8bpc_neon: 1.9 ( 4.76x)
Co-authored-by: J. Dekker jdek@itanimul.li