Skip to content

x86: Branch before waiting on popcnt in ipred_z AVX2 functions

Henrik Gramner requested to merge gramner/dav1d:x86_ipred_z_popcnt into master

Some specific Haswell CPU:s have a hardware bug where the popcnt instruction doesn't set zero flag correctly, which causes the wrong branch to be taken.

popcnt also has a 3-cycle latency on Intel CPU:s, so doing the branch on the input value instead of the output reduces the amount of time wasted going down the wrong code path in case of branch mispredictions.

Edited by Henrik Gramner

Merge request reports