x86: AVX-512 support
AVX-512 consists of a plethora of different extensions, but in order to keep things a bit more manageable we group together the following extensions under a single baseline cpu flag which should cover SKL-X and future CPUs: * AVX-512 Foundation (F) * AVX-512 Conflict Detection Instructions (CD) * AVX-512 Byte and Word Instructions (BW) * AVX-512 Doubleword and Quadword Instructions (DQ) * AVX-512 Vector Length Extensions (VL) On x86-64 AVX-512 provides 16 additional vector registers, prefer using those over existing ones since it allows us to avoid using `vzeroupper` unless more than 16 vector registers are required. They also happen to be volatile on Windows which means that we don't need to save and restore existing xmm register contents unless more than 22 vector registers are required. Also take the opportunity to drop X264_CPU_CMOV and X264_CPU_SLOW_CTZ while we're breaking API by messing with the cpu flags since they weren't really used for anything. Big thanks to Intel for their support.