AVX512 SIMD
See Wikipedia. The target will be Ice Lake (F, CD, VL, DQ, BW, IFMA, VBMI, VBMI2, VPOPCNTDQ, BITALG, VNNI, VPCLMULQDQ, GFNI, VAES). You can find Ice Lake instances (m6i) on Amazon's EC-2.
8-bit:
-
mc
- avg/mask/w_avg (!921 (merged))
- w_mask (!921 (merged))
- blend{,_h/v} (!1301 (merged))
- warp8x8{,t} (!1301 (merged))
- emu_edge
- 8tap put (!1301 (merged))
- 8tap prep (!1301 (merged))
- bilinear put (!1301 (merged))
- bilinear prep
-
intra_pred
- h/v/dc/dc_128 (!1301 (merged))
- paeth (!1301 (merged))
- smooth{,_h/v} (!1301 (merged))
- z1 (!1562 (merged))
- z2 (!1570 (merged))
- z3 (!1566 (merged))
- filter (!1301 (merged))
-
cfl_ac
- 4:2:0
- 4:4:4
- 4:2:2
- cfl_pred
- pal_pred (!1301 (merged))
- itx (!1301 (merged))
- deblock
-
CDEF
- dir
- filter (!905 (merged), !932 (merged)),
- loop restoration (!1301 (merged))
-
SVC/super_res
- mc.scaled_put/prep
- mc.resize (!1355 (merged), @psilokos)
-
grain (!1374 (merged) )
- generate_grain_y
- generate_grain_uv_420/422/444
- fgy_32x32xn
- fguv_32x32xn_420/422/444
10/12-bit:
-
mc (!1314 (merged))
- avg/mask/w_avg
- w_mask
- blend{,_h/v}
- warp8x8{,t}
- emu_edge
- 8tap put
- 8tap prep
- bilinear put
- bilinear prep
-
intra_pred
- h/v/dc/dc_128
- paeth (!1363 (merged))
- smooth{,_h/v} (!1363 (merged))
- z1 (!1572 (merged))
- z2 (!1590 (merged))
- z3 (!1580 (merged))
- filter (!1363 (merged))
-
cfl_ac
- 4:2:0
- 4:4:4
- 4:2:2
- cfl_pred
- pal_pred (!1363 (merged))
-
itx
-
10-bit
- 8x8, 8x16, 16x8, 16x16 (!1454 (merged), @gramner)
- 8x32, 32x8 (!1466 (merged), @gramner)
- 16x32, 32x16, 32x32 (!1475 (merged), @gramner)
- 16x64 (!1503 (merged), @rbultje), 64x16 (!1509 (merged)), 32x64 (!1504 (merged)), 64x32 (!1510 (merged)), 64x64 (!1512 (merged))
- 12-bit
-
10-bit
- deblock (!1427 (merged), @gramner)
-
CDEF
- dir
- filter (!1421 (merged))
-
loop restoration
- wiener (!1320 (merged))
-
SGR
- 10-bit (!1327 (merged))
- 12-bit
-
SVC/super_res
- mc.scaled_put/prep
- mc.resize (!1355 (merged), @psilokos)
-
grain (@gramner, !1396 (merged))
- generate_grain_y
- generate_grain_uv_420/422/444
- fgy_32x32xn
- fguv_32x32xn_420/422/444