-
Martin Storsjö authored
The vext.8 instructions only need to produce a single d register each, making more registers available as scratch space, allowing to hide latencies more, and group the vmul/vmla in the form that is beneficial for in-order cores (with a special forwarding path for such patterns).
41f59b02
Loading