Multiple LSX optimizations and a few LASX optimizations were added. On the loongarch platform, the decoding performance has also been greatly improved. Welcome reviews. @jbk @jamrial @gramner