Add endian swapping helper and support big endian formats
Might as well. This generally tries to do the endian swap on the GPU, using an optimized-ish compute shader that basically does SIMD 32-bit swizzles.
Also includes some miscellaneous minor fixes.