- May 18, 2021
-
-
Matthias Dressel authored
-
- May 16, 2021
-
-
Jean-Baptiste Kempf authored
-
- May 14, 2021
-
-
Martin Storsjö authored
Use the mvni instruction instead of setting the constant in a GPR first.
-
- May 13, 2021
-
-
Matthias Dressel authored
-
James Almer authored
-
In 16 bpc, the pixels are 16 bit integers, but valid pixels only are up to 12 bits, and the scaling buffer only contains 4096 elements. The src pixels are, normally, supposed to be valid pixels, but when processing blocks of 32 pixels at a time, it can operate on uninitialized pixels past the right edge. Before: Cortex A53 A72 A73 Apple M1 fgy_32x32xn_16bpc_neon: 10372.5 8194.4 8612.1 24.2 After: fgy_32x32xn_16bpc_neon: 10837.9 8469.5 8885.1 24.6
-
Jean-Baptiste Kempf authored
-
- May 12, 2021
-
-
Martin Storsjö authored
Relative speedup over C code: Cortex A53 A72 A73 Apple M1 fgy_32x32xn_16bpc_neon: 3.87 2.28 2.78 3.45
-
Martin Storsjö authored
Don't call them when targeting e.g. UWP. This requires building with a new enough SDK that does have the winapifamily.h header (and that it's included implicitly by regular platform headers); it's been available since the Windows 8.0 SDK (and since mingw-w64 v3.0.0) so it should be safe. Also rewrite the GetProcAddress call to avoid calling it if GetModuleHandleW(L"kernel32.dll") would return NULL for some reason.
-
- May 11, 2021
-
-
Ronald S. Bultje authored
-
- May 10, 2021
-
-
- May 04, 2021
-
-
-
-
-
-
-
-
-
-
-
-
-
-
It's only useful for 8-bit since the default ordering is more efficient for high bit-depth
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-