- 02 Jun, 2018 1 commit
-
-
Henrik Gramner authored
Clang emits aligned AVX stores for things like zeroing stack-allocated variables when using -mavx even with -fno-tree-vectorize set which can result in crashes if this occurs before we've realigned the stack. Previously we only ensured that the stack was realigned before calling assembly functions that accesses stack-allocated buffers but this is not sufficient. Fix the issue by changing the stack realignment to instead occur immediately in all CLI, API and thread entry points.
-
- 27 May, 2018 6 commits
-
-
Anton Mitrofanov authored
-
Anton Mitrofanov authored
32-bit shifts are only defined for values in the range 0-31.
-
Anton Mitrofanov authored
Now behaves the same as bs_align_0 and bs_align_1.
-
Anton Mitrofanov authored
-
Anton Mitrofanov authored
-
Anton Mitrofanov authored
-
- 31 Mar, 2018 2 commits
-
-
Henrik Gramner authored
This was always required, but accidentally happened to work correctly in a few cases.
-
Martin Storsjö authored
This picks the right assembler automatically for arm and aarch64 llvm-mingw targets. This doesn't get the right assembler for clang setups when clang acts like MSVC and uses MSVC headers though (where it perhaps should use armasm as before), but that's probably an even more obscure setup.
-
- 18 Jan, 2018 1 commit
-
-
Anton Mitrofanov authored
-
- 17 Jan, 2018 7 commits
-
-
Henrik Gramner authored
-
Diego Biurrun authored
-
Diego Biurrun authored
* Drop empty addition of GPLed filters * Replace backticks with $()
-
Henrik Gramner authored
-
Henrik Gramner authored
Improves cache efficiency.
-
Anton Mitrofanov authored
-
Henrik Gramner authored
Fixes segfaults on Windows where the stack is only 16-byte aligned.
-
- 24 Dec, 2017 23 commits
-
-
Anton Mitrofanov authored
-
5x speed up vs C code.
-
This version supports converting aarch64 assembly for MS armasm64.exe.
-
swscale can read past the end of the input buffer, which may result in crashes if such a read crosses a page boundary into an invalid page. Work around this by adding some padding space at the end of the buffer when using memory-mapped input frames. This may sometimes require copying the last frame into a new buffer on Windows since the Microsoft memory-mapping implementation has very limited capabilities compared to POSIX systems.
-
Use the AVComponentDescriptor depth field instead of depth_minus1.
-
The assembler (both gas and clang/llvm) automatically fixes this, armasm64 doesn't. We can fix it in gas-preprocessor, but we should also be using the right instruction form.
-
-
For windows, when building with armasm, we already filtered these out with gas-preprocessor. By filtering them out already in the source, we can also build directly with clang for windows (which also require wrapping the assembler in gas-preprocessor for converting instructions to thumb form, but gas-preprocessor doesn't and shouldn't filter out them in the clang configuration).
-
This confuses gas-preprocessor, which tries to replace actual st2 instructions by the integer 1 or 2.
-
Only 17 elements are actually used. It was originally padded to 64 bytes to avoid cache line splits in the x86 assembly, but those haven't really been an issue on x86 CPU:s made in the past decade or so. Benchmarking shows no performance impact from dropping the padding, so might as well remove it and save some cache.
-
Some ancient Pentium-M and Core 1 CPU:s had slow SSE units, and using MMX was preferable. Nowadays many assembly functions in x264 completely lack MMX implementations and falling back to C code will likely make things worse. Some misconfigured virtualized systems could sometimes also trigger this code path and cause assertions.
-
-
* Use the codec parameters API instead of the AVStream codec field. * Use av_packet_unref() instead of av_free_packet(). * Use the AVFrame pts field instead of pkt_pts.
-
-
-
Takes advantage of opmasks to avoid having to use scalar code for the tail. Also make some slight improvements to the checkasm test.
-
Use a different multiplier in order to eliminate some shifts. About 25% faster than before.
-
Anton Mitrofanov authored
It have caused significant quality hit without any meaningful (if any) speed up.
-
Anton Mitrofanov authored
Fixes some thread safety doubts and makes code cleaner. Downside: slightly higher memory usage when calling multiple encoders from the same application.
-
Anton Mitrofanov authored
Fix thread safety of x264_threading_init() and use of X264_PTHREAD_MUTEX_INITIALIZER with win32thread
-
Anton Mitrofanov authored
Log result of pkg-config checks to config.log. Fix lavf support detection for pkg-config fallback case. Fix detection of linking dependencies errors for lavf/lsmash/gpac. Cosmetics.
-
Anton Mitrofanov authored
-
Anton Mitrofanov authored
-