- 24 Dec, 2017 36 commits
-
-
The assembler (both gas and clang/llvm) automatically fixes this, armasm64 doesn't. We can fix it in gas-preprocessor, but we should also be using the right instruction form.
-
-
For windows, when building with armasm, we already filtered these out with gas-preprocessor. By filtering them out already in the source, we can also build directly with clang for windows (which also require wrapping the assembler in gas-preprocessor for converting instructions to thumb form, but gas-preprocessor doesn't and shouldn't filter out them in the clang configuration).
-
This confuses gas-preprocessor, which tries to replace actual st2 instructions by the integer 1 or 2.
-
Only 17 elements are actually used. It was originally padded to 64 bytes to avoid cache line splits in the x86 assembly, but those haven't really been an issue on x86 CPU:s made in the past decade or so. Benchmarking shows no performance impact from dropping the padding, so might as well remove it and save some cache.
-
Some ancient Pentium-M and Core 1 CPU:s had slow SSE units, and using MMX was preferable. Nowadays many assembly functions in x264 completely lack MMX implementations and falling back to C code will likely make things worse. Some misconfigured virtualized systems could sometimes also trigger this code path and cause assertions.
-
-
* Use the codec parameters API instead of the AVStream codec field. * Use av_packet_unref() instead of av_free_packet(). * Use the AVFrame pts field instead of pkt_pts.
-
-
-
Takes advantage of opmasks to avoid having to use scalar code for the tail. Also make some slight improvements to the checkasm test.
-
Use a different multiplier in order to eliminate some shifts. About 25% faster than before.
-
Anton Mitrofanov authored
It have caused significant quality hit without any meaningful (if any) speed up.
-
Anton Mitrofanov authored
Fixes some thread safety doubts and makes code cleaner. Downside: slightly higher memory usage when calling multiple encoders from the same application.
-
Anton Mitrofanov authored
Fix thread safety of x264_threading_init() and use of X264_PTHREAD_MUTEX_INITIALIZER with win32thread
-
Anton Mitrofanov authored
Log result of pkg-config checks to config.log. Fix lavf support detection for pkg-config fallback case. Fix detection of linking dependencies errors for lavf/lsmash/gpac. Cosmetics.
-
Anton Mitrofanov authored
-
Anton Mitrofanov authored
-
-
-
Add 'i_bitdepth' to x264_param_t with the corresponding '--output-depth' CLI option to set the bit depth at runtime. Drop the 'x264_bit_depth' global variable. Rather than hardcoding it to an incorrect value, it's preferable to induce a linking failure. If applications relies on this symbol this will make it more obvious where the problem is. Add Makefile rules that compiles modules with different bit depths. Assembly on x86 is prefixed with the 'private_prefix' define, while all other archs modify their function prefix internally. Templatize the main C library, x86/x86_64 assembly, ARM assembly, AARCH64 assembly, PowerPC assembly, and MIPS assembly. The depth and cache CLI filters heavily depend on bit depth size, so they need to be duplicated for each value. This means having to rename these filters, and adjust the callers to use the right version. Unfortunately the threaded input CLI module inherits a common.h dependency (input/frame -> common/threadpool -> common/frame -> common/common) which is extremely complicated to address in a sensible way. Instead duplicate the module and select the appropriate one at run time. Each bitdepth needs different checkasm compilation rules, so split the main checkasm target into two executables.
-
qp is modified to require a valid value before use, while qp_max is set to maximum allowable value (and clipped later on). This is needed so that param functions do not depend on bit depth size.
-
-
-
-
Anton Mitrofanov authored
-
Also drop the x264 prefix from all static cabac arrays.
-
-
Use dword instead of qword entries. Cuts the size of the tables in half which allows each table fit inside a single cache line. When PIC is disabled dwords are enough to store absolute addresses. When PIC is enabled we can store dword offsets relative to the start of the table and simply add the address of the table to the offset in order to calculate the full address. This approach also have the advantage of eliminating a whole bunch of run-time .data relocations.
-
On ELF platforms such symbols needs to be flagged as functions with the correct visibility to please certain linkers in some scenarios.
-
The standard section for read-only data on Windows is .rdata. Nasm will flag non-standard sections as executable by default which isn't ideal.
-
-
There are 32 pseudo-instructions for each floating-point comparison instruction, but only 8 of them are actually valid in legacy-encoded mode. The remaining 24 requires the use of VEX-encoded (v-prefixed) instructions and can therefore be disregarded for this purpose.
-
-
Anton Mitrofanov authored
Fix MSVS fprofiled build for win64
-
Anton Mitrofanov authored
-
- 11 Aug, 2017 1 commit
-
-
Henrik Gramner authored
Some cpuflags would previously be displayed incorrectly when running older operating systems without AVX support on modern CPU:s.
-
- 26 Jun, 2017 3 commits
-
-
Henrik Gramner authored
-
Henrik Gramner authored
-
Henrik Gramner authored
-