- 24 Dec, 2017 3 commits
-
-
5x speed up vs C code.
-
Add 'i_bitdepth' to x264_param_t with the corresponding '--output-depth' CLI option to set the bit depth at runtime. Drop the 'x264_bit_depth' global variable. Rather than hardcoding it to an incorrect value, it's preferable to induce a linking failure. If applications relies on this symbol this will make it more obvious where the problem is. Add Makefile rules that compiles modules with different bit depths. Assembly on x86 is prefixed with the 'private_prefix' define, while all other archs modify their function prefix internally. Templatize the main C library, x86/x86_64 assembly, ARM assembly, AARCH64 assembly, PowerPC assembly, and MIPS assembly. The depth and cache CLI filters heavily depend on bit depth size, so they need to be duplicated for each value. This means having to rename these filters, and adjust the callers to use the right version. Unfortunately the threaded input CLI module inherits a common.h dependency (input/frame -> common/threadpool -> common/frame -> common/common) which is extremely complicated to address in a sensible way. Instead duplicate the module and select the appropriate one at run time. Each bitdepth needs different checkasm compilation rules, so split the main checkasm target into two executables.
-
-
- 19 May, 2017 1 commit
-
-
Vittorio Giovara authored
-
- 23 Jan, 2017 4 commits
-
-
Alexandra Hájková authored
-
Alexandra Hájková authored
-
Alexandra Hájková authored
Also add some missing vector types in ppccommon.h
-
Vittorio Giovara authored
Architecture should always be the last element.
-
- 21 Jan, 2017 1 commit
-
-
Henrik Gramner authored
-
- 18 Jan, 2017 1 commit
-
-
Alexandra Hájková authored
Those functions are currently only used in 8-bit mode and results in warnings in other bit depths.
-
- 01 Dec, 2016 4 commits
-
-
Alexandra Hájková authored
-
Alexandra Hájková authored
-
Alexandra Hájková authored
-
Alexandra Hájková authored
Remove VEC_LOAD*, some of VEC_STORE* macros, some PREP* macros and VEC_DIFF_H_OFFSET macro. Make sure the functions do not use deprected primitives.
-
- 20 Apr, 2016 1 commit
-
-
Anton Mitrofanov authored
-
- 16 Jan, 2016 1 commit
-
-
Henrik Gramner authored
-
- 25 Jul, 2015 1 commit
-
-
Rong Yan authored
-
- 23 Feb, 2015 1 commit
-
-
Anton Mitrofanov authored
-
- 12 Dec, 2014 1 commit
-
-
Anton Mitrofanov authored
Didn't affect output due to the incorrect values either not being used in the code path or producing equal results compared to the correct values. Also deduplicate hpel_ref arrays.
-
- 08 Jan, 2014 1 commit
-
-
Henrik Gramner authored
Also update AUTHORS file and my e-mail address in the headers of various files.
-
- 09 Jan, 2013 1 commit
-
-
Loren Merritt authored
-
- 06 Mar, 2012 1 commit
-
-
Henrik Gramner authored
Some x264 asm assumed that the high 32 bits of registers containing "int" values would be zero. This is almost always the case, and it seems to work with gcc, but it is *not* guaranteed by the ABI. As a result, it breaks with some other compilers, like Clang, that take advantage of this in optimizations. Accordingly, fix all x86 code by using intptr_t instead of int or using movsxd where neccessary. Also add checkasm hack to detect when assembly functions incorrectly assumes that 32-bit integers are zero-extended to 64-bit.
-
- 04 Feb, 2012 1 commit
-
-
Hii authored
-
- 24 Mar, 2011 1 commit
-
-
Manuel Rommel authored
-
- 25 Jan, 2011 1 commit
-
-
Sean McGovern authored
-
- 19 Nov, 2010 1 commit
-
-
Oskar Arvidsson authored
Less verbose.
-
- 31 Oct, 2010 1 commit
-
-
Manuel Rommel authored
-
- 18 Sep, 2010 1 commit
-
-
Fiona Glaser authored
Update dates, improve file descriptions, make things more consistent. Also add information about commercial licensing.
-
- 16 Aug, 2010 1 commit
-
-
Manuel Rommel authored
-
- 15 Jul, 2010 1 commit
-
-
Loren Merritt authored
~1% faster overall on Conroe, mostly due to improved cache locality. Also allows improved SIMD on some chroma functions (e.g. deblock). This change also extends the API to allow direct NV12 input, which should be a bit faster than YV12. This isn't currently used in the x264cli, as swscale does not have fast NV12 conversion routines, but it might be useful for other applications. Note this patch disables the chroma SIMD code for PPC and ARM until new versions are written.
-
- 04 Jul, 2010 1 commit
-
-
Oskar Arvidsson authored
Output bit depth is specified on compilation time via --bit-depth. There is currently almost no assembly code available for high-bit-depth modes, so encoding will be very slow. Input is still 8-bit only; this will change in the future. Note that very few H.264 decoders support >8 bit depth currently. Also note that the quantizer scale differs for higher bit depth. For example, for 10-bit, the quantizer (and crf) ranges from 0 to 63 instead of 0 to 51.
-
- 09 Jun, 2010 1 commit
-
-
Henrik Gramner authored
-
- 17 May, 2010 1 commit
-
-
Henrik Gramner authored
-
- 06 May, 2010 1 commit
-
-
Anton Mitrofanov authored
-
- 05 Apr, 2010 1 commit
-
-
Fiona Glaser authored
Convert all applicable loops to use C99 loop index syntax. Clean up most inconsistent syntax in ratecontrol.c, visualize, ppc, etc. Replace log(x)/log(2) constructs with log2, and similar with log10. Fix all -Wshadow violations. Fix visualize support.
-
- 09 Nov, 2009 1 commit
-
-
David Conrad authored
No ARM or PPC assembly yet though.
-
- 23 Aug, 2009 1 commit
-
-
David Conrad authored
Neither GCC nor ARMCC support 16 byte stack alignment despite the fact that NEON loads require it. These macros only work for arrays, but fortunately that covers almost all instances of stack alignment in x264.
-
- 20 Jun, 2009 1 commit
-
-
David Wolstencroft authored
-
- 08 Feb, 2009 1 commit
-
-
Manuel Rommel authored
Also put width == 2 variant in its own scalar function because it's faster than a vectorized one.
-
- 26 Jan, 2009 1 commit
-
-
Guillaume Poirier authored
-