Commits on Source 28
-
Henrik Gramner authored
Most VEX-encoded instructions require an additional byte to encode when src2 is a high register (e.g. x|ymm8..15). If the instruction is commutative we can swap src1 and src2 when doing so reduces the instruction length, e.g. vpaddw xmm0, xmm0, xmm8 -> vpaddw xmm0, xmm8, xmm0 -
Henrik Gramner authored
Use register numbers instead of copying the full register names. This makes it possible to change register widths in the middle of a function and keep the mmreg permutations intact which can be useful for code that only needs larger vectors for parts of the function in combination with macros etc. Also change the LOAD_MM_PERMUTATION macro to use the same default name as the SAVE macro. This simplifies swapping from ymm to xmm registers or vice versa: SAVE_MM_PERMUTATION INIT_XMM <cpuflags> LOAD_MM_PERMUTATION -
* Coalesce some install recipe lines * Remove empty addition of GPLed filters * Install libdir in recipes that directly require it * Coalesce etags/TAGS rules * Simplify fprofiled rule
-
Virtually zero increase in compression efficiency compared to 4:2:0 with empty chroma planes. Performance is better though, especially with fast settings.
-
Henrik Gramner authored
Avoid the scalar C wrapper by utilizing opmasks to prevent overreading the input buffer.
-
Place it immediately after "static".
-
It was broken in "Drop the x264 prefix" patch.
-
-
Henrik Gramner authored
-
-
-
-
-
Increases overall encoding speed on POWER9 by 8%.
-
1) using xxpermdi + merge instead of 2 merges improves quant_8x8 performance by 5% 2) use vec_splats instead of vec_splat checkasm timings when compiled with gcc: C: AltiVec: before: after: quant_2x2_dc: 57 163 46 quant_4x4_dc: 141 162 57 dequant_4x4_cmp: 104 101 45 dequant_4x4_flat: 104 106 46 dequant_8x8_cmp: 412 208 147 dequant_8x8_flat: 414 212 149 -
-
Henrik Gramner authored
-
Bug reported by Nicolas Gaullier
-
Bug report by Koby Shina.
-
--trellis 0 was missed for it during 8-bit and 10-bit unification. Bug report by Aleksey Vasenev.
-
Bug report by Dirk Fieldhouse.
-
Henrik Gramner authored
Also fix the string parsing in param_apply_tune() to correctly compare the entire string, not just the first N characters.
-
-
-
Bug report by Daniel Deptford.
-
Ensures that access is atomic and that other threads sees the actual value of the variable.
-
Also check that CQP mode is not used with 2-pass.
-
Henrik Gramner authored