Skip to content
Snippets Groups Projects
  1. Aug 02, 2021
    • Lyndon Brown's avatar
      cpu: micro-optimise feature dumping · d9e79378
      Lyndon Brown authored
      the feature-specific macros each make a call to `vlc_CPU()` which involves
      making an atomic read of a global variable, then ANDing the return with a
      specific feature flag and comparing with zero. rather than do that over and
      over pointlessly, just get the flag set once with a direct call to
      `vlc_CPU()` and then directly compare with the individual feature flags.
      
      since we've removed the build-time conditional definition of the feature
      test macros which defined them as just returning `1` if matching a build
      target, this produces no difference in results, whereas if we had not
      removed that, the results would not have corresponded with build-time
      force-enabled features.
      d9e79378
    • Lyndon Brown's avatar
      cpu: remove build-time influence on feature test macros · 435a8a48
      Lyndon Brown authored
      the `vlc_CPU_SSE2()` type macros are provided for run-time feature
      detection. built-time target settings should have no influence on these,
      except perhaps in rare cases where we happen to know that a feature to
      absolutely definitely be available and wish to force enable it in the
      build for that platform.
      
      as an example, windows 8+ requires SSE2 (apparently), and so if we only
      want to support windows 8+ for our windows builds, then we could force
      enable SSE2 via use of the `-msse2` compile flag.
      
      in general though, i'm not aware that we ever do such a thing; i'm not sure
      we ever will want to, and we can easily hack in a necessary change if we
      did at such time; it risks mistakes being made; and as things were, this
      was just adding a lot of unnecessary mess.
      
      i find it questionable whether we even need to conditionally define the
      `VLC_SSE` type defines, but at leats there's only a couple of those.
      
      with the arm stuff:
       - note that while linux specific feature detection works, we are missing
         detection for other platforms.
       - `HAVE_FPU` is built-time dependant, which possibly needs replacing with
         runtime detection?
      435a8a48
    • Lyndon Brown's avatar
      cpu: expand x86/x86_64 feature coverage · ad426355
      Lyndon Brown authored
       - added detection for: FMA3, BMI1, BMI2, POPCNT, LZCNT, AVX512.
       - added missing detection on non-linux for: AVX, AVX2, FMA4, XOP, SSE4a.
         (with thanks to libx264 in combination with intel docs).
       - fixed broken detection of fma3 on linux - the string checked for needs
         to be "fma" not "fma3" at least on my system.
      
      in the latter case, these were already detected on linux via the
      alternative `/proc/cpuid` based cpuid detection, but were not handled by
      the generic solution used on other platforms.
      
      a xor operation is performed upon ecx to zero it before calling cpuid since
      when calling cpuid with eax=7 to get extended features, we need to also
      have ecx=0 to get the right data set for our needs, as detailed on
      wikipedia ([1]).
      
      the added AVX2 detection will enable my AVX2 chroma plugin to work on
      non-linux systems as well as linux. the rest i have no plans for, but we
      already have ones that are unused so why not expand coverage.
      
      i removed the obsolete `i_` prefix from the integer variables to avoid
      having to use them in all of the new code added.
      
      the AVX512 detection looks for a specific subset of available AVX512 flags
      (there are several). i've simply copied that used in libx264.
      
      i don't expect `__AVX512__` will actually work, but it'll do for now.
      
      [1]: https://en.wikipedia.org/wiki/CPUID
      ad426355
    • Lyndon Brown's avatar
      cpu: remove obsolete SSE OS-support test · 04fad74a
      Lyndon Brown authored
      to use SSE you both need a CPU with the SSE feature, and an OS that has
      support for saving the SSE registers during context switches. whilst
      detecting SSE CPU support is easy via the `cpuid` instruction and checking
      flags, it is not easy to check for OS support which involves trying to use
      an SSE instruction in a process fork, and seeing whether or not this
      results in SIGILL.
      
      (thankfully cpu designers made things easier for AVX & AVX512).
      
      since SSE support has been around since 1999, and all operating systems
      from that time are long since unsupported, we should be able to safely
      assume now that SSE OS support is available and thus remove this check.
      
      note, the `vlc_CPU_check()` function is now only used for ppc and only
      with `CAN_COMPILE_ALTIVEC` so i've adjusted the preprocessor conditions
      accordingly.
      04fad74a
    • Lyndon Brown's avatar
      cpu: purge obsolete cpuid instruction support check · 9a6c1bcf
      Lyndon Brown authored
      the `cpuid` instruction was introduced back in 1993 (according to
      wikipedia). we can thus safely assume that this exists on all platforms we
      care about.
      
      this also removes the obsolete ability to bypass the level=0 check for
      certain very old cpus.
      9a6c1bcf
    • Lyndon Brown's avatar
      cpu: purge MMX/MMXEXT · 321afe4f
      Lyndon Brown authored
      notes:
       - the `b_amd` detection was only used in connection to MMXEXT so was also
         removed.
       - the `goto out`s were replaced with `return 0` to avoid the warning that
         can now occur as i experienced.
      321afe4f
    • Lyndon Brown's avatar
      postproc: update SIMD variant selection · 236432b1
      Lyndon Brown authored
      the cpu SIMD selection code removed here dates from a time when vlc had
      options for disabling use of select SIMD variants, from before postproc
      added cpu auto-detection ([1]), and from before postproc seems to have
      added SSE2 ([2] and [3]).
      
      we are purging MMX/MMXEXT from vlc v4.0-dev, and thus have an interest in
      removing the corresponding MMX/MMXEXT bits here. rather than just removing
      those lines, alongside adding an entry for SSE2 though, i have instead
      chosen to convert the code to use auto-detection, which avoids having to
      keep the block of code explicitly enabling implementations in sync with the
      set of implementations available.
      
      note, the version of postproc in contribs is very old, pre-dating the
      SSE2 and CPU feature auto-detection enhancements. accordingly i have had to
      ensure that we define `PP_CPU_CAPS_AUTO` ourselves when not found, as had
      been done for `PP_CPU_CAPS_ALTIVEC`. effectively, for users like myself on
      linux with a new enough version, the auto-detection will work correctly and
      now make use of SSE2, which we were ignoring before; while where the
      contrib package is used, its use will fall back to the C implementation
      until such time that the contrib gets updated.
      
      [1]: https://github.com/FFmpeg/FFmpeg/commit/59d686f100863d00b8f171dd891e893c2bfd951e
      [2]: https://github.com/FFmpeg/FFmpeg/commit/4e264d1c79cfae8c3e05aacf77e350ed1b6d7e4b
      [3]: https://github.com/FFmpeg/FFmpeg/commit/f48cddfe4cf04e2d6e802d12e973301ff5a1a9a8
      236432b1
    • Prince Gupta's avatar
      qt/ColorSchemeModel: fix function call · f7c98371
      Prince Gupta authored and Hugo Beauzée-Luyssen's avatar Hugo Beauzée-Luyssen committed
      f7c98371
  2. Aug 01, 2021
  3. Jul 31, 2021
  4. Jul 30, 2021
Loading