1. 26 Oct, 2020 1 commit
  2. 21 Oct, 2020 2 commits
  3. 12 Sep, 2020 2 commits
    • Niklas Haas's avatar
      glslang: explicitly mark parameterless functions as void · 44525d4b
      Niklas Haas authored
      Something something, empty function lists get interpreted as "unknown
      signature" rather than "void signature". Dunno if this is still relevant
      for C++, but may as well get into the habit.
      44525d4b
    • Niklas Haas's avatar
      opengl: disable software rasterizers by default · 497239f7
      Niklas Haas authored
      Mirroring logic from mpv's vo_gpu. Motivated by the fact that many
      GPU-less build environments have barely functioning or broken software
      rasterizer fallbacks. The status of this bool is controlled explicitly
      for the CI by the new `CI_ALLOW_SW` define, to allow gitlab tests to
      continue using (new enough versions of) llvmpipe.
      
      Closes #109
      497239f7
  4. 23 Jul, 2020 2 commits
    • Niklas Haas's avatar
      vulkan/malloc: fix memory type mask check · 5c6ad850
      Niklas Haas authored
      Treating 0x0 as "all types allowed" was a very terrible default, because
      it completely masked the failure case of there actually being no types
      allowed. I have no idea what the justification for this was, but it
      doesn't make sense anyway.
      5c6ad850
    • Niklas Haas's avatar
      vulkan/malloc: explicitly clear vk_memslice on error · 05abc109
      Niklas Haas authored
      Leaving this struct partially initialized on the failure path caused a
      double-free when the error code for pl_buf_create tried freeing this
      slab a second time (via pl_buf_deref).
      
      As an aside, also make sure we `goto error` on every possible failure
      path, rather than `return false`.
      05abc109
  5. 22 Jul, 2020 7 commits
    • Niklas Haas's avatar
      shaders: revise sh_lut method logic · ffd4f666
      Niklas Haas authored
      This is required to support GLSL ES 1.0 and GLSL 110, which forbid the
      use of literal arrays in shaders. Since SH_LUT_LITERAL is now no longer
      a safe fallback, we instead always fallback to SH_LUT_UNIFORM.
      
      This is technically an API break, since in the past, the naked pl_shader
      API would always generate literal shaders, but now they may have arrays
      attached as uniforms - to prevent this, users can still set small LUT
      sizes (which is what e.g. VLC does anyway)
      ffd4f666
    • Niklas Haas's avatar
      shaders: prefer SH_LUT_LITERAL for small linear LUTs · 44a80e80
      Niklas Haas authored
      This pretty much only really affects the polar sampling code, which uses
      a small linear LUT. I found that the performance gain depends on whether
      or not we're using compute shaders, with the non-compute shader path
      being the only one to really benefit from this change.
      44a80e80
    • Niklas Haas's avatar
      shaders: remove SH_LUT_LINEAR, make a bool instead · b4d96813
      Niklas Haas authored
      By providing fallback code to linearly interpolate between array values
      on the GPU. The motivating use case here is not just a concern of
      semantics/correctness, but more importantly, because doing so might
      actually be faster than going through a texture sample, for small LUTs.
      b4d96813
    • Niklas Haas's avatar
      gpu: use host-imported pointers for pl_tex_download_pbo · 2e04963f
      Niklas Haas authored
      This allows such tex transfers to avoid an extra memcpy in most cases,
      except where the pointer happens to be horrifically misaligned with
      respect to the texel size - but in these cases, the alignment-fixing
      memcpy will happen inside VRAM (PL_BUF_MEM_DEVICE), which should still
      be faster than doing an extra memcpy in RAM.
      
      Also, I realized it makes no sense to have tex_download_pbo use a buffer
      pool at all, because it's synchronous anyway - there can only ever be
      one buffer. And doing it this way avoids code duplication between the
      import branch and the non-import branch.
      
      Side note: We could do the same for pl_tex_upload_pbo with the same
      justification, but I decided to test the waters with this commit first.
      2e04963f
    • Niklas Haas's avatar
      vulkan/malloc: round imported pointers to page boundaries · ae9f4166
      Niklas Haas authored
      This allows us to bypass the page-alignment restriction on host pointer
      imports, by simply sufficiently extending the host pointer base, the
      buffer offset, and the memory size in the respective direction. Thus
      ensuring that our memory import is always page-aligned.
      
      This *should* technically be safe, because the MMU can only enforce
      virtual memory access safety on a per-page granularity, and our code
      should never end up reading outside the bounds of a vk_memslice. But on
      the other hand, what we're doing is absolutely insane. Beware nasal
      demons. I only wrote this logic because I enjoy sharing an address space
      with a malevolent agent of chaos.
      
      As an aside, also fix some errors related to imported buffer size
      calculation and alignment validation that I noticed along the way.
      ae9f4166
    • Niklas Haas's avatar
      context: add ability to temporarily cap log verbosity · a02084cc
      Niklas Haas authored
      This is intended for stuff like probing functions, to avoid generating
      bogus error messages. We directly make use of this function to clean up
      the format probing code, which is notoriously prone to generating error
      spam.
      a02084cc
    • Niklas Haas's avatar
      tests/bench: add pl_tex transfer benchmark · c20b0eb4
      Niklas Haas authored
      Mostly so I can test the improvements that leveraging host-mapped
      pointers will give us.
      c20b0eb4
  6. 19 Jul, 2020 1 commit
    • Niklas Haas's avatar
      shaders/colorspace: clip before tone-mapping functions · c8bfe345
      Niklas Haas authored
      To prevent logic errors when overflowing e.g. the BT.2390 function, and
      also make functions behave more predictably on overflow in general.
      
      This ensures no function will ever see something larger than sig_peak.
      Requires changes to `clip` and `linear` to make them work properly
      again.
      c8bfe345
  7. 16 Jul, 2020 1 commit
  8. 15 Jul, 2020 1 commit
  9. 14 Jul, 2020 1 commit
    • Niklas Haas's avatar
      gpu: add preliminary API support for DRM format modifiers · 025c5dcb
      Niklas Haas authored
      This is still a pretty bad hack-patch as of currently, because no driver
      actually implements the drm format modifier extension. But this way of
      doing it at least allows us to differentiate between linear and
      non-linear, which we assume (blindly) is equal to optimal, and is needed
      to get vaapi hwdec working on AMD.
      
      We also get rid of the plane offset check because this also conflicts
      with the requirements of drm format modifiers, which we again can't
      respect properly. We already suppress validation errors for the image
      bind, and it works in practice.
      025c5dcb
  10. 13 Jul, 2020 2 commits
    • Niklas Haas's avatar
      include: add _COUNT members to all public enums · 5e517936
      Niklas Haas authored
      For consistency, and because these technically serve a useful purpose
      (e.g. allowing static array sizing or bounds checks).
      5e517936
    • Niklas Haas's avatar
      colorspace: rename pl_color_levels · e4c03d0f
      Niklas Haas authored
      I was growing unhappy by the use of the non-explanatory, confusing and
      misleading 'TV' and 'PC' enum names. Replace them by the more
      descriptive terms 'LIMITED' and 'FULL', respectively.
      
      No API bump because this is not a breaking change, as the old enum names
      are still defined.
      e4c03d0f
  11. 12 Jul, 2020 1 commit
    • Niklas Haas's avatar
      vulkan: remove FIXME comments on buffer sharing mode · 996e2b58
      Niklas Haas authored
      1. VkBuffer sharing mode doesn't actually affect anything in real-world
         drivers (e.g. RADV, ANV, AMDVLK).
      2. VkBuffers are not part of the interop API so we don't care about
         having to communicate this to the user.
      3. Having to somehow transition all buffers would be a pain anyway
      996e2b58
  12. 11 Jul, 2020 1 commit
  13. 09 Jul, 2020 2 commits
    • Niklas Haas's avatar
      opengl: refactor pl_opengl_wrap · 4a5ce5bc
      Niklas Haas authored
      This combines the function with the previously hidden pl_opengl_wrap_fb,
      allowing users to either provide their own framebuffers (in addition to
      the texture) or just wrap a plain framebuffer directly.
      
      In addition to merging these two functions, we also significantly
      overhaul the `gl_fb_query` function for inferring `pl_fmt` details from
      an opaque framebuffer. In particular, our wrapped framebuffers can now
      support PL_FMT_CAP_HOST_READABLE.
      
      Closes https://github.com/haasn/libplacebo/issues/81
      4a5ce5bc
    • Niklas Haas's avatar
      opengl: fix typo in comment · 80e862b1
      Niklas Haas authored
      80e862b1
  14. 06 Jul, 2020 11 commits
    • Niklas Haas's avatar
      shaders/colorspace: read detected peak directly from ssbo · b48c81cb
      Niklas Haas authored
      With the recent series of refactors to the vulkan malloc layer,
      host-visible device-local memory types exist and are allocatable, so we
      can directly serve host-readable uniform buffers.
      
      For the scenarios in which it's not possible, working around it should
      probably be done inside the pl_gpu, not the application code. (i.e.
      'host visibility emulation')
      b48c81cb
    • Niklas Haas's avatar
      vulkan: slightly revise buffer requirements/placement logic · 055cdc0a
      Niklas Haas authored
      Now that we support the existence of 'optimal' memory type properties,
      we can make device-local memory be the 'optimal' type by default. We can
      also split up `host_mapped` into scenarios where it's required and
      scenarios where it's merely recommended.
      055cdc0a
    • Niklas Haas's avatar
      vulkan/malloc: invalidate mapped noncoherent memory · 2f2ba1a6
      Niklas Haas authored
      Imported noncoherent memory is not implicitly invalidated.
      2f2ba1a6
    • Niklas Haas's avatar
      vulkan/malloc: misc fixes related to host pointer import · 870cb541
      Niklas Haas authored
      1. Log the proper pointer on unimport
      2. Add missing test case
      870cb541
    • Philip Langdale's avatar
      vulkan: implement support for dedicated imported allocations · d5b23f61
      Philip Langdale authored
      Dedicated allocations are ones where memory is allocated with
      a single image or buffer specified at allocation time, and only
      that buffer or image can be bound to the memory.
      
      Our first use-case for supporting it is to handle importing dma_bufs
      on AMD hardware, where the driver says dedicated allocations are
      required.
      
      I've tested this on Intel hardware, which doesn't require dedicated
      allocations, but works fine if you force them.
      Modified-by: Niklas Haas's avatarNiklas Haas <git@haasn.xyz>
      
      Rebased on top of the vulkan malloc API refactor, and also added support
      for allocating dedicated slabs directly - which allows us to also
      allocate dedicated memory for images which advertise preferring
      dedicated allocations. Finally, add some extra verification.
      
      Closes: !72
      d5b23f61
    • Niklas Haas's avatar
      vulkan/malloc: major API refactor · aea6f237
      Niklas Haas authored
      Major refactor, accomplishing the following:
      
      - group args into a params struct
      - unified API for importing, generic and buffers
      - move buffer importing boilerplate to the malloc layer
      - split up the property flags into required and optimal properties
      - better memory type scoring
      - enforce heap size when allocating large slabs
      - fix some buggy checks for optionally visible/coherent memory
      
      And probably more that I'm forgetting.
      aea6f237
    • Niklas Haas's avatar
      vulkan/malloc: only require host-cached memory for large buffers · 20014f11
      Niklas Haas authored
      Uncached reads are extremely slow, but for small buffers it shouldn't
      matter, since they're only used to readback small bits of state
      information and other non bandwidth-sensitive things.
      20014f11
    • Niklas Haas's avatar
      ci: disable parallel testing · 41bb87db
      Niklas Haas authored
      Parallel tests make errors much more confusing and hard to find.
      41bb87db
    • Niklas Haas's avatar
      tests: make errors more findable · 7176bf1f
      Niklas Haas authored
      Prefix the require() failure case to let me ctrl+f for them.
      7176bf1f
    • Niklas Haas's avatar
      shaders/av1: overhaul and fix grain reusability test · 5cc2e2a4
      Niklas Haas authored
      A lot of these fields were either redundant, too aggressively checked,
      not checked aggressively enough, or simply leftovers.
      
      Clean up this logic and bring it into the (hopefully) intended form.
      5cc2e2a4
    • Niklas Haas's avatar
      shaders/av1: avoid memcmp() on padded structs · ab8bd2f1
      Niklas Haas authored
      This can end up comparing undefined memory regions, because unpadded
      areas of structs may not be initialized with anything particular.
      ab8bd2f1
  15. 05 Jul, 2020 2 commits
  16. 01 Jul, 2020 3 commits