1. 10 Dec, 2018 1 commit
  2. 06 Dec, 2018 5 commits
    • Niklas Haas's avatar
      vulkan: explicitly query supported semaphore handle types · a3e1c43d
      Niklas Haas authored
      Mercifully much easier than buffers/images
      a3e1c43d
    • Niklas Haas's avatar
      vulkan: enforce image format compatibility with handle type · 6c29d3e1
      Niklas Haas authored
      Similar to the previous commit, we need to ensure compatibility with the
      handle type we are interested in to avoid UB.
      
      Also rewrite the image format compatibility check logic in general to
      make it more future-proof. Also fixes a few memleaks on error (due to
      `return NULL` when it should have done `goto error`).
      6c29d3e1
    • Niklas Haas's avatar
      vulkan: add support for loading instance-level functions · bc7fce86
      Niklas Haas authored
      After postponing this issue previously because we had no permanent
      context associated with user-provided instances, and no way of
      retroactively knowing what instance-level extensions were provided, the
      solution hit me on the head:
      
      We can just load them as part of the device-level extensions that depend
      on them. Duplicates in this list don't matter, since loading the same
      pointer twice is idempotent.
      bc7fce86
    • Niklas Haas's avatar
      vulkan: reduce handle_type boilerplate · aff6b549
      Niklas Haas authored
      There were a bunch of disjointed switch statements all over the place.
      Reduce these in favor of one helper function, since we'll end up needing
      this more as we go on.
      aff6b549
    • Philip Langdale's avatar
      vulkan: add support for win32 handle types · 8dea6887
      Philip Langdale authored
      There are two win32 specific handle types, which we believe to be
      the ones that would be primarily used by a win32 client.
      
      There is the 'NT Handle', which is available on Windows 8 or newer,
      and the 'Global share Handle' (abbreviated as KMT in the Vulkan spec),
      which is available on Windows 7 and newer, but which apparently
      shouldn't be used when the 'NT Handle' type is available. We do not
      attempt to judge usage, and it is up to the client to decide which
      of the two types it wants to use.
      
      Note that the existing logic for identifying the handle_caps is not
      actually sufficient. We need to call specific functions to identify
      what is possible for buffers, images, and semaphores.
      
      * vkGetPhysicalDeviceImageFormatProperties2KHR
      * vkGetPhysicalDeviceExternalBufferPropertiesKHR
      * vkGetPhysicalDeviceExternalSemaphorePropertiesKHR
      
      and these need to be called with the specific flags and parameters,
      so I think it basically ends up being that you can't tell if your
      image/buffer/semaphore is exportable until you create it. Anyway.
      
      I am not in a position to test any of these changes. Someone who
      can build on win32 must do that.
      8dea6887
  3. 30 Nov, 2018 4 commits
  4. 24 Nov, 2018 1 commit
  5. 21 Nov, 2018 5 commits
    • Niklas Haas's avatar
      vulkan: add test case for pl_sync · 95d826f5
      Niklas Haas authored
      Just use some dirty hacks to signal the VkSemaphore, for lack of any
      actual external API usage here.
      95d826f5
    • Niklas Haas's avatar
      vulkan: don't error when creating useless images · 04331ebe
      Niklas Haas authored
      Edge case that can arise in the test suite.
      04331ebe
    • Niklas Haas's avatar
      gpu: add pl_tex_export, replacing pl_vulkan_hold_external · 5d2be6e5
      Niklas Haas authored
      This is similar to the old vulkan hold/release_external but more
      generalized and more powerful. It also cleans up some of the semantics
      related to external image interop. To avoid headaches, just delete the
      old vulkan interop code in the same commit.
      
      To make sure we don't keep around references to destroyed sync objects,
      we refcount them. (This also means we don't need a lazy destructor,
      since a sync object that's not in use can be destroyed immediately -
      whereas a sync object that's in use will only be deref'd when the cmd
      callback fires)
      5d2be6e5
    • Philip Langdale's avatar
      vulkan: implement pl_sync · 2f5a19dc
      Philip Langdale authored
      These must eventually be ref-counted once we start using them to
      synchronize access to actual GPU resources.
      2f5a19dc
    • Philip Langdale's avatar
      gpu: refactor shared memory handle API · 5198e156
      Philip Langdale authored
      When we introduce exportable semaphores, we'll have handles which are
      not memory-backed. To avoid a sloppy API, let's explicitly separate the
      two ideas. This also presents an opportunity to pull the object offset
      within the memory into the mem_handle definition. We also refactor the
      way the handle caps are presented to the user, to allow distinguishing
      between caps for shared memory and those for semaphores.
      
      We also turn the struct pl_gpu_handle into a union, since in practice,
      it is not going to be useful to export a given resource as multiple
      types of external handle - it will just one or another type. (This is
      mostly a simple substitution, but we must also note that vk_slabs must
      also carry the handle type explicitly, as we can no longer look at which
      field(s) are set in a pl_gpu_handle.)
      
      This is obviously an API break for any external usage of buffers or
      textures.
      5198e156
  6. 18 Nov, 2018 3 commits
  7. 16 Nov, 2018 1 commit
    • Niklas Haas's avatar
      gpu: drop the dynamic ubo/ssbo/pushc layouts · 51379b3f
      Niklas Haas authored
      Ever since SPIRV-Cross has started emulating std140 layout when
      cross-compiling SPIR-V to D3D11, there's no more real justification for
      us to make this dependent on the backend type.
      
      In theory, metal could require something different here - but even then
      we should work around it in SPIRV-Cross rather than trying to make our
      code depend on it at runtime.
      51379b3f
  8. 12 Nov, 2018 1 commit
  9. 06 Nov, 2018 3 commits
  10. 05 Nov, 2018 1 commit
    • Niklas Haas's avatar
      vulkan: rewrite buffer memory placement logic · 3ef1cfc2
      Niklas Haas authored
      Explicitly defaulting the buffer type was also not the right thing,
      since it e.g. conflicted with the `params->host_mapped` logic further
      below. Instead of trying to default the memory type up there, I instead
      inverted this logic to do the defaulting at the bottom, after all of the
      type-specific requirements have been forced.
      
      This order should hopefully be more robust against future such bugs.
      3ef1cfc2
  11. 04 Nov, 2018 4 commits
    • Niklas Haas's avatar
      vulkan: explicitly default PL_BUF_MEM_AUTO · 3a0371f1
      Niklas Haas authored
      The previous behavior assumed that the ordering in the vulkan driver's
      memory types would be ordered by speed - but for any sort of buffer
      involving host interaction, this is misleading. Sure, it might make the
      actual buf->image copy faster, but it involves an additional memcpy from
      host to device so the performance gain is negated.
      
      To account for this, explicitly compensate for this by forcing
      host-interactable buffer memory to be resident on the host.
      3a0371f1
    • Niklas Haas's avatar
      vulkan: conform to vkCmdUpdateBuffer limits · 52615be5
      Niklas Haas authored
      This is statically limited to 64 kB by the spec, so enforce these limits
      by requiring host-mapped memory in this case.
      52615be5
    • Niklas Haas's avatar
      vulkan: implement PL_HANDLE_FD · bbc5a360
      Niklas Haas authored
      We create the fd when allocating the memory, since we only need one to
      represent the entire allocation. To export memory, we need to
      synchronize using queue family ownership transfers with a special queue
      family VK_QUEUE_FAMILY_EXTERNAL_KHR. Do this by extending `buf_barrier`.
      bbc5a360
    • Niklas Haas's avatar
      gpu: allow influencing buffer memory allocation · f3961441
      Niklas Haas authored
      Since users may want to allocate texture transfer buffers in GPU memory
      for the purposes of uploading data generated by CUDA or other external
      APIs, we introduce a new field which can be used to influence this
      memory placement decision.
      f3961441
  12. 02 Nov, 2018 1 commit
  13. 31 Oct, 2018 2 commits
  14. 30 Oct, 2018 3 commits
    • Niklas Haas's avatar
      vulkan: correctly synchronize non-mapped buffer writes · d1b6c26e
      Niklas Haas authored
      We forgot to add the offset of the vk_memslice into the vk_buf_write
      when synchronizing via buf_barrier.
      
      Fix this by moving the offset addition into buf_barrier, which reduces
      some of the code complexity and also makes this more robust in general.
      d1b6c26e
    • Niklas Haas's avatar
      vulkan: perform buffer flushes *after* writing to the buffer · 18d0d06b
      Niklas Haas authored
      Right now, for some reason, when something involves visible writes, we
      submit it as part of the buf_barrier. This is nonsensical, and only
      works by coincidence. We need to actually include the write in the
      source scope of the buf barrier. So we need to re-order the dependency
      so that it happens after the buffer operation.
      
      Accomplish this by splitting up the buffer barrier into two halves: one
      synchronizing the current access w.r.t the previous access, and one
      simply synchronizing the current access w.r.t host reads. (This does not
      require changing buf_vk->current_access because no vulkan commands
      correspond to host access, so there's nothing to transition out of)
      18d0d06b
    • Niklas Haas's avatar
      gpu: clarify and revise access rules on buffers · 08a40d8b
      Niklas Haas authored
      Due to the way the comment was formulated, it sounded like multiple
      accesses by libplacebo to the same buffer were also forbidden while a
      buffer was in use. However, this is simply not the case - we can do
      texture up- and downloads all day, since these are synchronized
      internally using semaphores.
      
      So, tighten the scope of what's forbidden to explicitly writes performed
      by the user (including pl_buf_read/write). Furthermore, while the
      comment suggested that coalescing multiple writes was fine, this is
      nonsensical - the result has to be undefined by definition since we
      don't know the scope or ordering of the access. As a result, forbid it.
      08a40d8b
  15. 27 Oct, 2018 2 commits
    • Niklas Haas's avatar
      vulkan: fix render pass layout for newly created FBOs · fc7600b9
      Niklas Haas authored
      This is actually a disturbingly common case, since most of the time the
      pass will be created when the FBO is fresh as well. Fortunately, it only
      matters when we need the layout for anything.
      
      As an aside, maybe we should skip this 'optimization' step and just use
      _UNDEFINED / _PREINITIALIZED depending on whether or not we need to
      blend? Further testing warranted. But assuming it makes no difference,
      the current code is fine. And assuming it makes a difference, the
      current code would be better.
      fc7600b9
    • Niklas Haas's avatar
      vulkan: fix stall if buf_poll timeout is never > 0 · 420df576
      Niklas Haas authored
      The "optimization" in 223aa09a removed a "needless" flush that was very
      much needed if the user simply never called pl_buf_poll with a nonzero
      timeout. In this case, we never reap commands with `vk_poll_commands`,
      and if the user does not `pl_gpu_flush` then we never even flush them to
      begin with!
      
      Fix both issues at the cost of having to flush every now and then.
      Hopefully this will only matter for the first few frames anyway.
      420df576
  16. 25 Oct, 2018 1 commit
    • Niklas Haas's avatar
      gpu: add pl_gpu_finish · 2e742f53
      Niklas Haas authored
      Much like glFinish, this can be useful when you want to force all GPU
      work to be completed (for any number of reasons, e.g. deinit logic or
      for benchmarking purposes).
      2e742f53
  17. 14 Oct, 2018 1 commit
    • Niklas Haas's avatar
      vulkan: avoid needless flush in buf_poll · 223aa09a
      Niklas Haas authored
      No need to flush if we aren't actually blocked.
      
      While it's unlikely this impacted performance much, since the design of
      the renderer forced users to do all uploading before the start of any
      rendering commands, it might still help in some cases.
      223aa09a
  18. 03 Jun, 2018 1 commit
    • Niklas Haas's avatar
      vulkan: add external VkImage interop API · e1016179
      Niklas Haas authored
      This is essentially the same interface that's used between pl_gpu and
      the vulkan swapchain implementation, tidied up a bit and exopsed to the
      user.
      
      This required tying off some loose ends related to queue families in
      order to make sure the behavior is defined.
      
      Closes #22.
      e1016179