v4.192.0-rc1

This is a minor release that focuses mainly on improvements to color space
handling - including support for Dolby Vision Profile 5/8 content, completely
rewritten HDR tone/gamut mapping, and bringing `pl_color_space` more in line
with HDR10 metadata. Libplaceb now generally respects things like mastering
color space metadata, especially for PL_INTENT_SATURATION. This technology is
not yet perfect, and will be iterated upon in future versions of libplacebo.
This release serves mainly to lay the API groundwork.

Other notable changes include support for H.274 film grain synthesis, as well
as improvements to the AVFrame<->libplacebo interop (in particular, support for
hardware accelerated frames), and various fixes related to alpha transparency.

As usual, also includes a number of bug fixes, performance improvements, and
miscellaneous improvements to the API and debug messages - such as the
inclusion of a brand-new Vulkan memory allocator, and a complete overhaul of
the Vulkan synchronization infrastructure to use timeline semaphores.

Additions:
- add support for H.274 film grain synthesis
- add preprocessor macros for all params structs, allowing users to write e.g.
  `pl_render_params(.foo = bar)` to construct a `pl_render_params` pointer
  which implicitly includes any default fields, without the need for explicitly
  reading from pl_render_default_params
- add support for video/display rotation in the renderer
- add `pl_frame_copy_stream_props`
- add support for blending transparency against a checkerboard pattern
- add `pl_fmt.signature`, for render pass compatibility
- add `pl_pass_params.index_fmt` to allow 32-bit index buffers
- add `pl_dispatch_reset_frame`, to allow explicitly advancing the state
  of the PRNG and/or triggering garbage collection
- add the possibility of adding extra debug tags to GPU resources, which
  the default params helper macros set to the current source location
- add `pl_gpu_limits.max_variable_comps`, correctly specifying the upper
  bound on the number of uniform variable floats
- add `pl_vulkan_get`, `pl_opengl_get` and `pl_d3d11_get`
- add `pl_shader_set_alpha`
- add `pl_map_avframe_ex` and `pl_unmap_avframe`, which allow mapping extra
  AVFrame resources, and also support the use of hardware frames such as vaapi,
  dmabuf or vulkan
- add support for Dolby Vision color reshaping, via PL_COLOR_SYSTEM_DOLBYVISION
  and `pl_dovi_metadata` - including automatic mapping from AVFrames
- add <libplacebo/tone_mapping.h>, defining a collection of tone mapping
  primitives, and mechanisms for constructing LUTs, and including new
  functions such as BT.2446a and `spline`, as well as improvements to old
  curves such as `hable` and `linear` to make them more perceptually linear
- add a variety of new tone mapping modes (see `pl_tone_map_mode`), including
  auto-selection based on heuristics of the source characteristics
- add a variety of new gamut mapping modes (see `pl_gamut_mode`)
- add support for `NAME_gather` macros for use in user shaders

Changes:
- replace <libplacebo/shaders/av1.h> by the more general
  <libplacebo/shaders/film_grain.h>
- remove API members deprecated for libplacebo v3
- PL_ALPHA_UNKNOWN tagging on files is now assumed to be PL_ALPHA_INDEPENDENT,
  rather than PL_ALPHA_PREMULTIPLIED, and `pl_shader_decode_color` now also
  outputs independent alpha by default
- the `box` filter was removed entirely, due to a number of issues preventing
  it from being effectively useful in practice
- replace `pl_tex_transfer.stride_w/h` (specified in texels) by
  `row/depth_pitch` (specified in bytes)
- replace `pl_pass_params.target_dummy` by `target_format`, and
  requiring that rendered textures are compatible with this format
- allow calling `pl_queue_update` on NULL
- <libplacebo/vulkan.h> now requires support for the timeline semaphores
  feature, included automatically in Vulkan 1.2 and available via
  VK_KHR_timeline_semaphore in previous versions
- change pl_vulkan_hold/release API: removing VkAccessFlags, and replacing the
  VkSemaphore by pl_vulkan_sem (for timeline semaphore support)
- `pl_queue_push` may now be used to push frames out-of-order
- `pl_render_image_mix` may now be used on single frames, in which case
  libplacebo will still go through the mixer cache, potentially speeding up
  single-frame redraws (see: `pl_render_params.skip_caching_single_frames`)
- remove `pl_vulkan_params.disable_events`
- `pl_shared_mem.size` no longer needs to be set for DMABUFs and D3D11 textures
- remove support for 64-bit integer texture formats, since these are very
  poorly supported on most platforms and also extremely rarely needed
- completely refactor all of the tone mapping settings in `pl_color_map_params`,
  replacing the old `desaturation_*` and `tone_mapping_algo` by the new
  `tone_mapping_mode` and `tone_mapping_function`, and the old `gamut_warning`
  and `gamut_clipping` by the new `gamut_mode`
- completely refactor `pl_color_space`: deprecate the old `sig_peak`,
  `sig_scale` etc. fields in favor of merging `pl_hdr_metadata` into this
  struct, and update the API of several functions that previusly took
  `pl_color_space` struct values, to instead take pointers
- remove `pl_color_light` entirely, instead treating the OOTF as an inseparable
  part of the color transfer function (e.g. HLG)
- delete `pl_swapchain_colors` in favor of `pl_color_space` instead (which
  contains the exact same fields now)

Fixes and performance enhancements:
- improve the quality and performance of GPU random number generation
- fix int/float compilation error on GLES in dither shader
- correctly set minimum integer precision on GLES
- allow pl_recreate_plane to create non-host-readable FBOs
- correct GLSL version requirement for 3D LUT shader
- correctly check for presence of GL_EXT_texture_integer
- correctly check for presence of GL_EXT_texture_norm16 on GLES 3.0
- correctly check for presence of GL_EXT_color_buffer_float
- replace GL_ARB_debug_output by GL_KHR_debug
- fix GLSL shader version for GLES 2.0
- add support for GL_EXt_texture_rg on GLES
- use GL_ARRAY_BUFFER instead of GL_COPY_WRITE_BUFFER (GLES compatibility)
- correctly check for GL_UNPACK_IMAGE_HEIGHT presence
- disable host readback on too-old GLES
- add support for the `bgra8` image format on gl/gles and `bgrx8` on d3d11
- remove deprecated usage of bare pointers for OpenGL index data
- fix vulkan malloc efficiency estimate calculation
- fix a memory leak in the vulkan memory allocator for small (<1K) buffers
- complete rewrite of the vulkan memory allocator to improve throughput,
  reduce code complexity and generally reduce memory waste
- allow using libplacebo as a meson subproject
- fix cyclic header dependency
- fix dithering when the FBO bit depth is higher than the content bit depth
- fix segfault if command buffer allocation fails
- fix various thread safety issues in vulkan command polling
- properly invalidate framebuffers on OpenGL
- properly disable GL_DEPTH_TEST and GL_CULL_FACE when running GL passes
- fix GL_ARB_framebuffer_object check
- fix issue with rendering transparent images on non-transparent swapchains
- try and detect presence of alpha channels on opengl framebuffers
- fix OUTPUT and NATIVE_CROPPED hook expressions for flipped files
- fix EPOXY_HAS_EGL checks
- significance increase performance of ICC 3DLUT generation
- correctly set the C level to C11 during compilation
- properly align allocated memory to `max_align_t` (instead of `intmax_t`)
- fix `pl_cond_timedwait`
- fix bug where `pl_opengl_wrap`+`pl_tex_destroy` accidentally closed FD 0
- fix pl_get_buffer2 implementation
- fix alpha blending of transparent subtitles onto transparent images
- fix issue where frame blending could sometimes crash if the only image
  in the mix was too far away from the vsync
- fix segfault when dispatching compute shader output hook while frame blending
- fix UB when using the same shader with different types of textures on vulkan
- fix possible race condition when writing to the same vulkan resource twice
- fix issue where user shaders were sometimes executed as compute shaders
  despite no //!COMPUTE pragma
- properly align texel buffers on vulkan
- properly propagate HOOKED textures between passes of a shader
- also support importing vulkan features from meta-structs like
  `VkPhysicalDeviceVulkan12Features`
- drop the use of VkEvents entirely, and instead optimize the usage of pipeline
  barriers to always emit the minimum required dependency
- fix vulkan object type enum parsing on recent vulkan versions
- fix invalid output on the first frame after enabling peak detection
- fix linking order of glslang libraries
- fix generated .pc file on windows
- fix imports of dedicated memory with plane offsets into vulkan resources
- fix major performance issue when combining debanding with bilinear scaling
- fix issue where `pl_vulkan_wrap` did not support `pl_tex_params.user_data`
- fix compatibility with MoltenVK by adding VK_KHR_portability_subset
- various compatibility fixes for OpenGL version 2.1
- fix UB in `pl_test_pixfmt`
- fix issue where rendering to partial crops of vulkan textures untentionally
  invalidated the image contents outside of the rendering area
- reduce the rate of false negatives in the renderer mixing cache
- fix strided OpenGL texture uploads
- fix crash in the frame mixer when the image color space changes mid-stream
- fix check for EGL DMA buffer modifiers
- fix division by zero in tone mapping shader
- make `pl_mpv_user_shader_destroy` properly reset the passed pointer
- fix undefined behavior passing negative values to `pl_shader_decode_color`
- fix bug where PL_QUEUE_MORE resulted in invalid frame mix outputs
- fix thread race between `pl_queue_push` and reading from a `pl_frame_mix`
- fix `pl_render_high_quality_params`
- fix parsing .cube LUTs with scientific notation floats
- inline LUTs into shader text less aggressively
- disable peak detection in `pl_render_default_params`, instead moving it to
  `pl_render_high_quality_params`
- fix edge case where HDR spaces didn't properly disable linearization
- fix various issues involving the allocation of per-pass identifiers on d3d11
- fix issue where the d3d11 swapchain unnecessarily held on to the framebuffer
- make `pl_swapchain_start/stop_frame` more robust against threading issues or
  API misuse
- fix the application of `pl_color_adjustment.gamma` in `pl_shader_decode_color`
- add support for `pl_pass_params.cached_program` on d3d11