- May 22, 2020
-
-
Niklas Haas authored
As an aside, we also make sh_subpass not explicitly spam/fail the parent shader in this case.
-
Niklas Haas authored
Rather than this merely representing an "in-flight" image, with the img->sh only living for as long as this exists between different pass types, `img` is now conceptually persistent and either in one of two modes: `sh`, or `tex`. This allows us to, in principle, avoid doing redundant FBO roundtrips for cases where the previous pass thinks the next pass needs a tex but the next pass actually needs a sh, such as is currently the case for the AV1 grain shader. Since `pass_hook` in particular can randomly mutate `img` to either of the two forms, callers must now be somewhat vigilant to make sure they always use `img_tex()` and `img_sh()` to access the "current" shader/texture, rather than relying on local variables staying persistent. The use of locally initialized pl_shader is now exclusive to passes that keep their own pl_dispatch_begin calls (for various reasons).
-
Niklas Haas authored
The current approach was to pair each FBO with its use, "semantically". The intent was to minimize the number of "reallocations" that would be required if the number of passes changed dynamically (e.g. as the renderer options changed). However, this is not only an unrealistic design goal to optimize for (users can use separate pl_renderers for wildly different purposes, and for a single conceptual video stream it doesn't really matter), but also, it gets in the way of a planned refactor I have concerning `struct img`. Change this to make all FBOs dynamically allocated. The current implementation simply uses a counter, but a more advanced implementation could use a pool of textures and find ones that have matching sizes before recreating ones that don't. The API shouldn't change as a result of this.
-
- May 21, 2020
-
-
Niklas Haas authored
In doing so I finally hit the time bomb caused by assuming blacklisting compute is only needed on that specific driver version. Turns out, the same issue is present even on newer driver versions. Since I have no idea how else to work around and/or debug it, just permanently disable compute on the CI. Unfortunately, for some reason, the `shaderc` version in this version of the CI image hits random internal exceptions when trying to compile pretty much anything. But using glslang directly works. Except for msan, because we don't have msan-instrumented libc++. Some other changes needed for whatever reason.
-
Niklas Haas authored
These don't generate any errors, but the compilation status still isn't "success". Treat these as errors as well, in terms of logging.
-
Niklas Haas authored
There's really no reason not to. Also clarify that these functions are not, in fact, "mandatory" instance-level function pointers.
-
Niklas Haas authored
This extension was treated as global in the past, but now that we have a proper extension framework we should just handle it like any other extension. Solves a very concrete issue where dependencies on e.g. VK_EXT_hdr_metadata were not satisfied if the user did not happen to enable this extension. Also check if the extension is loaded when we attempt actually creating a swapchain.
-
Niklas Haas authored
Forgot to un-mark timers as recording. Probably mostly benign, but could hypothetically cause issues.
-
Niklas Haas authored
With ca1ebc, the mismatched shaders of the variety that 092229 was intending to solve should no longer be a possibility. If this is still the case, it's probably a bug. Assert instead.
-
Niklas Haas authored
This function already exists.
-
Niklas Haas authored
The current logic could end rounding the plane up in both directions, inadvertently introducing an upscale by 1 pixel to the refplane, which not only forced an FBO indirection but also lowered the quality due to the effective resample. Furthermore, the code for adjusting the rect by the `rc` was wrong, since it failed to scale the "affine" part of the transformation down to the coordinate space of the plane. Simplify and fix this by only rounding the offset (making sure to always round towards 0 to have the best chance of "doing the right thing"), and also correctly scaling this offset when calculating the offsets for the individual planes.
-
Niklas Haas authored
This can happen if e.g. the shader's output image is rounded up relative to the rect actually being sampled, for example when when users are sampling fractional parts of the image.
-
Niklas Haas authored
Across the boards, "output sizes" and so forth are assumed to be integers, but a lot of the code generating them is based on floating point math. To make it clear what happens, make things consistent by using roundf() instead of implicit conversion (which may end up truncating etc.).
-
Niklas Haas authored
The order these are written in currently means the second check never has a chance to get printed, because SAMPLER_NOOP implies SAMPLER_DIRECT.
-
- May 20, 2020
-
-
Niklas Haas authored
In theory there are various ways we could reconcile this difference, but for now, the easiest thing to do is to simply disable it. I'll try working on a way to bring back support for this, but I wanted to get this fix out of the way first.
-
Niklas Haas authored
09834c52 introduced a regression here that caused an unnecessary FBO indirection to occur for the (relatively common) case of user shaders consulting the dimensions of OUTPUT in their RPN expressions. OUTPUT apparently behaves weirdly. mpv should probably document this better. Even better would be to have had different names for the OUTPUT hook stage and the OUTPUT magic constant for RPN exprs. But it's probably too late for that now. Unless we want to rename the OUTPUT stage, which we totally could do.
-
Niklas Haas authored
Also add a TODO for something I realized while staring at this code again.
-
Niklas Haas authored
This shader is very definitively non-resizable. Flag it as such.
-
- May 19, 2020
-
-
Niklas Haas authored
This is a bit annoying, and arguably such shaders are broken to begin with because there's no way you can possibly rely on this logic without spamming warnings. But it represents a difference in functionality compared to mpv, so we implement this just to be on the safe side.
-
Niklas Haas authored
This is `int` and not `size_t`, which may cause issues on platforms where the two are not the same size.
-
Niklas Haas authored
To make sure these messages see the light of day before the application gets a chance to crash.
-
Niklas Haas authored
Would have made debugging https://github.com/haasn/libplacebo/issues/74 easier.
-
Niklas Haas authored
These were never updated for 7ac80fb8
-
Niklas Haas authored
PL_VERSION can be a bit unhelpful for git commits
-
Niklas Haas authored
A simple oversight. Fixes e.g. Anime4K
-
Niklas Haas authored
Rather than tying struct img to the specifics of pass->cur_img, I wanted to loosen this up a bit and re-use it for the plane state. To this end, I parametrized one key function, `pass_hook`, by the actual `img` to use, and got rid of some redundant fields. This commit doesn't (or shouldn't) change the behavior yet, it's mostly to disentangle the behavior-changing bits of this refactor from the merely refactoring ones.
-
- May 18, 2020
-
-
Niklas Haas authored
Large slab allocations failing will end up segfaulting here
-
Niklas Haas authored
Still not entirely sure why I renamed it if I'm just gonna add a compatibility alias anyways. But whatever. It's starting to get to that point where I have to be at least slightly worried about breaking downstream.
-
Niklas Haas authored
pl_color_space_infer hard-codes this as BT.709, but we could/should be using pl_color_primaries_guess instead.
-
Niklas Haas authored
This refactor accomplishes a number of goals: - make the pl_shader_av1_grain sample directly from `tex` instead of relying on pre-sampled tex, which allows it to correctly keep track of the texture scale even when sampling from the luma tex, and also gets rid of redundant fields - makes the grain_state per-plane rather than shared, which avoids thrashing the grain state cache due to changing subsampling index - fixes the channel order when applying grain to non-YCbCr content - fixes some bugs related to grain scaling w.r.t texture sample depth Tested to produce correct output on a 4:2:2 10-bit grain sample.
-
Niklas Haas authored
There's still one bug but that requires a near-complete refactor to solve, coming up next.
-
Niklas Haas authored
Helps tracking down why this state might be getting thrashed.
-
Niklas Haas authored
Add a new `bool dynamic` to hint at the sh_lut that the LUT is going to be changed frequently. This makes it use host-writable textures rather than immutable textures, which may set different performance hints in the pl_gpu implementation.
-
Niklas Haas authored
Since we normalized before applying grain, the grain_params needs to contain the un-normalized repr.
-
Niklas Haas authored
Separated scaling of e.g. 10-bit chroma content blew out the chroma channel into bright pink due to the scale accidentally being applied for both passes of the pl_shader_sample_ortho.
-
- May 17, 2020
-
-
Niklas Haas authored
-
Niklas Haas authored
In practice the pre-grain texture is going to be a different pl_tex than the post-grain texture, so the "order" of operations does not really make sense in this context.
-
Niklas Haas authored
The pl_3dlut_default_params was undefined in this case, but the renderer still referenced it.
-
Based on ITU-R Report BT.2408, and general recommendations within the industry, the "SDR white level", i.e. the level at which to map SDR into HDR signals, is not 100 cd/m^2 but a value closer to 203 cd/m^2. For PQ signals this results in a relatively straightforward change of the code, but for HLG signals things get more complicated. For HLG, rather than targeting a fixed brightness in cd/m^2, the recommendation is to map SDR white levels to the 75% point in HLG, which calculates to a value of about roughly 3.17955 in scene-referred space (where the nominal peak is at 12.0). To fit this into the libplacebo interpretation of these values (where 1.0 maps to the SDR white levels), we scale things down by this factor, giving rise to a new scene-referred signal range of 12.0/3.17955 = 3.774, and adjust the OOTF to compensate. This commit does put into question to what extent the default tone mapping settings need to be altered to account for this change in interpretation when going from HDR back to SDR. Also add tests to ensure this stuff round-trips. Bump the API version because, apart from the fact that we changed a public header, this is quite a drastic change in functionality.
-
Niklas Haas authored
This reverts commit 9c0d88fb. The xta_ref is not parented, it's attached to the ref it's counting. Should probably improve the documentation here or something.
-