vo_gpu: implement frame mixing
This is extremely long overdue for how much of a trivial process it turned out to be to actually implement/test it. I'm embarrassed it took us this long to get this feature, which was for several years the only major roadblock to libplacebo replacing vo_gpu in mpv.
Note that I haven't really tested it on real data, just eyeballed the weights to make sure it seems to be doing the right thing.
In the process of testing, I finally sobered about all of my idealized assumptions about the nature of temporal resampling and signal theory were quite unpragmatic in practice. I went back to the assumption of a ZOH world, in which frames aren't samples "centered" on themselves, but effectively, go with the principle of least surprised by behaving the way they would on a real LCD: start at timestamp, last until replaced.
In particular, for the convolution-based frame mixers, we assume we're actually sampling the start of the frame, not the center of the frame. This is, again, to avoid surprising users about their frames being unexpectedly blurry in cases where their code is calibrated around a ZOH world.
I originally had anticipated to have to write "helper" functions to make calculating the correct timestamps etc. easier for the user, but I think the current API design is so easy that such a helper function is unnecessary.
Note that I also added a new pl_filter_mitchell_clamp to go along with it, since clamped mitchell is what mpv's tscale also defaults to. (Although I still think NULL is the better default)
Closes #13 (closed)