core: rework vlc_ancillary API
Add a new API: vlc_ancillary_list, a vlc_list + refcount + mutex that hold all ancillaries.
The previous API: vlc_ancillary is now only used internally and not exported.
The main reason for this rework is to allow passing all ancillaries from block/frames to pictures and vice versa in a more convenient way. The user doesn't have to enumerate all ancillaries ID explicitly, and can now pass ancillaries that are unknown to the current module. Indeed, an ancillary created by a demux could be unknown to a packetizer but handled by a decoder module, for example.
Some decoder modules allow passing extra data (reorder_opaque
in ffmpeg, sourceFrameRefCon
in videotoolbox). Passing and holding a vlc_ancillary_list
might be more convenient than passing and holding the whole input data (vlc_frame_t
).
This is needed for the clock-context !5144 (closed) that will use ancillary data. An intermediate MR will be needed to actually pass ancillaries from all packetizers and codecs modules. Once this work is done, any extra data can be passed from the demux to the outpit.
Merge request reports
Activity
changed milestone to %4.0
added Component::Core label
34 35 vlc_ancillary_id id; 35 36 void *data; 36 37 vlc_ancillary_free_cb free_cb; 38 39 struct vlc_list node; As said in the MR message, the
vlc_ancillary
will be hidden and the user won't be able to refcount ancillaries individually.But I kept a refcount (done internally) for the Merge function. But choosing an array or a list won't change anything for that particular case. if we want to remove internal refcount, we need to expose a
dup
callback handled by the client...Well you end up increasing the size of the ancillary to handle a weird case.
I can remove all
Merge
functions and just replace it with aHold
one. Therefore, all modules will share the same list instance and we can remove their weird case.But in that case, all modules will need to get first the input ancillary list before adding new one (I think it's OK).
To be fair I don't mind the holding structure part, but I'm truly not a fan of using linked lists for that.
Also keeping one shared list means that you end up with an increasing list of possibly outdated memory for no reason.
I still think that everything would be solved and kept easier with an array of ancillaries as it is now.
To be fair I don't mind the holding structure part, but I'm truly not a fan of using linked lists for that.
Speaking about
vlc_ancillary_list
orvlc_ancillary
?Also keeping one shared list means that you end up with an increasing list of possibly outdated memory for no reason.
One ancillary list is kept for the demux -> packetizer -> decoder -> filters -> output. So that is only one input data. It won't be outdated ?
I still think that everything would be solved and kept easier with an array of ancillaries as it is now.
What does it solve ? The API can stay the same with a list or an array, no ?
Speaking about
vlc_ancillary_list
orvlc_ancillary
?vlc_ancillary_list
I could get behind the "allow passing all ancillaries from block/frames to pictures and vice versa in a more convenient way" even if I have mixed feelings about "blind" metadata copies since we lack actual examples of this.I just don't agree with the linked list over arrays and removing the possibility to hold only one particular ancillary.
They are meant to be consumed by one part of the chain, it's not useful to keep them afterward but they can be useful after the frame holding the list has been released and holding the whole list means keeping more useless stuff.
Edited by Denis CharmetAlso keeping only one list and growing it throughout the chain means that the original creator of the list could possibly see (or even worse modify) whatever is appended to it at later stages which could be a concern security-wise.
Consider this "attack" scenario:
- Create an
orig
frame/picture with a list - Create a new
dup
frame/picture - Call
vlc_frame_CopyProperties
/picture_CopyProperties
- Send
orig
to the rest of the chain - You can edit
orig
ancillaries by usingdup
I know that it's far fetched but it's "allowed" by the new API.
Edited by Denis Charmet- Create an
They are meant to be consumed by one part of the chain, it's not useful to keep them afterward but they can be useful after the frame holding the list has been released and holding the whole list means keeping more useless stuff.
For now, all modules that read ancillary, just copy the ancillary into an internal struct, and release it, and they do that while the output frame is held. So, with the current code base, there won't be any difference memory wise if we decide that we can hold only one particular ancillary.
Yes modules could improve memory usage (less memcpy) by using directly the pointer from the ancillary list, in that case, the whole list will stay allocated. But the whole list is very likely to stay allocated until the frame/picture is rendered and released anyway.
I decided to hold the list instead of duplicating an array in order to :
- less allocation check and fail path to handle
- Faster way to pass ancillaries from frame/block/pic (but I guess it's very negligible).
Also keeping only one list and growing it throughout the chain means that the original creator of the list could possibly see (or even worse modify) whatever is appended to it at later stages which could be a concern security-wise.
Generally, the original creator doesn't care about the ancillaries it created. All modules adding ancillaries are adding them and forgetting them (they have no use of it)
For now, all modules that read ancillary, just copy the ancillary into an internal struct, and release it
My issue is the "for now" because we are the only users of 4.0, I personally find the current API more resilient to module devs possible shenanigans. I might be paranoid but I'll always take safety over efficiency.
I decided to hold the list instead of duplicating an array in order to :
- less allocation check and fail path to handle
- Faster way to pass ancillaries from frame/block/pic (but I guess it's very negligible).
I understand why you did this, I'm not blaming you but I just cannot ignore how it can be abused.
Edited by Denis CharmetOn the other hand, one can argue that similar attack path could be used by just keeping the
vlc_frame_t
and ancillaries are meant to be point to point data...Maybe @Courmisch has opinions on that? Maybe I'm too paranoid.
On the other hand, one can argue that similar attack path could be used by just keeping the
vlc_frame_t
and ancillaries are meant to be point to point data...If an intermediate module with bad intention is inserted, he could just hack output frames/pictures instead of ancillaries.
Also keeping only one list and growing it throughout the chain means that the original creator of the list could possibly see (or even worse modify) whatever is appended to it at later stages which could be a concern security-wise.
Just to be clear, the thread model here implies the module developer doing incorrect things by modifying which would open the path for a vulnerability?
That's one of my concerns, but as I've admitted it too, there are more obvious attack path...
TBH I prefer the curent refcounted ancillary model in array that can be held outside of the list (for later use) but I won't fight too hard if everybody else thinks that it's better.
Edited by Denis CharmetI do not understand the concept of a list with a reference count and a lock. It is not even a question of safety or idiot-proofing.
Input meta effectively work that way too, and that is really a half-assed broken design, to be honest. If you have more than one reference, the dataset is logically shared read-only. The lock may ensure memory safety, but it won't ensure a sound race-free design.
In fact, I have more of a problem with the lock than with the reference count.
OK, I will:
- Remove the list lock
- Use an array
- Duplicate the whole array each times ancillaries are passed within a component
- Keep single ancillary refcounting
The final goal will stay the same: One liners functions to pass (by alloc+copy) all ancillaries from a frame to a picture and vice versa.
added MRStatus::InReview label
161 169 return NULL; 162 170 } 163 171 172 struct vlc_ancillary_list * 173 vlc_ancillary_list_Merge(struct vlc_ancillary_list *dst, 174 struct vlc_ancillary_list *src) - Comment on lines +173 to +174
while this tries to cover the "one item in several list case" it doesn't cover the case where a module would keep its own list using
vlc_ancillary_list_Get
andvlc_ancillary_Attach
because they don't want the whole list.Edited by Denis Charmet My point is that it might want to keep a local list of ancillaries with just the ones it needs so it creates a
vlc_ancillary_list
withvlc_ancillary_list_Create()
, get one ancillary from the frame usingvlc_ancillary_list_Get()
and attach it to its newly created list withvlc_ancillary_list_Attach()
suddenly the attached ancillary is not in its original list anymore.
- Resolved by Denis Charmet
171 171 * 172 172 * @param list ancillary list 173 173 * @param id id of ancillary to request 174 * @return the ancillary or NULL if the ancillary for that particular id is 175 * not present 174 * @return the ancillary data or NULL if the ancillary for that particular id 175 * is not present 176 176 */ 177 VLC_API struct vlc_ancillary * 178 vlc_ancillary_list_Get(struct vlc_ancillary_list *list, vlc_ancillary_id id); 177 VLC_API void * 178 vlc_ancillary_list_GetData(struct vlc_ancillary_list *list, vlc_ancillary_id id); For the record, if the goal is to pass the clock id as an ancillary (as I suggested and I stand by it) and the goal is to make sure it's passed to the output, it's probably better to attach it after packetization in the core to skip this hassle of handling it inside packetizers since packetizers handling unknown data is a bit tricky.
It could also be consumed before filtering to prevent the hassle of transmission throughout filters.
Edited by Denis Charmetit's probably better to attach it after packetization
How do you do that ?
Only the demux/es_out.c will known if a new clock context should be attached, then the data is sent to
vlc_input_decoder
that handle packetizer + decoder.It could also be consumed before filtering to prevent the hassle of transmission throughout filters.
Yes but only filters know how to match input frames with output frames.
How do you do that ?
To be exact and assuming that it's sent only when it changes because it's not supposed to happen often. I'd extract the clock id and hold it before the call to
pf_packetize
and add it back when the packetizer output the first frame since the extraction if the output frame doesn't have a clock id.If it changes during packetization without any output then you can probably assume that it was dropped and use the latest.
Yes but only filters know how to match input frames with output frames.
Fair enough, I was assuming that since at the filter level everything is in decoded order so it can be matched using time but clock id could intersect...
161 169 return NULL; 162 170 } 163 171 172 struct vlc_ancillary_list * 173 vlc_ancillary_list_Merge(struct vlc_ancillary_list *dst, 174 struct vlc_ancillary_list *src) 175 { 176 /* Use the destination (that can be NULL) since the source is empty */ 177 if (src == NULL) 178 return dst; 179 180 /* Use the source directly since there is nothing to merge */ 181 if (dst == NULL) 182 return vlc_ancillary_list_Hold(src); - Comment on lines +180 to +182
I'm not a fan of this since anything that will be attached to the
dst
afterward will be attached tosrc
too. It should be probably create its own list and copy the content.Edited by Denis Charmet
65 65 66 66 void vlc_frame_CopyProperties(vlc_frame_t *restrict dst, const vlc_frame_t *src) 67 67 { 68 vlc_ancillary_array_Dup(&dst->priv_ancillaries, 69 &src->priv_ancillaries); 68 dst->ancillaries = src->ancillaries == NULL ? NULL 69 : vlc_ancillary_list_Hold(src->ancillaries); It's an issue if the ancillary is modified.
It is possible API wise, but there will be 0 cases for that. It's easy to grep for all ancillaries ID, and use an unique one when you implement it.
For now, you can overwrite a specific ancillary if you use the same ID but I think it's a bad idea, we should keep the original and reject the new one.
It is possible API wise, but there will be 0 cases for that
0 legit cases maybe this is why I called it an "attack" above.
For now, you can overwrite a specific ancillary if you use the same ID but I think it's a bad idea, we should keep the original and reject the new one.
I think that on the contrary modules should be able to edit the ancillary if they detect that it's wrong or incomplete or even remove it if it was only intended for them (for example cipher info, I know that we don't have one for now but it's a possibility)
478 484 479 485 const picture_priv_t *priv = container_of(picture, picture_priv_t, picture); 480 486 picture_priv_t *clone_priv = container_of(clone, picture_priv_t, picture); 481 vlc_ancillary_array_Dup(&clone_priv->ancillaries, &priv->ancillaries); 487 clone_priv->ancillaries = priv->ancillaries == NULL ? NULL 488 : vlc_ancillary_list_Hold(priv->ancillaries); added MRStatus::NotCompliant label and removed MRStatus::InReview label
mentioned in merge request !5630 (merged)
mentioned in merge request !7072