Skip to content
Snippets Groups Projects

core: rework vlc_ancillary API

Closed Thomas Guillem requested to merge tguillem/vlc:ancillary_list into master
7 unresolved threads

Add a new API: vlc_ancillary_list, a vlc_list + refcount + mutex that hold all ancillaries.

The previous API: vlc_ancillary is now only used internally and not exported.

The main reason for this rework is to allow passing all ancillaries from block/frames to pictures and vice versa in a more convenient way. The user doesn't have to enumerate all ancillaries ID explicitly, and can now pass ancillaries that are unknown to the current module. Indeed, an ancillary created by a demux could be unknown to a packetizer but handled by a decoder module, for example.

Some decoder modules allow passing extra data (reorder_opaque in ffmpeg, sourceFrameRefCon in videotoolbox). Passing and holding a vlc_ancillary_list might be more convenient than passing and holding the whole input data (vlc_frame_t).

This is needed for the clock-context !5144 (closed) that will use ancillary data. An intermediate MR will be needed to actually pass ancillaries from all packetizers and codecs modules. Once this work is done, any extra data can be passed from the demux to the outpit.

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
34 35 vlc_ancillary_id id;
35 36 void *data;
36 37 vlc_ancillary_free_cb free_cb;
38
39 struct vlc_list node;
  • refcounted objects and linked list are usually not a good mix... If an ancillary that was already in a list gets attached to another one then it's lost from the original list and it's likely to be bigger than an array of pointer so I think that the list should stay an array.

  • Author Maintainer

    As said in the MR message, the vlc_ancillary will be hidden and the user won't be able to refcount ancillaries individually.

    But I kept a refcount (done internally) for the Merge function. But choosing an array or a list won't change anything for that particular case. if we want to remove internal refcount, we need to expose a dup callback handled by the client...

  • Well you end up increasing the size of the ancillary to handle a weird case. Also preventing "Hold()/Release()" refcounting from the module using the data causes a risk of use after free if nothing is holding it.

  • Author Maintainer

    Well you end up increasing the size of the ancillary to handle a weird case.

    I can remove all Merge functions and just replace it with a Hold one. Therefore, all modules will share the same list instance and we can remove their weird case.

    But in that case, all modules will need to get first the input ancillary list before adding new one (I think it's OK).

  • To be fair I don't mind the holding structure part, but I'm truly not a fan of using linked lists for that.

    Also keeping one shared list means that you end up with an increasing list of possibly outdated memory for no reason.

    I still think that everything would be solved and kept easier with an array of ancillaries as it is now.

  • Author Maintainer

    To be fair I don't mind the holding structure part, but I'm truly not a fan of using linked lists for that.

    Speaking about vlc_ancillary_list or vlc_ancillary ?

    Also keeping one shared list means that you end up with an increasing list of possibly outdated memory for no reason.

    One ancillary list is kept for the demux -> packetizer -> decoder -> filters -> output. So that is only one input data. It won't be outdated ?

    I still think that everything would be solved and kept easier with an array of ancillaries as it is now.

    What does it solve ? The API can stay the same with a list or an array, no ?

  • Speaking about vlc_ancillary_list or vlc_ancillary ?

    vlc_ancillary_list I could get behind the "allow passing all ancillaries from block/frames to pictures and vice versa in a more convenient way" even if I have mixed feelings about "blind" metadata copies since we lack actual examples of this.

    I just don't agree with the linked list over arrays and removing the possibility to hold only one particular ancillary.

    They are meant to be consumed by one part of the chain, it's not useful to keep them afterward but they can be useful after the frame holding the list has been released and holding the whole list means keeping more useless stuff.

    Edited by Denis Charmet
  • Also keeping only one list and growing it throughout the chain means that the original creator of the list could possibly see (or even worse modify) whatever is appended to it at later stages which could be a concern security-wise.

    Consider this "attack" scenario:

    • Create an orig frame/picture with a list
    • Create a new dup frame/picture
    • Call vlc_frame_CopyProperties/picture_CopyProperties
    • Send orig to the rest of the chain
    • You can edit orig ancillaries by using dup

    I know that it's far fetched but it's "allowed" by the new API.

    Edited by Denis Charmet
  • Author Maintainer

    They are meant to be consumed by one part of the chain, it's not useful to keep them afterward but they can be useful after the frame holding the list has been released and holding the whole list means keeping more useless stuff.

    For now, all modules that read ancillary, just copy the ancillary into an internal struct, and release it, and they do that while the output frame is held. So, with the current code base, there won't be any difference memory wise if we decide that we can hold only one particular ancillary.

    Yes modules could improve memory usage (less memcpy) by using directly the pointer from the ancillary list, in that case, the whole list will stay allocated. But the whole list is very likely to stay allocated until the frame/picture is rendered and released anyway.

    I decided to hold the list instead of duplicating an array in order to :

    • less allocation check and fail path to handle
    • Faster way to pass ancillaries from frame/block/pic (but I guess it's very negligible).
  • Author Maintainer

    Also keeping only one list and growing it throughout the chain means that the original creator of the list could possibly see (or even worse modify) whatever is appended to it at later stages which could be a concern security-wise.

    Generally, the original creator doesn't care about the ancillaries it created. All modules adding ancillaries are adding them and forgetting them (they have no use of it)

  • For now, all modules that read ancillary, just copy the ancillary into an internal struct, and release it

    My issue is the "for now" because we are the only users of 4.0, I personally find the current API more resilient to module devs possible shenanigans. I might be paranoid but I'll always take safety over efficiency.

    I decided to hold the list instead of duplicating an array in order to :

    • less allocation check and fail path to handle
    • Faster way to pass ancillaries from frame/block/pic (but I guess it's very negligible).

    I understand why you did this, I'm not blaming you but I just cannot ignore how it can be abused.

    Edited by Denis Charmet
  • On the other hand, one can argue that similar attack path could be used by just keeping the vlc_frame_t and ancillaries are meant to be point to point data...

    Maybe @Courmisch has opinions on that? Maybe I'm too paranoid.

  • Author Maintainer

    On the other hand, one can argue that similar attack path could be used by just keeping the vlc_frame_t and ancillaries are meant to be point to point data...

    If an intermediate module with bad intention is inserted, he could just hack output frames/pictures instead of ancillaries.

  • Also keeping only one list and growing it throughout the chain means that the original creator of the list could possibly see (or even worse modify) whatever is appended to it at later stages which could be a concern security-wise.

    Just to be clear, the thread model here implies the module developer doing incorrect things by modifying which would open the path for a vulnerability?

  • That's one of my concerns, but as I've admitted it too, there are more obvious attack path...

    TBH I prefer the curent refcounted ancillary model in array that can be held outside of the list (for later use) but I won't fight too hard if everybody else thinks that it's better.

    Edited by Denis Charmet
  • I do not understand the concept of a list with a reference count and a lock. It is not even a question of safety or idiot-proofing.

    Input meta effectively work that way too, and that is really a half-assed broken design, to be honest. If you have more than one reference, the dataset is logically shared read-only. The lock may ensure memory safety, but it won't ensure a sound race-free design.

    In fact, I have more of a problem with the lock than with the reference count.

  • Author Maintainer

    OK, I will:

    • Remove the list lock
    • Use an array
    • Duplicate the whole array each times ancillaries are passed within a component
    • Keep single ancillary refcounting

    The final goal will stay the same: One liners functions to pass (by alloc+copy) all ancillaries from a frame to a picture and vice versa.

  • Please register or sign in to reply
  • Denis Charmet
    Denis Charmet @typx started a thread on commit 9f488a14
  • 161 169 return NULL;
    162 170 }
    163 171
    172 struct vlc_ancillary_list *
    173 vlc_ancillary_list_Merge(struct vlc_ancillary_list *dst,
    174 struct vlc_ancillary_list *src)
    • Comment on lines +173 to +174

      while this tries to cover the "one item in several list case" it doesn't cover the case where a module would keep its own list using vlc_ancillary_list_Get and vlc_ancillary_Attach because they don't want the whole list.

      Edited by Denis Charmet
    • Author Maintainer

      Why a module would prevent next modules in the output chain to get their ancillaries ?

    • My point is that it might want to keep a local list of ancillaries with just the ones it needs so it creates a vlc_ancillary_list with vlc_ancillary_list_Create(), get one ancillary from the frame using vlc_ancillary_list_Get() and attach it to its newly created list with vlc_ancillary_list_Attach() suddenly the attached ancillary is not in its original list anymore.

    • Sure you remove it later but I don't really think that you should.

    • Please register or sign in to reply
  • Denis Charmet
  • Denis Charmet
    Denis Charmet @typx started a thread on commit 573b9b4e
  • 171 171 *
    172 172 * @param list ancillary list
    173 173 * @param id id of ancillary to request
    174 * @return the ancillary or NULL if the ancillary for that particular id is
    175 * not present
    174 * @return the ancillary data or NULL if the ancillary for that particular id
    175 * is not present
    176 176 */
    177 VLC_API struct vlc_ancillary *
    178 vlc_ancillary_list_Get(struct vlc_ancillary_list *list, vlc_ancillary_id id);
    177 VLC_API void *
    178 vlc_ancillary_list_GetData(struct vlc_ancillary_list *list, vlc_ancillary_id id);
    • Comment on lines -178 to +178

      Getting the ancillary allow you to Hold() it and make sure that the data will stay valid even after the frame or the list has been released. Sure you could hold the whole list but then why keep everything when you only need one part.

    • Please register or sign in to reply
    • For the record, if the goal is to pass the clock id as an ancillary (as I suggested and I stand by it) and the goal is to make sure it's passed to the output, it's probably better to attach it after packetization in the core to skip this hassle of handling it inside packetizers since packetizers handling unknown data is a bit tricky.

      It could also be consumed before filtering to prevent the hassle of transmission throughout filters.

      Edited by Denis Charmet
    • Author Maintainer

      it's probably better to attach it after packetization

      How do you do that ?

      Only the demux/es_out.c will known if a new clock context should be attached, then the data is sent to vlc_input_decoder that handle packetizer + decoder.

      It could also be consumed before filtering to prevent the hassle of transmission throughout filters.

      Yes but only filters know how to match input frames with output frames.

    • How do you do that ?

      To be exact and assuming that it's sent only when it changes because it's not supposed to happen often. I'd extract the clock id and hold it before the call to pf_packetize and add it back when the packetizer output the first frame since the extraction if the output frame doesn't have a clock id.

      If it changes during packetization without any output then you can probably assume that it was dropped and use the latest.

      Yes but only filters know how to match input frames with output frames.

      Fair enough, I was assuming that since at the filter level everything is in decoded order so it can be matched using time but clock id could intersect...

    • Please register or sign in to reply
  • Denis Charmet
    Denis Charmet @typx started a thread on commit 9f488a14
  • 161 169 return NULL;
    162 170 }
    163 171
    172 struct vlc_ancillary_list *
    173 vlc_ancillary_list_Merge(struct vlc_ancillary_list *dst,
    174 struct vlc_ancillary_list *src)
    175 {
    176 /* Use the destination (that can be NULL) since the source is empty */
    177 if (src == NULL)
    178 return dst;
    179
    180 /* Use the source directly since there is nothing to merge */
    181 if (dst == NULL)
    182 return vlc_ancillary_list_Hold(src);
    • Comment on lines +180 to +182

      I'm not a fan of this since anything that will be attached to the dst afterward will be attached to src too. It should be probably create its own list and copy the content.

      Edited by Denis Charmet
    • Please register or sign in to reply
  • Denis Charmet
    Denis Charmet @typx started a thread on commit a6514068
  • 65 65
    66 66 void vlc_frame_CopyProperties(vlc_frame_t *restrict dst, const vlc_frame_t *src)
    67 67 {
    68 vlc_ancillary_array_Dup(&dst->priv_ancillaries,
    69 &src->priv_ancillaries);
    68 dst->ancillaries = src->ancillaries == NULL ? NULL
    69 : vlc_ancillary_list_Hold(src->ancillaries);
    • Comment on lines +68 to +69

      For the same reasons than in Merge this should be a duplicate. Imagine you duplicate a frame and edit ancillary, the original one is also edited.

    • Author Maintainer

      Imagine you duplicate a frame and edit ancillary, the original one is also edited.

      I don't think it is an issue. The original list will have one more ancillaries that won't be used by the original creator.

    • It's an issue if the ancillary is modified.

    • Author Maintainer

      It's an issue if the ancillary is modified.

      It is possible API wise, but there will be 0 cases for that. It's easy to grep for all ancillaries ID, and use an unique one when you implement it.

      For now, you can overwrite a specific ancillary if you use the same ID but I think it's a bad idea, we should keep the original and reject the new one.

    • It is possible API wise, but there will be 0 cases for that

      0 legit cases maybe this is why I called it an "attack" above.

      For now, you can overwrite a specific ancillary if you use the same ID but I think it's a bad idea, we should keep the original and reject the new one.

      I think that on the contrary modules should be able to edit the ancillary if they detect that it's wrong or incomplete or even remove it if it was only intended for them (for example cipher info, I know that we don't have one for now but it's a possibility)

    • Please register or sign in to reply
  • Denis Charmet
    Denis Charmet @typx started a thread on commit 159035a5
  • 478 484
    479 485 const picture_priv_t *priv = container_of(picture, picture_priv_t, picture);
    480 486 picture_priv_t *clone_priv = container_of(clone, picture_priv_t, picture);
    481 vlc_ancillary_array_Dup(&clone_priv->ancillaries, &priv->ancillaries);
    487 clone_priv->ancillaries = priv->ancillaries == NULL ? NULL
    488 : vlc_ancillary_list_Hold(priv->ancillaries);
  • Thomas Guillem mentioned in merge request !5630 (merged)

    mentioned in merge request !5630 (merged)

  • Thomas Guillem mentioned in merge request !7072

    mentioned in merge request !7072

  • Please register or sign in to reply
    Loading