Run preparser tasks in a separate process

(assigned to me but currently work in progress from Tanguy)

There are many highly desirable aspects that require a separate process. Though for most of them, it is a necessary but insufficient prerequisite, which probably deserve tickets of their own:

Resource constraints (RAM, CPU time) and de-prioritization (CPU and I/O priorities) will require explicit setting, e.g. setrlimit().
Timing out an insanely slow and potentially dead-locked requires the parent process to actually kill() the child or setting alarm() (the later is probably not portable).
Security would require proper sand-boxing, which is a whole order of magnitude more complex and very much not portable. The one feature that works out-of-the-box is not crashing VLC when the preparser crashes - or not leaking VLC memory when the preparser leaks, so at least there's something to look forward to.

This may also effectively require better optimisation of libvlc_new() in general, and the module bank in particular. It probably should not scan the plug-in directories every time a new process is spawned.

Thank you for your feedback on this! I agree on the three points you mentioned for the future, though I suggest to create them in a “sandbox“ milestone context.

Tanguy's long term objective is to work on some aspect of a future VLC sandbox, but, to add to your thoughts:

Like you mentioned, it's order of magnitude more complex. In my opinion, it cannot be done without a meeting with the TC and a clearer plan of action.
Not crashing VLC is the must-have feature here, but in addition it will be an initial step to animate discussion on the design and highlight platform issues.
Hugo rightfully asked whether it would be transparent for libvlc clients. Considering platform requirement, this is mostly about whether we support this feature through a secondary libexec executable, an intrusive platform integration or decide to not support it, which also means it must account for platforms on which you might not be able to exec in the long term. Not sure they techically exists but I'm pretty sure using exec on a different executable on appstore is typically difficult to validate for an app so this needs to be validated.
In my multiprocess VLC experiment, I didn't have much issues with libvlc re-scanning, the main cost being the loading of shared objects which is already done or uneeded with module bank. I think the optimization is needed in the long run but optional in the context of this ticket.
It might need bigger care for updates since it could change the executable in the case we fork exec or directly spawn a process with a different executable, which could break the IPC between the already running process and newer process.
Debugging and crash tracing might get more difficult, we probably want to bring answers here.
For cancellation, we probably want a dedicated thread on the preparsing side to handle the IPC with the main client anyway, so it might not be that different from the current situation. The process could get stuck for other reasons though and it's a bit harder to handle.

Regarding the effective implementation of this work, I suggest the following merging plan:

we wait until Romain's work on vlc_executor is merged
we move preparser to a module capability so as to implement the thread fallback and the platform modules in different plugins
we implement the POSIX variant which is a bit more “natural“ to write with the current code
we implement the Windows variant which might lead to a different design that the POSIX variant and lead to better change in the underlying system.
we potentially move some platform code into libvlccore for the processus handling.

Does it seem rational enough?

Replying to [comment:3 Alexandre Janniaux]:

Hugo rightfully asked whether it would be transparent for libvlc clients.

I don't why and how it would not be transparent at the API level. But obviously if somebody embeds LibVLC in their app and don't ship the background executable(s), then it won't work and we'll have to fall back to unsafe threads or failure. Likewise, if the target platform does not allow separate binaries, you are hosed. I mean, I don't think we really have a choice here.

The only choice that we seem to have is if this is a legitimate stand-alone CLI tool in $bindir, that can output in human readable format (based on a CLI flag), or if it is only a back-end tool in $libexecdir like the cache generator.

In my multiprocess VLC experiment, I didn't have much issues with libvlc re-scanning, the main cost being the loading of shared objects which is already done or unneeded with module bank. I think the optimization is needed in the long run but optional in the context of this ticket.

As I said, it annoys me that optical media access plugins hijack all file inputs (but I don't have any smart idea to avoid them). I don't think that we can avoid loading and probing demuxers.

Debugging and crash tracing might get more difficult, we probably want to bring answers here.

That's another motivation for having a stand-alone executable rather than some obscure thing that can't easily be run by hand.

For cancellation, we probably want a dedicated thread on the preparsing side to handle the IPC with the main client anyway, so it might not be that different from the current situation. The process could get stuck for other reasons though and it's a bit harder to handle.

If you only want to protect against crashes, then you just need a thread to call popen(..., "re") then read, then pclose(). This also works to handle time-outs, if we trust the child process to call alarm(). But for cancellation, you need to get hold of the PID, so popen() is out then.

mentioned in issue #25301

added Component::Core label

assigned to @chouquette and unassigned @alexandre-janniaux

mentioned in merge request !548

Is it for VLC 4.0? I do think it's needed.

I agree that it's unnecessary - if media library is disabled by default and in official VideoLAN builds.

Run preparser tasks in a separate process

Child items ...

Activity