VTT subtitle fileNAMEs are not autorecognized that have unicode characters on mac APFS
Specs
- VLC: 3.0.12 Vetinari (Intel 64bit)
- MacOS: Big Sur 11.2.3
- System: Intel Macbook Pro
I've recently bought a very performant used intel-based macbook pro, immediately wiped the internal SDD and formatted with APFS (NOT case-sensitive) per apple recommendations, then put a fresh install of Big Sur on it. I followed with an install of the latest mac version of VLC.
I maintain an ongoing, huge library of FRENCH videos with matching external vtt subtitle files that is now spanning across multiple drives and systems.
On my old macbook pro (aka mbp) high sierra, i had an older vlc version installed and when i played a french video the matching vtt file with the same exact name (except for extension) would automatically be recognized and start appearing with the playback. It was Fantastic! I want the unicode chars in the filenames! That's how i download them from source.
However, with the new move to my new mbp (new OS, new VLC, new xSDD with APFS), it appears that VLC does NOT autorecognize any vtt fileNAMES (the file name itself) that have unicode characters in them!!! My entire library of mp4 / vtt video filenames include unicode chars (like é, è, ç, ï, etc) so VLC auto-recognizing unicode filenames needs to work.
Oddly enough, vtt filenames that do NOT have unicode chars are however autorecognized by VLC during playback of the corresponding mp4 file with the same name!
For cases when there are unicode chars in the vtt and coinciding video files, if i drag the vtt file onto the playing video in vlc, the vtt will begin to play!
So to be clear - this is NOT a problem with the encoding of the content of vtt files themselves. That works fine. The subtitles once loaded play as they should. It's simply a question of the video starting but not finding and loading it's associated and same named vtt file automatically when the filenames contain unicode chars.
CONCRETE EXAMPLES
unicode chars in filenames : vtt file NOT autorecognized though it's the same exact name
La laïcité française.mp4
La laïcité française.fr.vtt
NO unicode chars in filenames : vtt file autorecognized as should since they're the same exact name
Les avions du bout du monde.960
Les avions du bout du monde.fr.vtt
My troubleshooting indicates that this is either
- an overall VLC mac configuration locale setting that relates to the file NAMES ONLY
- or an APFS file system or some other unicode-based macOS system issue
Hopefully it's easy to fix but I fear it may be the latter APFS issue though i'm surprised nobody would have run into this before.
Why do i think that?
- my current VLC autorecognizes vtt files from the video library that lives on an external external hard disk and formatted as NTFS (NOT APFS!)
- ie old data works with new system and new vlc
- my old mbp and old version of VLC autorecognized these same vtt files that were originally downloaded to it's internal hard disk that was formatted with OS journaled file system
- ie old data works with old system and old vlc
- if i copy any of my newly downloaded problem video / vtt file pairs on my NEW mbp with APFS to the older external hard drive that is formatted as NTFS, the current vlc on the new mbp autorecognizes the unicode vtt files!!!
- ie the new data needs to "live on the old / different system" (the external hd) to work..
- ie the new data is becoming "old data" by the xfer ;)
- BUT if i THEN copy those now working unicode vtt files on the external NTFS drive from no. 3 above back to the new mbp APFS internal drive, now the new VLC autorecognizes the vtt files on APFS that it did NOT recognize prior to being copied off to NTFS and back again!!!
- ie once new data becomes "old data" and is copied back to new system, it stays "old data" and works ;)
So there's a pattern of things working when tied to the "old data" here. There seems to be something about the way any of my new video fileNAMEs are encoded into the internal SDD APFS filesystem when i download them now as new editions to the library.
In sum, it appears that their is a disconnect between the way the APFS filesystem (or the mac itself) represents unicode chars and the way VLC expects to interpret those same unicode chars! That's the only thing that makes sense at this point based on my experiments.
Note that my terminal / bash and the finder render those unicode characters in these filenames correctly regardless of what drive or file system they are on! So they must be set to the same encoding used when creating the files initially. So I gather VLC must not be in sync with whatever the filename unicode encoding is on my mac or my internal APFS file system.
Please tell me there is a simple local setting for filenames somewhere in VLC or macOS that can fix this! :)
Sorry for the long post but i want to be clear what i'm seeing here as i've invested a lot of time trying to sort out where exactly the problem is and when, how and why it appears.
Thanks!