Application crash when scanning MP3 media file with Shift JIS tags
Regression. Last known working nightly build: VLC-Android-3.3.2-armv7-20201124-0124.apk (released on November 24th 2020).
Operating system: Android 10 (32-bit)
VLC Media Player version: 3.3.4 Beta 2 (January 19th 2021) (ARMv7)
The issue happens from version 3.3.3 Beta 1 (November 25th 2020) onwards.
Hardware does not seem to be necessary information for this issue.
== Description
When VLC scans the device's storage volumes for media files, it creates an index of all the valid media files it could find. For MP3 files, the player attempts to read MP3 tags stored in the files to identify and categorize them.
However, MP3 tags can actually be of any character encoding. UTF-8 is the most common one these days, but some old media files may still use dated character encodings.
Up to version 3.3.2, MP3 files containing Shift JIS-encoded tags displayed as mojibake (garbled text) in VLC; a behavior that happens on pretty much every machine not set to the Japanese locale, and thus is not that "unexpected". Note that on Android, the mojibake happens even if the system is set to use the Japanese locale.
Starting with version 3.3.3 Beta 1, the application crashes instead, which is a somewhat more problematic issue.
Steps to reproduce
- Place an MP3 file with tags encoded in Shift JIS in a location the VLC application can scan for media files.
- Run VLC and have it scan the device for media files.
Note: It's actually possible that VLC crashes should the tags be anything with "invalid characters", not necessarily Shift JIS. It's very possible for it to be a bug caused by the parser (if VLC attempts to read the tags with a Western encoding).
I cannot retrieve the required debugging logs right now as I do not have enough space on my PC for Android development.
However, I can provide you with files that should trigger the crash; this one is the file I found causing the issue on the system. Most early-2000s MP3 files from this page, which is the source for the crashing file above, use Shift JIS encoding in their tags and thus should have similar results.