Libxml2’s IsEmptyElement function needs to be exported to Lua
The function Read of the XmlTextReader class of Libxml2, which is used for XmlReader:next_node() in Lua (with many layers inbetween), reports self-closing tags (like '' or '') with the common NodeType value for the start of an element, the number 1.
It does not report the end of the element on the next call (value 15 in the lib, remapped to 2 for Lua)—instead, it proceeds with the next node after the self-closing tag.
To see whether the node is self-closing (also called empty in the doc.), the function IsEmptyElement must be called and the result evaluated, as per http://xmlsoft.org/xmlreader.html.
However, this function is currently not exported to Lua, as far as I can see. This means that SAX-style XML parsing, which seems to be the focus of the next_node() call, is currently impossible without resorting to building a tree representation (not really SAX anymore) because nodes could only be inferred to have been self-closing tags after-the-fact, when a closing tag //of some upper layer// is encountered.
An example of a document with the problem:
#!xml
<root>
<someelem>
</someelem>
<selfclosed/>
<otherelem>
Text
</otherelem>
</root>
It would lead to the following list of tuples being produced by calling XmlReader:next_node() in Lua:
| =Type= | =Value= |
|---|---|
| 1 | root |
| 1 | someelem |
| 2 | someelem |
| 1 | selfclosed |
| 1 | otherelem |
| 3 | Text |
| 2 | otherelem |
| 2 | root |
| 0 | |
| Here, //selfclosed// could only be known not to contain //otherelem// after encountering the closing tag for //root//. |
The documents returned by the Jamendo API (v3.0) will cause this problem to appear, see this one for an example: https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=xml&order=popularity_week&limit=3
To fix this problem, the function IsEmptyElement would have to be exported to Lua.
If you’d like to have a testcase that reproduces the problem, please ask refp to provide it, he seemed convinced (on IRC) that it was easy to do, whereas I’m not.
Finally, the list of stations calls to XmlReader:next_node() go through until they are translated to calls into Libxml2, which might save you some time:
- vlclua_xml_reader_next_node in .\modules\lua\libs\xml.c
- xml_ReaderNextNode in .\include\vlc_xml.h
- reader->pf_next_node in the same file*
- ReaderNextNode in .\modules\misc\xml\libxml.c
- xmlTextReaderRead and xmlTextReaderNodeType in the same function
*populated by //vlc_custom_create([…]"xml reader");// in xml_ReaderCreate in .\src\misc\xml.c