How do I define a new entity for the HtmlUnit XML parser?
I'm running into an issue with the HtmlUnit parser where I'm trying to grab some XML from a website (using the website's API) do a quick parse of the resulting XML and then save the XML to a file (all within the rights of the API). (sample content)
开发者_JS百科Unfortunately the website returns an entity ¿
in some of the requested pages, and while this is a valid HTML entity HtmlUnit is throwing an exception during the parse with message:
The entity "iquest" was referenced, but not declared.
How do I define iquest
as a valid entity?
You can't define ¿ except by editing the data you received (the data is not XML as any validator will show e.g. first one I found on google
The site is not serving valid XML so the best wayis ask it to fix the XML.
When that fails then either so a search and replace on ¿ or add a DOCTYPE that defines the entity ¿.
精彩评论