开发者

How do I define a new entity for the HtmlUnit XML parser?

I'm running into an issue with the HtmlUnit parser where I'm trying to grab some XML from a website (using the website's API) do a quick parse of the resulting XML and then save the XML to a file (all within the rights of the API). (sample content)

开发者_JS百科

Unfortunately the website returns an entity ¿ in some of the requested pages, and while this is a valid HTML entity HtmlUnit is throwing an exception during the parse with message:

The entity "iquest" was referenced, but not declared.

How do I define iquest as a valid entity?


You can't define ¿ except by editing the data you received (the data is not XML as any validator will show e.g. first one I found on google

The site is not serving valid XML so the best wayis ask it to fix the XML.

When that fails then either so a search and replace on ¿ or add a DOCTYPE that defines the entity &iquest.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜