开发者

SAXParser implementation is skipping entities

I have an implementation of org.xml.sax.helpers.DefaultHandler, it works fine except when it comes something like this:

<NAME>Ji&#345;&#237; B&#225;rta</NAME>

The character method is overriden as:


@Override
public void characters(char[] ch, int start, int length) throws SAXException {
  开发者_JS百科  if (currentElement) {
        currentValue = new String(ch, start, length);
        currentElement = false;
    }
}

But the char array that arrives to the method has only 'Ji', skipping the rest of the string. I have another method to convert those entities to UTF-8, but I never get them, so I can't convert anything.


Be aware that the parser may not deliver all character data in one call. To be safe you must build the string from possibly several characters() invocations, bracketed by startElement()/endElement().

As a side note, why do you want to convert the "entities" to UTF-8? They are already converted to UTF-16 characters.


The functionality you describe is correct, your understanding is incorrect.

Try implementing resolveEntity in your Handler class. Interestingly enough, the purpose of resolveEntity is to resolve an entity. The string "Ji&#345;&#237;" starts with two characters "Ji" then contains two entities. "&#345;" is one entity and "&#237;" is another entity.

Another option is to not implement resolveEntity and to implement skippedEntity instead.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜