开发者

Load XMLDocument from byte array (optionally containing BOM characters)

I've seen several posts here on SO about loading XML documents from some data source where the data has Microsoft's proprietary UTF-8 preamble (for instance, this one).

However, I can't find an elegant (and working!) solution which does not involve striping out BOM characters manually.

For instance, there is this example:

byte[] b = System.IO.File.ReadAllBytes("c:\\temp_file_containing_bom.txt");
using (System.IO.MemoryStream oByteStream = new System.IO.MemoryStream(b)) {
    using (System.Xml.XmlTextReader oRD = new System.Xml.XmlTextReader(oByteStream)) {
        Sys开发者_运维知识库tem.Xml.XmlDocument oDoc = new System.Xml.XmlDocument();
        oDoc.Load(oRD);
        Console.WriteLine(oDoc.OuterXml);
        Console.ReadLine();
    }
}

...but it still keeps throwing "invalid data" exception.

My problem is that I have a huge byte array which sometimes contains the BOM and sometimes it does not. I need to load it in XMLDocument. And I don't believe that I am the one who has to take care for the "helper" bytes.


That BOM is no longer 'proprietary'. It's written up in the XML specs. Only old version of Java (1.4) have a problem with it. It's pretty humorous if you've got MS technology exploding.

Use a buffered input stream to filter out the BOM by pushing back the first character if it's not the first character of the BOM sequence.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜