开发者

What's the best way to parse XML in the middle of other text

How can I parse an xml in the midle of other text.

Example: If I have this text file in C# how can I parse the xml part:

-> Begin of file

2010-01-01 tehgvdhjjsad  
2010-01-02 dsjhnxcucncu  
14:55 iahsdahksdjh  

<Answer>
<headline>
<a1>1</a1>
<a2>2</a2>
</headline>
</Answer开发者_C百科>
2010-01-05 tehgvddsda  
2010-01-05 ddsada  
22:55 iahsdahksdjh2  

<Answer>
<headline>
<a1>11</a1>
<a2>22</a2>
</headline>
</Answer>
-> End of file


Several ways:

 1. Do a string.IndexOf("<Answer>") and then use a substring to chop off the header information.  Then add the substring like this:
xmlString = "<Answers>" + substringXml + "</Answers>".  Then you could parse the xml as valid XML.
 2. Use an xmltextreader created with fragment conformance levels and read through the xml.  Only stop on the Answer elements and do processing.
 3. Add a root element to the document and open it in an XmlDocument and use an xpath expression to read out the Answer elements.


Well, there aren't many things that can help you with something that. AFAIK there are two possibilities:

Option 1. If all the xml fragments have the same root-node, ie. "<Answer>", then you can simply find loop through the occurrences of <Answer> finding the next occurence of the closing </Answer>, extract the text between the two and use a normal XML parser.

Option 2. If it's a anything xml goes kind of thing then you could use this Regex based Html Parser I wrote some time ago. It should handle that input without issue; however, you will have to deal with the open/close elements and determine what to do with them.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜