开发者

XML cleanup - unmatched tags

I am trying to format xml entries I have so that I can use the xmltextreader without getting errors. I added a default header and footer in the event I notice there is no opening or closing tags. I remove illegal characters and check for unicode but I always find an issue where an entry slips in and gives the error: data at the root level is invalid and when I check that entry is slipped through the cleaning process or just has an unmatched tag somewhere. Now I use

   Dim stringSplitter() As String = {"</entry>"}
        ' split the file content based on the closing entry tag
        sampleResults = _html.Split(stringSplitter, StringSplitOptions.RemoveEmptyEntries)

to split my xml into individual entries before I start the cleanup process. Here are my default headers;

Private defaultheader = "xmlns=""http://www.w3.org/2005/Atom"""
    Private headerl As String = "<?xml version=""1.0"" encoding=""utf-8""?>" & vbNewLine & "<entry " & defaultNameSpace & ">"
    Private footer As String = "</entry>"

is there any tool in the .net framework that can detect and cleanup unmatched tags so 开发者_C百科that I can get this to work


I think you are looking in the wrong direction for a solution :) I think what you need is to check out the IXmlSerializer.

check out this article: Proper way to implement IXmlSerializable?

My approach would be to create an entry object, make it serializable, and read it via the serializer.

Create another serialized object called CleanedEntry, and give that the entry object in the constructor.

If the input never contains any errors, you should be able to make this work quite easily.

(of course this depends a bit on how the source looks like, and what you want to do with it.) Please give an example of expected input /output if my answer seems hazy, and I will try to elaborate on it. (if I have the time ; ) )

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜