Fix broken element that miss ending tag or /> with c#
Is there a simple way to fix elements in a html document that miss the ending tag, or /> ending? I'm using ASP.NET with c# (loads html with the help of Html Agility Pack).
An example:
<img src="www.example.com/image.jpg">
should transform into
<img src开发者_JAVA技巧="www.example.com/image.jpg" />
or
<img src="www.example.com/image.jpg"></img>
You can use the save() method to convert the Html document to XML. Doing this, HTMLAgilitypack will try to close all the open tags.
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
System.IO.StringWriter sw = new System.IO.StringWriter();
System.Xml.XmlTextWriter xw = new System.Xml.XmlTextWriter(sw);
doc.Save(xw);
string result = sw.ToString();
精彩评论