Parse page HTML output
Everything you need is in the
Page.Render
method, override it and do what you want to in there.
protected override void Render(HtmlTextWriter writer)
{
// do your stuff here
StringBuilder stringBuilder = new StringBuilder();
StringWriter stringWriter = new StringWriter(stringBuilder);
HtmlTextWriter htmlTextWriter = new HtmlTextWriter(stringWriter);
base.Render(htmlTextWriter); // <-- render the page into the htmlTextwriter
// the htmlTextwriter connects trough the stringWriter to the stringBuilder
string theHtml = stringBuilder.ToString(); // <---- html captured in string
//---------------------------------------------
//do stuff on theHtml here
//---------------------------------------------
writer.Write(theHtml); // <----write html with the original writer
}
It depends on what you mean by "parse" exactly, but something like the HTML Agility Pack can create an XML-like structure from an HTML document - essentially creating a proper HTML DOM data structure. You can even then convert it straight to XML, use LINQ, etc.
精彩评论