How to comment out all script tags in an html document using HTML agility pack
I would like to comment out all script t开发者_开发问答ags from an HtmlDocument. This way when I render the document the scripts are not executed however we can still see what was there. Unfortunately, my current approach is failing:
foreach (var scriptTag in htmlDocument.DocumentNode.SelectNodes("//script"))
{
var commentedScript = new HtmlNode(HtmlNodeType.Comment, htmlDocument, 0) { InnerHtml = scriptTag.ToString() };
scriptTag.ParentNode.AppendChild(commentedScript);
scriptTag.Remove();
}
Note that I can do this using replace functions on the html, but I do not think it would be as robust:
domHtml = domHtml.Replace("<script", "<!-- <script");
domHtml = domHtml.Replace("</script>", "</script> -->");
Try this:
foreach (var scriptTag in htmlDocument.DocumentNode.SelectNodes("//script"))
{
var commentedScript = HtmlTextNode.CreateNode(string.Format("<!--{0}-->", scriptTag.OuterHtml));
scriptTag.ParentNode.ReplaceChild(commentedScript, scriptTag);
}
Refer to this SO post - very clean solution utilising the Linq query support of the HTML Agility Pack: htmlagilitypack - remove script and style?
精彩评论