开发者

How to comment out all script tags in an html document using HTML agility pack

I would like to comment out all script t开发者_开发问答ags from an HtmlDocument. This way when I render the document the scripts are not executed however we can still see what was there. Unfortunately, my current approach is failing:

foreach (var scriptTag in htmlDocument.DocumentNode.SelectNodes("//script"))
            {
                var commentedScript = new HtmlNode(HtmlNodeType.Comment, htmlDocument, 0) { InnerHtml = scriptTag.ToString() };
                scriptTag.ParentNode.AppendChild(commentedScript);
                scriptTag.Remove();
            }

Note that I can do this using replace functions on the html, but I do not think it would be as robust:

domHtml = domHtml.Replace("<script", "<!-- <script");
domHtml = domHtml.Replace("</script>", "</script> -->");


Try this:

foreach (var scriptTag in htmlDocument.DocumentNode.SelectNodes("//script"))
        {
            var commentedScript = HtmlTextNode.CreateNode(string.Format("<!--{0}-->", scriptTag.OuterHtml));
            scriptTag.ParentNode.ReplaceChild(commentedScript, scriptTag);
        }


Refer to this SO post - very clean solution utilising the Linq query support of the HTML Agility Pack: htmlagilitypack - remove script and style?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜