What is the intended use of the HtmlAgilityPack MixedCodeDocument?
I am using version 1.4 o开发者_JS百科f the HtmlAgilityPack and as I understand it, the MixedCodeDocument and related classes are there to help you parse asp.net markup as found in aspx and ascx files. I've found zero documentation or examples for the MixedCodeDocument class. From what I've tried, it seems that the MixedCodeDocument breaks a file's text into chunks separating asp.net fragments from non-asp.net fragments. For example, the following snippet:
<asp:Label ID="lbl_xyz" runat="server" Text='<%=Name%>'></asp:Label>
<a href='#'>blah</a>
would be broken up into:
// Text fragment 1
<asp:Label ID="lbl_xyz" runat="server" Text="
// Code fragment 1
<%=Name%>
// Text fragment 2 (two lines)
></asp:Label>
<a href='#'>blah</a>
But there is no parsing done any deeper than that, i.e. the a tag is not parsed into its own node with attributes or anything like that.
So my best guess is that the MixedCodeDocument is expected to be used to strip out the code fragments so that the remaining text fragments can be pieced together and then parsed using the HtmlDocument class.
Does anybody know if that's correct? Or even better, does anybody have any tips for ways to successfully parse and manipulate an aspx or ascx file using the HAP or other?
You guess is 100% correct.
The MixedCodeDocument
class was designed to be able to parse a text that contains two languages, that is, classic ASP, ASP.NET, etc. hence the name :-)
Originally the Html Agility Pack was used in a tool that is capable of processing and transforming a whole tree of various files, including HTML and other types of file. If you needed to replace only the HTML parts for other files, this class helped you split code & markup and. Separated code and markup blocks can then be parsed by other means.
I don't think anyone's using it today :)
精彩评论