开发者

Html Agility Pack isn't matching <link> tags

Code: (Using HTML Agility Pack)

 Dim moHtmlParser As HtmlDocument = New HtmlDocument           
 moHtmlParser.LoadHtml(htmlString)
           
 Dim maStyles As New List(Of String)
 Dim moStyleNodes As HtmlNodeCollection = moHtmlParser.DocumentNode.SelectNodes("//link")

Html:

<开发者_StackOverflow中文版;head runat="server">
<script src="Scripts/JScript1.js" type="text/javascript" ></script>
        
<link href="Stylesheets/StyleSheet1.css" rel="Stylesheet" type="text/css" />

<link href="Stylesheets/StyleSheet2.css" rel="Stylesheet" type="text/css" />

<link href="Stylesheets/StyleSheet3.css" rel="Stylesheet" type="text/css" />    
     

<title>Untitled Page</title>

No matches? moStyleMatches is always Nothing. The Html shown is from the Head, for what it's worth. I'm able to match other tags in there(script, title) no problem.

Update:

Even after removing the ElementsFlag for "link" tags, it just wouldn't pick up the tags.

I worked around it with this code:

Dim moStyleNodes As HtmlNodeCollection = moHtmlParser.DocumentNode.SelectNodes("//*[@rel]")

I then made sure that the "rel" was "stylesheet" before working with the node.

Works for now, but doesn't explain why it wasn't working in the first place.


Even after removing the ElementsFlag for "link" tags, it just wouldn't pick up the tags.

I worked around it with this code:

Dim moStyleNodes As HtmlNodeCollection = moHtmlParser.DocumentNode.SelectNodes("//*[@rel]")

I then made sure that the "rel" was "stylesheet" before working with the node.

Works for now, but doesn't explain why it wasn't working in the first place.


This is most probably a default namespace problem -- most likely a default namespace in the complete document, that you have not shown.

Writing XPath expressions that address names that are in no namespace is a FAQ and there are numerous good answers in the xpath tag -- just find and read them.

In summary, XPath considers any non-prefixed name in an XPath expression to be in "no namespace". Because the actual elements in the XML document are in the default namespace (not in no namespace), they are not selected.

The solution is to register a namespace binding using the API of your XPath engine and then prefix all names in the expression with the preffix from the binding.

See this for more information exactly how to register a namespace in SimpleXML.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜