Need some HTML element with HTMLAgilityPack in C# - how to do it?
I have the following scenario:
<a href="test.com">Some text <b>is bolded</b> some is <b&g开发者_如何学编程t;not</b></a>
Now, how do I get the "test.com" part and the anchor of the text, without having the bolded parts?
Assuming the following markup:
<html>
<head>
<title>Test</title>
</head>
<body>
<a href="test.com">Some text <b>is bolded</b> some is <b>not</b></a>
</body>
</html>
You could perform the following:
class Program
{
static void Main()
{
var doc = new HtmlDocument();
doc.Load("test.html");
var anchor = doc.DocumentNode.SelectSingleNode("//a");
Console.WriteLine(anchor.Attributes["href"].Value);
Console.WriteLine(anchor.InnerText);
}
}
prints:
test.com
Some text is bolded some is not
Of course you probably wanna adjust your SelectSingleNode
XPath selector by providing an unique id or a classname to the anchor you are trying to fetch:
// assuming <a href="test.com" id="foo">Some text <b>is bolded</b> some is <b>not</b></a>
var anchor = doc.GetElementbyId("foo");
精彩评论