Html Agility Pack link and img src extraction
I have pages that use images as links, and I a开发者_如何学Gom trying to get the href link as well as the images src. The problem is what I have now is collecting the href's fine, but it is only getting the first img src and just repeating.
HtmlWeb hw = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = hw.Load(url);
HtmlNodeCollection linkNodes = doc.DocumentNode.SelectNodes("//a[@href]");
foreach (HtmlNode linkNode in linkNodes)
{
HtmlAttribute link = linkNode.Attributes["href"];
HtmlNode imageNode = linkNode.SelectSingleNode("//img");
HtmlAttribute src = imageNode.Attributes["src"];
string imageLink = link.Value;
string imageUrl = src.Value;
}
Can some one tell me whats wrong or another way of doing it? Thanks.
Try changing
HtmlNode imageNode = linkNode.SelectSingleNode("//img");
to
HtmlNode imageNode = linkNode.SelectSingleNode(".//img");
Hope this helps.
精彩评论