开发者

Html Agility Pack link and img src extraction

I have pages that use images as links, and I a开发者_如何学Gom trying to get the href link as well as the images src. The problem is what I have now is collecting the href's fine, but it is only getting the first img src and just repeating.

HtmlWeb hw = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = hw.Load(url);
HtmlNodeCollection linkNodes = doc.DocumentNode.SelectNodes("//a[@href]");
foreach (HtmlNode linkNode in linkNodes)
{
HtmlAttribute link = linkNode.Attributes["href"];
HtmlNode imageNode = linkNode.SelectSingleNode("//img");
HtmlAttribute src = imageNode.Attributes["src"];

string imageLink = link.Value;
string imageUrl = src.Value;
}

Can some one tell me whats wrong or another way of doing it? Thanks.


Try changing

HtmlNode imageNode = linkNode.SelectSingleNode("//img");

to

HtmlNode imageNode = linkNode.SelectSingleNode(".//img");

Hope this helps.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜