开发者

c# HtmlAgility Pack - Unable to get image src

I am trying to learn how to get all the img src from a URL. But, the imgs variable in my code is always null. What am I doing wrong?

static void Main(string[] args)
{
    HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
    doc.LoadHtml("http://archive.ncsa.illinois.edu/primer.html");
    HtmlAgilityPack.HtmlNodeCollection imgs = doc.DocumentNode.Sele开发者_如何转开发ctNodes("//img");

    if (imgs != null)
    {
        foreach (HtmlAgilityPack.HtmlNode img in imgs)
        {
            string imgSrc = img.Attributes["src"].Value;
        }
    }

    Console.ReadKey();
}  


You are using HtmlDocument.LoadHtml which is designed to take html source and not a url.

You could use the WebClient to get the html e.g.

WebClient wc = new WebClient();
string html = wc.DownloadString("http://archive.ncsa.illinois.edu/primer.html");
doc.LoadHtml(html);

HtmlDocument also supports a Load that allows content to be loaded from various other sources.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜