开发者

HTML Agility Pack

I'm trying to use HTML Agility Pack to get the description text from inside the:

<meta name="description" content="**this is the text i want to extract and store in a string**" />

And someone on Stackoverflow a little while ago su开发者_Go百科ggested I use HTMLAgilityPack. But I don't know how to use it, and the documentation for it that I've found (including the docs contained in the downloads) all have invalid links and therefor cannot view the documentation.

Can somebody please help me solve this?


The usage is very similar to XmlDocument; you could use MSDN on XmlDocument for a broad overview; you might also want to learn xpath syntax (MSDN).

Example:

HtmlDocument doc = new HtmlDocument();
doc.Load(path); // or .LoadHtml(html);
HtmlNode node = doc.DocumentNode.SelectSingleNode("//meta[@name='description']");
if (node != null) {
    string desc = node.GetAttributeValue("content", "");
    // TODO: write desc somewhere
}

The second argument to GetAttributeValue is the default returned in case the attribute is not found.


public string HtmlAgi(string url, string key) {

    var Webget = new HtmlWeb();
    var doc = Webget.Load(url);
    HtmlNode ourNode = doc.DocumentNode.SelectSingleNode(string.Format("//meta[@name='{0}']", key));

    if (ourNode != null)
    {


            return ourNode.GetAttributeValue("content", "");

    }
    else
    {
        return "not fount";
    }

}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜