开发者

Parsing HTML and counting tags with C#

Suppose I have a block of HTML in a string:

<div class="nav mainnavs">
    <ul>
        <li><a id="nav-questions" href="/questions">Questions</a></li>
     开发者_运维技巧   <li><a id="nav-tags" href="/tags">Tags</a></li>
        <li><a id="nav-users" href="/users">Users</a></li>
        <li><a id="nav-badges" href="/badges">Badges</a></li>
        <li><a id="nav-unanswered" href="/unanswered">Unanswered</a></li>
    </ul>
</div>

How can I parse the HTML and count the number of instances of a specific type of tag, such as <div> or <li>?


You can use HtmlAgilityPack for this - the latest version supports Linq so this is straight-forward:

For a local html file:

HtmlDocument doc = new HtmlDocument();
doc.Load(@"test.html");
int liCount = doc.DocumentNode.Descendants("li").Count(); //returns 5

From the web:

HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load("http://stackoverflow.com");
int liCount = doc.DocumentNode.Descendants("li").Count();
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜