开发者

Extract specific HTML text using HTMLAgilityPack

<table class="result" summary="Summary Description.">
<tbody>
<tr>
    <th scope="col" class="firstcol">Column 1</th>
    <th scope="col">Column 2</th>
    <th scope="col">Column 3</th>
    <th scope="col" class="lastcol">Column 4</th>
</tr>
<tr class="even">
    <td class="firstcol">Text 1</td>
    <td>Text 2</td>
    <td>4Text 3</td>
    <td class="lastcol">Text 4</td>
</tr>
</tb开发者_开发技巧ody></table>

The part of the HTML Im interested in looks like this. I want Text 1, Text 2, Text 3 and Text 4. Using HTMLAgilityPack, how can I extract that data? I google and checked this site but didnt find something that matched my scenario exactly.

        if (htmlDoc.DocumentNode != null)
        {
            foreach (HtmlNode text in htmlDoc.DocumentNode.SelectNodes(???)
            {
                ???
            }
        }


Try this:

        var html = @"<table class=""result"" summary=""Summary Description.""> <tbody> <tr>     <th scope=""col"" class=""firstcol"">Column 1</th>     <th scope=""col"">Column 2</th>     <th scope=""col"">Column 3</th>     <th scope=""col"" class=""lastcol"">Column 4</th> </tr> <tr class=""even"">     <td class=""firstcol"">Text 1</td>     <td>Text 2</td>     <td>4Text 3</td>     <td class=""lastcol"">Text 4</td> </tr> </tbody></table>";
        var doc = new HtmlDocument();
        doc.LoadHtml(html);
        var textNodes = doc.DocumentNode.SelectNodes(@"//tr[@class='even']/td/text()").ToList();
        foreach(var textNode in textNodes)
        {
            Console.WriteLine(textNode.InnerText);
        }
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜