开发者

Html Agility Pack problem retrieving data

I am trying to parse data from web page http://www.bbb.org/kitchener/accredited-business-directory?letter=a

i want to get all the categories like

Accountants - Certified Public (2)

Accounting Services (1) etc but problem is when i goto node then tag a is null i donot know why but HTMLagility pack does not get these tags. Checking in watch it says that div only encloses thest commented breakline tags not the tag where as when we see in page source it is there

doc.DocumentNode.SelectNodes("//tr/td/table/tr/td/div/div")[0].OuterHtml    "<div style=\"font-size: 12px;line-height: 16px;\"><!--<开发者_C百科br />-->\r\n<!--<br />-->\r\n</div>"    

here is start of that div Note i have included only 2 tags from the HTML

<div style="float: left; width: 305px;"> 
  <h5 style="margin: 0px; margin-bottom: 5px; border-bottom: 1px solid #cccccc; padding-bottom: 5px; font-size: 12px;">Categories Starting with letter 'a'</h5> 
   <div style="font-size: 12px;line-height: 16px;">
     <!--<br />-->
     <!--<br />-->       
     <a class="listingName" href="/kitchener/accredited-business-directory/accountants">Accountants (11)</a><br />   
     <a class="listingName" href="/kitchener/accredited-business-directory/accountants-certified-public">Accountants - Certified Public (2)</a><br /> 
   </div> 
</div>

how can i get data

Even putting does not reveal the links

foreach (var test in doc.DocumentNode.SelectNodes("//a[@href]")) 
{ MessageBox.Show(test.InnerText+"\n"+test.InnerHtml); }


This worked fine for me using the following sample:

HtmlWeb web = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = web.Load("http://www.bbb.org/kitchener/accredited-business-directory?letter=a");

foreach (var link in doc.DocumentNode.SelectNodes("//a[@href]"))
{
    Console.WriteLine(link.InnerText);
}

Output (shortened):

BBB
Home
Accredited Business Directory
Accountants (11)
Accountants - Certified Public (2)
Accounting Services (1)
Advertising - Direct Mail (3)
Advertising Agencies & Counselors (3)
Advertising Specialties (3)
...
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜