开发者

HTMLUnit collecting all links by class name

I would like to scrape / collect all the links on a page under a specific class name

e.g. HTML Agriculture (92)

<a href="http://www.specificurl/page.html" class="generate">Agriculture</a>

I have been toying with the following pieces of code:

   List<?> links = page.getByXPath("//div[@class='generate']/@hre开发者_运维技巧f");

   OR
 List<?> links = page.getAnchors();
    System.out.println(links);

The getByXPath option returns null and the other option grabs all anchors. Is there a way to grab the links into a list?


This is a terrible XPath but I was having issues narrowing it down. (I can look into a better XPath if necessary, but for now this one worked:

List<?> links = page.getByXPath("/html/body/div[2]/div[2]/table/tbody/tr/td/table/tbody/tr[7]/td/table/tbody/tr/td/div/table/tbody/tr[2]/td/div/table/tbody/tr/td/table/tbody/tr/td/ul/li/a/@href").asList()

I'm not quite sure why it wasn't allow us to grab it by that class name.

Let me know how it works for you when you get the chance

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜