Extract HTML Table ( span ) tags using Jsoup in Java
I am trying to extract the td name and the span class. In the sample code, I want to extract the a href with in the firs开发者_开发百科t td "accessory" and the span tag in the second td.
I want to print Mouse, is-present, yes KeyBoard, No Dual-Monitor, is-present, Yes
When I use the below Java code, I get, Mouse Yes Keyboard No Dual-Monitor Yes.
How do I get the span class name?
HTML Code
<tr>
<td class="" width="1%" style="padding:0px;">
</td>
<td class="">
<a href="/accessory">Mouse</a>
</td>
<td class="tright ">
<span class='is_present'>Yes</span><br/>
</td>
<td class="tright ">
<br/>
</td>
<tr>
<td class="" width="1%" style="padding:0px;">
</td>
<td class="">
<a href="/accessory"> KeyBoard</a>
</td>
<td colspan="2" class="" style='text-align:center;'>
<small>No</small>
</td>
<td class="" width="1%" style="padding:0px;">
</td>
<td class="">
<a href="/accessory">Dual-Monitor</a>
</td>
<td class="tright ">
<span class='is_present'>Yes</span><br/>
</td>
<td class="tright ">
<br/>
</td>
Java code
private void printParse(String HTMLdata){
Element table = data.select("table[class="computer_table").first();
Iterator<Element> ite = table.select("td").iterator();
while(ite.hasnext()){
sysout(ite.next().text());
}
}
if you get table element then all you you need is getting span. you don't need to get td becasue you can query using span and still get the same result. below is the code snippet.
Elements span = table.select("span");
for (Element src : span) {
if (src.tagName().equals("span"))
System.out.print( src.attr("class") );
}
but make sure that you got table element.
Element table = doc.select("table[id=computer_table]").first();
Elements results = table.select("td");
for (Element dl : results) {
if(!dl.text().equals("") && dl.text().length() > 1)
pNames.add(dl.text());
if((!dl.select("small").text().equals("")) && dl.select("small").text().length() > 1)
emails.add((dl.select("small").text()));
if(!dl.select("span").attr("class").equals("") && dl.select("span").attr("class").length() > 1)
moneyDollars.add(dl.select("span").attr("class"));
}
精彩评论