Parsing whole table in Perl using TableExtract
I want to parse whole table using TableExtract in Pearl. This is what I wrote in Perl:
use HTML::TableExtract;
use LWP::Simple;
use Data::Dumper;
my $te = new HTML::TableExtract( depth=>3, count=>3, gridmap=>0);
my $content = get("C:/Users/admin/Desktop/tabela.html");
$te->parse($content);
foreach $ts ($te->table_states)
{
print $ts;
foreach $row ($ts->rows)
{
print Dumper $row;
#print Dumper $row if (scalar(@$row) == 2);
}
}
and this is how file "tabela.html" looks:
<table width=100% align=center cellspacing=0 cellpadding=0 class='raspored_1x2'><tr class=svetlija><td align=center wid开发者_开发知识库th=20% >02.03.2011 20:30</td><td align=center width=5% >261</td><td align=right width=21% >AUSTRIA W.</td><td align=center width=2% >-</td><td align=left width=21% >STURM</td><td align=right width=8%>
<a title="dodaj u tiket" href=?option=com_content&task=view&id=24&Itemid=31&sport=Fudbal&a=add&rb=261-5018-2011&dom=AUSTRIA+W.&gost=STURM&tip=1&kvota=1.80>1.80</a></td><td align=right width=8% >
<a title="dodaj u tiket" href=?option=com_content&task=view&id=24&Itemid=31&sport=Fudbal&a=add&rb=261-5018-2011&dom=AUSTRIA+W.&gost=STURM&tip=X&kvota=3.30>3.30</a></td><td align=right width=8% >
<a title="dodaj u tiket" href=?option=com_content&task=view&id=24&Itemid=31&sport=Fudbal&a=add&rb=261-5018-2011&dom=AUSTRIA+W.&gost=STURM&tip=2&kvota=3.90>3.90</a></td><td width=7%>
<a title='Pogledaj kvote' href='javascript:void(0)' onclick="prikaziKvote('261-5018-2011')">
<img src="http://www.balkanbet.co.rs/site/templates/balkanbet_green/images/arrow_down.gif" class='strelica'>
</a>
</td></tr></table>
When I run perl script nothing happen. Has anyone idea what is the problem?
#!/usr/bin/env perl
use warnings;
use strict;
use HTML::TableExtract;
use Data::Dumper;
my $content =<<EOC;
<table width=100% align=center cellspacing=0 cellpadding=0 class='raspored_1x2'>
<tr class=svetlija>
<td align=center width=20% >02.03.2011 20:30</td>
<td align=center width=5% >261</td>
<td align=right width=21% >AUSTRIA W.</td>
<td align=center width=2% >-</td>
<td align=left width=21% >STURM</td>
<td align=right width=8%><a title="dodaj u tiket" href=?option=com_content&task=view&id=24&Itemid=31&sport=Fudbal&a=add&rb=261-5018-2011&dom=AUSTRIA+W.&gost=STURM&tip=1&kvota=1.80>1.80</a></td>
<td align=right width=8% ><a title="dodaj u tiket" href=?option=com_content&task=view&id=24&Itemid=31&sport=Fudbal&a=add&rb=261-5018-2011&dom=AUSTRIA+W.&gost=STURM&tip=X&kvota=3.30>3.30</a></td>
<td align=right width=8% ><a title="dodaj u tiket" href=?option=com_content&task=view&id=24&Itemid=31&sport=Fudbal&a=add&rb=261-5018-2011&dom=AUSTRIA+W.&gost=STURM&tip=2&kvota=3.90>3.90</a></td>
<td width=7%><a title='Pogledaj kvote' href='javascript:void(0)' onclick="prikaziKvote('261-5018-2011')"><img src="http://www.balkanbet.co.rs/site/templates /balkanbet_green/images/arrow_down.gif" class='strelica'></a></td>
</tr>
</table>
EOC
my $te = new HTML::TableExtract();
$te->parse( $content );
for my $ts ($te->table_states) {
print $ts;
for my $row ($ts->rows) {
print Dumper $row;
# print Dumper $row if (scalar(@$row) == 2);
}
}
# HTML::TableExtract::Table=HASH(0x91e2e0)$VAR1 = [
# '02.03.2011 20:30',
# '261',
# 'AUSTRIA W.',
# '-',
# 'STURM',
# '1.80',
# '3.30',
# '3.90',
# undef
# ];
There may be other problems, but the first thing that springs to mind is that you are using LWP and you are passing it a file path, not a URL.
You probably want File::Slurp
instead of LWP::Simple
(note that it has a different API so you'll need to replace get()
)
精彩评论