PHP parser ASP page [duplicate]
Possible Duplicate:
PHP : Parser asp page
I have this tag into asp page
开发者_C百科<a class='Lp' href="javascript:prodotto('Prodotto.asp?C=3')">AMARETTI VICENZI GR. 200</a>
how can i parser this asp page for to have the text AMARETTI VICENZI GR. 200 ?
This is the code that I use but don't work :
<?php
$page = file_get_contents('http://www.prontospesa.it/Home/prodotti.asp?c=12');
preg_match_all('#<a href="(.*?)" class="Lp">(.*?)</a>#is', $page, $matches);
$count = count($matches[1]);
for($i = 0; $i < $count; $i++){
echo $matches[2][$i];
}
?>
You're regular expression (in preg_match_all
) is wrong. It should be #<a class='Lp' href="(.*?)">(.*?)</a>#is
since the class attribute comes first, not last and is wrapped in single quotes, not double quotes.
You should highly consider using DOMDocument
and DOMXPath
to parse your document instead of regular expressions.
DOMDocument/DOMXPath Example:
<?php
// ...
$doc = new DOMDocument;
$doc->loadHTML($html); // $html is the content of the website you're trying to parse.
$xpath = new DOMXPath($doc);
$nodes = $xpath->query('//a[@class="Lp"]');
foreach ( $nodes as $node )
echo $node->textContent . PHP_EOL;
You have to modify the regular expression a little based on the HTML code of the page you are getting the content from:
'#<a class=\'Lp\' href="(.*?)">(.*?)</a>#is'
Note that the class is first and it is surrounded by single quotes not double. I tested and it works for me.
精彩评论