开发者

PHP parser ASP page [duplicate]

This question already has an answer here: Closed 11 years ago.

Possible Duplicate:

PHP : Parser asp page

I have this tag into asp page

开发者_C百科<a class='Lp' href="javascript:prodotto('Prodotto.asp?C=3')">AMARETTI VICENZI GR. 200</a>

how can i parser this asp page for to have the text AMARETTI VICENZI GR. 200 ?

This is the code that I use but don't work :

<?php
$page = file_get_contents('http://www.prontospesa.it/Home/prodotti.asp?c=12'); 
preg_match_all('#<a href="(.*?)" class="Lp">(.*?)</a>#is', $page, $matches); 

$count = count($matches[1]); 
for($i = 0; $i < $count; $i++){ 
    echo $matches[2][$i];  
} 
?> 


You're regular expression (in preg_match_all) is wrong. It should be #<a class='Lp' href="(.*?)">(.*?)</a>#is since the class attribute comes first, not last and is wrapped in single quotes, not double quotes.

You should highly consider using DOMDocument and DOMXPath to parse your document instead of regular expressions.

DOMDocument/DOMXPath Example:

<?php

// ...

$doc = new DOMDocument;
$doc->loadHTML($html); // $html is the content of the website you're trying to parse.

$xpath = new DOMXPath($doc);
$nodes = $xpath->query('//a[@class="Lp"]');

foreach ( $nodes as $node )
  echo $node->textContent . PHP_EOL;


You have to modify the regular expression a little based on the HTML code of the page you are getting the content from:

'#<a class=\'Lp\' href="(.*?)">(.*?)</a>#is'

Note that the class is first and it is surrounded by single quotes not double. I tested and it works for me.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜