Having a problem matching a html element using Preg_Match
I am trying to match a html element but I don't think its matching since $titles is empty - can anyone correct me?
My preg_match:
preg_match_all("~<td align=\"left\" width=\"50%\">[^<]*. <b><a href=\"(.*?)\">[^<]*</a>~i", $main, $titles);
Example HTML to match:
//<td align="left" width="50%">开发者_StackOverflow社区1. <b><a title="Wat" href="http://www.exmple.com/q.html">Wat</a></b><br></td>
Am I missing something?
Thanks all for any help
There's nothing to match title="Wat"
in the <a>
tag.
I'd suggest not using a regex to parse it though. I'm not too familiar with PHP but I'm sure it already has something that will do most of the work for you.
As i said in my comment regex is rarely if ever the proper tool to use when trying to parse html. Im foing to use an example of Zend_Dom_Query, one of th ecomponetns in Zend Framework simply because i havent seen it recommended on one of these questions yet. So...
$dom = new Zend_Dom_Query($htmlHaystack);
$anchors = $dom->query('//td/a[@title]'); // xpath here
if(count($anchors) > 0)
{
$titles = array();
foreach($anchors as $element)
{
$titles[] = $element->getAttribute('title');
}
}
else
{
$title = null;
}
$string='<td align="left" width="50%">1. <b><a title="Wat" href="http://www.exmple.com/q.html">Wat</a></b><br></td>';
$s = explode("</a>",$string);
foreach($s as $k){
if (strpos($k,"href")!==FALSE){
echo preg_replace('/.*href="|">.*/ms',"",$k);
}
}
精彩评论