Get the content of the href attribute of an a element [duplicate]
Possible Duplicate:
Grabbing the href attribute of an A element
Hello,
I have the following html I want to parse:
<td align="left" nowrap="nowrap"><a href="XXXXXXX">
I want to save XXXXX on a variable. I know next to nothing of regular expressions. I know how to do it using strpos, substr, etc. But I believe it is slower than using regex.
if (preg_match('!<td align="left" NOWRAP><a href=".\s+/.+">!', $result, $matches))
echo $matches[1];
else
echo "error!!!";
I know the previous code is an atrocity to a regex expert. But I really have no idea how to do it. I need some tips, not the full solution.
Here's my (not remotely original) tip: don't use regex to parse HTML. Use an HTML parser.
See How do you parse and process HTML/XML in PHP?.
One thing of knowing regex is to know when not to use them.
Often when you want to parse HTML, 9/10 times, regex is not the right tool.
You can use a DOM parser.
If your structure is always like the same you posted, you can use this REGEX:
<td\s+align="left"\s+nowrap="nowrap">\s*<a\s+href="(.*?")>
and then take the group #1 that is the string between parenthesis. You have to make a group, a zone between the parenthesis wich contains the data you would get. This link contains useful information about regex and the PHP implementation.
精彩评论