How does this RegEx for parsing emails work in PHP?
Okay, I have the following PHP code to extract an email address of the following two forms:
Random Stranger <email@domain.com>
email@domain.com
Here is the PHP code:
// The first example
$sender = "Random Stranger <email@domain.com>";
$pattern = '/([\w_-]*@[\w-\.]*)|.*<([\w_-]*@[\w-\.]*)>/';
preg_match($pattern,$sender,$matches,PREG_OFFSET_CAPTURE);
echo "<pre>";
print_r($matches);
echo "</pre><hr>";
// The second example
$sender = "user@domain.com";
preg_match($pattern,$sender,$matches,PREG_OFFSET_CAPTURE);
echo "<pre>";
print_r($matches);
echo "</pre>";
My question is... what is in $matches
? It seems to be a strange collection of arrays. Which index holds the match from the parenthesis? How can I be sure I'm getting the email address and only the email address?
Update:
Here is the output:
Array
(
[0] => Array
(
[0] => Random Stranger
[1] => 0
)
[1] => Array
(
[0] =>
[1] => -1
)
[2] => Array
(
[0] => user@domain.com
[1] => 5
)
)
Array
(
[0] => Array
(
[0] => user@domain.com开发者_高级运维
[1] => 0
)
[1] => Array
(
[0] => user@domain.com
[1] => 0
)
)
This doesn't help you with your preg question but it will simplify your code. Since those are the only 2 options, dont use regular expressions
echo end( explode( '<', rtrim( $sender, '>' ) ) );
The following is copied directly from the help doc at http://us.php.net/preg_match
If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.
The preg_match() manual page explains how $matches
works. It's an optional parameter that gets filled with the results of any bracketed sub-expression from your regexp, in the order that they matched. $matches[0]
is always the entire expression match, followed by the sub-expressions.
So for example, that pattern contains two sub-expression, ([\w_-]*@[\w-\.]*)
and ([\w_-]*@[\w-\.]*)
. The parts matching those two expressions will be put into $matches[1]
and $matches[2]
, respectively. I would guess after a quick glance that for the email address of Random Stranger <email@domain.com>
, you would have something like this in $matches
:
Array(
0 => "Random Stranger <email@domain.com>",
1 => "Random Stranger",
2 => "email@domain.com"
)
Think of it as passing an array named $matches
by reference, that gets filled with all the sub-parts that are matched.
Edit - note that you are using the PREG_OFFSET_CAPTURE
flag, which alters the behaviour of how $matches
gets filled, so your result won't match my example. The manual explains how this flag alters the capture as well. In this case, instead of a set of matched sub-expressions, you get a multidimensional array of each expression with the position it was found at in the string.
精彩评论