开发者

How does this RegEx for parsing emails work in PHP?

Okay, I have the following PHP code to extract an email address of the following two forms:

Random Stranger <email@domain.com>
email@domain.com

Here is the PHP code:

// The first example
$sender = "Random Stranger <email@domain.com>";

$pattern = '/([\w_-]*@[\w-\.]*)|.*<([\w_-]*@[\w-\.]*)>/';

preg_match($pattern,$sender,$matches,PREG_OFFSET_CAPTURE);

echo "<pre>";
print_r($matches);
echo "</pre><hr>";

// The second example
$sender = "user@domain.com";

preg_match($pattern,$sender,$matches,PREG_OFFSET_CAPTURE);

echo "<pre>";
print_r($matches);
echo "</pre>";

My question is... what is in $matches? It seems to be a strange collection of arrays. Which index holds the match from the parenthesis? How can I be sure I'm getting the email address and only the email address?

Update:

Here is the output:

Array
(
    [0] => Array
        (
            [0] => Random Stranger 
            [1] => 0
        )

    [1] => Array
        (
            [0] => 
            [1] => -1
        )

    [2] => Array
        (
            [0] => user@domain.com
            [1] => 5
        )

)
Array
(
    [0] => Array
        (
            [0] => user@domain.com开发者_高级运维
            [1] => 0
        )

    [1] => Array
        (
            [0] => user@domain.com
            [1] => 0
        )

)


This doesn't help you with your preg question but it will simplify your code. Since those are the only 2 options, dont use regular expressions

echo end( explode( '<', rtrim( $sender, '>' ) ) );


The following is copied directly from the help doc at http://us.php.net/preg_match

If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.


The preg_match() manual page explains how $matches works. It's an optional parameter that gets filled with the results of any bracketed sub-expression from your regexp, in the order that they matched. $matches[0] is always the entire expression match, followed by the sub-expressions.

So for example, that pattern contains two sub-expression, ([\w_-]*@[\w-\.]*) and ([\w_-]*@[\w-\.]*). The parts matching those two expressions will be put into $matches[1] and $matches[2], respectively. I would guess after a quick glance that for the email address of Random Stranger <email@domain.com>, you would have something like this in $matches:

Array( 
    0 => "Random Stranger <email@domain.com>",
    1 => "Random Stranger",
    2 => "email@domain.com"
)

Think of it as passing an array named $matches by reference, that gets filled with all the sub-parts that are matched.

Edit - note that you are using the PREG_OFFSET_CAPTURE flag, which alters the behaviour of how $matches gets filled, so your result won't match my example. The manual explains how this flag alters the capture as well. In this case, instead of a set of matched sub-expressions, you get a multidimensional array of each expression with the position it was found at in the string.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜