Parsing text for email id's
I am trying to parse text for email id's using php / regex. Are there 开发者_开发问答any classes or built in methods to do this? The text contains multiple email id's at random places.
The source of the text is .doc files, which I then copy paste into forms, to be processed on submit.
preg_match('/^[^@]+@[a-zA-Z0-9._-]+\.[a-zA-Z]+$/', $email) //from php.net
I submitted a similar question on superuser for software solutions to the problem.
It's hard to accurately detect emails embedded in running text. You will either match stuff that isn't an e-mail address erroneously, or miss some valid but strange e-mail addresses.
A good starting point is
preg_match_all('/\b[A-Z0-9._%+-]+@(?:[A-Z0-9-]+\.)+[A-Z]{2,6}\b/i', $subject, $result, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($result[0]); $i++) {
# Matched text = $result[0][$i];
}
(generated by RegexBuddy from its library)
It will match most "normal" addresses ok, but won't find ones like mail@1.2.3.4
or "Tim\ O'Reilly"@microsoft.com
. And of course it will match nonsense like my@mail.addr
.
精彩评论