Regular Expression for delimited email address
i am looking for a regular express to validate a list o开发者_运维技巧f email addresses like the following
test1@abd.com;test.test@abc.com;test3@test.com
and
test1@abd.com;test.test@abc.com;test3@test.com;
optional ";" at the end of the list.
thanks,
Update
Don't. Split on the delimiter and validate each one separately. I would also recommend setting a match timeout and/or non-backtracking flag to prevent excessive CPU usage when matching.
Original Answer
From: http://regexlib.com/RETester.aspx?regexp_id=1007
^(([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5}){1,25})+([;.](([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5}){1,25})+)*$
“You call that a knife???” —Crocodile Dundee
Actually, all those are very poor patterns for matching a mail address. To strictly validate an RFC 5322 mail address with zero false negatives and zero false positives, you need this precise pattern:
(?x)
(?(DEFINE)
(?<address> (?&mailbox) | (?&group))
(?<mailbox> (?&name_addr) | (?&addr_spec))
(?<name_addr> (?&display_name)? (?&angle_addr))
(?<angle_addr> (?&CFWS)? < (?&addr_spec) > (?&CFWS)?)
(?<group> (?&display_name) : (?:(?&mailbox_list) | (?&CFWS))? ; (?&CFWS)?)
(?<display_name> (?&phrase))
(?<mailbox_list> (?&mailbox) (?: , (?&mailbox))*)
(?<addr_spec> (?&local_part) \@ (?&domain))
(?<local_part> (?&dot_atom) | (?"ed_string))
(?<domain> (?&dot_atom) | (?&domain_literal))
(?<domain_literal> (?&CFWS)? \[ (?: (?&FWS)? (?&dcontent))* (?&FWS)?
\] (?&CFWS)?)
(?<dcontent> (?&dtext) | (?"ed_pair))
(?<dtext> (?&NO_WS_CTL) | [\x21-\x5a\x5e-\x7e])
(?<atext> (?&ALPHA) | (?&DIGIT) | [!#\$%&'*+-/=?^_`{|}~])
(?<atom> (?&CFWS)? (?&atext)+ (?&CFWS)?)
(?<dot_atom> (?&CFWS)? (?&dot_atom_text) (?&CFWS)?)
(?<dot_atom_text> (?&atext)+ (?: \. (?&atext)+)*)
(?<text> [\x01-\x09\x0b\x0c\x0e-\x7f])
(?<quoted_pair> \\ (?&text))
(?<qtext> (?&NO_WS_CTL) | [\x21\x23-\x5b\x5d-\x7e])
(?<qcontent> (?&qtext) | (?"ed_pair))
(?<quoted_string> (?&CFWS)? (?&DQUOTE) (?:(?&FWS)? (?&qcontent))*
(?&FWS)? (?&DQUOTE) (?&CFWS)?)
(?<word> (?&atom) | (?"ed_string))
(?<phrase> (?&word)+)
# Folding white space
(?<FWS> (?: (?&WSP)* (?&CRLF))? (?&WSP)+)
(?<ctext> (?&NO_WS_CTL) | [\x21-\x27\x2a-\x5b\x5d-\x7e])
(?<ccontent> (?&ctext) | (?"ed_pair) | (?&comment))
(?<comment> \( (?: (?&FWS)? (?&ccontent))* (?&FWS)? \) )
(?<CFWS> (?: (?&FWS)? (?&comment))*
(?: (?:(?&FWS)? (?&comment)) | (?&FWS)))
# No whitespace control
(?<NO_WS_CTL> [\x01-\x08\x0b\x0c\x0e-\x1f\x7f])
(?<ALPHA> [A-Za-z])
(?<DIGIT> [0-9])
(?<CRLF> \x0d \x0a)
(?<DQUOTE> ")
(?<WSP> [\x20\x09])
)
(?&address) # finally, match a mail address
I discuss this in more detail in this answer.
That needs a Perl Compatible Regular Expression (PCRE) library, or Perl itself, to work properly. I cannot guarantee that perl incompatible pattern engines will correctly handle the recursion.
@Evan's answer is close, but the expression matches these scenarios which are invalid:
a@test.comb@test.comc@test.com
(no delimiters)a@test.com;b@test.com.c@test.com
(period.
accepted as delimiter)a@test.com;b@test.comc@test.com
(only 1st delimiter matched - extension of problem #1)
To fix these (and simplify a little bit), I made these changes:
- Removed the 3rd
+
sign, and the outer parens()
surrounding the first email address. - Change
[;.]
to a plain;
delimiter - I actually changed that part to;[ ]{0,1}
because I wanted the expression to match spaces after the semi-colon delimiter. - Similar to #1, remove the final
+
sign in the expression, and the outer parens()
between the delimiter (step #2) and the final closing paren with the star after it -)*
Here is the final expression (allowing an optional space after the ;
delimiter:
/^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5}){1,25}(;[ ]{0,1}([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5}){1,25})*$/
I use this:
^(([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5}){1,25}(($)|( *;+ *$)|( *;+ *(?=[a-zA-Z0-9_\-\.]))))*$
https://regexr.com/3gth7
I needed to allow for the presence of white space around the delimiter, so I used this modification of Evan Mulawski's answer:
^(([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5}){1,25})+(\s*[;.]\s*(([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5}){1,25})+)*$
Test it at: http://regexlib.com/RETester.aspx?regexp_id=13126
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
精彩评论