Regular Expression Crashes in Ruby and Javascript
I have an odd problem occurring with our regex for email addresses. Here is the expression:
^(\w)+(([(\.?)\w\-+])*[\w]+)*@((\[([\d]{1,3}\.){3}[\d]{1,3}\])|((\w)+((\.?)[\w\-]+)*\.[a-z]{2,6}))$
Anything we've thrown at it which matches is fine, the problem is with failures, long strings cause the expression to hang. On our webserver it will spike the CPU. Some examples follow. The problem is when people enter long email addresses errantly, it crashes the server.
This is a failure which works.
rubular failure 1 short@failure
This is a failure which causes the hanging, you can see rubular has issues as well.
rubular failure 2 thisisamuchlonger@expressionleadingtofailure
The interesting thing is if you make it proper:
rubular pass thisisamuchlonger@expressionleadingtofailure.com
This passes easil开发者_StackOverflow社区y.
Edit: A note, I've also attempted to run this using the client side javascript tester and it will fail/succeed in the same ways. There is something about this regex which causes parsers to eat memory and fail, I'm just not sure what part it is.
Your regular expression combines the worst-case for regular expressions repeatedly. Your regex gets stuck backtracking over the string when the regex fails to match. Take out the *
s and ?
s and your regular expression will perform admirably.
See http://swtch.com/~rsc/regexp/regexp1.html for a thorough explanation of why you can't do what you are trying to do in a performant manner.
My personal opinion is that you should just check for /@/
and send a confirmation e-mail, but you can probably find a regex elsewhere on the web that will perform adequately while matching most e-mail addresses.
Try this for example, rubular eats this well
^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}$
By the way google first serp leads to more examples: http://www.regular-expressions.info/email.html
精彩评论