开发者

Email Regex that DOES include unicode domains

I was wondering if anybody has found a solution that validates an email that includes unicode characters as in from a unicode domain? I have searched at length and have yet to find a solution that works.开发者_如何转开发


Fully validating an email address through a regex is hard. Really hard. This is one that is fully compliant with RFC822. Even if you create a perfect regex that correct validates all email addresses, that doesn't stop me from entering hi@hi.com (If you're trying to make sure that I enter a valid email address) or from accidentally misspelling my username (If you're trying to make sure that I enter my email address correctly).

Just send a link in an email saying, "click here to validate your email address."


I had the same issue and came up with an intelligent solution \p{L}. Please check it out:

private static bool IsEmailValid(string email) {
    System.Text.RegularExpressions.Regex re = new Regex(@"^[\p{L}0-9!$'*+\-_]+(\.[\p{L}0-9!$'*+\-_]+)*@[\p{L}0-9]+(\.[\p{L}0-9]+)*(\.[\p{L}]{2,})$", RegexOptions.CultureInvariant | RegexOptions.IgnoreCase);

    return re.IsMatch(email);
}


Ok, so the only email validation I ever found that was truly awesome (instead of just OK) is part of the Zend Framework. Of course that means PHP, hopefully though, you can look at how they do it and emulate some of their better ideas: http://pastebin.com/SvZPBp31 Or just look up Zend_Validate_EmailAddress sourcecode.

sorry that this isn't in C# syntax / language.


Like has been pointed out, validating e-mail addresses through a regular expression is a hard problem. You can get close with a fairly simple one, but there are many, many cases that it will fail to catch. I'm all for sending an email to a supposed email address as @Nick ODell suggests (after doing some basic sanity checking, like, does it contain an @ sign, does the domain name portion exist and have one or more of MX/A/AAAA RRs, and the likes) and including a verification link.

That said, if by Unicode domain you mean a Punycode-encoded host name label, those should be covered by any half-way competent validation regexp, as in encoded form those are just xn-- followed by the regular set [a-z0-9-] (case insensitive comparison).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜