开发者

What is wrong with this Regex

I am using ^[\w-\.\+]+@([\w-]+\.)+[\w-]{2,4}$ to validate em开发者_JAVA百科ail address, when I use it from .aspx.cs it works fine to validate IDN email but when I use it from aspx page directly it doesn't work.

return  Regex.IsMatch(
                email,
                @"^[\w-\.\+]+@([\w-]+\.)+[\w-]{2,4}$",
                RegexOptions.Singleline);

the ID that I would like to validate looks like pelai@ÖßÜÄÖ.com

I am too bad at regex do you guys know what am I doing wrong?


You may want to take a look at regexlib.com, they have a fantastic selection of user-created content to do these extremely commont types of matches.

http://regexlib.com/Search.aspx?k=email


First the correct validation of an e-mail address is somewhat more complex as regex. But that apart, the Regex is not at fault, but probably rather how you use it.

Edit (after seeing your code): do you make sure that the string to be tested has no whitespace and such in it? Put a breakpoint on it right there and inspect the string, that might give you an idea of what is going wrong.


You should escape dash (-) within the first char class and no need for dot and plus :

[\w\-.+]

or

[\w.+-]

no need to escape dash if it is the last char.


With "directly from aspx page" you probably mean in a regularexpression validator?

Then you need to be aware that the regex is used by a different system: javascript which has it's own implementation of regex. This means that regexes that work in .Net directly, might fail in js.

The implementations are not too different, the basics are identical. But there might be differences in details (as how an unescaped - is handled) and js lacks some "advanced features" (although your regex doesn't look too "advanced" ;-) ).

Do you see any error messages in the browser?


The problem is those non-ASCII characters in your test address, ÖßÜÄÖ (which you only ever mentioned in a comment to @HansKesting's answer). In .NET, \w matches all Unicode letters and digits, and even several characters besides _ that are classified as connecting punctuation, but in JavaScript it only matches [A-Za-z0-9_].

JavaScript also lacks support for Unicode properties (like \p{L} for letters) and blocks (\p{IsLatin}), so you would have to list any non-ASCII characters you want to allow by their Unicode escapes (\uXXXX). If you just want to support Latin1 letters, I suppose you could use [\w\u00C0-\u00FF], but IDN is supposed to support more than just Latin1, isn't it?

By the way, JavaScript also doesn't support Singleline mode, and even if it did you wouldn't be able to use it. JS does support Multiline and IgnoreCase modes, but there's no way to set them on both the server and client side. The inline modifiers, (?i) and (?m), don't work in JS, and the RegexOptions argument only works server-side.

Fortunately, you don't really need Singleline mode anyway; it allows the . metacharacter to match linefeeds, but the only dots in your regex are matching literal dots.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜