开发者

regex to test for correct use of commas

I'm looping through thousands of strings with various regexes to check for simple errors. I would like to add a regex to check for the correct use of commas.

If a comma exists in one of my strings, then it MUST be followed by either whitespace or exactly three digits:

  • valid: ,\s
  • valid: ,\d\d\d

But if a comma is followed by any other pattern, then it is an error:

  • invalid: ,\D
  • invalid: ,\d
  • invalid: ,\d\d
  • invalid: ,\d\d\d\d

The best regex I've come up with thus far is:

Regex CommaError = new Regex(@",(^(\d\d\d)|\S)"); // fails case #2

To test, I am using:

if (CommaError.IsMatch(", ")) // should NOT match
    Console.WriteLine("failed case #1");
if (CommaError.IsMatch(",234")) // should NOT match
    Console.WriteLine("failed case #2");
if (!CommaError.IsMatch("0,a")) // should match
    Console.WriteLine("failed case #3");
if (!CommaError.IsMatch("0,0")) // should match
    Console.WriteLine("failed case #4");
if (!CommaError.IsMatch("0,0a1")) // should match
    Console.WriteLine("failed case #5");

But the regex I gave above fails case #2 (it matches when it should not).

I've invested several hours investigating this, and searched the Web for similar regexes, but have hit a brick wall. What's wrong with my regex?

Update: Peter posted a comment with a regex that works the way I want:

Regex CommaError = new Regex(@",(?!\d\d\d|\s)");

Edit: Well, almost. It fails in this case:

if (!CommaError.IsMatch("1,2345")) // should match
    Console.WriteLine(开发者_StackOverflow中文版"failed case #6");


You can only use ^ to mean not inside of a character class (eg: [^a-b]) in most regex syntaxes.

The simplest thing for you to do would be to invert the condition in your if statement.

If you can't do that for whatever reason you can use a negative lookahead in some regex syntaxes. eg:

,(?!\d\d\d(?!\d)|\s)

In regex syntaxes that don't support negative assertions you can still do what you want, but the bigger the negative match the more complicated the regex gets. eg:

,($|[^ \d]|\d$|\d[^\d]|\d\d$|\d\d[^\d]|\d\d\d\d)

Essentially you have to enumerate all of the bad cases.


In which language are your trying to do this? This is perl-comaptible regular expression to match such case: ,(?!(\s|\d{3}[^\d])) (it will match commas not followed by space or exact 3 digits, so if string matches this regexp it is not valid)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜