开发者

RegEx for an invoice format

I'm quite new to regular expressions and I'm trying to creat开发者_JAVA百科e a regex for the validation of an invoice format.

The pattern should be: JjYy (all 4 characters are legit), used 0, 2 or 4 times e.g. no Y's at all is valid, YY is valid, YYYY is valid, but YYY should fail. Followed by a series of 0's repeating 3 to 10 times. The whole should never exceed 10 characters.

examples: JyjY000000 is valid (albeit quite strange) YY000 is valid 000000 is valid jjj000 is invalid jjjj0 is invalid

I learned some basics from here, but my regex fails when it shouldn't. Can someone assist in improving it?

My regex so far is: [JjYy]{0}|[JjYy]{2}|[JjYy]{4}[0]{3,10}.

The following failed also: [JjYy]{0|2|4}[0]{3,10}


As you need the total length to never exceed 10 characters I think you have to handle the three kinds of prefixes separately:

0{3,10}|[JjYy]{2}0{3,8}|[JjYy]{4}0{3,6}


How about:

^([JjYy]{2}){0,2}0{3,10}$

To check the length is ten characters or less, use a string length function rather than a regular expression - don't hammer nails with a screwdriver, and so forth.

Test:

#!perl
use warnings;
use strict;

my $re = qr/^([JjYy]{2}){0,2}0{3,10}$/;

my %tests = qw/JyjY000000 valid
           YY000 valid
           000000 valid
           jjj000 invalid
           jjjj0 invalid/;

for my $k (keys %tests) {
    print "$k is ";
    if ($k =~ /$re/) {
        print "valid";
    } else {
        print "invalid";
    }
    print " and it should be $tests{$k}.\n";
}

Produces

jjjj0 is invalid and it should be invalid.
YY000 is valid and it should be valid.
JyjY000000 is valid and it should be valid.
jjj000 is invalid and it should be invalid.
000000 is valid and it should be valid.


([jJyY]{2}){0,2}0{3,10}

If the total length limit is inclusive of the jJyY part, you can check it with a negative look ahead to make sure there are no more than 10 characters in the string to begin with (?![jJyY0]{11,})

\b(?![jJyY0]{11,})([jJyY]{2}){0,2}0{3,10}\b


It may depend on what you are using to implement the regular expression. For example I found out the other day that Notepad++ only supports a few basic operators. Things like the pipe are not part of the core regex standard.

I'd suggest something like this:

([JjYy]{2}([JjYy]{2})?)?[0]{3,10}

If you're using a programming language, you'll need to use a string length function to validate the length.

EDIT: actually, you should be able to validate the length by separating the different situations:

([0]{3,10})|([JjYy]{2}[0]{3,8})|([JjYy]{4}[0]{3,6})


You want to limit the string to 10 characters. So in order to do this you have to consider what valid combinations will make up 10 characters.

Valid combinations therefore would be:

  • 0000000000
  • 000
  • cc00000000
  • cc000
  • cccc000000
  • cccc000

So, an expression to include all of these would be: /0{3,10}|[JY]{2}0{3,8}|[JY]{4}0{3,6}/i

A case insensitive match would suffice, although you do get additional performance from some regular expression engines by explicitly saying /[JjYy]/ instead of /[JY]/i.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜