Modify regular expression for a list of numbers and numeric range expressions
I am using ExtJS. One of the textfield made with ExtJS component should allow comma separated number/opeator strings (3 similar examples) like
1, 2-3, 4..5, <6, <=7, >8, >=9
>2, 3..5, >=9,>10
<=9, 1, <=8, 4..5, 8-9
Here I am using equals, range (-), sequence (..) & greater than/equal to operators for numbers less than or equal to 100. These numbers are separated by a comma.
What can be a regular 开发者_如何学运维expression for this type of string?
For my previously asked question.. I got a solution from "dlamblin":
^(?:\d+(?:(?:\.\.|-)\d+)?|[<>]=?\d+)(?:,\s*\d+(?:(?:\.\.|-)\d+)?|[<>]=?\d+)*$
This works perfect for all patterns except:
Only if relationship operators (
<
,<=
,>
,>=
) are present as first element of the string. E.g.<=3, 4-5, 6, 7..8
works perfect, but<=3, 4-5, 6, 7..8, >=5
relationship operator not at 1st element of string.Also string
<3<4, 5, 9-4
does not give any error i.e. it is satisfying condition though comma is needed between<3
and<4
.Numbers in the string should be less than or equal to 100. i.e.
<100
,0-100
,99..100
It should not allow leading zeros (like
003
,099
)
Scrap that and use a tokenizer instead. Split up the string by commas, then look at each token and decide (possibly using a regular expression) which type of relationship it is. If it's none of the existing relationships, it's invalid. If any relationship contains a number that's too big, it's invalid.
For the sake of your sanity and the people who will have to maintain this code after you're done with it, don't use regular expressions to validate such a complicated interrelated set of rules. Break it down into simpler chunks.
Welbog's advice to use a tokenizer is the sane option.
If you have some other constraint that forces a regular expression, you can use
^(<|<=|>|>=)?\s*(100|0|[1-9]\d?)((\.\.|-)(100|0|[1-9]\d?))?(,\s*(<|<=|>|>=)?\s*(100|0|[1-9]\d?)((\.\.|-)(100|0|[1-9]\d?))?)*$
That's the result of expanding manually the following:
num = (100|0|[1-9]\d?)
op = (<|<=|>|>=)
range = op?\s*num((\.\.|-)num)?
expr = ^range(,\s*range)*$
I agree with Welbog that pre/post processing should be a better choice.
BUT since I like to so RegEx so here is my solution.
^[ \t]*(?:(?:0|[1-9][0-9]?|100)(?:(?:\-|\.\.)(?:0|[1-9][0-9]?|100))?|(?:[<>]=?)(?:0|[1-9][0-9]?|100))(?:[ \t]*,[ \t]*(?:(?:0|[1-9][0-9]?|100)(?:(?:\-|\.\.)(?:0|[1-9][0-9]?|100))?|(?:[<>]=?)(?:0|[1-9][0-9]?|100)))*[ \t]*$
'\s
' is not used as it may include '\n
' in some engine.
'\d
' is not used as you will need [1-9]
so [0-9]
will be easier to use.
'(?:0|[1-9][0-9]?|100)
' will match a number from 0 to 100 without leading zero.
'(?:[<>]=?)(?:0|[1-9][0-9]?|100)
' will match conditions follows by a number (if you want to match '=
' too, just adjust it).
'(?:0|[1-9][0-9]?|100)(?:(?:\-|\.\.)(?:0|[1-9][0-9]?|100))?
' will match a number with optional range or sequence.
Full explanation:
^
[ \t]* // Prefix spaces
(?: // A valid term
// A number
(?:0|[1-9][0-9]?|100)
// Optional range or sequence
(?:
(?:\-|\.\.)
(?:0|[1-9][0-9]?|100)
)?
|
// Condition and number
(?:[<>]=?)(?:0|[1-9][0-9]?|100)
)
(?: // Other terms
[ \t]*,[ \t]* // Comma with prefix and suffix spaces
(?: // A valid term
// A number
(?:0|[1-9][0-9]?|100)
// Optional range or sequence
(?:
(?:\-|\.\.)
(?:0|[1-9][0-9]?|100)
)?
|
// Condition and number
(?:[<>]=?)(?:0|[1-9][0-9]?|100)
)
)*
[ \t]* // Tail spaces
I test with regex-search of Eclipse and it work.
Hope this helps.
This should work:
^(?:(?:\s*((?:\<|\>|\<\=|\>\=)?(?:[1-9]|[1-9]\d|100))\s*(?:,|$))|(?:\s*((?:[1-9]|[1-9]\d|100)(?:\.\.|\-)(?:[1-9]|[1-9]\d|100))\s*(?:,|$)))*$
(You'll need to use the "multiline" option, obviously.)
If you have the advantage of a regex engine that supports the "ignore whitespace" option, then you could break it up like this:
^ # beginning of line
(?:
(?:
\s* # any whitespace
( # capture group
(?:<|>|<=|>=)? # inequality
(?:[1-9]|[1-9]\d|100) # single value
)
\s* # any whitespace
(?:,|$) # comma or end of line
)
|
(?:
\s* # any whitespace
( # catpure group
(?:[1-9]|[1-9]\d|100) # single value
(?:\.\.|\-) # range modifier
(?:[1-9]|[1-9]\d|100) # single value
)
\s* # any whitespace
(?:,|$) # comma or end of line
)
)+ # one or more of all this
$ # end of line
As you can see, it matches your examples in Expresso:
精彩评论