开发者

Regular expression where part of string must be number between 0-100

I need to validate serial numbers. For this we use regular expressions in C#, and a certain product, part of the serial number is the "seconds since midnight". There are 86400 seconds in a day, but how can I validate it as a 5-digit number in this string?:

654984051-86400-231324

I can't use this concept:

[0-8][0-6][0-4][0-0][0-0]

Because then 86399 wouldn't be valid. How can I overcome this? I want something like:

[00000-86400]

UPDATE

I want to make it clear that I'm aware of - and agree with - the "don't use regular expressions when there's a simpler way" school-of-thought. Jason's answer is exactly how I'd like to do it, however this serial number validation is for all serial numbers that pass through our system - there's currently no custom validation code for these specific ones. In this case I have a good reason for looking for a regex solution.

Of course, if there isn't one, then t开发者_StackOverflow社区hat makes the case for custom validation for these particular products undeniable, but I wanted to explore this avenue fully before going with a solution that requires code changes.


Don't use regex? If you're struggling to come up with the regex to parse this that says that maybe it's too complex and you should find something simpler. I see absolutely no benefit to using regex here when a simple

int value;
if(!Int32.TryParse(s, out value)) {
    throw new ArgumentException();
}
if(value < 0 || value > 86400) {
    throw new ArgumentOutOfRangeException();
}

will work just fine. It's just so clear and easily maintainable.


You don't want to try to use regular expressions for this, you'll end up with something incomprehensible, unwieldy, and difficult to modify (somebody will probably suggest one :). What you want to do is match the string using a regex to make sure that it contains digits in the format you want, then pull out a matching group and check the range using an arithmetic comparison. For example, in pseudocode:

match regex /(\d+)-(\d+)-(\d+)/
serial = capture group 2
if serial >= 0 and serial <= 86400 then
    // serial is valid
end if


Generate a Regular Expression to Match an Arbitrary Numeric Range http://utilitymill.com/utility/Regex_For_Range

yields the following regex expression:

\b0*([0-9]{1,4}|[1-7][0-9]{4}|8[0-5][0-9]{3}|86[0-3][0-9]{2}|86400)\b

Description of output:

First, break into equal length ranges:
  0 - 9
  10 - 99
  100 - 999
  1000 - 9999
  10000 - 86400

Second, break into ranges that yield simple regexes:
  0 - 9
  10 - 99
  100 - 999
  1000 - 9999
  10000 - 79999
  80000 - 85999
  86000 - 86399
  86400 - 86400

Turn each range into a regex:
  [0-9]
  [1-9][0-9]
  [1-9][0-9]{2}
  [1-9][0-9]{3}
  [1-7][0-9]{4}
  8[0-5][0-9]{3}
  86[0-3][0-9]{2}
  86400

Collapse adjacent powers of 10:
  [0-9]{1,4}
  [1-7][0-9]{4}
  8[0-5][0-9]{3}
  86[0-3][0-9]{2}
  86400

Combining the regexes above yields:
  0*([0-9]{1,4}|[1-7][0-9]{4}|8[0-5][0-9]{3}|86[0-3][0-9]{2}|86400)

Tested here: http://osteele.com/tools/rework/


With the standard 'this-is-not-a-particularly-regexy-problem' caveat,

[0-7]\d{4}|8[0-5]\d{3}|86[0-3]\d{2}|86400 


If you really need a pure regex solution I believe this would work although the other posters make a good point about only validating they are digits and then using a matching group to validate the actual number.

([0-7][0-9]{4}) | (8[0-5][0-9]{3}) | (86[0-3][0-9]{2}) | (86400)


I would use regex combined with some .NET code to accomplish this. A pure regex solution isn't going to be easy or efficient to handle large number ranges.

But this will:

Regex myRegex = new Regex(@"\d{9}-(\d{5})-\d{6}");
String value = myRegex.Replace(@"654984051-86400-231324", "$1");

This will grab the value 86400 in this case. And then you'd just check if the captured number is between 0 and 86400 as per Jason's answer.


I don't believe this is possible in regular expressions since this isn't something that can be checked as part of a regular language. In other words, a finite state automata machine cannot recognize this string so a regular expression cannot either.

Edit: This can be recognized by a regex, but not in an elegant way. It would require a monster or chain (e.g.: 00000|00001|00002 or 0{1,5}|0{1,4}1|0{1,4}2). To me, having to enumerate such a large set of possibilities makes it clear that while it is technically possible, it is not feasible or manageable.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜