Regex: 5 digits in increasing order

2023-01-05 09:30 问答作者：

I need a regex for 5 digits in increasing order, like 12345, 24579, 34680开发者_运维问答, and so on.

0 comes after 9.

You can try (as seen on rubular.com)

^(?=\d{5}$)1?2?3?4?5?6?7?8?9?0?$

Explanation

^ and $ are the beginning and end of string anchors respectively
\d{5} is the digit character class \d repeated exactly {5} times
(?=...) is a positive lookahead
? on each digit makes each optional

How it works

First we use lookahead to assert that anchored at the beginning of the string, we can see \d{5} till the end of the string
Now that we know that we have 5 digits, we simply match the digits in the order we want, but making each digit optional
- The assertion ensures that we have the correct number of digits

regular-expressions.info

Anchors, Character Classes, Finite Repetition, Lookarounds, and Optional

Generalizing the technique

Let's say that we need to match strings that consists of:

between 1-3 vowels [aeiou]
and the vowels must appear in order

Then the pattern is (as seen on rubular.com):

^(?=[aeiou]{1,3}$)a?e?i?o?u?$

Again, the way it works is that:

Anchored at the beginning of the string, we first assert (?=[aeiou]{1,3}$)
- So correct alphabet in the string, and correct length
Then we test for each letter, in order, making each optional, until the end of the string

Allowing repetition

If each digit can repeat, e.g. 11223 is a match, then:

instead of ? (zero-or-one) on each digit,
we use * (zero-or-more repetition)

That is, the pattern is (as seen on rubular.com):

^(?=\d{5}$)1*2*3*4*5*6*7*8*9*0*$

Wrong tool for the job. Just iterate through the characters one by one and check it. How you would do that depends on which language you're using.

Here is how to check using C:

#include <stdio.h>
#define CHR2INT(c) c - '0'

int main(void)
{
    char *str = "12345";
    int i, res = 1;

    for (i = 1; i < 5; ++i) {
        res &= CHR2INT(str[i - 1]) < CHR2INT(str[i]) && str[i] >= '0' && str[i] <= '9';
    }

    printf("%d", res);

    return 0;
}

It is obviously longer than a regex solution, but a regex solution will never be as fast as that.

polygenelubricants's suggestion is a great one, but there's a better one and that's to use a simpler lookahead constraint given that the bulk of the RE checks for the numeric-ness of the characters anyway. For why, see this log of an interactive Tcl session:

% set RE1 "^(?=\\d{5}$)1?2?3?4?5?6?7?8?9?0?$"
^(?=\d{5}$)1?2?3?4?5?6?7?8?9?0?$
% set RE2 "^(?=.{5}$)1?2?3?4?5?6?7?8?9?0?$"
^(?=.{5}$)1?2?3?4?5?6?7?8?9?0?$
% time {regexp $RE1 24579} 100000
32.80587355 microseconds per iteration
% time {regexp $RE2 24579} 100000
22.598555649999998 microseconds per iteration

As you can see, it's about 30% faster to use the version of the RE with .{5}$ as a lookahead constraint, at least in the Tcl RE engine. (Note that the above log misses some lines where I was stabilizing the compilations of the regular expressions, though I'd anticipate RE2 to be a little faster to compile anyway.) If you're using a different RE engine (e.g., PCRE or Perl) then you should recheck to get your own performance figures.

This is not something that regular expressions are generally good for. The sort of regex you're going to need to acheive this is likely to be bigger and uglier than simple procedural code to do the same thing.

By all means use a regex to ensure you have five digits in your string but then just use normal coding checks to ensure the order is correct.

You don't bang in nails with a screwdriver (well, not if you're smart), so you shouldn't be trying to use regular expressions for every job either :-)

继续阅读：regex

Regex: 5 digits in increasing order

Explanation

How it works

regular-expressions.info

Generalizing the technique

Allowing repetition

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？

Explanation

How it works

regular-expressions.info

Generalizing the technique

Allowing repetition

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生 新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？