开发者

More string matching features

Is it possible to create a regex that matches all strings with five a's and five b's?

Like aaaaabbbbb or ababababab or aabbaabbab.

I imagine it would require polynomial time for a deterministic engine.

Are there other matching languages which would enable such matching?

Update:

I wanted to use the kind of expression for searching, so I changed the one purposed to (?=b*ab*){5}(?=a*ba*){5}([ab]{10}) and it works nicely! :) I'm still uncertain with respects to the performance of an expression开发者_如何学运维 like that. But I guess I can just look up lookahead expressions.

I'm still curious to which other kinds of patterns, that are simple to explain but hard to regex, are out there?


I have all these screws. To hammer them into this piece of wood, should I use a claw or ball-peen hammer?

That's (roughly) what your question is asking. What you should do is just loop through each character of the string. I can do it in C. Watch:

int validate(char *s)
{
    int a = 0, a = 0;
    while(*s)
      {
        switch(*s++)
        {
        case 'a':
            a++;
            break;
        case 'b':
            b++;
            break;
        }
      }
    return a == 5 && b == 5;
}

It is left as an excercise to you to a) convert this to your language of choice, b) modify this to match only consecutive sequences of 'a's and 'b's (if you like) or tweak it to your other specific requirements.

The basic point is that there are much better tools for this job than regex, so unless "a" and "b" are stand-ins for more complicated regular expressions, don't use regexes for this. And even if "a" and "b" are really more complicated regexes, you don't have to solve all problems with One Regex To Rule Them All. You can mix a few useful regexes and a loop of code (like the above) to much greater effect than an enormous (and unmaintainable) Regex-zilla.


You can use lookahead assertions:

^(?=(?:[^a]*a){5}[^a]*$)(?=(?:[^b]*b){5}[^b]*$)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜