More string matching features
Is it possible to create a regex that matches all strings with five a's and five b's?
Like aaaaabbbbb or ababababab or aabbaabbab.
I imagine it would require polynomial time for a deterministic engine.
Are there other matching languages which would enable such matching?
Update:
I wanted to use the kind of expression for searching, so I changed the one purposed to (?=b*ab*){5}(?=a*ba*){5}([ab]{10})
and it works nicely! :)
I'm still uncertain with respects to the performance of an expression开发者_如何学运维 like that. But I guess I can just look up lookahead expressions.
I'm still curious to which other kinds of patterns, that are simple to explain but hard to regex, are out there?
I have all these screws. To hammer them into this piece of wood, should I use a claw or ball-peen hammer?
That's (roughly) what your question is asking. What you should do is just loop through each character of the string. I can do it in C. Watch:
int validate(char *s)
{
int a = 0, a = 0;
while(*s)
{
switch(*s++)
{
case 'a':
a++;
break;
case 'b':
b++;
break;
}
}
return a == 5 && b == 5;
}
It is left as an excercise to you to a) convert this to your language of choice, b) modify this to match only consecutive sequences of 'a's and 'b's (if you like) or tweak it to your other specific requirements.
The basic point is that there are much better tools for this job than regex, so unless "a" and "b" are stand-ins for more complicated regular expressions, don't use regexes for this. And even if "a" and "b" are really more complicated regexes, you don't have to solve all problems with One Regex To Rule Them All. You can mix a few useful regexes and a loop of code (like the above) to much greater effect than an enormous (and unmaintainable) Regex-zilla.
You can use lookahead assertions:
^(?=(?:[^a]*a){5}[^a]*$)(?=(?:[^b]*b){5}[^b]*$)
精彩评论