Regex non-consecutive chars
Currently I have开发者_如何学C:
[A-Za-z0-9._%+-]
This matches any string that contains letters, numbers, and certain special chars (._%+-
)
How can I change this so that it won't match a string that contains the special chars consecutively?
For example, I want it to match:
foo.bar+test
and foo.+bar
and +foo.
But not:
foo..bar+test
or foo.bar++test
or foo.bar++
If your tool/language supports look aheads, try:
^(?!.*([._%+-])\1)[A-Za-z0-9._%+-]+$
^(?:[0-9A-Za-z]+|([._%+-])(?!\1))+$
Broken down:
(?:
…)+
— one or more of either:[0-9A-Za-z]+
— one or more alphanumeric characters or([._%+-])
— any allowed non-alphanumeric(?!\1)
— which isn't followed by the exact same character
Allows:
foo
foo.+bar
-700.bar+baz
Disallows:
foo..bar
foo.+bar--baz
It works by capturing the matched non-alphanumeric characters into the first backreference (\1
) each time the outer, not capturing group is matched and using a negative look-ahead ((?!
… )
) to make sure the same character doesn't appear twice in a row. Be aware that not all regex flavors support negative look-ahead!
How about this:
^(?!.*[._%+-]{2})[\w.%+-]+$
If only the same character cannot be repeated then use:
^(?!.*([._%+-])\1)[\w.%+-]+$
Using PHP's PCRE, you can do this:
/^([A-Za-z0-9]|([._%+-])(?!\2))*$/
The \2
is the back-reference that's required to detect a duplicate usage of the same symbol. I'm not sure it's possible to do this without a forward assertion and a back-reference, so there is my working regex tested against:
'foo' => true,
'bar.baz' => true,
'bar.biz.buz' => true,
'bar.+bar' => true,
'bar..bar' => false,
'biz.baz..' => false,
'..++..' => false,
'.faf.' => true,
精彩评论