开发者

Regex non-consecutive chars

Currently I have开发者_如何学C:

[A-Za-z0-9._%+-]

This matches any string that contains letters, numbers, and certain special chars (._%+-)

How can I change this so that it won't match a string that contains the special chars consecutively?

For example, I want it to match: foo.bar+test and foo.+bar and +foo.

But not: foo..bar+test or foo.bar++test or foo.bar++


If your tool/language supports look aheads, try:

^(?!.*([._%+-])\1)[A-Za-z0-9._%+-]+$


^(?:[0-9A-Za-z]+|([._%+-])(?!\1))+$

Broken down:

  • (?:)+ — one or more of either:
    • [0-9A-Za-z]+ — one or more alphanumeric characters or
    • ([._%+-]) — any allowed non-alphanumeric
      • (?!\1) — which isn't followed by the exact same character

Allows:

  • foo
  • foo.+bar
  • -700.bar+baz

Disallows:

  • foo..bar
  • foo.+bar--baz

It works by capturing the matched non-alphanumeric characters into the first backreference (\1) each time the outer, not capturing group is matched and using a negative look-ahead ((?!)) to make sure the same character doesn't appear twice in a row. Be aware that not all regex flavors support negative look-ahead!


How about this:

^(?!.*[._%+-]{2})[\w.%+-]+$

If only the same character cannot be repeated then use:

^(?!.*([._%+-])\1)[\w.%+-]+$


Using PHP's PCRE, you can do this:

/^([A-Za-z0-9]|([._%+-])(?!\2))*$/

The \2 is the back-reference that's required to detect a duplicate usage of the same symbol. I'm not sure it's possible to do this without a forward assertion and a back-reference, so there is my working regex tested against:

'foo'         => true,
'bar.baz'     => true,
'bar.biz.buz' => true,
'bar.+bar'    => true,
'bar..bar'    => false,
'biz.baz..'   => false,
'..++..'      => false,
'.faf.'       => true,
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜