开发者

Match alternating digits and characters

I have read the Google python regex tutorial about regular expression and tried to test one of the patterns I need.

  • String must be 10 characters long.
  • 1,3,5,7,9 chars must be digits (1 - 5) and should not be repeated
  • other symbols are letters (a, b or c) and the can be repeated

For example:

1a2b4a3c5b # is valid

1a5c4b4a3b # is not valid because of two 4s

So far I've tried:

pattern = r'([1-5])[abc]([开发者_JS百科1-5^\1])[abc]([1-5^\1\2])[abc]([1-5\1\2\3])[abc]([1-5\1\2\3\4])[abc]'

but it failed...


I would suggest something like this instead of regex:

def matches(s):
    return (len(s) == 10 and 
            set(s[::2]) == set('12345') and 
            set(s[1::2]) <= set('abc'))

>>> matches('1a2b4a3c5b')
True
>>> matches('1a5c4b4a3b')
False


You're trying to include a previous match in a negated character class, which is not possible.

The only way to do this is to do something like this:

^([1-5])[abc](?!\1)([1-5])[abc](?!\1|\2)([1-5])[abc](?!\1|\2|\3)([1-5])[abc](?!\1|\2|\3|\4)[1-5][abc]$

Needless to say, this is not something regex should be doing.

A demo:

#!/usr/bin/env python

import re

tests = ['1a2b4a3c5b', '1a5c4b4a3b']

pattern = re.compile(
    r"""(?x)             # enable inline comments and ignore literal spaces
    ^                    # match the start of input
    ([1-5])              # match any of '1'..'5' and store it in group 1
    [abc]                # match 'a', 'b' or 'c'
    (?!\1)([1-5])        # if the digit from group 1 is not ahead, match any of '1'..'5' and store it in group 2
    [abc]                # match 'a', 'b' or 'c'
    (?!\1|\2)([1-5])     # if the digits from group 1 and 2 are not ahead, match any of '1'..'5' and store it in group 3
    [abc]                # match 'a', 'b' or 'c'
    (?!\1|\2|\3)([1-5])  # if the digits from group 1, 2 and 3 are not ahead, match any of '1'..'5' and store it in group 4
    [abc]                # match 'a', 'b' or 'c'
    (?!\1|\2|\3|\4)[1-5] # if the digits from group 1, 2, 3 and 4 are not ahead, match any of '1'..'5'
    [abc]                # match 'a', 'b' or 'c'
    $                    # match the end of input
    """, re.X)

for t in tests:
  if re.match(pattern, t):
    print t

would print:

1a2b4a3c5b


You can use negative lookaheads to determine whether the next number is not one of the previous ones. The following regex should work:

pattern = r'^([1-5])[abc](?!\1)([1-5])[abc](?!\1|\2)([1-5])[abc](?!\1|\2|\3)([1-5])[abc](?!\1|\2|\3|\4)([1-5])[abc]$'

Edit: As Bart pointed out, the regex should start with ^ and end with $ to ensure it only matches exactly that string

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜