lookahead assertions
I'm trying to match a label within a valid domain name using a regular expression in Python:
DOMAIN_LABEL_RE = """
\A(
(?&开发者_如何学Clt;![\d\-]) # cannot start with digit or hyphen, looking behind
([a-zA-Z\d\-]*?)
([a-zA-Z]+)# need at least 1 letter
([a-zA-Z\d\-]*?)
(?!\-) # cannot end with a hyphen, looking ahead
)\Z
"""
I'm trying to use a positive and negative assertion to avoid a hyphen at the beginning or end of the label.
But the string "-asdf" still matches: e.match(DOMAIN_LABEL_RE, "-asdf", re.VERBOSE).group()
I don't understand why it's still matching.
Thanks for any help.
M.
\A
matches the start of the string and the following lookbehind matches if there is no hyphen before that position.
You are at the beginning of the string, of course there is no character before it!
Use a negative lookahead instead: (?![\d\-])
.
Similar for the end of the string. You have to use a negative lookbehind instead (?<!\-)
.
I think an equivalent expressions to your current one would be:
DOMAIN_LABEL_RE = """
(?i # case insensitive
\A(
([a-z]) # need at least 1 letter and cannot start with digit or hyphen
([a-z\d-]*?)
(?<!-) # cannot end with a hyphen
)\Z
)
"""
Note: I did not check whether the expression is actually suited for the problem you are trying to solve.
精彩评论