Regex for [a-zA-Z0-9\-] with dashes allowed in between but not at the start or end
Update:
This question was an epic failure, but here's the working solution. It's based on Gumbo's answer (Gumbo's was close to working so I chose it as the accepted answer):
Solution:
r'(?=[a-zA-Z0-9\-]{4,25}$)^[a-zA-Z0-9]+(\-[a-zA-Z0-9]+)*$'
Original Question (albeit, after 3 edits)
I'm using Python and I'm not trying to extract the value, but rather test to make sure it fits the pattern.
allowed 开发者_如何学运维values:
spam123-spam-eggs-eggs1
spam123-eggs123
spam
1234
eggs123
Not allowed values:
eggs1-
-spam123
spam--spam
I just can't have a dash at the starting or the end. There is a question on here that works in the opposite direction by getting the string value after the fact, but I simply need to test for the value so that I can disallow it. Also, it can be a maximum of 25 chars long, but a minimum of 4 chars long. Also, no 2 dashes can touch each other.
Here's what I've come up with after some experimentation with lookbehind, etc:
# Nothing here
Try this regular expression:
^[a-zA-Z0-9]+(-[a-zA-Z0-9]+)*$
This regular expression does only allow hyphens to separate sequences of one or more characters of [a-zA-Z0-9]
.
Edit Following up your comment: The expression (…)*
allows the part inside the group to be repeated zero or more times. That means
a(bc)*
is the same as
a|abc|abcbc|abcbcbc|abcbcbcbc|…
Edit Now that you changed the requirements: As you probably don’t want to restrict each hyphen separated part of the words in its length, you will need a look-ahead assertion to take the length into account:
(?=[a-zA-Z0-9-]{4,25}$)^[a-zA-Z0-9]+(-[a-zA-Z0-9]+)*$
The current regex is simple and fairly readable. Rather than making it long and complicated, have you considered applying the other constraints with normal Python string processing tools?
import re
def fits_pattern(string):
if (4 <= len(string) <= 25 and
"--" not in string and
not string.startswith("-") and
not string.endswith("-")):
return re.match(r"[a-zA-Z0-9\-]", string)
else:
return None
It should be something like this:
^[a-zA-Z0-9]+(-[a-zA-Z0-9]+)*$
You are telling it to look for only one char, either a-z, A-Z, 0-9 or -, that is what [] does.
So if you do [abc]
you will match only "a", or "b" or "c". not "abc"
Have fun.
If you simply don't want a dash at the end and beginning, try ^[^-].*?[^-]$
Edit: Bah, you keep changing it.
精彩评论