Python regex: string does not contain "jpg" and must have "-" and lowercase
I'm having troubles figuring out a python regex for django urls. I have a certain criteria, but can't seem to come up with the magic formula. In the end its so I can identify which page is a CMS page and pass to the django function the alias url it should load.
Here are some examples of valid strings which would match:
- about-us
- contact-us
- terms-and-conditions
- info/learn-more-pg2
- info/my-example-url
Criteria:
- Must be all lowercase
- Must开发者_运维技巧 contain a dash "-"
- Can contain numbers, letters and a slash "/"
- Must be at least 4 characters long and a max of 30 characters
- Cannot contain special characters
- Cannot contain the words:
- .jpg
- .gif
- .png
- .css
- .js
Examples which should not match:
- About-Us (has upper case)
- contactus (doesn't have a dash)
- pg (less than 4 characters)
- img/bg.gif (contains ".gif")
- files/my-styles.css (contains ".css")
- my-page@ (has a character other than letters, numbers, dash or slash)
I know this isn't even close yet, but this is as far as I've gotten:
(?P<alias>([a-z/-]{4,30}))
I apologize for having large requirements, but I just can't get my head wrapped around this regex stuff.
Thanks!
I'm puzzled as to why several of the commentators find that this is hard to do in a regex. This is exactly what regular expressions are good at.
if re.match(
r"""^ # match start of the string
(?=.*-) # assert that there is a dash
(?!.*\.(?:jpg|gif|png|css|js)) # assert that these words can't be matched
[a-z0-9/-]{4,30} # match 4-30 of the allowed characters
$ # match the end of the string""",
subject, re.VERBOSE):
# Successful match at the start of the string
else:
# Match attempt failed
It is true however that since the .
isn't among the allowed characters, the check for the forbidden file extensions is not really necessary.
Here’s my first post on SO. Pleeaaase, correct my english whenever it will be needed, I do ask you.
I think that any of the following REs fits right:
'(?=.{4,30}\Z)(?=.*-)[-a-z0-9/]+\Z'
'(?=.{4,30}\Z)[a-z0-9/]\*-[-a-z0-9/]\*\Z'
'(?=.{4,30}\Z)(?:[a-z0-9/]+|)-[-a-z0-9/]*\Z'
精彩评论