开发者

Python regex: string does not contain "jpg" and must have "-" and lowercase

I'm having troubles figuring out a python regex for django urls. I have a certain criteria, but can't seem to come up with the magic formula. In the end its so I can identify which page is a CMS page and pass to the django function the alias url it should load.

Here are some examples of valid strings which would match:

  • about-us
  • contact-us
  • terms-and-conditions
  • info/learn-more-pg2
  • info/my-example-url

Criteria:

  • Must be all lowercase
  • Must开发者_运维技巧 contain a dash "-"
  • Can contain numbers, letters and a slash "/"
  • Must be at least 4 characters long and a max of 30 characters
  • Cannot contain special characters
  • Cannot contain the words:
    • .jpg
    • .gif
    • .png
    • .css
    • .js

Examples which should not match:

  • About-Us (has upper case)
  • contactus (doesn't have a dash)
  • pg (less than 4 characters)
  • img/bg.gif (contains ".gif")
  • files/my-styles.css (contains ".css")
  • my-page@ (has a character other than letters, numbers, dash or slash)

I know this isn't even close yet, but this is as far as I've gotten:

(?P<alias>([a-z/-]{4,30}))

I apologize for having large requirements, but I just can't get my head wrapped around this regex stuff.

Thanks!


I'm puzzled as to why several of the commentators find that this is hard to do in a regex. This is exactly what regular expressions are good at.

if re.match(
    r"""^             # match start of the string
    (?=.*-)           # assert that there is a dash
    (?!.*\.(?:jpg|gif|png|css|js))  # assert that these words can't be matched
    [a-z0-9/-]{4,30}  # match 4-30 of the allowed characters
    $                 # match the end of the string""", 
    subject, re.VERBOSE):
    # Successful match at the start of the string
else:
    # Match attempt failed

It is true however that since the . isn't among the allowed characters, the check for the forbidden file extensions is not really necessary.


Here’s my first post on SO. Pleeaaase, correct my english whenever it will be needed, I do ask you.

I think that any of the following REs fits right:

'(?=.{4,30}\Z)(?=.*-)[-a-z0-9/]+\Z'

'(?=.{4,30}\Z)[a-z0-9/]\*-[-a-z0-9/]\*\Z'

'(?=.{4,30}\Z)(?:[a-z0-9/]+|)-[-a-z0-9/]*\Z'
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜