Regular expression: matching only if not ending in particular sequence
I would like to test a url that does NOT end in .html
This is the pattern I come up with:
[/\w\.-]+[^\.html$]
The following matches because it does not end in .html
/blog/category/subcategory/
This doesn't match because it ends in .html:
/blog/category/subcategory/index.html
However, the following does not match, although I want it to match, because it ends in .ht and not .html
/blog/category/subc开发者_如何学运维ategory/index.ht
How should I change my pattern?
You can use a negative lookbehind assertion if your regular expression engine supports it:
^[/\w\.-]+(?<!\.html)$
If you don't have lookbehind assertions but you do have lookaheads then you can use that instead:
^(?!.*\.html$)[/\w\.-]+$
See it working online: rubular
What engine are you using? If it's one that supports lookahead assertions, you can do the following:
/((?!\.html$)[/\w.-])+/
If we break it out into the components, it looks like this:
( # start a group for the purposes of repeating
(?!\.html$) # negative lookahead assertion for the pattern /\.html$/
[/\w.-] # your own pattern for matching a URL character
)+ # repeat the group
This means that, for every character, it tests that the pattern /.html$/ can't match here, before it consumes the character.
You may also want to anchor the entire pattern with ^
at the start and $
at the end to force it to match the entire URL - otherwise it's free to only match a portion of the URL. With this change, it becomes
/^((?!\.html$)[/\w.-])+$/
精彩评论