开发者

Regular expression: matching only if not ending in particular sequence

I would like to test a url that does NOT end in .html

This is the pattern I come up with:

[/\w\.-]+[^\.html$]

The following matches because it does not end in .html

/blog/category/subcategory/

This doesn't match because it ends in .html:

/blog/category/subcategory/index.html

However, the following does not match, although I want it to match, because it ends in .ht and not .html

/blog/category/subc开发者_如何学运维ategory/index.ht

How should I change my pattern?


You can use a negative lookbehind assertion if your regular expression engine supports it:

^[/\w\.-]+(?<!\.html)$

If you don't have lookbehind assertions but you do have lookaheads then you can use that instead:

^(?!.*\.html$)[/\w\.-]+$

See it working online: rubular


What engine are you using? If it's one that supports lookahead assertions, you can do the following:

/((?!\.html$)[/\w.-])+/

If we break it out into the components, it looks like this:

(            # start a group for the purposes of repeating
 (?!\.html$) # negative lookahead assertion for the pattern /\.html$/
 [/\w.-]     # your own pattern for matching a URL character
)+           # repeat the group

This means that, for every character, it tests that the pattern /.html$/ can't match here, before it consumes the character.

You may also want to anchor the entire pattern with ^ at the start and $ at the end to force it to match the entire URL - otherwise it's free to only match a portion of the URL. With this change, it becomes

/^((?!\.html$)[/\w.-])+$/
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜