perl regular expression
I am having some URLs in this format. Some URLs contain &abc=4
and some not.
xxxxxxxxxxxxxxxxxxxxxxxxxxx&abc=4
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx&abc=4
xxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
here xxxxxxxxxxxxxxxxxxxxx
is string
I want to match URLs which have xxxxxxxxxxxxxxxxx
only and not &abc=4
(meaning I want to get these type of URLs, only xxxxxxxxxxxxxx
, xxxxxxxxxxxxxx
, xxx
)
I know how to write a regular expression which matches the entire url. For example: /x.*abc=4/
But how do I write a regular expre开发者_StackOverflow社区ssion that matches only xxxxxxxxxx
and not &abc=4
?
I would use negative look-ahead assertion (Look ahead what is not allowed to follow my pattern)
^(?!.*&abc=4$).*$
This pattern will match any string that does not end with &abc=4
you can verify it online here: http://www.rubular.com/
Use negative lookbehind assertion. The form is:
(?<![&?]abc=4)
(this will also exclude ?abc=4
).
Assuming your URLs are on each line, you can use:
([^&]+?)
This basically will match anything up to the the first instance of &.
As @Benoit said, you can do this using a zero width expression to negate the capture of the query string, but you would be after a positive lookahead, and not a negative lookbehind, syntax example below:
(?=(&[^=]+?\d+)+)
As you can see though, this would complicate the expression a touch.
Hope this helps.
精彩评论