开发者

Regex for matching different URL

I am trying to write a regex for a URL match and it's not working.

The subject URL is...

var subject = "http://encore.lsbu.ac.uk/iii/encore/search/C%257CSab%257COrightresult%257CU1?lang=eng&suite=pearl";

The regex is...

var regex = /http:\/\/encore.lsbu.ac.uk\/iii\/encore\/search\/C%257CS[a-z][A-Z][0-9]%257COrightresult%257CU1?lang=eng&suite=pearl/i;

And I am using JavaScript to test it.

var answer = rege开发者_开发知识库x.test(subject);
// answer is false

The goal is to match any URL with the regex which has keyword changed in the middle of the string but whole string matches the URL. In other words, both string should be matched except the part of keyword shouldn't be checked. Am I doing anything wrong?


Here is the fixed regex:

/http:\/\/encore\.lsbu\.ac\.uk\/iii\/encore\/search\/C%257CS[a-z0-9]*%257COrightresult%257CU1\?lang=eng&suite=pearl/i

I made the following changes to it:

  • Escaped the dots in the URL with a backslash.
  • Changed [a-z][A-Z][0-9] to [a-z0-9]* to match any of those characters repeated zero or more times.
    • You can also use + instead of * to match the characters once or more, or {2} to match them exactly twice (this is the case in your example string, but may not be in all cases).
  • Escaped the question mark in the URL with a backslash.

Edit: Removed the A-Z since it isn't needed (the regex is case insensitive).


If I understand correctly, you are trying to match the URL except you want to accept any keyword chars instead of the 'ab' in your original string.

For reference, [a-z][A-Z][0-9] says that you want a lowercase character, then an uppercase character, then a digit. Try using \w instead, which matches any of those.

If you want to accept any length of keyword, then use \s+. If you want to limit it to exactly two characters, try \w{2,2}.

So the relavant part of your url would be:

/C%257CS\s+%257
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜