Non greedy regex match, JavaScript and ASP
I need t开发者_如何学JAVAo do a non greedy match and hope someone can help me. I have the following, and I am using JavaScript and ASP
match(/\href=".*?\/pdf\/.*?\.pdf/)
The above match, matches the first start of an href tag. I need it to only match the last href that is part of the /pdf/
folder.
any ideas ?
You need to use capturing parenthesis for sub-expression matches:
match(/\href=".*?(\/pdf\/.*?\.pdf)/)[1];
Match will return an array with the entire match at index 0, all sub expression captures will be added to the array in the order they matched. In this case, index 1
contains the section matching \/pdf\/.*?\.pdf
.
Try and make your regex more specific than just
.*?
if it's matching too broadly. For instance:
match(/\href="([^"]+?\/pdf\/[^\.]+?\.pdf)"/)[1];
[^"]+?
will lazily match a string of characters that doesn't contain the double quote character. This will limit the match to staying within the quotes, so the match won't be too broad in the following string, for instance:
<a href="someurl/somepage.html">Test</a><a href="dir/pdf/file.pdf">Some PDF</a>
精彩评论