开发者

Regex to remove all but file name from links

I am trying to write a regexp that removes file paths from links and images.

href="path/path/file" to href="file"
href="/file" to href="file"
src="/path/file" to src="file"

and so on...

I thought that I had it working, but it messes up if there are two paths in the string it is working on. I think my expression is too greedy. It finds the very last file in the entire string.

This is my code that shows the expression messing up on the test input:

<script type="text/javascript" src="/javascripts/jquery.js"></script>
<script type="text/jav开发者_运维问答ascript">
    $(document).ready(function(){
        var s = '<a href="one/keepthis"><img src="/one/two/keep.this"></a>';
        var t = s.replace(/(src|href)=("|').*\/(.*)\2/gi,"$1=$2$3$2");
        alert(t);
    });
</script>

It gives the output:

<a href="keep.this"></a>

The correct output should be:

<a href="keepthis"><img src="keep.this"></a>

Thanks for any tips!


It doesn't have to be a regular expression (assuming / delimiters):

var fileName = url.split('/').pop(); //pop takes the last element


I would suggest run separate regex replacement, one for a links and another for img, easier and clearer, thus more maintainable.


This seems to work in case anyone else has the problem:

var t = s.replace(/(src|href)=('|")([^ \2]*\/)*\/?([^ \2]*)\2/gi,"$1=$2$4$2");


Try adding ? to make the * quantifiers non-greedy. You want them to stop matching when they encounter the ending quote character. The greedy versions will barrel right on past the ending quote if there's another quote later in the string, finding the longest possible match; the non-greedy ones will find the shortest possible match.

/(src|href)=("|').*?\/([^/]*?)\2/gi

Also I changed the second .* to [^/]* to allow the first .* to still match the full path now that it's non-greedy.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜