Selecting image url from html using javascript regexp
I'd like to select the image source url from html code using javascript regexp. I'm using it to simplify using picasaweb images in other websites. I'm rather new at this, and I constructed a regex using h开发者_如何学Gottp://www.regular-expressions.info/javascriptexample.html, and there it works like a charm, but not in my own script. Can somebody point me out the error in my script?
function addImage() {
var picasaDump=prompt("Copy the picasa link");
if (picasaDump!=null && picasaDump!="")
{
var regMatch = new RegExp("http:\/\/\S\.[jJ][pP][eE]?[gG]");
var imageLink = regMatch.exec(picasaDump);
if(imageLink == null) {
alert("Error, no images found");
} else if(imageLink.length > 1) {
alert("Error, multiple images fount");
} else {
further parsing...
}
}
}
EDIT: Some sample input
<a href="http://picasaweb.google.com/lh/photo/NHH78Y0WLPAAzIu0lzKlUA?feat=embedwebsite"><img src="http://lh3.ggpht.com/_ADW_3zOQhj8/TGgN4bXtfMI/AAAAAAAABCA/w6M-JKzNtBk/s144/DSC_2132.jpg" /></a>
Here is another SO thread that talks about the appropriate regular expression for this: Regex to check if valid URL that ends in .jpg, .png, or .gif
Regardless of the regular expression you use, a simple one-liner to test a string is :
({Regular_Expression}>/gi).test({String_To_Test})
For e.g.
(/http:\/\/.+?\.jpg|jpeg/gi).test("http://www.google.com/image.jpg")
in this line
var regMatch = new RegExp("http:\/\/\S\.[jJ][pP][eE]?[gG]");
you're escaping characters in the string rather than in your regular expression. also \S
will only match a single character. it should be
var regMatch = new RegExp("http:\\/\\/\\S+\\.[jJ][pP][eE]?[gG]");
You can try
var regMatch = new RegExp("http:\/\/.+?\.jpg|jpeg","gi");
This would be best implemented with look-behind. However, since JavaScript doesn't support look-behind, we have to mimic it by reversing the string, and using a look-ahead.
String.prototype.reverse = function () {
return this.split('').reverse().join('');
};
var input = '<a href="http://picasaweb.google.com/lh/photo/NHH78Y0WLPAAzIu0lzKlUA?feat=embedwebsite"><img src="http://lh3.ggpht.com/_ADW_3zOQhj8/TGgN4bXtfMI/AAAAAAAABCA/w6M-JKzNtBk/s144/DSC_2132.jpg" /></a>'
var matches = input.reverse().match(/(gepj|gpj|gif|png)\..+?\/\/:ptth(?=\"\=crs)/g);
This will return an array of reversed image URLs, so you'll have to re-reverse them.
for (i = 0; i < matches.length; i++)
{
matches[i] = matches[i].reverse();
}
If you know the format of your image links, you may be able to specify more of a look-ahead, like so:
var matches = input.reverse().match(/(gepj|gpj|gif|png)\..+?\/\/:ptth(?=\"\=crs gmi)/g);
Which will match only if <img
is immediately followed by src
.
Look-behind mimicking taken from Steven Levithan
精彩评论