Java regular expression to get the img src
I am trying to fetch the data from the html page. This data is image link. Page has always different content so only way is to use regular expression. There is only one match on the page with the following style
<img src="imglink" alt="texttext textex" style="border:1px solid #FFFFFF"/>
What am I using to g开发者_如何学Pythonet the imglink
"<img src=\"(.*)\""
Is there something I don't know about using regular expression? I must be easy as pie, but it get me all the text after < and before />
Try to use the non-greedy version
"<img src=\"(.*?)\""
in order to match as few characters as possible.
Please note: do only use regular expressions to handle html or xml if you have a simple structure of the text which is known. For arbitrary htlm/xml do not use regex.
As a rule of thumb when trying to select chars between to separators I make it a point to put "next expected separator char" in the selection clause instead of ".".
So in this case:
"<img src=\"([^\"]*)\""
精彩评论