HTML type string parsing question!
<a href="http://www.google.com/map" class="more-link">look at the Google map</a>
Is there any parser to get the link(www.google.com/map) from the <a>
tag?
or the be开发者_Python百科st way just to write a custom one~
jQuery, for instance:
var href = $('a.more-link').attr('href');
There is many 3:rd party solutions but I am not sure which exist for Java, maybe HTML agility pack exists in a version for Java.
But another solution would be to use regex
/<a\s+[^<]*?href\s*=\s*(?:(['"])(.+?)\1.*?|(.+?))>/
Fixed the regex to handle problems suggested in comments.
Looked up some real HTML parsers for Java if you find you need more than the regex aproach
http://htmlparser.sourceforge.net/
http://jericho.htmlparser.net/docs/index.html
http://jsoup.org/
精彩评论