Problem finding an url within html--with regex?
for (String line; (line = reader.readLine()) != null;) {//reads html page
Pattern p = Pattern.compile("https://secure\\.runescape\\.com/m=displaynames/s=[a-zA-Z1-9*]+/check_name\\.ws\\?displayname=");
Matcher m = p.matcher(line);
if (m.find()) {
System.out.println(m.group(0));
}
}
The string in the page looks like: callback_request("https://secure.runescape.com/m=displaynames/s=p2FAuYaMFDgzntbDei*324JUo*3ozJ7hR*h1KNlxc6kPaBeKCBrdKH5kzljYSfUa/check_name.ws?displayname=" + esc开发者_Go百科ape(text), handleResult);
However it's not returning any results. Am I doing something wrong? Apologies for the noobish question, I'm still learning java.
As per your regular expression you are missing a ?
in the test expression.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Regex {
public static void main(String[] args)
{
Pattern p = Pattern.compile("https://secure\\.runescape\\.com/m=displaynames/.*/check_name\\.ws\\?displayname=(\\?)?");
Matcher m = p.matcher("callback_request(\"https://secure.runescape.com/m=displaynames/s=p2FAuYaMFDgzntbDei*324JUo*3ozJ7hR*h1KNlxc6kPaBeKCBrdKH5kzljYSfUa/check_name.ws?displayname=\" + escape(text), handleResult);");
if(m.find())
{
System.out.println(m.group(0));
}
}
}
I suppose in the displayname=?
the ending ?
is coming from the escape(text)
therefore if you make the ?
in the displayname=?
as optional then it would work. Check the above code for more detail.
>>Output: https://secure.runescape.com/m=displaynames/s=p2FAuYaMFDgzntbDei*324JUo*3ozJ7hR*h1KNlxc6kPaBeKCBrdKH5kzljYSfUa/check_name.ws?displayname=
It looks like your regex is being matched on one line at a time. Are you sure that the URL you are searching for will always be on one line?
You could use a regex tester for debugging, for instance here. A better expression is probably https://secure\.runescape\.com/m=displaynames/s=[a-zA-Z1-9*]+/check_name\.ws\?displayname=
精彩评论