开发者

Problem finding an url within html--with regex?

            for (String line; (line = reader.readLine()) != null;) {//reads html page
                Pattern p = Pattern.compile("https://secure\\.runescape\\.com/m=displaynames/s=[a-zA-Z1-9*]+/check_name\\.ws\\?displayname=");
                Matcher m = p.matcher(line);
                if (m.find()) {
                    System.out.println(m.group(0));
                }

            }

The string in the page looks like: callback_request("https://secure.runescape.com/m=displaynames/s=p2FAuYaMFDgzntbDei*324JUo*3ozJ7hR*h1KNlxc6kPaBeKCBrdKH5kzljYSfUa/check_name.ws?displayname=" + esc开发者_Go百科ape(text), handleResult);

However it's not returning any results. Am I doing something wrong? Apologies for the noobish question, I'm still learning java.


As per your regular expression you are missing a ? in the test expression.

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Regex {
    public static void main(String[] args) 
    {
        Pattern p = Pattern.compile("https://secure\\.runescape\\.com/m=displaynames/.*/check_name\\.ws\\?displayname=(\\?)?");
        Matcher m = p.matcher("callback_request(\"https://secure.runescape.com/m=displaynames/s=p2FAuYaMFDgzntbDei*324JUo*3ozJ7hR*h1KNlxc6kPaBeKCBrdKH5kzljYSfUa/check_name.ws?displayname=\" + escape(text), handleResult);");
        if(m.find())
        {
            System.out.println(m.group(0));
        }
    }
}

I suppose in the displayname=? the ending ? is coming from the escape(text) therefore if you make the ? in the displayname=? as optional then it would work. Check the above code for more detail.

>>Output: https://secure.runescape.com/m=displaynames/s=p2FAuYaMFDgzntbDei*324JUo*3ozJ7hR*h1KNlxc6kPaBeKCBrdKH5kzljYSfUa/check_name.ws?displayname=


It looks like your regex is being matched on one line at a time. Are you sure that the URL you are searching for will always be on one line?


You could use a regex tester for debugging, for instance here. A better expression is probably https://secure\.runescape\.com/m=displaynames/s=[a-zA-Z1-9*]+/check_name\.ws\?displayname=

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜