开发者

Java regex, need help with escape characters

My HTML looks like:

<td class="price" valign="top"><font color= "blue">&nbsp;&nbsp;$&nbsp;      5.93&nbsp;</font></td>

I tried:

String result = "";
        Pat开发者_JS百科tern p =  Pattern.compile("\"blue\">&nbsp;&nbsp;$&nbsp;(.*)&nbsp;</font></td>");

        Matcher m = p.matcher(text);

        if(m.find())
            result = m.group(1).trim();

Doesn't seem to be matching.

Am I missing an escape character?


Unless escaped at the regex level, $ means match the end of line. And to get the single \ needed to escape the $ it needs to be escaped in the String literal; i.e. two \ characters. So ...

... Pattern.compile("\"blue\">&nbsp;&nbsp;\\$&nbsp;(.*)&nbsp;</font></td>");

But the folks who commented that you shouldn't use regexes to parse HTML are absolutely right!! Unless you want chronically fragile code, your code should use a strict or non-strict HTML parser.


May be you need to escape $ (I think, with two slashes)?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜