开发者

Escaping double-slashes with regular expressions in Java

I have this unit test:

public void testDeEscapeResponse() {
    final String[] inputs = new String[] {"peque\\\\u0f1o", "peque\\u0f1o"};
    final String[] expected = new String[] {"peque\\u0f1o", "peque\\u0f1o"};
    for (int i = 0;开发者_JAVA技巧 i < inputs.length; i++) {
        final String input = inputs[i];
        final String actual = QTIResultParser.deEscapeResponse(input);
        Assert.assertEquals(
            "deEscapeResponse did not work correctly", expected[i], actual);
    }
}

I have this method:

static String deEscapeResponse(String str) {
    return str.replaceAll("\\\\", "\\");
}

The unit test is failing with this error:

java.lang.StringIndexOutOfBoundsException: String index out of range: 1
    at java.lang.String.charAt(String.java:686)
    at java.util.regex.Matcher.appendReplacement(Matcher.java:703)
    at java.util.regex.Matcher.replaceAll(Matcher.java:813)
    at java.lang.String.replaceAll(String.java:2189)
    at com.acme.MyClass.deEscapeResponse
    at com.acme.MyClassTest.testDeEscapeResponse

Why?


Use String.replace which does a literal replacement instead of String.replaceAll which uses regular expressions.

Example:

"peque\\\\u0f1o".replace("\\\\", "\\")    //  gives  peque\u0f1o

String.replaceAll takes a regular expression thus \\\\ is interpreted as the expression \\ which in turn matches a single \. (The replacement string also has special treatment for \ so there's an error there too.)

To make String.replaceAll work as you expect here, you would need to do

"peque\\\\u0f1o".replaceAll("\\\\\\\\", "\\\\")


I think the problem is that you're using replaceAll() instead of replace(). replaceAll expects a regular expression in the first field and you're just trying to string match.


See javadoc for Matcher:

Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.

Thus with replaceAll you cannot replace anything with a backslash. Thus a really crazy workaround for your case would be str.replaceAll("\\\\(\\\\)", "$1")

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜