开发者

regex to convert find instances a single \

I am looking to replace \n with \\n but so far my regex attempts are not working (Really it is any \ by itself, \n just happens to be the use case I have in the data).

What I need is something along the lines of:开发者_开发问答

any-non-\ followed by \ followed by any-non-\

Ultimately I'll be passing the regex to java.lang.String.replaceAll so a regex formatted for that would be great, but I can probably translate another style regex into what I need.

For example I after this program to print out "true"...

public class Main
{
    public static void main(String[] args)
    {
        final String original;
        final String altered;
        final String expected;

        original = "hello\nworld";
        expected = "hello\\nworld";
        altered  = original.replaceAll("([^\\\\])\\\\([^\\\\])", "$1\\\\$2");
        System.out.println(altered.equals(expected));
   }
}

using this does work:

    altered  = original.replaceAll("\\n", "\\\\n");


The string should be

"[^\\\\]\\\\[^\\\\]"

You have to quadruple backslashes in a String constant that's meant for a regex; if you only doubled them, they would be escaped for the String but not for the regex.

So the actual code would be

myString = myString.replaceAll("([^\\\\])\\\\([^\\\\])", "$1\\\\$2");

Note that in the replacement, a quadruple backslash is now interpreted as two backslashes rather than one, since the regex engine is not parsing it. Edit: Actually, the regex engine does parse it since it has to check for the backreferences.

Edit: The above was assuming that there was a literal \n in the input string, which is represented in a string literal as "\\n". Since it apparently has a newline instead (represented as "\n"), the correct substitution would be

myString = myString.replaceAll("\\n", "\\\\n");

This must be repeated for any other special characters (\t, \r, \0, \\, etc.). As above, the replacement string looks exactly like the regex string but isn't.


So whenever there is 1 backslash, you want 2, but if there is 2, 3 or 4... in a row, leave them alone?

you want to replace

(?<=[^\\])\\(?!\\+)([^\\])

with

\\$1

That changes the string

hello\nworld and hello\\nworld and hello\\\nworld

into

hello\\nworld and hello\\nworld and hello\\\nworld


I don't know exactly what you need it for, but you could have a look at StringEscapeUtils from Commons Lang. They have plenty of methods doing things like that, and if you don't find exactly what you're searching for, you could have a look at the source to find inspiration :)


Whats wrong with using altered = original.replaceAll("\\n", "\\\\n"); ? That's exactly what i would have done.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜