开发者

Regular expression to match unescaped special characters only

I'm trying to come up with a regular expression that can match only characters not preceded by a special escape sequence in a string.

For instance, in the string Is ? stranded//? , I want to be able to replace the ? which hasn't been escaped with another string, so I can have this result : **Is Dave stranded?**

But for the life of me I have not been able to figure out a way. I have only come up with regular expressions that eat all the replaceable characters.

How do you construct a regular expression that matches only characters not preceded by an e开发者_运维技巧scape sequence?


Use a negative lookbehind, it's what they were designed to do!

(?<!//)[?]

To break it down:

(
    ?<!    #The negative look behind.  It will check that the following slashes do not exist.
    //     #The slashes you are trying to avoid.
)
[\?]       #Your special charactor list.

Only if the // cannot be found, it will progress with the rest of the search.

I think in Java it will need to be escaped again as a string something like:

Pattern p = Pattern.compile("(?<!//)[\\?]");


Try this Java code:

str="Is ? stranded//?";
Pattern p = Pattern.compile("(?<!//)([?])");
m = p.matcher(str);
StringBuffer sb = new StringBuffer();
while (m.find()) {
    m.appendReplacement(sb, m.group(1).replace("?", "Dave"));
}
m.appendTail(sb);
String s = sb.toString().replace("//", "");
System.out.println("Output: " + s);

OUTPUT

Output: Is Dave stranded?


I was thinking about this and have a second simplier solution, avoiding regexs. The other answers are probably better but I thought I might post it anyway.

String input = "Is ? stranded//?"; 
String output = input
    .replace("//?", "a717efbc-84a9-46bf-b1be-8a9fb714fce8")
    .replace("?", "Dave")
    .replace("a717efbc-84a9-46bf-b1be-8a9fb714fce8", "?");

Just protect the "//?" by replacing it with something unique (like a guid). Then you know any remaining question marks are fair game.


Use grouping. Here's one example:

import java.util.regex.*;

class Test {
    public static void main(String[] args) {
        Pattern p = Pattern.compile("([^/][^/])(\\?)");
        String s = "Is ? stranded//?";
        Matcher m = p.matcher(s);
        if (m.matches)
            s = m.replaceAll("$1XXX").replace("//", "");
        System.out.println(s + " -> " + s);
    }
}

Output:

$ java Test
Is ? stranded//? -> Is XXX stranded?

In this example, I'm:

  • first replacing any non-escaped ? with "XXX",
  • then, removing the "//" escape sequences.

EDIT Use if (m.matches) to ensure that you handle non-matching strings properly.

This is just a quick-and-dirty example. You need to flesh it out, obviously, to make it more robust. But it gets the general idea across.


Match on a set of characters OTHER than an escape sequence, then a regex special character. You could use an inverted character class ([^/]) for the first bit. Special case an unescaped regex character at the front of the string.


String aString = "Is ? stranded//?";

String regex = "(?<!//)[^a-z^A-Z^\\s^/]";
System.out.println(aString.replaceAll(regex, "Dave"));

The part of the regular expression [^a-z^A-Z^\\s^/] matches non-alphanumeric, whitespace or non-forward slash charaters.

The (?<!//) part does a negative lookbehind - see docco here for more info

This gives the output Is Dave stranded//?


try matching:

(^|(^.)|(.[^/])|([^/].))[special characters list]


I used this one:

((?:^|[^\\])(?:\\\\)*[ESCAPABLE CHARACTERS HERE])

Demo: https://regex101.com/r/zH1zO3/4

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜