Extract data and format them using RegEx
I'm having three strings that I have to glue together.
I have an input string (string 1), which I have to run a regex (which has groups) on (string 2) and extract these groups to put them in a template (string 3) using backreferences.
A short example could be :
input: "foo1234bar5678"
regex: ".*?(\\d*).*?(\\d*).*"
template: "answer: $1 $2"
which should be expanded in "answer: 1234 5678".
I have been using java.util.regex.Pattern, but I can't figure out a way to do this with matchers. Obviously, replaceAll is not the expected behaviour, nor is append*.
Is there a way to do this nicely using the android API ?
EDIT: Here is a basic implementation :
public static String genOutput(String regex, String input, String template) {
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(input);
if (m.find()) {
for (int i = 1; i <= m.groupCount开发者_JAVA百科(); i++) {
template = template.replaceAll("\\$" + i, m.group(i));
}
}
return template;
}
Here is how I would do it:
Pattern p = Pattern.compile("(?:\\D*(\\d*)\\D*)+");
Matcher m = p.matcher(input);
if (m.find()) {
String result = "answer: ";
for (int i = 1; i < m.groupCount(); i++) {
result += m.group(i) + " ";
}
System.out.println(result);
} else {
System.out.println("Input did not match");
}
This will match your string, and then use the two groups as input to the String
formatter.
.*(\d*).*(\d*).*
Your problem is that your regex includes repeaters but no characters for them to repeat...
The above regex will do what you want.
Digging in the libcore of android, I have found a private method, appendEvaluated, in java.util.Matcher, which does the job. So I did a copy/paste of it in my code.
Here it is :
private void appendEvaluated(StringBuffer buffer, String s) {
boolean escape = false;
boolean dollar = false;
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
if (c == '\\' && !escape) {
escape = true;
} else if (c == '$' && !escape) {
dollar = true;
} else if (c >= '0' && c <= '9' && dollar) {
buffer.append(group(c - '0'));
dollar = false;
} else {
buffer.append(c);
dollar = false;
escape = false;
}
}
// This seemingly stupid piece of code reproduces a JDK bug.
if (escape) {
throw new ArrayIndexOutOfBoundsException(s.length());
}
}
Following modification will help you too:
.*?(\d*).*?(\d*).*
The question mark means that regex should match minimal number of characters. Otherwise it matches maximum, so .*
matches the whole string.
And obviously do not forget that back slashes must be duplicate when you are in Java: one back slash for Java, the next one for regex, i.e. Pattern.compile(".*?(\\d*).*?(\\d*).*");
精彩评论