Regex for circular replacement
How would you use regex to write functions to do the following:
- Replace lowercase 'a' with uppercase and vice versa
- If easily extensible, do this for all letters
- Where words are separated by whitespaces and
>
and<
are special markers on some w开发者_如何转开发ords, replace>word
withword<
and vice versa.- If it helps, you can restrict input such that all words must be marked one way or the other.
- Replace postincrement (
i++;
) with preincrement (++i;
) and vice versa. Variable names are[a-z]+
. Input can now be assumed to be restricted to a bunch of these statements. Bonus: also do decrement.
Also interested in solutions in other flavors.
Note: this is NOT a homework question. See also my previous explorations of regex:
- Regex split into overlapping strings (Alan Moore's answer is especially instructive)
- Can you use zero-width matching regex in String split? (my solution exploits a known Java regex bug with regards to non-obvious length lookbehind!)
As you no doubt have gathered, the only sensible way to do this kind of thing is to make all the replacements in one pass, generating the replacement strings dynamically based on what was matched.
Java seems to be unique among today's major languages in not providing a convenient way to do that, but it can be done. You just have to use the lower-level API provided by the Matcher class. Here's a demonstration, based on Elliott Hughes's definitive Rewriter class:
import java.util.regex.*;
/**
* A Rewriter does a global substitution in the strings passed to its
* 'rewrite' method. It uses the pattern supplied to its constructor, and is
* like 'String.replaceAll' except for the fact that its replacement strings
* are generated by invoking a method you write, rather than from another
* string. This class is supposed to be equivalent to Ruby's 'gsub' when
* given a block. This is the nicest syntax I've managed to come up with in
* Java so far. It's not too bad, and might actually be preferable if you
* want to do the same rewriting to a number of strings in the same method
* or class. See the example 'main' for a sample of how to use this class.
*
* @author Elliott Hughes
*/
public abstract class Rewriter
{
private Pattern pattern;
private Matcher matcher;
/**
* Constructs a rewriter using the given regular expression; the syntax is
* the same as for 'Pattern.compile'.
*/
public Rewriter(String regex)
{
this.pattern = Pattern.compile(regex);
}
/**
* Returns the input subsequence captured by the given group during the
* previous match operation.
*/
public String group(int i)
{
return matcher.group(i);
}
/**
* Overridden to compute a replacement for each match. Use the method
* 'group' to access the captured groups.
*/
public abstract String replacement();
/**
* Returns the result of rewriting 'original' by invoking the method
* 'replacement' for each match of the regular expression supplied to the
* constructor.
*/
public String rewrite(CharSequence original)
{
this.matcher = pattern.matcher(original);
StringBuffer result = new StringBuffer(original.length());
while (matcher.find())
{
matcher.appendReplacement(result, "");
result.append(replacement());
}
matcher.appendTail(result);
return result.toString();
}
public static void main(String... args) throws Exception
{
String str = ">Foo baR<";
// anonymous subclass example:
Rewriter caseSwapper = new Rewriter("[A-Za-z]")
{
public String replacement()
{
char ch0 = group(0).charAt(0);
char ch1 = Character.isUpperCase(ch0) ?
Character.toLowerCase(ch0) :
Character.toUpperCase(ch0);
return String.valueOf(ch1);
}
};
System.out.println(caseSwapper.rewrite(str));
// inline subclass example:
System.out.println(new Rewriter(">(\\w+)|(\\w+)<")
{
public String replacement()
{
return group(1) != null ? group(1) + "<"
: ">" + group(2);
}
}.rewrite(str));
}
}
Best way to do this is using a regex for matching and a callback for replacing. E.g. in Python:
import re
# First example
s = 'abcDFE'
print re.sub(r'\w', lambda x: x.group().lower()
if x.group().isupper()
else x.group().upper(), s)
# OUTPUT: ABCdfe
# Second example
s = '<abc dfe> <ghe <auo pio>'
def switch(match):
match = match.group()
if match[0] == '<':
return match[1:] + '>'
else:
return '<' + match[:-1]
print re.sub(r'<\w+|\w+>', switch, s)
# OUTPUT: abc> <dfe ghe> auo> <pio
Perl, also using code in the replacement:
$\ = $/;
### 1.
$_ = 'fooBAR';
s/\w/lc $& eq $&? uc $&: lc $&/eg;
# this isn't a regex but better (in most cases):
# tr/A-Za-z/a-zA-Z/g;
print;
# FOObar
### 2.
$_ = 'foo >bar baz<';
s/>(\w+)|(\w+)</$1?"$1<":">$2"/eg;
print;
# foo bar< >baz
### 3.
$_ = 'x; ++i; i--;';
s/(--|\+\+)?\b([a-z]\w*)\b(?(1)|(--|\+\+))/$1?"$2$1":"$3$2"/eig;
print;
# x; i++; --i;
精彩评论