开发者

How to replace string and preserve its uppercase/lowercase

I want to replace one string with the other in Perl; both are of the same length. I want t开发者_StackOverflow中文版o replace all occurrences of the string (case insensitive), but I want that the case of the letter will be preserved. So if the first letter was upper case, the first letter after the replacement will be upper case also.

For example, if I want to replace "foo" with "bar", so I want that

foo ==> bar
Foo ==> Bar
FOO ==> BAR

Is there a simple way to do this in Perl?


This might be what you are after:

How do I substitute case insensitively on the LHS while preserving case on the RHS?

This is copied almost directly from the above link:

sub preserve_case($$) {
    my ($old, $new) = @_;
    my $mask = uc $old ^ $old;
    uc $new | $mask .
    substr($mask, -1) x (length($new) - length($old))
}

my $string;

$string = "this is a Foo case";
$string =~ s/(Foo)/preserve_case($1, "bar")/egi;
print "$string\n";

# this is a Bar case

$string = "this is a foo case";
$string =~ s/(Foo)/preserve_case($1, "bar")/egi;
print "$string\n";

# this is a bar case

$string = "this is a FOO case";
$string =~ s/(Foo)/preserve_case($1, "bar")/egi;
print "$string\n";

# this is a BAR case


perldoc perlfaq6 provides some insights:

How do I substitute case-insensitively on the LHS while preserving case on the RHS?

Here's a lovely Perlish solution by Larry Rosler. It exploits properties of bitwise xor on ASCII strings.

$_= "this is a TEsT case";
$old = 'test';
$new = 'success';
s{(\Q$old\E)}
    { uc $new | (uc $1 ^ $1) .
            (uc(substr $1, -1) ^ substr $1, -1) x
            (length($new) - length $1)
    }egi;
print;    # 'this is a SUcCESS case'

And here it is as a subroutine, modeled after the above:

sub preserve_case {
        my ($old, $new) = @_;
        my $mask = uc $old ^ $old;
        uc $new | $mask .
            substr($mask, -1) x (length($new) - length($old))
    }

$string = "this is a TEsT case";
$string =~ s/(test)/preserve_case($1, "success")/egi;
print "$string\n";

This prints:

this is a SUcCESS case

So you could use the preserve_case() subroutine like so. Just don't expect Unicode miracles :)

s[\b(abc)\b][preserve_case($1,'xyz')]ei ;


$text =~ s/\b(?:(Abc)|abc)\b/ $1 ? 'Xyz' : 'xyz' /eg;

If the actual list is longer, you can use a lookup table.

my %translations = (
   'Abc' => 'Xyz',  'abc' => 'xyz',
   'Def' => 'Ghi',  'def' => 'ghi',
   'Jkl' => 'Mno',  'jkl' => 'mno',
);

my $alt_pat = join '|', map quotemeta, keys(%translations);

$text =~ s/\b($alt_pat)\b/$translations{$1}/g;

But that still leaves some duplication that could be removed by deriving the lowercase versions.

my %translations = (
   'Abc' => 'Xyz',
   'Def' => 'Ghi',
   'Jkl' => 'Mno',
);

%translations = ( ( map lc, %translations ), %translations );

my $alt_pat = join '|', map quotemeta, keys(%translations);

$text =~ s/\b($alt_pat)\b/$translations{$1}/g;


Here's a solution that factors out the idea of "alter one string to match the capitalization of another string" into a function, and calls that function to build the replacement.

sub matchcap
{
  my ($s,$r) = @_;
  return $s eq ucfirst($s) ? ucfirst($r) : lcfirst($r);
}

s/\b(Abc|abc)\b/matchcap($1,'xyz')/ge;


A bit of a hack, using the experimental code extended regular expression:

$text =~ s/\b([Aa])(?{ $n=chr(ord($^N)+23) })bc/${n}yz/

First, match the letter A with ([Aa]). The following (?{...}) contains arbitrary code, with $^N containing the text of the most recently captured subgroup. The 23 is the difference in ASCII codes between A and X (for upper- and lowercase), so $n contains the letter X with the same case as the corresponding A.

(This should not be taken as an endorsement to write code like this, but as an interesting example of this experimental regular expression.)


Here's a "semi-perlish" solution that should work for arbitrary regexps and Unicode data:

sub adjust_case {
    my ($text, $case) = @_;
    $case .= substr($case, -1) x (length($text) - length($case));
    $_ = [ split // ] for $text, $case;
    return join "", map {
        $case->[$_] =~ /\p{Upper}/ ? uc $text->[$_] :
        $case->[$_] =~ /\p{Lower}/ ? lc $text->[$_] : $text->[$_]
    } 0 .. $#$text;
}

my $regexp  = qr/\b(abc\w*)\b/i;
my $replace = "Xyzzy";

s/$regexp/adjust_case $replace, ${^MATCH}/egp;


You could do this:

my %trans = (
    'Abc' => Xyz, 
    'abc' => xyz,
);
$text =~s/\b(Abc|abc)\b/$trans{$1}/ge;


You know each string is the same length, so basically, you can:

index = Pos(string, oldString)
for i = index to index + strlen(oldString)
  if (oldString[i] >= 'a') && (oldString[i] <= 'z'')
    string[i] = ToLower(newString[i])
  else
    string[i] = ToUpper(newString[i])0x20


Here's a neat trick that uses non-destructive transliteration (available in Perl 5.14) within the result of the substitution.

use 5.014;
$string =~ s/\b(f)(o)(o)\b/ ($1 =~ tr{fF}{bB}r) . ($2 =~ tr{oO}{aA}r) . ($3 =~ tr{oO}{rR}r) /egi;

You can even shorten it if consecutive groups of letters have same replacements, e.g.

# foo ==> see, FoO ==> SeE, etc.
$string =~ s/\b(foo)\b/ $1 =~ tr{fFoO}{sSeE}r /egi;


Check character by character. If a character's ASCII value falls in uppercase ASCII values, replace with uppercase else lowercase.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜