开发者

Replacing multiple substrings in Java when replacement text overlaps search text

Say you have the following string:

cat dog fish dog fish cat

You want to replace all cats with dogs, all dogs with fish, and all fish with cats. Intuitively, the expected result:

dog fish cat fish cat dog

If y开发者_如何学Cou try the obvious solution, looping through with replaceAll(), you get:

  1. (original) cat dog fish dog fish cat
  2. (cat -> dog) dog dog fish dog fish dog
  3. (dog -> fish) fish fish fish fish fish fish
  4. (fish -> cat) cat cat cat cat cat cat

Clearly, this is not the intended result. So what's the simplest way to do this? I can cobble something together with Pattern and Matcher (and a lot of Pattern.quote() and Matcher.quoteReplacement()), but I refuse to believe I'm the first person to have this problem and there's no library function to solve it.

(FWIW, the actual case is a bit more complicated and doesn't involve straight swaps.)


It seems StringUtils.replaceEach in apache commons does what you want:

StringUtils.replaceEach("abcdeab", new String[]{"ab", "cd"}, new String[]{"cd", "ab"});
// returns "cdabecd"

Note that the documenent at the above links seems to be in error. See comments below for details.


String rep = str.replace("cat","§1§").replace("dog","§2§")
                .replace("fish","§3§").replace("§1§","dog")
                .replace("§2§","fish").replace("§3§","cat");

Ugly and inefficient as hell, but works.


OK, here's a more elaborate and generic version. I prefer using a regular expression rather than a scanner. That way I can replace arbitrary Strings, not just words (which can be better or worse). Anyway, here goes:

public static String replace(
    final String input, final Map<String, String> replacements) {

    if (input == null || "".equals(input) || replacements == null 
        || replacements.isEmpty()) {
        return input;
    }
    StringBuilder regexBuilder = new StringBuilder();
    Iterator<String> it = replacements.keySet().iterator();
    regexBuilder.append(Pattern.quote(it.next()));
    while (it.hasNext()) {
        regexBuilder.append('|').append(Pattern.quote(it.next()));
    }
    Matcher matcher = Pattern.compile(regexBuilder.toString()).matcher(input);
    StringBuffer out = new StringBuffer(input.length() + (input.length() / 10));
    while (matcher.find()) {
        matcher.appendReplacement(out, replacements.get(matcher.group()));
    }
    matcher.appendTail(out);
    return out.toString();
}

Test Code:

System.out.println(replace("cat dog fish dog fish cat",
    ImmutableMap.of("cat", "dog", "dog", "fish", "fish", "cat")));

Output:

dog fish cat fish cat dog

Obviously this solution only makes sense for many replacements, otherwise it's a huge overkill.


I would create a StringBuilder and then parse the text once, one word at a time, transferring over unchanged words or changed words as I go. I wouldn't parse it for each swap as you're suggesting.

So rather than doing something like:

// pseudocode
text is new text swapping cat with dog
text is new text swapping dog with fish
text is new text swapping fish with cat

I'd do

for each word in text
   if word is cat, swap with dog
   if word is dog, swap with fish
   if word is fish, swap with cat
   transfer new word (or unchanged word) into StringBuilder.

I'd probably make a swap(...) method for this and use a HashMap for the swap.

For example

import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;

public class SwapWords {
   private static Map<String, String> myMap = new HashMap<String, String>();

   public static void main(String[] args) {
      // this would really be loaded using a file such as a text file or xml
      // or even a database:
      myMap.put("cat", "dog");
      myMap.put("dog", "fish");
      myMap.put("fish", "dog");

      String testString = "cat dog fish dog fish cat";

      StringBuilder sb = new StringBuilder();
      Scanner testScanner = new Scanner(testString);
      while (testScanner.hasNext()) {
         String text = testScanner.next();
         text = myMap.get(text) == null ? text : myMap.get(text);
         sb.append(text + " ");
      }

      System.out.println(sb.toString().trim());
   }
}


public class myreplase {
    public Map<String, String> replase;

    public myreplase() {
        replase = new HashMap<String, String>();

        replase.put("a", "Apple");
        replase.put("b", "Banana");
        replase.put("c", "Cantalope");
        replase.put("d", "Date");
        String word = "a b c d a b c d";

        String ss = "";
        Iterator<String> i = replase.keySet().iterator();
        while (i.hasNext()) {
            ss += i.next();
            if (i.hasNext()) {
                ss += "|";
            }
        }

        Pattern pattern = Pattern.compile(ss);
        StringBuilder buffer = new StringBuilder();
        for (int j = 0, k = 1; j < word.length(); j++,k++) {
            String s = word.substring(j, k);
            Matcher matcher = pattern.matcher(s);
            if (matcher.find()) {
                buffer.append(replase.get(s));
            } else {
                buffer.append(s);
            }
        }
        System.out.println(buffer.toString());
    }

    public static void main(String[] args) {
        new myreplase();
    }
}

Output :- Apple Banana Cantalope Date Apple Banana Cantalope Date


Here's a method to do it without regex.

I noticed that every time a part of the string a gets replaced with b, b will always be part of the final string. So, you can ignore b from the string from then on.

Not only that, after replacing a with b, there will be a "space" left there. No replacement can take place across where b is supposed to be.

These actions add up to look a lot like split. split up the values (making the "space" in between strings), do further replacements for each string in the array, then joins them back.

For example:

// Original
"cat dog fish dog fish cat"

// Replace cat with dog
{"", "dog fish dog fish", ""}.join("dog")

// Replace dog with fish
{
    "",
    {"", " fish ", " fish"}.join("fish")
    ""
}.join("dog")

// Replace fish with cat
{
    "",
    {
        "",
        {" ", " "}.join("cat"),
        {" ", ""}.join("cat")
    }.join("fish")
    ""
}.join("dog")

So far the most intuitive way (to me) is to do this is recursively:

public static String replaceWithJointMap(String s, Map<String, String> map) {
    // Base case
    if (map.size() == 0) {
        return s;
    }

    // Get some value in the map to replace
    Map.Entry pair = map.entrySet().iterator().next();
    String replaceFrom = (String) pair.getKey();
    String replaceTo = (String) pair.getValue();

    // Split the current string with the replaceFrom string
    // Use split with -1 so that trailing empty strings are included
    String[] splitString = s.split(Pattern.quote(replaceFrom), -1);

    // Apply replacements for each of the strings in the splitString
    HashMap<String, String> replacementsLeft = new HashMap<>(map);
    replacementsLeft.remove(replaceFrom);

    for (int i=0; i<splitString.length; i++) {
        splitString[i] = replaceWithJointMap(splitString[i], replacementsLeft);
    }

    // Join back with the current replacements
    return String.join(replaceTo, splitString);
}

I don't think this is very efficient though.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜