开发者

How do I find the characters common to two strings in Java using single replaceAll?

So suppose I have:

String s = "1479K";
String t = "459LP";

and I want to return

String commonChars = "49";

the common characters between the two strings.

Obviously it is possible to do with a standard loop like:

String commonChars = "";
for (i = 0; i < s.length; i++)
{
    char ch = s.charAt(i);
    if (t.indexOf(ch) != -1)
    {
        commonChars = commonChars + ch;
    }
}

However I would like to be able to do this in one line using replaceAll. This can be done as follows:

String commonChars = s.replaceAll("["+s.replaceAll("["+t+"]","")+"]","");

My question is: is it possible to do this using a single invocation of replaceAll? And what would be the regular expression? I pr开发者_如何学Pythonesume I have to use some sort of lookahead, but my brain turns to mush when I even think about it.


String commonChars = s.replaceAll("[^"+t+"]","");

Note that you may need to escape special characters in t, e.g. using Pattern.quote(t) instead of t above.


The accepted answer:

String commonChars = s.replaceAll("[^"+t+"]","");

has a bug!!!

What if the string t has a regex meta-character? In that case the replaceAll fails.

See this program as an example where the string t has ] in it and ] is a regex meta-character which marks the end of the character class. Clearly the program does not produce the expected output.

Why?

Consider:

String s = "1479K";
String t = "459LP]";

Now the regex will become(just substitute t):

String commonChars = s.replaceAll("[^459LP]]","");

Which says replace any character other than 4,5,9,L,P followed by a ] with nothing. Which is clearly not what you want.

To fix these you need to escape the ] in t. You can do it manually as:

String t = "459LP\\]";

and the regex works fine.

This is a common problem when using regex, so the java.util.regex.Pattern class provides a static method named quote which can be used to do exactly this: quote the regex-metacharacters so that they are treated literally.

So before using t in replaceAll you quote it as:

t = Pattern.quote(t);

Program using quote method works as expected.


The accepted answer is incorrect. Because the replaceAll is a Pattern, we must consider the syntax. What will happen if s1 = "\\t" ? And what will happen if s1 = "]{" ?

If all chars are in range[0 - 255], we can work like this:

  1. byte[] tmp = new byte[255];
  2. loop each char in first string

    for (char c : str1.toCharArray())
    // or use charAt(i) here if (tmp[c] == 0) tmp[c] = 1;

  3. loop each char in second string

    for (char c : str2.toCharArray()) if (tmp[c] == 1) tmp[c] = 2;

  4. loop the tmp array, find the members with value of 2, the index is the right char we are look for.

Another solution is using HashSet.retainAll(Collection<?> c);


public class common {

   public static void main(String args[]) {
      String s = "FIRST";
      String s1 = "SECOND";
      String common = s.replaceAll("[^" + s1 + "]", "");
      System.out.println(common);
   }
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜