开发者

How to remove non digits?

private String removeNonDigits(final String value) {         
   if(value == null || value.isEmpty()){
        return "";
   }
   return value.replaceAll("[^0-9]+", "");
}

Any better way to do this? Does Str开发者_如何学JAVAingUtils of Apache has a similar method?


Just for fun I ran a benchmark:

import java.util.List;
import java.util.regex.Pattern;

import com.google.common.base.Joiner;
import com.google.common.base.Predicate;
import com.google.common.collect.Iterables;
import com.google.common.primitives.Chars;

public final class Main {
    private static final String INPUT = "0a1b2c3d4e";
    private static final int REPS = 10000000;

    public static volatile String out;

    public static void main(String[] args) {
        System.err.println(removeNonDigits1(INPUT));
        System.err.println(removeNonDigits2(INPUT));
        System.err.println(removeNonDigits3(INPUT));
        System.err.println(removeNonDigits4(INPUT));
        System.err.println(removeNonDigits5(INPUT));

        long t0 = System.currentTimeMillis();
        for (int i = 0; i < REPS; ++ i) {
            out = removeNonDigits1(INPUT);
        }
        long t1 = System.currentTimeMillis();
        for (int i = 0; i < REPS; ++ i) {
            out = removeNonDigits2(INPUT);
        }
        long t2 = System.currentTimeMillis();
        for (int i = 0; i < REPS; ++ i) {
            out = removeNonDigits3(INPUT);
        }
        long t3 = System.currentTimeMillis();
        for (int i = 0; i < REPS; ++ i) {
            out = removeNonDigits4(INPUT);
        }
        long t4 = System.currentTimeMillis();
        for (int i = 0; i < REPS; ++ i) {
            out = removeNonDigits5(INPUT);
        }
        long t5 = System.currentTimeMillis();
        System.err.printf("removeNonDigits1: %d\n", t1-t0);
        System.err.printf("removeNonDigits2: %d\n", t2-t1);
        System.err.printf("removeNonDigits3: %d\n", t3-t2);
        System.err.printf("removeNonDigits4: %d\n", t4-t3);
        System.err.printf("removeNonDigits5: %d\n", t5-t4);
    }

    private static final String PATTERN_SOURCE = "[^0-9]+";
    private static final Pattern PATTERN = Pattern.compile(PATTERN_SOURCE);

    public static String removeNonDigits1(String input) {
        return input.replaceAll(PATTERN_SOURCE, "");
    }

    public static String removeNonDigits2(String input) {
        return PATTERN.matcher(input).replaceAll("");
    }

    public static String removeNonDigits3(String input) {
        char[] arr = input.toCharArray();
        int j = 0;
        for (int i = 0; i < arr.length; ++ i) {
            if (Character.isDigit(arr[i])) {
                arr[j++] = arr[i];
            }
        }
        return new String(arr, 0, j);
    }

    public static String removeNonDigits4(String input) {
        StringBuilder result = new StringBuilder();
        for (int i = 0; i < input.length(); ++ i) {
            char c = input.charAt(i);
            if (Character.isDigit(c)) {
                result.append(c);
            }
        }
        return result.toString();
    }

    public static String removeNonDigits5(String input) {
        List<Character> charList = Chars.asList(input.toCharArray());
        Predicate<Character> isDigit =
            new Predicate<Character>() {
                public boolean apply(Character input) {
                    return Character.isDigit(input);
                }
            };
        Iterable<Character> filteredList =
            Iterables.filter(charList, isDigit);
        return Joiner.on("").join(filteredList);
    }
}

And got these results:

removeNonDigits1: 74656
removeNonDigits2: 52235
removeNonDigits3: 4468
removeNonDigits4: 5250
removeNonDigits5: 29610

The amusing part is that removeNonDigits5 (the Google Collections version) was supposed to be an example of a silly, overcomplicated and inefficent solution, yet it's twice as fast as the regex version.

Update: Pre-compiling the regex increases the speed, but not as much as one might expect.

Re-using the Matcher gives another slight speedup, but probably not worth sacrificing thread-safety for.


Your method seems fine to me - what exactly is it you're looking for when you say "better"? Your method is clear and understandable in its implementation, and will have reasonably good performance.

In particular, unless your application consists of calling this method constantly in a tight loop, I don't think you'd gain anything noticeable from trying to make it more performant. Don't optimise prematurely; profile first and optimise the hotspots.


If this is a method which is being called frequently, you might get a speedup from compiling the regex to a Pattern and reusing it each time:

private static final Pattern digits = Pattern.compile("[^0-9]");

private String removeNonDigits(final String value) {             
  if(value == null || value.isEmpty()){
    return "";
  }

  return digits.matcher(value).replaceAll("");
}


Another version might be:

public static String removeNonDigits(final String value) {
    if (value == null || value.isEmpty()) {
        return "";
    }

    StringBuilder sb = new StringBuilder(value.length());
    for (int i = 0; i < value.length(); i++) {
        char c = value.charAt(i);
        if (Character.isDigit(c))
            sb.append(c);
    }
    return sb.toString();
}


Only a suggestion: value.trim().isEmpty() or (0==value.trim().length())

If you have

   String value="     ";
  • without the method trim()

    value == null || value.isEmpty()== false

  • with the method trim()

    value == null || value.isEmpty()== true

The second is functionally more correct IMHO


public static String getOnlyNumerics(String str)
{ 
    if (str == null)
    {
       return null;
    }    

    StringBuffer strBuff = new StringBuffer();    
    char c;
    for (int i = 0; i < str.length() ; i++)
    {     
         c = str.charAt(i);              
        if (Character.isDigit(c))
        {  
             strBuff.append(c);       
        }  
    }    

    return strBuff.toString();
}


Adding my version of variant 4 to finnw's fun above:

    public static String removeNonDigits4a(String input) {
        char[] chars = input.toCharArray();
        int l = chars.length;
        int m = 0;
        char c;
        for (int n = 0; n < l; ) {

            if (Character.isDigit(c = chars[n++])) {
                chars[m++] = c;
            }
        }
        return new String(chars, 0, m);
    }
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜