开发者

charAt() or substring? Which is faster?

I want to go through each character in a String and pass each character of the String as a String to another function.

String开发者_如何学C s = "abcdefg";
for(int i = 0; i < s.length(); i++){
    newFunction(s.substring(i, i+1));}

or

String s = "abcdefg";
for(int i = 0; i < s.length(); i++){
    newFunction(Character.toString(s.charAt(i)));}

The final result needs to be a String. So any idea which will be faster or more efficient?


As usual: it doesn't matter but if you insist on spending time on micro-optimization or if you really like to optimize for your very special use case, try this:

import org.junit.Assert;
import org.junit.Test;

public class StringCharTest {

    // Times:
    // 1. Initialization of "s" outside the loop
    // 2. Init of "s" inside the loop
    // 3. newFunction() actually checks the string length,
    // so the function will not be optimized away by the hotstop compiler

    @Test
    // Fastest: 237ms / 562ms / 2434ms
    public void testCacheStrings() throws Exception {
        // Cache all possible Char strings
        String[] char2string = new String[Character.MAX_VALUE];
        for (char i = Character.MIN_VALUE; i < Character.MAX_VALUE; i++) {
            char2string[i] = Character.toString(i);
        }

        for (int x = 0; x < 10000000; x++) {
            char[] s = "abcdefg".toCharArray();
            for (int i = 0; i < s.length; i++) {
                newFunction(char2string[s[i]]);
            }
        }
    }

    @Test
    // Fast: 1687ms / 1725ms / 3382ms
    public void testCharToString() throws Exception {
        for (int x = 0; x < 10000000; x++) {
            String s = "abcdefg";
            for (int i = 0; i < s.length(); i++) {
                // Fast: Creates new String objects, but does not copy an array
                newFunction(Character.toString(s.charAt(i)));
            }
        }
    }

    @Test
    // Very fast: 1331 ms/ 1414ms / 3190ms
    public void testSubstring() throws Exception {
        for (int x = 0; x < 10000000; x++) {
            String s = "abcdefg";
            for (int i = 0; i < s.length(); i++) {
                // The fastest! Reuses the internal char array
                newFunction(s.substring(i, i + 1));
            }
        }
    }

    @Test
    // Slowest: 2525ms / 2961ms / 4703ms
    public void testNewString() throws Exception {
        char[] value = new char[1];
        for (int x = 0; x < 10000000; x++) {
            char[] s = "abcdefg".toCharArray();
            for (int i = 0; i < s.length; i++) {
                value[0] = s[i];
                // Slow! Copies the array
                newFunction(new String(value));
            }
        }
    }

    private void newFunction(String string) {
        // Do something with the one-character string
        Assert.assertEquals(1, string.length());
    }

}


The answer is: it doesn't matter.

Profile your code. Is this your bottleneck?


Does newFunction really need to take a String? It would be better if you could make newFunction take a char and call it like this:

newFunction(s.charAt(i));

That way, you avoid creating a temporary String object.

To answer your question: It's hard to say which one is more efficient. In both examples, a String object has to be created which contains only one character. Which is more efficient depends on how exactly String.substring(...) and Character.toString(...) are implemented on your particular Java implementation. The only way to find it out is running your program through a profiler and seeing which version uses more CPU and/or more memory. Normally, you shouldn't worry about micro-optimizations like this - only spend time on this when you've discovered that this is the cause of a performance and/or memory problem.


Of the two snippets you've posted, I wouldn't want to say. I'd agree with Will that it almost certainly is irrelevant in the overall performance of your code - and if it's not, you can just make the change and determine for yourself which is fastest for your data with your JVM on your hardware.

That said, it's likely that the second snippet would be better if you converted the String into a char array first, and then performed your iterations over the array. Doing it this way would perform the String overhead once only (converting to the array) instead of every call. Additionally, you could then pass the array directly to the String constructor with some indices, which is more efficient than taking a char out of an array to pass it individually (which then gets turned into a one character array):

String s = "abcdefg";
char[] chars = s.toCharArray();
for(int i = 0; i < chars.length; i++) {
    newFunction(String.valueOf(chars, i, 1));
}

But to reinforce my first point, when you look at what you're actually avoiding on each call of String.charAt() - it's two bounds checks, a (lazy) boolean OR, and an addition. This is not going to make any noticeable difference. Neither is the difference in the String constructors.

Essentially, both idioms are fine in terms of performance (neither is immediately obviously inefficient) so you should not spend any more time working on them unless a profiler shows that this takes up a large amount of your application's runtime. And even then you could almost certainly get more performance gains by restructuring your supporting code in this area (e.g. have newFunction take the whole string itself); java.lang.String is pretty well optimised by this point.


I would first obtain the underlying char[] from the source String using String.toCharArray() and then proceed to call newFunction.

But I do agree with Jesper that it would be best if you could just deal with characters and avoid all the String functions...


Leetcode seems to prefer the substring option here.

This is how I solved that problem:

class Solution {
public int strStr(String haystack, String needle) {
    if(needle.length() == 0) {
        return 0;
    }

    if(haystack.length() == 0) {
        return -1;
    }

    for(int i=0; i<=haystack.length()-needle.length(); i++) {
        int count = 0;
        for(int j=0; j<needle.length(); j++) {
            if(haystack.charAt(i+j) == needle.charAt(j)) {
                count++;
            }
        }
        if(count == needle.length()) {
            return i;
        }
    }
    return -1;
}

}

And this is the optimal solution they give:

class Solution {
public int strStr(String haystack, String needle) {
    int length;
    int n=needle.length();
    int h=haystack.length();
    if(n==0)
        return 0;
    // if(n==h)
    //     length = h;
    // else
        length = h-n;
    if(h==n && haystack.charAt(0)!=needle.charAt(0))
            return -1;
    for(int i=0; i<=length; i++){
        if(haystack.substring(i, i+needle.length()).equals(needle))
            return i;
    }
    return -1;
}

}

Honestly, I can't figure out why it would matter.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜