开发者

Performance and simplicity tradeoffs between String, StringBuffer, and StringBuilder

Have you ever thought about the implications of this change in the Java Programming Language?

The String class was conceived as an immutable class (and this decision was intentionally thought-out). But String concatenation is really slow, I've benchmarked it myself. So the StringBuffer was born. Really great class, synchronized and really fast. But some people were not happy with the performance cost of some synchronized blocks, and the StringBuilder was introduced.

But, when using String to concatenate not too many objects, the immutability of the class makes it a really natural way to achieve thread-safety. I can understand the use of StringBuffer when we want to manage several Strings. But, here is my first question:

  1. If you have, say, 10 or fewer开发者_高级运维 strings that you want to append, for example, would you trade simplicity for just some milliseconds in execution time?

    I've benchmarked StringBuilder too. It is more efficient than StringBuffer (just a 10% improvement). But, if in your single-threaded program you're using StringBuilder, what happens if you sometimes want to change the design to use several threads? You have to change every instance of StringBuilder, and if you forget one, you'll have some weird effect (given the race condition that may arise) that can be produced.

  2. In this situation, would you trade performance for hours of debugging?

Ok, that's all. Beyond the simple question (StringBuffer is more efficient than "+" and thread-safe, and StringBuilder is faster than StringBuffer but no thread-safe) I would like to know when to use them.

(Important: I know the differences between them; this is a question related to the architecture of the platform and some design decisions.)


Just a comment about your "StringBuilders and threads" remark: even in multi-threaded programs, it's very rare to want to build up a string across multiple threads. Typically, each thread will have some set of data and create a string from that, often by concatenating multiple strings together. They'll then convert that StringBuilder to a string, and that string can be safely shared among threads.

I don't think I've ever seen a bug due to a StringBuilder being shared between threads.

Personally I wish StringBuffer didn't exist - it was in the "let's synchronize everything" phase of Java, leading to Vector and Hashtable which have been almost obsoleted by the unsynchronized ArrayList and HashMap classes from Java 2. It just took a little while long for the unsynchronized equivalent of StringBuffer to arrive.

So basically:

  • Use string when you don't want to perform manipulations, and want to be sure nothing else will
  • Use StringBuilder to perform manipulation, usually over a short period
  • Avoid StringBuffer unless you really, really need it - and as I say, I can't remember ever seeing a situation where I'd use StringBuffer instead of StringBuilder, when both are available.


StringBuffer was in Java 1.0; it was not any kind of a reaction to slowness or immutability. It's also not in any way faster or better than string concatenation; in fact, the Java compiler compiles

String s1 = s2 + s3;

into something like

String s1 = new StringBuilder(s2).append(s3).toString();

If you don't believe me, try it yourself with a disassembler (javap -c, for example.)

The thing about "StringBuffer is faster than concatenation" refers to repeated concatenation. In that case explicitly creating yoir own StringBuffer and using it repeatedly performs better than letting the compiler create many of them.

StringBuilder was introduced in Java 5 for performance reasons, as you say. The reason it makes sense is that StringBuffer/Builder are virtually never shared outside of the method that creates them: 99% of their usage is something like the above, where they're created, used to append a few strings together, then discarded.


Nowadays both StringBuffer and Builder are sort of useless (from performance point of view). I explain why:

StringBuilder was supposed to be faster than StringBuffer but any sane JVM can optimize away the synchronization. So it was quite a huge miss (and small hit) when it was introduced.

StringBuffer used NOT to copy the char[] when creating the String (in non shared variant); however that was a major source of issues, incl leaking huge char[] for small Strings. In 1.5 they decided that a copy of the char[] must occur every time and that practically made StringBuffer useless (the sync was there to ensure no thread games can trick out the String). That conserves memory, though and ultimately helps the GC (beside the obviously reduced footprint), usually the char[] is the top3 of the objects consuming memory.

String.concat was and still is the fastest way to concatenate 2 strings (and 2 only... or possibly 3). Keep that in mind, it does not perform an extra copy of the char[].

Back to the useless part, now any 3rd party code can achieve the same performance as StringBuilder. Even in java1.1 I used to have a class name AsycnStringBuffer which did exactly the same what StringBuilder does now, but still it allocates larger char[] than StringBuilder. Both StrinBuffer/StringBuilder are optimized for small Strings by default you can see the c-tor

  StringBuilder(String str) {
    super(str.length() + 16);
    append(str);
    }

Thus if the 2nd string is longer than 16chars, it gets another copy of the underlying char[]. Pretty uncool.

That can be a side effect of attempt at fitting both StringBuilder/Buffer and the char[] into the same cache line (on x86) on 32bit OS... but I don't know for sure.

As for the remark of hours of debugging, etc. Use your judgment, I personally do not recall ever having any issues w/ strings operations, aside impl. rope alike structure for the sql generator of JDO impl.


Edit: Below I illustrate what java designers didn't do to make String operations faster. Please, note that the class is intended for java.lang package and it can put there only by adding it to the bootstrap classpath. However, even if not put there (the difference is a single line of code!), it'd be still faster than StringBuilder, shocking? The class would have made string1+string2+... a lot better than using StringBuilder, but well...

package java.lang;

public class FastConcat {

    public static String concat(String s1, String s2){
        s1=String.valueOf(s1);//null checks
        s2=String.valueOf(s2);

        return s1.concat(s2);
    }

    public static String concat(String s1, String s2, String s3){
        s1=String.valueOf(s1);//null checks
        s2=String.valueOf(s2);
        s3=String.valueOf(s3);
        int len = s1.length()+s2.length()+s3.length();
        char[] c = new char[len];
        int idx=0;
        idx = copy(s1, c, idx);
        idx = copy(s2, c, idx);
        idx = copy(s3, c, idx);
        return newString(c);
    }
    public static String concat(String s1, String s2, String s3, String s4){
        s1=String.valueOf(s1);//null checks
        s2=String.valueOf(s2);
        s3=String.valueOf(s3);
        s4=String.valueOf(s4);

        int len = s1.length()+s2.length()+s3.length()+s4.length();
        char[] c = new char[len];
        int idx=0;
        idx = copy(s1, c, idx);
        idx = copy(s2, c, idx);
        idx = copy(s3, c, idx);
        idx = copy(s4, c, idx);
        return newString(c);

    }
    private static int copy(String s, char[] c, int idx){
        s.getChars(c, idx);
        return idx+s.length();

    }
    private static String newString(char[] c){
        return new String(0, c.length, c);
        //return String.copyValueOf(c);//if not in java.lang
    }
}


I tried the same thing on an XP machine. the StringBuilder IS somewhat faster but if You reverse the order of the run, or make several runs You'll notice that the "almost factor two" in the results will be changed into something like 10% advantage:

StringBuffer build & output duration= 4282,000000 µs
StringBuilder build & output duration= 4226,000000 µs
StringBuffer build & output duration= 4439,000000 µs
StringBuilder build & output duration= 3961,000000 µs
StringBuffer build & output duration= 4801,000000 µs
StringBuilder build & output duration= 4210,000000 µs

For Your kind of test the JVM will NOT help out. I had to limit the number of runs and elements just to get ANY result from a "String only"-test.


Decided to put the options to the test with a simple composition of XML exercise. Testing done on a 2.7GHz i5 with 16Gb DDR3 RAM for those wishing to replicate results.

Code:

   private int testcount = 1000; 
   private int elementCount = 50000;

   public void testStringBuilder() {

    long total = 0;
    int counter = 0;
    while (counter++ < testcount) {
        total += doStringBuilder();
    }
    float f = (total/testcount)/1000;
    System.out.printf("StringBuilder build & output duration= %f µs%n%n", f); 
}

private long doStringBuilder(){
    long start = System.nanoTime();
    StringBuilder buffer = new StringBuilder("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n");
    buffer.append("<root>");
      for (int i =0; i < elementCount; i++) {
          buffer.append("<data/>");
      }
      buffer.append("</root>");
     //System.out.println(buffer.toString());
      output = buffer.toString();
      long end = System.nanoTime();
     return end - start;
}


public void testStringBuffer(){
    long total = 0;
    int counter = 0;
    while (counter++ < testcount) {
        total += doStringBuffer();
    }
    float f = (total/testcount)/1000;

    System.out.printf("StringBuffer build & output duration= %f µs%n%n", f); 
}

private long doStringBuffer(){
    long start = System.nanoTime();
    StringBuffer buffer = new StringBuffer("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n");
    buffer.append("<root>");
      for (int i =0; i < elementCount; i++) {
          buffer.append("<data/>");
      }
      buffer.append("</root>");
     //System.out.println(buffer.toString());
      output = buffer.toString();

      long end = System.nanoTime();
      return end - start;
}

Results:

On OSX machine:

StringBuilder build & output duration= 1047.000000 µs 

StringBuffer build & output duration= 1844.000000 µs 


On Win7 machine:
StringBuilder build & output duration= 1869.000000 µs 

StringBuffer build & output duration= 2122.000000 µs

So looks like performance enhancement might be platform specific, dependant on how JVM implements synchronisation.

References:

Use of System.nanoTime() has been covered here -> Is System.nanoTime() completely useless? and here -> How do I time a method's execution in Java?.

Source for StringBuilder & StringBuffer here -> http://www.java2s.com/Open-Source/Java-Document/6.0-JDK-Core/lang/java.lang.htm

Good overview of synchronising here -> http://www.javaworld.com/javaworld/jw-07-1997/jw-07-hood.html?page=1

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜