Unexpected long process time when reading a byte array
I have a program that need to handle byte array source. Originally the program work fine when the byte array size is 3000 byte. Now data size increase and the array size need to be changed from 3000 to 30000 (10 times).
I make a sample benchmark program to test looping time. I suppose required CPU time should be increase linearly according to the array size, but the benchmark program show that process 30000 bytes require much more than 35 times compared to process 3000 bytes.
Here is my bench mark program. Could the program improve so that it use around 10 times CPU time only?
public static void main(String args[])
int TestArraySize=30000;
String strFinalMessage="";
// create a dummy byte array
byte[] bytearrayMessageContent = new byte[TestArraySize];
fo开发者_Go百科r (int i=0; i<TestArraySize; i++) {
// fill character A-J into the dummy array
bytearrayMessageContent[i] = (byte) (i%10+65);
}
System.out.println(bytearrayMessageContent.length);
// time start time
long lngCurrentTime = System.currentTimeMillis();
// process the byte array
int intTHMessageLenAdj = TestArraySize;
try {
InputStream input = new ByteArrayInputStream(bytearrayMessageContent);
while (intTHMessageLenAdj > 0) {
// get random length of bytes to process
int RandomLength = getNextRandom();
if (RandomLength > intTHMessageLenAdj) {
RandomLength = intTHMessageLenAdj;
}
// get the bytes to be process in a byte array and process it
byte[] bytearrayMsgTrunk = new byte[RandomLength];
input.read(bytearrayMsgTrunk);
// do some logic here
strFinalMessage += new String(bytearrayMsgTrunk) + "||";
// repeat looping until all bytes are read
intTHMessageLenAdj -= RandomLength;
}
input.close();
} catch (Exception ex) {
ex.printStackTrace();
}
// time end time
lngCurrentTime = System.currentTimeMillis() - lngCurrentTime;
//System.out.println(strFinalMessage);
System.out.println(lngCurrentTime);
}
public static int getNextRandom() {
// info is arround 4 bytes size
Random random = new Random();
return random.nextInt(8);
}
Well, there are several problems here:
- You're using the default platform encoding for strings. Don't do that. Specify a particular string encoding to convert between bytes and text.
- Don't concatenate strings in a loop like this - use
StringBuilder
. - You're ignoring the return value of
InputStream.Read
. That may be okay when reading fromByteArrayInputStream
, but you generally shouldn't rely on it. - You're creating a new instance of
Random
each time. I believe this is okay as of Java 6, but will give you repeated values in earlier versions. It's generally a bad idea, basically. Use one instance ofRandom
repeatedly.
I suppose required CPU time should be increase linearly according to the array size, but the benchmark program show that process 30000 bytes require much more than 35 times compared to process 3000 bytes.
Actually, I'd expect it to grow quadratically with the array size. If you were to profile the program, you'd probably find that a significant proportion of the time is going in calls to String.concat
. And as the array gets bigger, the proportion will increase.
Basically, each time you do a String concatenation, you copy all of the characters you've accumulated so far into a new String ... and throw away the previous one. It is not hard to see that that part of the code is O(N**2)
, where N
is the array size.
Replace the String declaration and concatenations with this:
// allocate the builder with extra space to hold the '||' characters
StringBuilder sb= new StringBuilder(TestArraySize * 3 / 2);
...
// this replaces the concatenation.
sb.append(new String(bytearrayMsgTrunk);
sb.append("||");
...
// this does a final copy of the characters to create a new String
String strFinalMessage = sb.toString();
Why do you think you need to increase your buffer size because the data size increases? That just ain't so.
精彩评论