开发者

Is it faster to compare String's, or byte arrays?

So, mig开发者_如何学Pythonht sound like an odd question, but is it faster to compare 2 String's, or byte[]'s (using Arrays.equals())? I'm working with Hadoop/Hbase, and I get byte[] as the value from Hbase, and I have a value that is passed in. Will it be faster to convert the value I get to a String and compare? Or compare them as to byte arrays?


Without actually testing this it would seem that Array.equals() is your friend. To make a string you end up making a copy of the byte array in the String constructor, then you have to decode the unicode, which involves creating a decoder for the default Unicode encoding, and converting the byte array into a char array, then you have to do the equals, which involves iterating through every character in each of the strings.

So on a O() type calculation you already have to read every byte in the array to do the conversion to a character, so I'd say the complexity is worse for converting to String for equals.

Update: Given the comments added to the question, it sounds like you are given a String and are using it to compare to multiple results in the MapReduce job. In this case it seems that there is one conversion of the input String to bytes and them multiple byte array comparisons. This seems faster than leaving the input String and converting every byte array returned in the job.


Firstly, You have to consider whether both the strings are of same encoding. Then if you just want to have an equals check then go ahead with byte comparison. But if you want to have the compareTo behavior of String, then you may have to figure out how to know which string is greater or lesser, in which case I would prefer converting to String first and then compare.

If they are not of same encoding, then its better to create Strings and then compare as the decoding part will be done by String class itself.


First, you should ask yourself if it really matters. Given that you are dealing with HBase, and thus network communication, whatever you do may be completely swamped, time-wise. Like @Clint and @Suraj, I think your probably better off with fewer method calls (i.e. using Array.equals() ). Just think of what has to happen when you do a String equals, and then add in the overhead of converting the byte-arrays to Strings.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜