String transformations and locales on Android
I have an Android app that is the "server" in a client/server design. In the app, I need to compute an MD5 hash against a set of strings and return the result to the client in order to let the conversation between them to continue. My code to do this has been pieced together from numerous examples out there. The algorithm of computing the hash (not designed by me) goes like this:
- Convert the string into an array of bytes
- Use the MessageDigest class to generate a hash
- Convert resulting hash back to a string
The hash seems to be correct for 99% of my customers. One of the customers seeing the wrong hash is running with a German locale, and it started to make me wonder if language could be factoring into the result I get. This is the code to make the byte array out of the string:
public static byte[] hexStringToByteArray(String s)
{
byte[] data = null;
if(s.length() % 2 != 0)
{
s = "0" + s;
}
int len = s.length();
data = new byte[len / 2];
for (int i = 0; i < len; i += 2)
{
data[i / 2] = (byte) ((Character.digit(s.charAt(i), 16) << 4)
+ Character.digit(s.charAt(i+1), 16));
}
return data;
}
And here's the current version of the hashing function:
public static String hashDataAsString(String dataToHash)
{
MessageDigest messageDigest;
try
{
messageDigest = MessageDigest.getInstance("MD5");
messageDigest.reset();
byte[] data = hexStringToByteArray(dataToHash);
messageDigest.update(data);
final byte[] resultByte = messageDigest.digest();
return new String(Hex.encodeHex(resultByte));
}
catch(NoSuchAlgorithmException e)
{
throw new RuntimeException("Failed to hash data values", e);
}
}
I'm using the Hex.encodeHex function from Apache Commons.
I've tried switching my phone to a German locale, but my unit tests still produce the corr开发者_运维知识库ect hash result. This customer is using stock Froyo, so that eliminates the risk that a custom ROM is at fault here. I've also found this alternative for converting from bytes to a string:
public static String MD5_Hash(String s) {
MessageDigest m = null;
try {
m = MessageDigest.getInstance("MD5");
} catch (NoSuchAlgorithmException e) {
e.printStackTrace();
}
//m.update(s.getBytes(),0,s.length());
byte [] data = hexStringToByteArray(s);
m.update(data, 0, data.length);
String hash = new BigInteger(1, m.digest()).toString(16);
return hash;
}
In my unit tests, it results in the same answer. Could BigInteger be a safer alternative to use here?
In your hashDataAsString
method, do you need to do hexStringToByteArray
? Is the incoming data a hex string or just an arbitrary string? Could you not use String.getBytes()?
If you are doing string/byte conversions, do you know the encoding of the incoming data and the encoding assumptions of your data consumers? Do you need to use a consistent encoding at both ends (e.g. ASCII or UTF-8)?
Do you include non-ASCII data in your unit tests?
精彩评论