MD5 Hash of ISO-8859-1 string in Java

2022-12-12 23:46 问答作者：

I'm implementing an interface for digital payment service called Suomen Verkkomaksut. The information about the payment is sent to them via HTML form. To ensure that no one messes with the information during the transfer a MD5 hash is calculated at both ends with a special key that is not sent to them.

My problem is that for some reason they seem to decide that the incoming data is encoded with ISO-8859-1 and not UTF-8. The hash that I sent to them is calculated with UTF-8 strings so it differs from the hash that they calculate.

I tried this with following code:

String prehash = "6pKF4jkv97zmqBJ3ZL8gUw5DfT2NMQ|13466|123456||Testitilaus|EUR|http://www.esimerkki.fi/success|http://www.esimerkki.fi/cancel|http://www.esimerkki.fi/notify|5.1|fi_FI|0412345678|0412345678|esimerkki@esimerkki.fi|Matti|Meikäläinen||Testikatu 1|40500|Jyväskylä|FI|1|2|Tuote #101|101|1|10.00|22.00|0|1|Tuote #202|202|2|8.50|22.00|0|1";
String prehashIso = new String(prehash.getBytes("ISO-8859-1"), "ISO-8859-1");

String hash = Crypt.md5sum(prehash).toUpperCase(); 
String hashIso = Crypt.md5sum(prehashIso).toUpperCase();

Unfortunately both hashes are identical with value C83CF67455AF10913D54252737F30E21. The correct value for this example case is 975816A41B9EB79B18B3B4526569640E according to Suomen Verkkomaksut's documentation.

Is there a way to calculate MD5 hash in Java with ISO-8859-1 strings?

UPDATE: While waiting answer from Suomen Verkkomaksut, I found an alternative way to make the hash. Michael Borgwardt corrected my understanding of String and encodings and I looked for a way to make the hash from byte[].

Apache Commons is an excellent source of libraries and I found their DigestUtils class which has a md5hex function which takes byte[] input and returns a 32 character hex string.

For some reason this still doesn't work. Both of these return the same value:

DigestUtils.md5Hex(prehash.getBytes());
DigestUtils.md5Hex(prehash.g开发者_StackOverflow中文版etBytes("ISO-8859-1"));

You seem to misunderstand how string encoding works, and your Crypt class's API is suspect.

Strings don't really "have an encoding" - an encoding is what you use to convert between Strings and bytes.

Java Strings are internally stored as UTF-16, but that does not really matter, as MD5 works on bytes, not Strings. Your Crypt.md5sum() method has to convert the Strings it's passed to bytes first - what encoding does it use to do that? That's probably the source of your problem.

Your example code is pretty nonsensical as the only effect this line has:

String prehashIso = new String(prehash.getBytes("ISO-8859-1"), "ISO-8859-1");

is to replace characters that cannot be represented in ISO-8859-1 with question marks.

Java has a standard java.security.MessageDigest class, for calculating different hashes.

Here is the sample code

include java.security.MessageDigest;

// Exception handling not shown

String prehash = ...

final byte[] prehashBytes= prehash.getBytes( "iso-8859-1" );

System.out.println( prehash.length( ) );
System.out.println( prehashBytes.length );

final MessageDigest digester = MessageDigest.getInstance( "MD5" );

digester.update( prehashBytes );

final byte[] digest = digester.digest( );

final StringBuffer hexString = new StringBuffer();

for ( final byte b : digest ) {
    final int intByte = 0xFF & b;

    if ( intByte < 10 )
    {
        hexString.append( "0" );
    }

    hexString.append(
        Integer.toHexString( intByte )
    );
}

System.out.println( hexString.toString( ).toUpperCase( ) );

Unfortunately for you it produces the same "C83CF67455AF10913D54252737F30E21" hash. So, I guess your Crypto class is exonerated. I specifically added the prehash and prehashBytes length printouts to verify that indeed 'ISO-8859-1' is used. In this case both are 328.

When I did presash.getBytes( "utf-8" ) it produced "9CC2E0D1D41E67BE9C2AB4AABDB6FD3" (and the length of the byte array became 332). Again, not the result you are looking for.

So, I guess Suomen Verkkomaksut does some massaging of the prehash string that they did not document, or you have overlooked.

Not sure if you solved your problem, but I had a similar problem with ISO-8859-1 encoded strings with nordic ä & ö characters and calculating a SHA-256 hash to compare with stuff in documentation. The following snippet worked for me:

import java.security.MessageDigest;
//imports omitted

@Test
public void test() throws ProcessingException{
String test = "iamastringwithäöchars";           
System.out.println(this.digest(test));      
}

public String digest(String data) throws ProcessingException {
    MessageDigest hash = null;

    try{
        hash = MessageDigest.getInstance("SHA-256");
    }
    catch(Throwable throwable){
        throw new ProcessingException(throwable);
    }
    byte[] digested = null;
    try {
        digested = hash.digest(data.getBytes("ISO-8859-1"));
    } catch (UnsupportedEncodingException e) {
        e.printStackTrace();
    }

    String ret = BinaryUtils.BinToHexString(digested);
    return ret;
}

To transform bytes to hex string there are many options, including the apache commons codec Hex class mentioned in this thread.

If you send UTF-8 encoded data that they treat as ISO-8859-1 then that could be the source of your problem. I suggest you either send the data in ISO-8859-1 or try to communicate to Suomen Verkkomaksut that you're sending UTF-8. In a http-based protocol you do this by adding charset=utf-8 to Content-Type in the HTTP header.

A way to rule out some issues would be to try a prehash String that only contains characters that are encoded the same in UTF-8 and ISO-8859-1. From what I can see you can achieve this by removing all "ä" characters in the string you'e used.

继续阅读：iso-8859-1 utf-8

MD5 Hash of ISO-8859-1 string in Java

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？