MD5 with ASCII Char

2023-01-26 18:09 问答作者：

I have a string

    wDevCopyright = [NSString stringWithFormat:@"Copyright: %c 1995 by WIRELESS.dev, Corp Communications Inc., All rights reserved.",0xa9];

and to munge it I call

-(NSString *)getMD5:(NSString *)source
{

 const char *src = [source UTF8String];
 unsigned char result[CC_MD5_DIGEST_LENGTH];
 CC_MD5(src, strlen(src), result);

     return [NSString stringWithFormat:
   @"%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x",
   result[0], result[1], result[2], result[3], 
   result[4], result[5], result[6], result[7],
   result[8], result[9], result[10], result[11],
   result[12], result[13], result[14], result[15]
   ]; //ret;
}

because of 0xa9 *src = [source UTF8String] does not create a char that represents the string, thus returning a munge that is not comparable with other platforms.

I tried to encode the char with NSASCIIStringEncoding but it broke the code.

How do I call CC_MD5 with a string that has ASCII characters and get the same hash as in Java?

Update to code request:

Java

private static char[] kTestASCII = {
        169
        };

System.out.println("\n\n>>>>> msg## " + (char)0xa9 + " " + (char)169 + "\n  md5 " + md5(new String(kTestASCII), false) //unicode = false

Result >>>>> msg## \251 \251 md5 a252c2c85a9e7756d5ba5da9949d57ed

ObjC

     char kTestASCII [] = {
            169
        };


NSString *testString = [NSString stringWithCString:kTestASCII encoding:NSUTF8StringEncoding];

NSLog(@">>>> objC msg## int %d char %c md5: %@", 0xa9, 169, [self getMD5:testString]);

Result >>>> objC msg## int 169 char © md5: 9b759040321a408a5c7768b4511287a6

** As stated earlier - without the 0xa9 the hashes in Java and ObjC are the same. I am trying to get the hash for 0xa9 the same in Java and ObjC

Java MD5 code

private static char[] kTestASCII = {
    169
    };

md5(new String(kTestASCII), false);

    /**
     * Compute the MD5 hash for the given String.
     * @param s the string to add to the digest
     * @param unicode true if the string is unciode, false for ascii strings
     */
    public synchronized final String md5(String value, boolean unicode)
    {
        MD5();
        MD5.update(value, unicode);
        return WUtilities.toHex(MD5.finish());

    }
    public synchronized void update(String s, boolean unicode)
{


    if (unicode)
    {
        char[] c = new char[s.length()];
        s.getChars(0, c.length, c, 0);
        update(c);
    }
    else
    {
        byte[] b = new byte[s.length()];
        s.getBytes(0, b.length, b, 0);
        update(b);
    }
}

public synchronized void update(byte[] b)
{
    update(b, 0, b.length);
}

//--------------------------------------------------------------------------------

/**
 * Add a byte sub-array to the digest.
 */
public synchronize开发者_如何学Cd void update(byte[] b, int offset, int length)
{
    for (int n = offset; n < offset + length; n++)
        update(b[n]);
}

/**
 * Add a byte to the digest.
 */
public synchronized void update(byte b)
{
    int index = (int)((count >>> 3) & 0x03f);
    count += 8;
    buffer[index] = b;
    if (index >= 63)
        transform();
}

I believe that my issue is with using NSData withEncoding as opposed to a C char[] or the Java byte[]. So what is the best way to roll my own bytes into a byte[] in objC?

The character you are having problems with, ©, is the Unicode COPYRIGHT SIGN (00A9). The correct UTF-8 encoding of this character is the byte sequence 0xc9 0xa9.

You are attempting, however to convert from the single-byte sequence 0xa9 which is not a valid UTF-8 encoding of any character. See table 3-7 of http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf#G7404 . Since this is not a valid UTF-8 byte sequence, stringWithCString is converting your input to the Unicode REPLACEMENT_CHARACTER (FFFD). When this character is then encoded back into UTF-8, it yields the byte sequence 0xef 0xbf 0xbd. The MD5 of this sequence is 9b759040321a408a5c7768b4511287a6 as reported by your Objective-C example.

Your Java example yields an MD5 of a252c2c85a9e7756d5ba5da9949d57ed, which simple experimentation shows is the MD5 of the byte sequence 0xa9, which I have already noted is not a valid UTF-8 representation of the desired character.

I think we need to see the implementation of the Java md5() method you are using. I suspect it is simply dropping the high bytes of every Unicode character to convert to a byte sequence for passing to the MessageDigest class. This does not match your Objective-C implementation where you are using a UTF-8 encoding.

Note: even if you fix your Objective-C implementation to match the encoding of your Java md5() method, your test will need some adjustment because you cannot use stringWithCString with the NSUTF8StringEncoding encoding to convert the byte sequence 0xa9 to an NSString.

UPDATE

Having now seen the Java implementation using the deprecated getBytes method, my recommendation is to change the Java implementation, if at all possible, to use a proper UTF-8 encoding.

I suspect, however, that your requirements are to match the current Java implementation, even if it is wrong. Therefore, I suggest you duplicate the bad behavior of Java's deprecated getBytes() method by using NSString getCharacters:range: to retrieve an array of unichars, then manually create an array of bytes by taking the low byte of each unichar.

stringWithCString requires a null terminated C-String. I don't think that kTestASCII[] is necessarily null terminated in your Objective-C code. Perhaps that is the cause of the difference.

Try:

char kTestASCII [] = {
            169,
            0
        };

Thanks to GBegan's explanation - here is my solution

for(int c = 0; c < [s length]; c++){
    int number = [s characterAtIndex:c];
    unsigned char c[1];
    c[0] = (unsigned char)number;
    NSMutableData *oneByte = [NSMutableData dataWithBytes:&c length:1];
}

继续阅读：character-encoding

MD5 with ASCII Char

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？