Please help identify multi-byte character encoding scheme on ASP Classic page
I'm working with a 3rd party (Commidea.com) payment processing system and one of the parameters being sent along with the processing result is a "signature" field. This is used to provide a SHA1 hash of the result message wrapped in an RSA encrypted envelope to provide both integrity and authenticity control. I have the API from Commidea but it doesn't give details of encoding and uses artificially created signatures derived from Base64 strings to illustrate the examples.
I'm struggling to work out what encoding is being used on this parameter and hoped someone might recognise the quite distinctive pattern. I initially thought it was UTF8 but having looked at the individual characters I am less sure.
Here is a short sample of the content which was created by the following code where I am looping through each "byte" in the string:
sig = Request.Form("signature")
For x = 1 To LenB(sig)
s = s & AscB(MidB(sig,x,1)) & ","
Next
' Print s to a debug log file
When I look in the log I get something like this:
129,0,144,0,187,0,67,0,234,0,71,0,197,0,208,0,191,0,9,0,43,0,230,0,19,32,195,0,248,0,102,0,183,0,73,0,192,0,73,0,175,0,34,0,163,0,174,0,218,0,230,0,157,0,229,0,234,0,182,0,26,32,42,0,123,0,217,0,143,0,65,0,42,0,239,0,90,0,92,0,57,0,111,0,218,0,31,0,216,0,57,32,117,0,160,0,244,0,29,0,58,32,56,0,36,0,48,0,160,0,233,0,173,0,2,0,34,32,204,0,221,0,246,0,68,0,238,0,28,0,4,0,92,0,29,32,5,0,102,0,98,0,33,0,5,0,53,0,192,0,64,0,212,0,111,0,31,0,219,0,48,32,29,32,89,0,187,0,48,0,28,0,57,32,213,0,206,0,45,0,46,0,88,0,96,0,34,0,235,0,184,0,16,0,187,0,122,0,33,32,50,0,69,0,160,0,11,0,39,0,172,0,176,0,113,0,39,0,218,0,13,0,239,0,30,32,96,0,41,0,233,0,214,0,34,0,191,0,173,0,235,0,126,0,62,0,249,0,87,0,24,0,119,0,82,0
Note that every other value is a zero except occasionally where it is 32 (0x20). I'm familiar with UTF8 where it represents characters above 127 by using two bytes but if this was UTF8 encoding then I would expect the "32" value to be more like 194 (0xC2) or (0xC3) and the other value would be greater than 0x80.
Ultimately what I'm trying to do is convert this signature parameter into a hex encoded string (eg. "12ab0528...") which is then used by the RSA/SHA1 function to verify the message is intact. This part is already working but I can't for the life of me figure out how to get the signature parameter decoded.
For historical reasons we are having to use classic ASP and the SHA1/RSA functions are javascript based.
Any help would be much appreciated. Regards, Craig.
Update: Tried looking into UTF-16 encoding on Wikipedia and other sites. Can't find anything to explain why I am seeing only 0x20 or 0x00 in the (assumed) high order byte positions. I don't think this is relevant any more as the example below shows other values in this high order position.
Tried adding some code to log the values using Asc instead of AscB (Len,Mid instead of LenB,MidB too). Got some surprising results. Here is a new stream of byte-wise characters followed by the equivalent stream of word-wise (if you know what I mean) characters.
21,0,83,1,214,0,201,0,88,0,172,0,98,0,182,0,43,0,103,0,88,0,103,0,34,33,88,0,254,0,173,0,188,0,44,0,66,0,120,1,246,0,64,0,47,0,110,0,160,0,84,0,4,0,201,0,176,0,251,0,166,0,211,0,67,0,115,0,209,0,53,0,12,0,243,0,6,0,78,0,106,0,250,0,19,0,204,0,235,0,28,0,243,0,165,0,94,0,60,0,82,0,82,0,172,32,248,0,220,2,176,0,141,0,239,0,34,33,47,0,61,0,72,0,248,0,230,0,191,0,219,0,61,0,105,0,246,0,3,0,57,32,54,0,34,33,127,0,224,0,17,0,224,0,76,0,51,0,91,0,210,0,35,0,89,0,178,0,235,0,161,0,114,0,195,0,119,0,69,0,32,32,188,0,82,0,237,0,183,0,220,0,83,1,10,0,94,0,239,0,187,0,178,0,19,0,168,0,211,0,110,0,101,0,233,0,83,0,75,0,218,0,4,0,241,0,58,0,170,0,168,0,82,0,61,0,35,0,184,0,240,0,117,0,76,0,32,0,247,0,74,0,64,0,163,0
And now the word-wise data stream:
21,156,214,201,88,172,98,182,43,103,88,103,153,88,254,173,188,44,66,159,246,64,47,110,160,84,4,201,176,251,166,211,67,115,209,53,12,243,6,78,106,250,19,204,235,28,243,165,94,60,82,82,128,248,152,176,141,239,153,47,61,72,248,230,191,219,61,105,246,3,139,54,153,127,224,17,224,76,51,91,210,35,89,178,235,161,114,195,119,69,134,188,82,237,183,220,156,10,94,239,187,178,19,168,211,110,101,233,83,75,218,4,241,58,170,168,82,61,35,184,240,117,76,32,247,74,64,163
Note the second pair of byte-wise characters (83,1) seem to be interpreted as 156 in the word-wise stream. We also see (34,33) as 153 and (120,1) as 159 and (220,2) as 152. Does this give any clues as the encoding? Why are these 15[2369] values apparently being treated differently from other valu开发者_运维百科es?
What I'm trying to figure out is whether I should use the byte-wise data and carry out some post-processing to get back to the intended values or if I should trust the word-wise data with whatever implicit decoding it is apparently performing. At the moment, neither seem to give me a match between data content and signature so I need to change something.
Thanks.
Quick observation tells me that you are likely dealing with UTF-16. Start from there.
精彩评论