Decoding a compressed short string; uncertain on compression used - Updated
I have a program that is compressing a string in an unknown way. I know a few inputs and the output produced, but I am no开发者_StackOverflow中文版t sure what is being used to compress the string.
Here are my examples.
(just 38 x a, no spaces or anything else)
In: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
Out: "21 1A A6 30 00"
(just 32 x a)
In: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
Out: "1c 1a a7 a0 00"
(31 x a, then 1 b)
In: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab"
Out: "01 77 c5 53 c0 00"
(31 x b, then 1 a)
In: "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbba"
Out: "1e 77 54 f3 80 00"
In: "Hey wot u doing 2day u wanna do something"
Out: "11 C7 C6 2E 78 CE 6B 8E 3A CD 83 E8 1B 37 C5 C5 A6 B9 D1 E1 B0 69 63 DB 5E 71 15 5C 10 00"
(same as previous string, but with a space at the end)
In: "Hey wot u doing 2day u wanna do something "
Out: "12 C7 71 8B 9E 33 9A E2 EB 36 0F A0 2C DF 17 17 7A 67 47 86 DF 4B 1E DA F3 88 AA E0 80 00"
Any help / advice would be great, thanks! Also, it may help to know these are from a BlackBerry 8120
Its unlikely that someone can figure out what kind of compression algorithm is being used just by looking at the supplied strings.
Assuming that they're not encrypted also (but merely transformed using an algorithm without the input of a key or other kind of secret), the only approach I can think of is brute force. That is, write some code to transform the input values using different compression algorithms and observe the outputs generated. It does not seem to be the LZW algorithm used by the .NET DeflateStream and GZipStream classes, so you can skip at least one ;)
My recommendation would be to look at the BlackBerry SDK and find out what algorithms it supports, as it's likely to be one of those.
You may also find this tutorial to be of interest: Hacking Data Compression
精彩评论