Best compression technique for binary data? [closed]
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this questionI have a large binary file that represents the alpha channel for each pixel in an image - 0 for transparent, 1 for anything else. This binary data needs to be dynamically loaded from a text file, and it would be useful to get the maximum possible compression in it. De-compression times aren't majorly important (unless we're talking a jump of say a minute to an hour), but the files need to be as small as possible.
Methods we've tried so far ar开发者_开发技巧e using run length encoding, then a huffman coding, then converting the binary data to base64, and run length encoding but differentiating between zero and one using numeric values for one and alphabetical equivalents for zero (seems to give the best results). However, we're wondering if there's a better solution than either of these as we're approaching it from a logical standpoint, rather than looking at all possible methods.
As external libraries were out fo the question, I created a custom solution for this. The system used run length encoding to compress the data, then the RLE encoded data was represented in base32 (32 characters for the zeroes, and the matching set for ones). This allowed us to represent files approximately 5MB in size with only around 30KB, without any loss.
I agree, you should be best off by using an existing proven image format. If you must do it yourself you will probably still end up with something that is very close to some existing tech.
I would think that I would like to store how many times the following byte is repeated |10|1|1|0|3|1|5|0
Would produce
1111111111011100000
But if one looks at this and optimize it on a byte level you would soon se that this is almost exactly what RLE -compresion does. So long answer made short, take a look at RLE ;)
Good luck!
Check out 7-Zip. It has very good compression ratios, often a tenth the size of zip, and has language bindings for many programming languages.
http://www.7-zip.org/sdk.html
There are some comparative tests of lossless archivers for photo images. You may look at one of them at: http://qlic.altervista.org/LPCB.html
You see that there are dozens of such archivers. For everyday use I'd recommend 7-zip.
精彩评论