开发者

How can I find how many zlib files are in a single zlib file?

I would like to know how to determine how many zlib files are contained in a single file.

An example; Think I have 5 different files, and compressed them separately by using zlib. Then I combined them. So, I have one file contains 5 different zlib files. Now, how can I find how many zlib files are in that single file? I just need to find out the number of zlib files in a single file. I guess, I need to dump its hex code and grep some magic number, bu开发者_如何学JAVAt could not figure out how to do that.

Could you help me out?


The length of a block is not stored in the zlib encoded data (with the exception of non-compressed block). Instead the end of a block is signified by a token [256] in the stream. But this token is Huffman encoded and the Huffman encoding is usually dynamically generated so it can be different for each block. Furthermore the encoded token might start on any bit of the byte so there is no way to "grep" it. The only way to find the end of block token is to decode the entire block and check to see when you hit this token.

I think instead you should see if your container includes any length information and use that to find out how long the compressed data is.

For details of the zlib format see RFC 1950, and the related DEFLATE specification which is RFC 1951.


If your single file is a concatenation multiple gzip files, then you can find an upper bound on the number of files. Gzip format starts with the magic 0x1f8b.

Count the occurrence of the magic in the single file. The count indicates that you have at most that many files. Unfortunately, it's an upper bound and not an exact number of files. Because 0x1f8b may also occur in the data section by chance 1 out of 64K bytes. To reduce false matches to 1 in ~24 million bytes you can scan for 0x1f8b08 instead. The trailing 0x08 is the "compression method" field which is always 8.

Further refinements of this "filter" is possible. See the FLG field of RFC1952.

If the members of the single file are not gzip formatted, but the Zlib or raw formats, then you are out of luck; you must decompress to count the number of files - which I would do regardless.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜