GIF format - separate variable-length codes

2023-03-02 20:24 问答作者：

I trying to parse GIF format and have one problem with reading image data. This data represented like bit array, containing variable-length values.

ex:

0010-1010-0010-0000-00111-10000-11111...

Sometimes length of the code increases, but I can't understand ho开发者_开发百科w I can detect this increasing. I have just initial code size (length of the first code ex. 4).

Standart says only:

Appendix F. Variable-Length-Code LZW Compression.

...

The Variable-Length-Code aspect of the algorithm is based on an initial code size (LZW-initial code size), which specifies the initial number of bits used for the compression codes. When the number of patterns detected by the compressor in the input stream exceeds the number of patterns encodable with the current number of bits, the number of bits per LZW code is increased by one.

...

When parsing a GIF file, the Image Descriptor includes the bit width of the unencoded symbols (example: 8 bits). As you probably already know, the initial code size of the compressed data is one bit wider than the bit width of the unencoded symbols (example: 9 bits).

Also, as you probably already know, the possible compressed code values in a GIF file gradually increase in size, up to a maximum of 0xFFF == 4095 which requires 12 bits to store.

For every code that the decompressor pulls from the compressed data, the decompressor adds a new item to its dictionary. For example, if the first two 9-bit codes the decompressor reads are 0x028 0x0FF, the decompressor adds a two-byte sequence to its dictionary. Later if the decompressor ever reads the code 0x102, the decompressor decodes that 0x102 code to the two 8-bit bytes 0x28 0xFF.

The next item the decompressor adds to the dictionary is assigned the code 0x103.

The next item the decompressor adds to the dictionary is assigned the code 0x104. ...

Eventually the decompressor adds an item to the dictionary that is assigned the code 0x1FF. That is the largest number that fits into 9 bits. After storing that item into the dictionary, the decompressor starts reading 10-bit codes.

The next item the decompressor adds to the dictionary is assigned the code 0x200.

There isn't any special "command" in the data sequence that tells the decompressor to increment the code width. The decompressor must keep track of how many items the dictionary contains so far (which often can be conveniently re-used as the index of where to write the next item into the dictionary). When the decompressor adds item 0x1ff to the dictionary, it's time for the decompressor to start reading 10-bit codes. When the decompressor adds item 0x3ff to the dictionary, it's time for the decompressor to start reading 11 bit codes.

Wikipedia: Graphics Interchange Format
Wikipedia: Lempel–Ziv–Welch with variable-length codes

Look at this example first, it may be clearer to understand LZW than looking at the standard. And this may also be useful.

继续阅读：compression gif variable-length

GIF format - separate variable-length codes

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？