Huffman coding for Lossless Compression

2023-02-25 14:13 问答作者：

I really need help with Huffman Coding for Lossless compression. I have an exam coming up and need to understand this, does anyone know of easy tutorials made to understand this, or could someone explain.

The questions in the exam are likely to be:

Suppose the alphabet is [A, B, C], and the known probability distribution is P(A)=0.6, P(B)=0.2 and P(C)=0.2. For simplicity, let’s also assume that both encoder and decoder know that the length of the messages is always 3, so there is no need for a terminator.

How many bits are needed to encode the message ACB by Huffman Coding? You need to provide the Huffman tree and Huffman code for each symbol. (3 marks)
How many bits are needed to encode the message ACB by Arithmetic Coding? You need to provide details of the encoding process. (3 marks)
Using the above results, discuss the advantage of Arithmetic Coding over Huffman coding. (1 mark)

Answers:

Huffman Code: A - 1, B - 01, C - 00. The encoding result is 10001, so 5 bits are needed. (3 marks)
The encoding process of Arithmetic Coding: Symbol Low high range 0.0 1.0 1.0 A 0.0 0.6 0.6 C 0.48 0.6 0.12 B 0.552 0.576 0.024 The final binary codeword is 0.1001, which is 0.5625. Therefore 4 bits are needed. (3 marks)
In Huffman Coding, the length of the co开发者_JAVA技巧deword for each symbol has to be an integer. But it can be fractional in Arithmetic Coding. Therefore Arithmetic Coding is often more efficient than Huffman Coding, as the results shown above. (1 mark)

http://en.wikipedia.org/wiki/Huffman_coding

If you look at the tree (top right) you'll see that each parent node is the sum of the two below it. The values at the nodes are the frequencies of the letters. Each bit in the binary sequence is a right/left branch in the tree.

Does that help?

I don't really have a clue about Arithmetic coding, but it looks quite clever.

A Huffman tree is a binary tree with the nodes representing the values with the highest distribution in the stream being compressed near the root and the values with decreasing distribution further and further away from the root, thus allowing more common values to be encoded in shorter bit strings while less common values are encoded in longer strings.

A Huffman tree is constructed as follows:

Build a table of entities in the source stream, with their distribution.
Pick the two entries in the table that have the lowest distribution.
Make a tree node out of these two entries.
Remove the entries just used from the table.
Add a new entry to the table with the combined distribution of the nodes just removed, as well as the tree node.
if there is more than one entry left in the table, go to step 2.
The entry left in the table is your root.

Basic huffmann implementation can be quite ok. But, if you are building from scratch you may need more than 1 other datastructure in your toolbox to make things easier such as a minHeap and a bit vector. The basic algorithms for encoding and decoding are pretty simple. No info on comparison with arithmetic coding.

An implementation example

继续阅读：huffman-code lossless-compression

Huffman coding for Lossless Compression

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？