Compressibility Example

2023-01-02 19:06 问答作者：

From my algorithms textbook:

The annual county horse race is bringing in three thoroughbreds who have never competed against one another. Excited, you study their past 200 races and summarize these as probability distributions over four outcomes: first (“first place”), second, third, and other.

                       Outcome     Aurora   Whirlwind    Phantasm
                        first        0.15      0.30          0.20

                        second       0.10      0.05          0.30

                        third        0.70      0.25          0.30

                        other        0.05      0.40          0.20

Which horse is the most predictable? One quantitative approach to this question is to look at compressibility. Write down the history of each horse as a string of 200 values (first, second, third, other). The total number of bits needed to encode these track-record strings can then be computed using Huffman’s algorithm. This开发者_如何学JAVA works out to 290 bits for Aurora, 380 for Whirlwind, and 420 for Phantasm (check it!). Aurora has the shortest encoding and is therefore in a strong sense the most predictable.

How did they get 420 for Phantasm? I keep getting 400 bytes, as so:

Combine first, other = 0.4, combine second, third = 0.6. End up with 2 bits encoding each position.

Is there something I've misunderstood about the Huffman encoding algorithm?

Textbook available here: http://www.cs.berkeley.edu/~vazirani/algorithms.html (page 156).

I think you're right: Phantasm's 200 outcomes can be represented using 400 bits (not bytes). 290 for Aurora and 380 for Whirlwind are correct.

The correct Huffman code is generated in the following manner:

Combine the two least probable outcomes: 0.2 and 0.2. Get 0.4.
Combine the next two least probable outcomes: 0.3 and 0.3. Get 0.6.
Combine 0.4 and 0.6. Get 1.0.

You would get 420 bits if you did this instead:

Combine the two least probable outcomes: 0.2 and 0.2. Get 0.4.
Combine 0.4 and 0.3. (Wrong!) Get 0.7.
Combine 0.7 and 0.3. Get 1.0

继续阅读：algorithm compression huffman-code information-theory

Compressibility Example

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？