
How can different compressing formats be compared?

I would like to know if a standardised method for comparing file compression formats does exist.

Does a standard set of files exist with which a comparison of compression efficiency is possible?

If you choose a large amount of files, does it even matter which filetypes you choose to compare the rates of compression of different algorithms? (To make this clear: I know that the rate of compression of one algorithm varies if you choose different files. I would like to know if one al开发者_如何学Gogorithm a might have for a set of 100,000 files a compression rate of 5% and an algorithm b has 2%, but for another set of 100,000 files the algorithm a would have 1% and algorithm b 2%. So for one set a is better, for the other b. Is this for a large set of files possible?)

Although I wouldn't say it is standarised, there are some corpus that are often used to compare different compression algorithms. Check for example the Calgary Corpus or the Canterbury Corpus.

Even if you choose a large amount of files, it does matter which file types you choose, as the compression ratio will vary depending on how much the actual data fits the underlying model assumed by the compression algorithm.

Check this site and this site to view the comparison of compression results on different kinds of data.





验证码 换一张
取 消

