When do I use ByteString and when do I not?

2023-03-05 19:55 问答作者：

I've been making rather poor attempts at the PRIME1 problem on SPOJ. I discovered using that using ByteString really helped performance for reading in the problem text. However, using ByteString to write out the results is actually slightly slower than using Prelude functions. I'm trying to figure out if I'm doing it wrong, or if this is expected.

I've conducted profiling and timing using (putStrLn.show) and the ByteString equivalents three different ways:

I test each candidate to see if it is prime. If so, I add it to a list and write it out with (putStrLn . show)
I make a list of all primes and write out the list using (putStrLn . unlines. show)
I make a list of all primes and write out the list using map (putStrLn . show)

I expected numbers 2 and 3 to perform slower as you are building a list in one function and consuming it in another. By printing the numbers as I generate them, I avoid allocating any memory for the list. On the other hand, you are making a call system call with each call to putStrLn. Right? So I tested and #1 was in fact the fastest.

The best performance was achieved with option #1 and the Prelude ([Char]) functions. I expected that my best performance would be option #1 with ByteString, but this was not the case. I only used lazy ByteStrings, but I didn't think this would matter. Would it?

Some questions:

would you expect the ByteStrings to perform better for writing a bunch of Integers to stdout?
Am I missing a way pattern to generate and write out the answers that would lead to better performance?
If I am only writing out numbers as text, when, if ever, is there a benefit to using ByteString?

My working hypothesis is that writing out Integer's with ByteString is slower iff you aren't combining them with other text. If you are combining Integers with [Char], then you'd get better performance working with ByteStrin开发者_JAVA百科gs. I.e., the ByteString rewrite of:

putStrLn $ "the answer is: " ++ (show value)

will be much faster than the version written above. Is this true?

Thanks for reading!

Doing bulk input is usually faster with bytestrings, since the data is dense, there's simply less data to shuffle from the disk into memory.

Writing data as output however, is a little different. Typically, you're serializing a structure, generating many small writes. So the dense, bulk writes of bytestrings don't help you much in that case. Even regular Strings will do reasonably at incremental output.

However, all is not lost. We can recover fast bulk writes by efficiently building up bytestrings in memory. This approach is taken by the various *-builder packages:

binary
blaze-builder

Instead of converting values to lots of tiny bytestrings, and writing them out one at a time, we stream the conversion into an ever-growing buffer, and in turn, write that buffer in one big piece. This results in a lot less IO overhead, and performance improvements (often signficant) over string IO.

This kind of approach is taken by e.g. webservers in Haskell, or the efficient HTML system, blaze.

Also, the performance, even with bulk writes, will depend on the efficiency of whatever conversion function you have between your types and bytestrings. For Integer, you could be simply copying the bit pattern in memory to output, or instead going through some inefficient decoder. As a result, you sometimes have to think a bit about the quality of the encoding function you're using, and not just whether to use Char/String or bytestring IO.

Note that performance isn't the main difference between ByteString and String. The former is for binary data while the latter is for Unicode text. If you have binary data, use ByteString, if you have Unicode text, use the Text type from the text package.

继续阅读：bytestring haskell io performance

When do I use ByteString and when do I not?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？