What's the best technique to generate a random-access data structure lazily?

2023-02-10 03:39 问答作者：

In Haskell, I'd like to generate a list of random integers of undetermined length. (However, less than say 1 million.)

I'm not likely to need all the elements of the list immediately, so I'd like to generate it lazily. However, once generated, I'll need to access el开发者_C百科ements in the list with random access. So, I assume the best technique would be to copy an infinite list to an array. However, I don't know if arrays can be "interchanged" with lists very easily -- for instance, once I want to generate more items of the list, I want to start where I left off but expand the array.

Anyways, perhaps I should settle for some log(n) tree structure instead of an array? What do you think?

Does Haskell have a tree structure for sequential elements that can be accessed using a list-like API?

If you have a good PRNG, then values which are generated with seeds which are close together should be independent, so you can just choose an initial seed, then each cell i in the array will be prng(seed+i).

This is essentially just using it as a hash function :P

If you do it this way, you don't even really need the array, you can just have a function getRandomValue seed i = prng (seed+i). Whether or not this is better than an array of lazy values will depend on the complexity of your PRNG, but either way you'll get O(1) access.

This seems like it might be useful, but you would need to generate increasingly larger trees as the list gets longer: http://www.codeproject.com/KB/recipes/persistentdatastructures.aspx#RandomAccessLists (it is a log(n) structure, though). Something like http://research.janelia.org/myers/Papers/applicative.stack.pdf might be useful as well, but I don't see off hand how to make it work efficiently (with shared structure) when the list is generated dynamically.

One possible approach would be to use a pseudorandom number generator that allows "leapfrogging," or skipping forward n steps in sub-linear time. One method (that is slow, but works in constant time) is to use a block cipher in counter mode; see http://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator#Designs_based_on_cryptographic_primitives. See http://cs.berkeley.edu/~mhoemmen/cs194/Tutorials/prng.pdf for more information.

Not an answer, but a consideration: note that any standard technique for generating a list of random values will involve threading through a seed. So if I generate a lazy random infinite list and force the value of the 100th element, that forces not only the spine of the cons cells 0-99 but also their values as well.

In any case, I'd recommend lazy storable vectors: http://hackage.haskell.org/package/storablevector

You still have O(n) access, but you've reduced it by the factor of your chunk size, so in practice this is much faster than a plain list. And you still get, modulo chunk size, the laziness properties you're interested in.

What about Data.Sequence.Seq? It's not lazy, but it does support O(1) appends at either end, and lookups are O(log(min(i,n-i))), which should be better than most other tree structures. You could keep a Seq and generate/append more outputs as necessary.

There's also an instance for ListLike if you prefer that API over the one provided by Seq.

One thing you could do is create a list of arrays. So if each array stores M elements of your sequence, then accessing element n is O(n/M). This might be a good middle ground.

For example:

import Data.Array
import Data.List

fibsList :: (Num a) => [a]
fibsList = 1 : 1 : zipWith (+) fibsList (tail fibsList)

chunkSize :: (Num a, Ix a) => a
chunkSize = 100000

fibsChunks :: (Num i, Ix i, Num e) => [Array i e]
fibsChunks = mkChunks fibsList
  where mkChunks fs = listArray (0,chunkSize-1) xs : mkChunks ys
          where (xs,ys) = splitAt chunkSize fs

lookupFib :: (Integral i, Ix i, Num e) => i -> e
lookupFib n = fibsChunks `genericIndex` q ! r
  where (q,r) = n `divMod` chunkSize

继续阅读：haskell

What's the best technique to generate a random-access data structure lazily?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？