开发者

Efficient MapReduce when dealing with streams to queries to the same dataset

I have a massive, static dataset and I've a function to apply to it.

f is in the form reduce(map(f, dataset)), so I would use the MapReduce s开发者_开发知识库keleton. However, I don't want to scatter the data at each request (and ideally I want to take advantage of indexing in order to speedup f). There is a MapReduce implementation that address this general case?

I've taken a look at IterativeMapReduce and maybe it does the job, but seems to address a slightly different case, and the code isn't available yet.


Hadoop's MapReduce (and all the others map-reduce skeleton inspired by Google) doesn't scatter the data all the time.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜