Is there an implementation of rapid concurrent syntactical sugar in scala? eg. map-reduce

2022-12-28 23:24 问答作者：

Passing messages around with actors is great. But I would like to have even easier code.

Examples (Pseudo-code)

val splicedList:List[List[Int]]=biglist.partition(100)
val sum:Int=ActorPool.numberOfActors(5).getAllResults(splicedList,foldLeft(_+_))

where spliceIntoParts turns one big list into开发者_如何学Go 100 small lists the numberofactors part, creates a pool which uses 5 actors and receives new jobs after a job is finished and getallresults uses a method on a list. all this done with messages passing in the background. where maybe getFirstResult, calculates the first result, and stops all other threads (like cracking a password)

With Scala Parallel collections that will be included in 2.8.1 you will be able to do things like this:

val spliced = myList.par // obtain a parallel version of your collection (all operations are parallel)
spliced.map(process _)   // maps each entry into a corresponding entry using `process`
spliced.find(check _)    // searches the collection until it finds an element for which
                         // `check` returns true, at which point the search stops, and the element is returned

and the code will automatically be done in parallel. Other methods found in the regular collections library are being parallelized as well.

Currently, 2.8.RC2 is very close (this or next week), and 2.8 final will come in a few weeks after, I guess. You will be able to try parallel collections if you use 2.8.1 nightlies.

You can use Scalaz's concurrency features to achieve what you want.

import scalaz._
import Scalaz._
import concurrent.strategy.Executor
import java.util.concurrent.Executors

implicit val s = Executor.strategy[Unit](Executors.newFixedThreadPool(5))

val splicedList = biglist.grouped(100).toList
val sum = splicedList.parMap(_.sum).map(_.sum).get

It would be pretty easy to make this prettier (i.e. write a function mapReduce that does the splitting and folding all in one). Also, parMap over a List is unnecessarily strict. You will want to start folding before the whole list is ready. More like:

val splicedList = biglist.grouped(100).toList
val sum = splicedList.map(promise(_.sum)).toStream.traverse(_.sum).get

You can do this with less overhead than creating actors by using futures:

import scala.actors.Futures._
val nums = (1 to 1000).grouped(100).toList
val parts = nums.map(n => future { n.reduceLeft(_ + _) })
val whole = (0 /: parts)(_ + _())

You have to handle decomposing the problem and writing the "future" block and recomposing it in to a final answer, but it does make executing a bunch of small code blocks in parallel easy to do.

(Note that the _() in the fold left is the apply function of the future, which means, "Give me the answer you were computing in parallel!", and it blocks until the answer is available.)

A parallel collections library would automatically decompose the problem and recompose the answer for you (as with pmap in Clojure); that's not part of the main API yet.

I'm not waiting for Scala 2.8.1 or 2.9, it would rather be better to write my own library or use another, so I did more googling and found this: akka http://doc.akkasource.org/actors

which has an object futures with methods

awaitAll(futures: List[Future]): Unit
awaitOne(futures: List[Future]): Future

but http://scalablesolutions.se/akka/api/akka-core-0.8.1/ has no documentation at all. That's bad.

But the good part is that akka's actors are leaner than scala's native ones
With all of these libraries (including scalaz) around, it would be really great if scala itself could eventually merge them officially

At Scala Days 2010, there was a very interesting talk by Aleksandar Prokopec (who is working on Scala at EPFL) about Parallel Collections. This will probably be in 2.8.1, but you may have to wait a little longer. I'll lsee if I can get the presentation itself. to link here.

The idea is to have a collections framework which parallelizes the processing of the collections by doing exactly as you suggest, but transparently to the user. All you theoretically have to do is change the import from scala.collections to scala.parallel.collections. You obviously still have to do the work to see if what you're doing can actually be parallelized.

继续阅读：concurrency mapreduce scala syntactic-sugar

Is there an implementation of rapid concurrent syntactical sugar in scala? eg. map-reduce

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？