开发者

Adding immutable Vectors

I am trying to work more with scalas immutable collection since this is easy to parallelize, but i struggle with some newbie problems. I am looking for a way to create (efficiently) a new Vector from an operation. To be precise I want something like

val v : Vector[Double] = RandomVector(10000)
val w : Vector[Double] = RandomVector(10000)
val r = v + w

I tested the following:

// 1)
val r : Vector[Double] = (v.zip(w)).map{ t:(Double,Double) => t._1 + t._2 }

// 2)
val vb = new VectorBuilder[Double]()    
var i=0
while(i<v.len开发者_如何学Gogth){
  vb += v(i) + w(i)
  i = i + 1
}
val r = vb.result

}

Both take really long compared to the work with Array:

[Vector Zip/Map   ] Elapsed time 0.409 msecs
[Vector While Loop] Elapsed time 0.374 msecs
[Array While Loop ] Elapsed time 0.056 msecs
// with warm-up (10000) and avg. over 10000 runs

Is there a better way to do it? I think the work with zip/map/reduce has the advantage that it can run in parallel as soon as the collections have support for this.

Thanks


Vector is not specialized for Double, so you're going to pay a sizable performance penalty for using it. If you are doing a simple operation, you're probably better off using an array on a single core than a Vector or other generic collection on the entire machine (unless you have 12+ cores). If you still need parallelization, there are other mechanisms you can use, such as using scala.actors.Futures.future to create instances that each do the work on part of the range:

val a = Array(1,2,3,4,5,6,7,8)
(0 to 4).map(_ * (a.length/4)).sliding(2).map(i => scala.actors.Futures.future {
  var s = 0
  var j = i(0)
  while (j < i(1)) {
    s += a(j)
    j += 1
  }
  s
}).map(_()).sum  // _() applies the future--blocks until it's done

Of course, you'd need to use this on a much longer array (and on a machine with four cores) for the parallelization to improve things.


You should use lazily built collections when you use more than one higher-order methods:

v1.view zip v2 map { case (a,b) => a+b }

If you don't use a view or an iterator each method will create a new immutable collection even when they are not needed.

Probably immutable code won't be as fast as mutable but the lazy collection will improve execution time of your code a lot.


Arrays are not type-erased, Vectors are. Basically, JVM gives Array an advantage over other collections when handling primitives that cannot be overcome. Scala's specialization might decrease that advantage, but, given their cost in code size, they can't be used everywhere.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜