开发者

What operations are performed in bulk when using parallel collections? Strange behavior here

Input the following little sequential program and its parallelized version in Scala REPL:

/* Acti开发者_Python百科vate time measurement in "App" class. Prints [total <X> ms] on exit. */
util.Properties.setProp("scala.time", "true")
/* Define sequential program version. */
object X extends App { for (x <- (1 to 10)) {Thread.sleep(1000);println(x)}}
/* Define parallel program version. Note '.par' selector on Range here. */
object Y extends App { for (y <- (1 to 10).par) {Thread.sleep(1000);println(y)}}

Executing X with X.main(Array.empty) gives:

1
2
3
4
5
6
7
8
9
10
[total 10002ms]

Whereas Y with Y.main(Array.empty) gives:

1
6
2
7
3
8
4
9
10
5
[total 5002ms]

So far so good. But what about the following two variations of the program:

object X extends App {(1 to 10).foreach{Thread.sleep(1000);println(_)}}
object Y extends App {(1 to 10).par.foreach{Thread.sleep(1000);println(_)}}

The give me runtimes of [total 1002ms] and [total 1002ms] respectively. How can this be?


This have nothing to do with parallel collections. The problem is hidden in the function literal. You can see it if you let the compiler show the AST (with option -Xprint:typer):

for (x <- (1 to 10)) {Thread.sleep(1000);println(x)}

produces

scala.this.Predef.intWrapper(1).to(10).foreach[Unit](((x: Int) => {
  java.this.lang.Thread.sleep(1000L);
  scala.this.Predef.println(x)
}))

whereas

(1 to 10).foreach{Thread.sleep(1000);println(_)}

produces

scala.this.Predef.intWrapper(1).to(10).foreach[Unit]({
  java.this.lang.Thread.sleep(1000L);
  ((x$1: Int) => scala.this.Predef.println(x$1))
})

There is a little difference. If you want the expected result you have to change the foreach-expression to

(1 to 10).foreach{x => Thread.sleep(1000);println(x)}

But what is the difference? In your code you declare a block to foreach and after executing the block it will return the function to execute. Then this returned function is delivered to foreach and not the block which contains it.

This mistake is often done. It has to do with the underscore literal. Maybe this question helps you.


An interesting way of thinking about it is that because scala is call-by-value (Call by name vs call by value in Scala, clarification needed) when you hand {Thread.sleep(1000);println()} to foreach you evaluate the the block {Thread.sleep(1000);println()} only once and hand only the resulting println(_) function to foreach. When you do foreach(x => Thread.sleep(1000); println(x)) you are handing Thread.sleep(1000) as well as the println(x) into the function foreach. This is just another way of saying what sschaef already said.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜