开发者

What is the motivation for Scala assignment evaluating to Unit rather than the value assigned?

What is the motivation for Scala assignment evaluating to Unit rather than the value assigned?

A common pattern in I/O pro开发者_如何学编程gramming is to do things like this:

while ((bytesRead = in.read(buffer)) != -1) { ...

But this is not possible in Scala because...

bytesRead = in.read(buffer)

.. returns Unit, not the new value of bytesRead.

Seems like an interesting thing to leave out of a functional language. I am wondering why it was done so?


I advocated for having assignments return the value assigned rather than unit. Martin and I went back and forth on it, but his argument was that putting a value on the stack just to pop it off 95% of the time was a waste of byte-codes and have a negative impact on performance.


I'm not privy to inside information on the actual reasons, but my suspicion is very simple. Scala makes side-effectful loops awkward to use so that programmers will naturally prefer for-comprehensions.

It does this in many ways. For instance, you don't have a for loop where you declare and mutate a variable. You can't (easily) mutate state on a while loop at the same time you test the condition, which means you often have to repeat the mutation just before it, and at the end of it. Variables declared inside a while block are not visible from the while test condition, which makes do { ... } while (...) much less useful. And so on.

Workaround:

while ({bytesRead = in.read(buffer); bytesRead != -1}) { ... 

For whatever it is worth.

As an alternate explanation, perhaps Martin Odersky had to face a few very ugly bugs deriving from such usage, and decided to outlaw it from his language.

EDIT

David Pollack has answered with some actual facts, which are clearly endorsed by the fact that Martin Odersky himself commented his answer, giving credence to the performance-related issues argument put forth by Pollack.


This happened as part of Scala having a more "formally correct" type system. Formally-speaking, assignment is a purely side-effecting statement and therefore should return Unit. This does have some nice consequences; for example:

class MyBean {
  private var internalState: String = _

  def state = internalState

  def state_=(state: String) = internalState = state
}

The state_= method returns Unit (as would be expected for a setter) precisely because assignment returns Unit.

I agree that for C-style patterns like copying a stream or similar, this particular design decision can be a bit troublesome. However, it's actually relatively unproblematic in general and really contributes to the overall consistency of the type system.


Perhaps this is due to the command-query separation principle?

CQS tends to be popular at the intersection of OO and functional programming styles, as it creates an obvious distinction between object methods that do or do not have side-effects (i.e., that alter the object). Applying CQS to variable assignments is taking it further than usual, but the same idea applies.

A short illustration of why CQS is useful: Consider a hypothetical hybrid F/OO language with a List class that has methods Sort, Append, First, and Length. In imperative OO style, one might want to write a function like this:

func foo(x):
    var list = new List(4, -2, 3, 1)
    list.Append(x)
    list.Sort()
    # list now holds a sorted, five-element list
    var smallest = list.First()
    return smallest + list.Length()

Whereas in more functional style, one would more likely write something like this:

func bar(x):
    var list = new List(4, -2, 3, 1)
    var smallest = list.Append(x).Sort().First()
    # list still holds an unsorted, four-element list
    return smallest + list.Length()

These seem to be trying to do the same thing, but obviously one of the two is incorrect, and without knowing more about the behavior of the methods, we can't tell which one.

Using CQS, however, we would insist that if Append and Sort alter the list, they must return the unit type, thus preventing us from creating bugs by using the second form when we shouldn't. The presence of side effects therefore also becomes implicit in the method signature.


I'd guess this is in order to keep the program / the language free of side effects.

What you describe is the intentional use of a side effect which in the general case is considered a bad thing.


It is not the best style to use an assignment as a boolean expression. You perform two things at the same time which leads often to errors. And the accidential use of "=" instead of "==" is avoided with Scalas restriction.


By the way: I find the initial while-trick stupid, even in Java. Why not somethign like this?

for(int bytesRead = in.read(buffer); bytesRead != -1; bytesRead = in.read(buffer)) {
   //do something 
}

Granted, the assignment appears twice, but at least bytesRead is in the scope it belongs to, and I'm not playing with funny assignment tricks...


You can have a workaround for this as long as you have a reference type for indirection. In a naïve implementation, you can use the following for arbitrary types.

case class Ref[T](var value: T) {
  def := (newval: => T)(pred: T => Boolean): Boolean = {
    this.value = newval
    pred(this.value)
  }
}

Then, under the constraint that you’ll have to use ref.value to access the reference afterwards, you can write your while predicate as

val bytesRead = Ref(0) // maybe there is a way to get rid of this line

while ((bytesRead := in.read(buffer)) (_ != -1)) { // ...
  println(bytesRead.value)
}

and you can do the checking against bytesRead in a more implicit manner without having to type it.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜