Relax ordering constraints in monadic computation
here is some food for thought.
When I write monadic code, the monad imposes ordering on the operations done. For example, If I write in the IO monad:
do a <- doSomething
b <- doSomethingElse
return (a + b)
I know doSomething
will be executed before doSomethingElse
.
Now, consider the equivalent code in a language like C:
return (doSomething() + doSomethingElse());
The semantics of C don't actually specify what order these two function calls will be evaluated, so the compiler is free to move things around as it pleases.
My question is this: How would I write开发者_开发百科 monadic code in Haskell that also leaves this evaluation order undefined? Ideally, I would reap some benefits when my compiler's optimizer looks at the code and starts moving things around.
Here are some possible techniques that don't get the job done, but are in the right "spirit":
- Write the code in functorial style, that is, write
plus doSomething doSomethingElse
and letplus
schedule the monadic calls. Drawbacks: You lose sharing on the results of the monadic actions, andplus
still makes a decision about when things end up being evaluated. - Use lazy IO, that is,
unsafeInterleaveIO
, which defers the scheduling to the demands lazy of evaluation. But lazy is different from strict with undefined order: in particular I do want all of my monadic actions to get executed. - Lazy IO, combined with immediately seq'ing all of the arguments. In particular,
seq
does not impose ordering constraints.
In this sense, I want something more flexible than monadic ordering but less flexible than full-out laziness.
This problem of over-sequentializing monad code is known as the "commutative monads problem".
Commutative monads are monads for which the order of actions makes no difference (they commute), that is when following code:
do a <- f x
b <- g y
m a b
is the same as:
do b <- g y
a <- f x
m a b
there are many monads that commute (e.g. Maybe
, Random
). If the monad is commutative, then the operations captured within it can be computed in parallel, for example. They are very useful!
However, we don't have a good syntax for monads that commute, though a lot of people have asked for such a thing -- it is still an open research problem.
As an aside, applicative functors do give us such freedom to reorder computations, however, you have to give up on the notion of bind
(as suggestions for e.g. liftM2
show).
This is a deep dirty hack, but it seems like it should do the trick to me.
{-# OPTIONS_GHC -fglasgow-exts #-}
{-# LANGUAGE MagicHash #-}
module Unorder where
import GHC.Types
unorder :: IO a -> IO b -> IO (a, b)
unorder (IO f) (IO g) = IO $ \rw# ->
let (# _, x #) = f rw#
(# _, y #) = g rw#
in (# rw# , (x,y) #)
Since this puts non-determinism in the hands of the compiler, it should behave "correctly" (i.e. nondeterministically) with regards to control flow issues (i.e. exceptions) as well.
On the other hand, we can't pull the same trick most standard monads such as State
and Either a
since we're really relying on spooky action at a distance available via messing with the RealWorld
token. To get the right behavior, we'd need some annotation available to the optimizer that indicated we were fine with nondeterministic choice between two non-equivalent alternatives.
The semantics of C don't actually specify what order these two function calls will be evaluated, so the compiler is free to move things around as it pleases.
But what if doSomething()
causes a side-effect that will change the behavior of doSomethingElse()
? Do you really want the compiler to mess with the order? (Hint: no) The fact that you are in a monad at all suggests that such may be the case. Your note that "you lose sharing on the results" also suggests this.
However, note that monadic does not always mean sequenced. It's not exactly what you describe, but you may be interested in the Par monad, which allows you to run your actions in parallel.
You are interested in leaving the order undefined so that the compiler can magically optimize it for you. Perhaps instead you should use something like the Par monad to indicate the dependencies (some things inevitably have to happen before others) and then let the rest run in parallel.
Side note: don't confuse Haskell's return
to be anything like C's return
精彩评论