What is Weak Head Normal Form?
What does Weak Head Normal Form (WHNF) mean? What does Head Normal form (HNF) and Normal Form (NF) mean?
Real World Haskell states:
The familiar
seq
function evaluates an expression to what we call head normal form (abbreviated HNF). It stops once it reaches the outermost constructor (the "head"). This is distinct from normal form (NF), in whic开发者_如何学Ch an expression is completely evaluated.You will also hear Haskell programmers refer to weak head normal form (WHNF). For normal data, weak head normal form is the same as head normal form. The difference only arises for functions, and is too abstruse to concern us here.
I have read a few resources and definitions (Haskell Wiki and Haskell Mail List and Free Dictionary) but I don't get it. Can someone perhaps give an example or provide a layman definition?
I am guessing it would be similar to:
WHNF = thunk : thunk
HNF = 0 : thunk
NF = 0 : 1 : 2 : 3 : []
How do seq
and ($!)
relate to WHNF and HNF?
Update
I am still confused. I know some of the answers say to ignore HNF. From reading the various definitions it seems that there is no difference between regular data in WHNF and HNF. However, it does seem like there is a difference when it comes to a function. If there was no difference, why is seq
necessary for foldl'
?
Another point of confusion is from the Haskell Wiki, which states that seq
reduces to WHNF, and will do nothing to the following example. Then they say that they have to use seq
to force the evaluation. Is that not forcing it to HNF?
Common newbie stack overflowing code:
myAverage = uncurry (/) . foldl' (\(acc, len) x -> (acc+x, len+1)) (0,0)
People who understand seq and weak head normal form (whnf) can immediately understand what goes wrong here.
(acc+x, len+1)
is already in whnf, so theseq
(in the definition offoldl'
), which reduces a value to whnf, does nothing to this. This code will build up thunks just like the originalfoldl
example, they'll just be inside a tuple. The solution is just to force the components of the tuple, e.g.myAverage = uncurry (/) . foldl' (\(acc, len) x -> acc `seq` len `seq` (acc+x, len+1)) (0,0)
-Haskell Wiki on Stackoverflow
I'll try to give an explanation in simple terms. As others have pointed out, head normal form does not apply to Haskell, so I will not consider it here.
Normal form
An expression in normal form is fully evaluated, and no sub-expression could be evaluated any further (i.e. it contains no un-evaluated thunks).
These expressions are all in normal form:
42
(2, "hello")
\x -> (x + 1)
These expressions are not in normal form:
1 + 2 -- we could evaluate this to 3
(\x -> x + 1) 2 -- we could apply the function
"he" ++ "llo" -- we could apply the (++)
(1 + 1, 2 + 2) -- we could evaluate 1 + 1 and 2 + 2
Weak head normal form
An expression in weak head normal form has been evaluated to the outermost data constructor or lambda abstraction (the head). Sub-expressions may or may not have been evaluated. Therefore, every normal form expression is also in weak head normal form, though the opposite does not hold in general.
To determine whether an expression is in weak head normal form, we only have to look at the outermost part of the expression. If it's a data constructor or a lambda, it's in weak head normal form. If it's a function application, it's not.
These expressions are in weak head normal form:
(1 + 1, 2 + 2) -- the outermost part is the data constructor (,)
\x -> 2 + 2 -- the outermost part is a lambda abstraction
'h' : ("e" ++ "llo") -- the outermost part is the data constructor (:)
As mentioned, all the normal form expressions listed above are also in weak head normal form.
These expressions are not in weak head normal form:
1 + 2 -- the outermost part here is an application of (+)
(\x -> x + 1) 2 -- the outermost part is an application of (\x -> x + 1)
"he" ++ "llo" -- the outermost part is an application of (++)
Stack overflows
Evaluating an expression to weak head normal form may require that other expressions be evaluated to WHNF first. For example, to evaluate 1 + (2 + 3)
to WHNF, we first have to evaluate 2 + 3
. If evaluating a single expression leads to too many of these nested evaluations, the result is a stack overflow.
This happens when you build up a large expression that does not produce any data constructors or lambdas until a large part of it has been evaluated. These are often caused by this kind of usage of foldl
:
foldl (+) 0 [1, 2, 3, 4, 5, 6]
= foldl (+) (0 + 1) [2, 3, 4, 5, 6]
= foldl (+) ((0 + 1) + 2) [3, 4, 5, 6]
= foldl (+) (((0 + 1) + 2) + 3) [4, 5, 6]
= foldl (+) ((((0 + 1) + 2) + 3) + 4) [5, 6]
= foldl (+) (((((0 + 1) + 2) + 3) + 4) + 5) [6]
= foldl (+) ((((((0 + 1) + 2) + 3) + 4) + 5) + 6) []
= (((((0 + 1) + 2) + 3) + 4) + 5) + 6
= ((((1 + 2) + 3) + 4) + 5) + 6
= (((3 + 3) + 4) + 5) + 6
= ((6 + 4) + 5) + 6
= (10 + 5) + 6
= 15 + 6
= 21
Notice how it has to go quite deep before it can get the expression into weak head normal form.
You may wonder, why does not Haskell reduce the inner expressions ahead of time? That is because of Haskell's laziness. Since it cannot be assumed in general that every subexpression will be needed, expressions are evaluated from the outside in.
(GHC has a strictness analyzer that will detect some situations where a subexpression is always needed and it can then evaluate it ahead of time. This is only an optimization, however, and you should not rely on it to save you from overflows).
This kind of expression, on the other hand, is completely safe:
data List a = Cons a (List a) | Nil
foldr Cons Nil [1, 2, 3, 4, 5, 6]
= Cons 1 (foldr Cons Nil [2, 3, 4, 5, 6]) -- Cons is a constructor, stop.
To avoid building these large expressions when we know all the subexpressions will have to be evaluated, we want to force the inner parts to be evaluated ahead of time.
seq
seq
is a special function that is used to force expressions to be evaluated. Its semantics are that seq x y
means that whenever y
is evaluated to weak head normal form, x
is also evaluated to weak head normal form.
It is among other places used in the definition of foldl'
, the strict variant of foldl
.
foldl' f a [] = a
foldl' f a (x:xs) = let a' = f a x in a' `seq` foldl' f a' xs
Each iteration of foldl'
forces the accumulator to WHNF. It therefore avoids building up a large expression, and it therefore avoids overflowing the stack.
foldl' (+) 0 [1, 2, 3, 4, 5, 6]
= foldl' (+) 1 [2, 3, 4, 5, 6]
= foldl' (+) 3 [3, 4, 5, 6]
= foldl' (+) 6 [4, 5, 6]
= foldl' (+) 10 [5, 6]
= foldl' (+) 15 [6]
= foldl' (+) 21 []
= 21 -- 21 is a data constructor, stop.
But as the example on HaskellWiki mentions, this does not save you in all cases, as the accumulator is only evaluated to WHNF. In the example below, the accumulator is a tuple, so it will only force evaluation of the tuple constructor, and not acc
or len
.
f (acc, len) x = (acc + x, len + 1)
foldl' f (0, 0) [1, 2, 3]
= foldl' f (0 + 1, 0 + 1) [2, 3]
= foldl' f ((0 + 1) + 2, (0 + 1) + 1) [3]
= foldl' f (((0 + 1) + 2) + 3, ((0 + 1) + 1) + 1) []
= (((0 + 1) + 2) + 3, ((0 + 1) + 1) + 1) -- tuple constructor, stop.
To avoid this, we must make it so that evaluating the tuple constructor forces evaluation of acc
and len
. We do this by using seq
.
f' (acc, len) x = let acc' = acc + x
len' = len + 1
in acc' `seq` len' `seq` (acc', len')
foldl' f' (0, 0) [1, 2, 3]
= foldl' f' (1, 1) [2, 3]
= foldl' f' (3, 2) [3]
= foldl' f' (6, 3) []
= (6, 3) -- tuple constructor, stop.
The section on Thunks and Weak Head Normal Form in the Haskell Wikibooks description of laziness provides a very good description of WHNF along with this helpful depiction:
Evaluating the value (4, [1, 2]) step by step. The first stage is completely unevaluated; all subsequent forms are in WHNF, and the last one is also in normal form.
Haskell programs are expressions and they are run by performing evaluation.
To evaluate an expression, replace all function applications by their definitions. The order in which you do this does not matter much, but it's still important: start with the outermost application and proceed from left to right; this is called lazy evaluation.
Example:
take 1 (1:2:3:[])
=> { apply take }
1 : take (1-1) (2:3:[])
=> { apply (-) }
1 : take 0 (2:3:[])
=> { apply take }
1 : []
Evaluation stops when there are no more function applications left to replace. The result is in normal form (or reduced normal form, RNF). No matter in which order you evaluate an expression, you will always end up with the same normal form (but only if the evaluation terminates).
There is a slightly different description for lazy evaluation. Namely, it says that you should evaluate everything to weak head normal form only. There are precisely three cases for an expression to be in WHNF:
- A constructor:
constructor expression_1 expression_2 ...
- A built-in function with too few arguments, like
(+) 2
orsqrt
- A lambda-expression:
\x -> expression
In other words, the head of the expression (i.e. the outermost function application) cannot be evaluated any further, but the function argument may contain unevaluated expressions.
Examples of WHNF:
3 : take 2 [2,3,4] -- outermost function is a constructor (:)
(3+1) : [4..] -- ditto
\x -> 4+5 -- lambda expression
Notes
- The "head" in WHNF does not refer to the head of a list, but to the outermost function application.
- Sometimes, people call unevaluated expressions "thunks", but I don't think that's a good way to understand it.
- Head normal form (HNF) is irrelevant for Haskell. It differs from WHNF in that the bodies of lambda expressions are also evaluated to some extent.
A good explanation with examples is given at http://foldoc.org/Weak+Head+Normal+Form Head normal form simplifies even the bits of an expression inside of a function abstraction, while "weak" head normal form stops at function abstractions.
From the source, if you have:
\ x -> ((\ y -> y+x) 2)
that is in weak head normal form, but not head normal form... because the possible application is stuck inside of a function that can't be evaluated yet.
Actual head normal form would be difficult to implement efficiently. It would require poking around inside of functions. So the advantage of weak head normal form is that you can still implement functions as an opaque type, and hence it's more compatible with compiled languages and optimization.
The WHNF does not want the body of lambdas to be evaluated, so
WHNF = \a -> thunk
HNF = \a -> a + c
seq
wants its first argument to be in WHNF, so
let a = \b c d e -> (\f -> b + c + d + e + f) b
b = a 2
in seq b (b 5)
evaluates to
\d e -> (\f -> 2 + 5 + d + e + f) 2
instead of, what would be using HNF
\d e -> 2 + 5 + d + e + 2
Basically, suppose you have some sort of thunk, t
.
Now, if we want to evaluate t
to WHNF or NHF, which are the same except for functions, we would find that we get something like
t1 : t2
where t1
and t2
are thunks. In this case, t1
would be your 0
(or rather, a thunk to 0
given no extra unboxing)
seq
and $!
evalute WHNF. Note that
f $! x = seq x (f x)
In an implementation of graph reduction, lazy evaluation to HNF forces you to deal with the name capture problem of lambda calculus, whereas lazy evaluation to WHNF lets you avoid it.
This is explained in Chapter 11 of The Implementation of Functional Programming Languages by Simon Peyton Jones.
I realize this is an old question, but here is an explicit mathematical definition of WHNF, HNF, and NF. In the pure lambda calculus:
A term is in NF if it is of the form
λ x1. λ x2. ... λ xn. (x t1 t2 ... tm)
where
x
is a variable, andt1, t2, ..., tm
are in NF.A term is in HNF if it is of the form
λ x1. λ x2. ... λ xn. (x e1 e2 ... em)
where
x
is a variable, ande1, e2, ..., em
are arbitrary terms.A term is in WHNF if it is either a lambda term
λ x. e
for any terme
or if it is of the formx e1 e2 ... em
where
x
is a variable ande1, e2, ..., em
are arbitrary terms.
Now consider a programming language with constructors a,b,c...
of arity na, nb, nc...
, which means that whenever t1, t2, ..., tm
are in NF, then the term a t1 t2 ... tm
where m = na
is a redex and can be evaluated. For example, the addition constructor +
in Haskell has arity 2
, because it only evaluates when it is given two arguments in normal form (in this case integers, which can themselves be considered as nullary constructors).
A term is in NF if it is of the form
λ x1. λ x2. ... λ xn. (x t1 t2 ... tm)
where
x
is either a variable or a constructor of arityn
withm < n
, andt1, t2, ..., tm
are in NF.A term is in HNF if it is of the form
λ x1. λ x2. ... λ xn. (x e1 e2 ... em)
where
x
is either a variable or a constructor of arityn
, ande1, e2, ... em
are arbitrary terms so long as the firstn
arguments are not all in NF.A term is in WHNF if it is either a lambda term
λ x. e
for any terme
or if it is of the formx e1 e2 ... em
where
x
is either a variable or a constructor of arityn
, ande1, e2, ... em
are arbitrary terms so long as the firstn
arguments are not all in NF.
In particular, any term in NF is in HNF, and any term in HNF is in WHNF, but not conversely.
Head normal form means there is no head redex
(λx.((λy.y+x)b))b
Reduces to: R, for redex (and yes there's another redex inside it but this is irrelevant). This is a head redex because it's the leftmost redex (the only redex), and there are no lambda terms before it (variables or lambda expressions (applications or abstractions)), only 0 to n abstractors (if R is a redex (λx.A)B
the abstractor of R is λx
), in this case 0.
Because there's a head redex it is not in HNF and it is therefore also not in NF because there's a redex.
WHNF means that it is a lambda abstraction or in HNF. The above is not in HNF and it is not a lambda abstraction, but an application, and is therefore not in WHNF.
λx.((λy.y+x)b)b
is in WHNF
It is a lambda abstraction, but not in HNF because there is a head λx.Rb
Reduce to λx.((b+x)b)
. There is no redex therefore it is in normal form.
Consider λx.((λy.zyx)b)
, it simplifies to λx.R
, so it is not in HNF. λx.(k(λy.zyx)b)
simplifies to λx.kR
therefore it is in HNF but not NF.
All NF are in HNF and WHNF. All HNF are WHNF.
精彩评论