This article is about lazy evaluation, an evaluation strategy, an operational semantics. This means lazy evaluation specifies the steps taken by a computer to do things. Lazy evaluation is a popular way to implement most of Haskell, but not the only way.

Haskell itself is not lazy; it only specifies non-strictness, not what steps a computer takes, for most things. Non-strictness is a word from denotational semantics, which is only about relating expressions to values. Non-strictness means for example

const True ⊥ = True

(⊥ is a theoretical value to stand for “no information” or “no answer”. I cannot say “non-termination” because that would require talking about steps.) In other words, non-strictness means that an expression can have an answer even if some of its arguments have no answer.

Denotational semantics says whether you get an answer and what answer, but does not explain or control the amount of time and memory used to get it. This is when you need the operational semantics given by compilers and interpreters; fortunately, most of them give lazy evaluation, coarsely speaking. Lazy evaluation suffices to explain asymptotic complexities such as why some code uses Θ(n) memory and why an alternative uses Θ(1) memory. Even such a coarse understanding without real cycle counts and byte counts already solves a lot of mysteries because most programmers do not even get this far.

If actual cycle counts and byte counts matter, if you need to explain why some code uses 40 MB memory and why an alternative uses just 10 MB, then lazy evaluation does not suffice, and you need the low-level operational semantics of your compiler or interpreter, including code optimizations. This is outside the scope of this article. But seeing that the low-level operational semantics is a refinement of lazy evaluation, you should probably know lazy evaluation first.

Knowing lazy evaluation does not help explain what answer you get, if that
is all you want regardless of complexity. It can be done, and most people
do it (out of poor education), but it is too low-level and tedious,
missing the forest for the trees, nay, tree leaves. Denotational semantics
is better, but still not high-level enough. Axiomatic semantics combined with
program derivation is the most suitable, and I have triumphant examples of
the axiomatic, derivational style:
`scanl`

examples,
`fix`

examples,
a primitive recursion example.

The main points of lazy evaluation are pretty straightforward, just unfamiliar: short-circuit evaluation, and sharing the result (if the thing evaluated has multiple parents). This is of course vague; I illustrate below and fully clarify in subsequent sections.

To illustrate short-circuit evaluation, I evaluate `(\c _ -> c) True (3*4+3*4)`

:

●

`(\c _ -> c) True (3*4+3*4)`

→{function application}

`True`

□

In particular the expression `3*4+3*4`

is left alone because
the rule for function application doesn't evaluate it.

A longer example, to show what pattern-matching evaluates:

●

`case (\c _ -> c) (Just (3*4+3*4)) 4 of { Just _ -> True; Nothing -> False }`

→{evaluate until match}

●

`(\c _ -> c) (Just (3*4+3*4)) 4`

→{function application}

`Just (3*4+3*4)`

…

`case Just (3*4+3*4) of { Just _ -> True; Nothing -> False }`

→{match pattern}

`True`

□

The sub-expression is evaluated only as far as to tell which pattern to
match; `3*4+3*4`

is again unevaluated.

But sometimes we are also interested in evaluating `3*4+3*4`

,
for example if we expressedly want it done, or if the pattern above
were `Just 0`

. Here is how.

The operators `(+)`

and `(*)`

for built-in
numeric types (Int, Integer, Float, Double) are eager, not lazy: to evaluate
`e+f`

(say),
both `e`

and `f`

are evaluated first. Let us also pick
the evaluation order so that we evaluate `e`

, then `f`

,
then get back to finish `e+f`

; similarly
for `e*f`

.
I now evaluate `3*4+3*4`

:

●

`3*4+3*4`

→{first operand not done}

●

`3*4`

→{both operands done, arithmetic}

`12`

…

`12+3*4`

→{second operand not done}

●

`3*4`

→{both operands done, arithmetic}

`12`

…

`12+12`

→{both operands done, arithmetic}

`24`

□

Now I illustrate sharing. I evaluate `(\x->x+x) (3*4)`

:

●

`(\x->x+x) (3*4)`

→{function application}

```
```

→{need first operand}

●

`3*4`

→{both operands done, arithmetic}

`12`

…

```
```

→{both operands done, arithmetic}

`24`

□

By mentioning the same formal argument `x`

multiple times,
lazy evaluation requires sharing the actual argument `3*4`

as well as everything that happens to it subsequently, such as it becoming
`12`

.
The same sharing applies to bindings in pattern-matching, `let`

,
`where`

, and top-level definitions. (For example,
`let x=3*4 in x+x`

is almost identical to the above.)
However, sharing does not apply to `3*4+3*4`

per se,
unless the compiler applies an optimization called
common-subexpression elimination.

This whirlwind tour gives a glimpse of lazy evaluation but leaves the precise rules unsaid. These are spelled out in subsequent sections.

The notation I use to depict expressions is designed to make explicit what is shared (and what is not shared); this is very important because it makes the whole world of difference in lazy evaluation. The notation also happens to be close to how most compilers and interpreters represent expressions.

A typical expression consists of mostly function applications and
constructors, like
`let x=blah in f x : g x`

. In the extreme, I may draw a graph in which a node
stands for each function application or constructor, with explicit pointers to functions
and arguments, like this:

This extreme takes a lot of space and a lot of work. More often, I like
to merge most nodes into one, drawing separate nodes for shared
things only (in this example `blah`

), like this:

Furthermore, I don't draw separate nodes for lambda-bound variables, though
they may occur multiple times. So for example,
I leave `(\x -> f x : g x)`

as it is.
Lambda-bound variables represent holes to be filled in, rather than pointers
to more nodes.

I call these Inlined Expression Graphs: they are like expression graphs, but I inline (merge) unshared nodes to save space and work.

This notation is similar to that in the book Introduction to Functional Programming using Haskell, 2nd edition, by Richard Bird. I add boxes around nodes; I also have slightly more separate nodes.

There is a text-only notation in the paper A Natural Semantics for Lazy
Evaluation, by John Launchbury. It is like using a huge let-expression
to represent an expression graph, except that it does away with the keywords
“let” and “in”. It binds names to nodes (except the root), so pointers
can be serialized as target node names. It remains to write down the binding
of names to node contents, and to distinguish the root. So for example,
`let x=3*4 in x+x`

is written as:

{x↦t*f, t↦3, f↦4} : x+x

(It uses the convention that every argument must be a separate node.)

Likewise, the book The Haskell School of Expression, by Paul Hudak, uses a text-only notation based on where-clauses and bindings when sharing is important, e.g., x+x where x=3*4. I used to use this notation when contributing to the Haskell Wiki a long time ago.

I now decide against a text-only notation because I think that most readers like to see pictures when sharing happens.

I write evaluation steps from start to end like this:

●stuff0

→{why}

stuff1

→{why}

stuff2

□

Sometimes, evaluation requires zooming into a sub-expression and processing it, before zooming out and continuing with the parent expression. I indicate this by an indented, nested evaluation, like this:

●stuff0 blah

→{why zoom in}

●blah

→{why}

yak

…stuff0 yak

→{why}

stuff1

□

This format is a fragment of Structured Derivation by Ralph-Johan Back. The most important feature I use here is nesting, which is what “structured” refers to. Most other presentations are unnested (including Bird's, Hudak's, all webpages before this one; fortunately, Launchbury's has nesting), i.e.,

stuff0 blah

→{why blah becomes yak}

stuff0 yak

→{why}

stuff1

A nested presentation is superior in several ways. It separates the two concerns—which rule triggers zooming into blah, which rule transforms blah—and explains them separately. It cuts the bloat and confusion of copying the parent expression over and over again when the sub-expression takes many steps to process. It is closer to what most compilers and interpreters do: when zooming in, one stack unit is reserved; when zooming out, one stack unit is freed. Thus, nesting depth is stack depth, and it completely explains stack overflows.

The evaluation rule for function application: evaluate the function until it is a lambda term, then plug in the arguments. Multiple occurrences of a formal argument must all point to the same actual argument node (sharing).

Example: `(\c -> (c, 0)) blah`

●

`(\c -> (c, 0)) blah`

→{function application}

`(blah, 0)`

□

Example: `(\c -> (c, c)) blah`

●

`(\c -> (c, c)) blah`

→{function application}

```
```

□

The above examples are easy because of two simplifications: the function is already a lambda term, and there is only one argument. The following example shows the first complication, where the function needs its own processing first; it uses an evaluation rule covered later.

Example: `(if True then (\c -> (c, 0)) else (\c -> (c, 1))) blah`

●

`(if True then (\c -> (c, 0)) else (\c -> (c, 1))) blah`

→{need function}

●

`(if True then (\c -> (c, 0)) else (\c -> (c, 1)))`

→{if-then-else}

`(\c -> (c, 0))`

…

`(\c -> (c, 0)) blah`

→{function application}

`(blah, 0)`

□

The second complication is about “multiple” arguments like
`f x y`

. Technically, this is `(f x) y`

, i.e.,
a function application with one argument `y`

, but the
function is yet another function application.

Example: `(\c _ -> c) True (3*4+3*4)`

●

`(\c _ -> c) True (3*4+3*4)`

→{need function}

●

`(\c _ -> c) True`

→{function application}

`(\_ -> True)`

…

`(\_ -> True) (3*4+3*4)`

→{function application}

`True`

□

It is a hassle to add N-1 nesting levels for what can be casually read as 1 function and N arguments. In practice, I will shorten the evaluation steps to plugging in all N arguments at once. This shortcut is safe for asymptotic complexity because the Haskell type system limits the number of arguments below program size. So this shortcut does not mistake a Θ(n) computation for a Θ(1) computation; it only hides a constant factor. This shortcut is also taken by most compilers and interpreters.

Example: `(\c _ -> c) True (3*4+3*4)`

●

`(\c _ -> c) True (3*4+3*4)`

→{function application}

`True`

□

If the function is a name with a definition elsewhere, I have two choices: I could spend a step turning the name into the corresponding lambda term, and then plug in the arguments; or I could just jump to plugging in the arguments. So here is another shortcut: I will jump to plugging in the arguments. It is less clutter, especially when later I cover functions defined by pattern matching. It is safe for asymptotic complexity. It is also close to most compilers and interpreters.

Example: `const True (3*4+3*4)`

given: `const c _ = c`

●

`const True (3*4+3*4)`

→{function application}

`True`

□

A later section covers function applications in which the functions are defined by pattern matching.

The evaluation rule for comparison and arithmetic of built-in numeric
types (Int, Integer, Float, Double): evaluate the operands (in operand order,
usually) until they are final answers, then evaluate the expression in question
until it is a final answer. For example: to evaluate `e+f`

,
evaluate `e`

, then evaluate `f`

, then evaluate
`e+f`

.

(For user-defined types, evaluation order and eagerness depend on
implementation, since comparison and arithmetic for them are also user-defined
via the type classes Eq, Ord, Enum, Num, etc. User-defined code can do anything
it likes. There are well-known user-defined naturals with partly lazy
comparison and fully lazy `succ`

, for example.)

Example: `3*4+3*4`

●

`3*4+3*4`

→{first operand not done}

●

`3*4`

→{both operands done, arithmetic}

`12`

…

`12+3*4`

→{second operand not done}

●

`3*4`

→{both operands done, arithmetic}

`12`

…

`12+12`

→{both operands done, arithmetic}

`24`

□

Example: `0*(3+4)`

●

`0*(3+4)`

→{second operand not done}

●

`3+4`

→{arithmetic}

`7`

…

`0*7`

→{arithmetic}

`0`

□

The evaluation rule for most operations on Char (comparison, most Enum operations, conversion from/to Int) is similar: evaluate the operands (in operand order, usually) until they are final answers, then evaluate the expression in question until it is a final answer.

Example: `succ 'g' == pred (pred 'j')`

●

`succ 'g' == pred (pred 'j')`

→{first operand not done}

●

`succ 'g'`

→{Char succ}

`'h'`

…

`'h' == pred (pred 'j')`

→{second operand not done}

●

`pred (pred 'j')`

→{operand not done}

●

`pred 'j'`

→{Char pred}

`'i'`

…

`pred 'i'`

→{Char pred}

`'h'`

…

`'h' == 'h'`

→{Char comparison}

`True`

□

if-then-else can be regarded as a special case of pattern matching. I cover if-then-else first—it's easy.

The evaluation rule for `if b then et else ef`

:
evaluate `b`

until it becomes `True`

or `False`

,
then go with `et`

or `ef`

, respectively.

Example: `if 2>2 then () else ()`

●

`if 2>2 then () else ()`

→{need condition}

●

`2>2`

→{arithmetic}

`False`

…

`if False then () else ()`

→{if-then-else}

`()`

□

The evaluation rule for `case e of {…}`

: evaluate the argument
`e`

enough to find the first pattern that match (and enough to
see that it does not match those patterns before), then bind variables in
the chosen pattern, and go with the chosen branch; or to find no match,
then it's a runtime error.

There are a lot of variations in how much evaluation is enough
and how much binding and sharing happens, depending on the patterns.
In this section and several case studies shortly after, all patterns
have at most one level, e.g., there are `Just x`

and
`x:xs`

, but none nested like `Left (Just x : (Nothing : ys))`

.
There are also no guards.
This is why the section title says “Level 1”; we shall accomplish higher
levels later! So this section just focuses on level 1 patterns and sharing.

With level 1 patterns, the argument `e`

is evaluated just
until its outermost constructor is exposed, and then we know which branch
to choose.

This is right for types defined by `data`

,
but not quite right for those by `newtype`

. I will cover
`newtype`

in the future.

Example: `case const (Just a) b of { Just _ -> True; Nothing -> False }`

●

`case const (Just a) b of { Just _ -> True; Nothing -> False }`

→{evaluate argument for pattern

`Just _`

}●

`const (Just a) b`

→{function application}

`Just a`

…

`case Just a of { Just _ -> True; Nothing -> False }`

→{match pattern}

`True`

□

A level 1 pattern could also be a Char or numeric literal.
Matching against it boils down to comparison using `(==)`

,
i.e., the argument is evaluated as much as `(==)`

specifies.
Thus, for Char and built-in numbers, evaluate until the final answer;
for user-defined numbers, it depends on user-defined `(==)`

.

Example: `case 3*4 of { 0 -> True; _ -> False }`

●

`case 3*4 of { 0 -> True; _ -> False }`

→{evaluate argument for pattern

`0`

}●

`3*4`

→{arithmetic}

`12`

…

`case 12 of { 0 -> True; _ -> False }`

→{match pattern}

`False`

□

There are also level 0 patterns: just a variable or the `_`

wildcard. The argument is unevaluated to match them.

Example: `case 3*4 of { x -> True }`

●

`case 3*4 of { x -> True }`

→{match pattern}

`True`

□

Variables in patterns are similar to lambda-bound variables: They are holes to be filled with pointers, and when you fill them in and go with the branch, multiple occurrences point to the same node (sharing).

Example: `case Just (3*4) of { Just x -> (x,x); Nothing -> (0,0) }`

●

`case Just (3*4) of { Just x -> (x,x); Nothing -> (0,0) }`

→{match pattern}

```
```

□

Example: `case Just (3*4) of { v@(Just x) -> (x,v); Nothing -> undefined }`

●

```
case
of { v@(Just x) -> (x,v); Nothing -> undefined }
```

→{match pattern}

```
```

□

In the graph above, I draw some nodes explicitly to subtly indicate this process.
We have a `Just`

node, and it has pointer to a `3*4`

node.
We give the `Just`

node to `case`

, and that's what
`v`

receives. Whatever our `Just`

node points to,
that's what `x`

receives. This explains why `3*4`

is shared. The following example further emphasizes that `v`

also shares existing nodes rather than pointing to new nodes.

Example: `let j = Just (3*4) in case j of { v@(Just x) -> (x, v, j) }`

●

```
```

→{match pattern}

```
```

□

The evaluation rule for function application, where the function is defined
elsewhere by pattern-matching with 1 argument: similar to a `case`

expression, evaluate the argument enough to find the first match, then bind
variables and go with the chosen RHS; or to find no match, then abort.

Example: `null (iterate id 'c')`

given these definitions:

null [] = True null (_:_) = False iterate f x = x : iterate f (f x)

●

`null (iterate id 'c')`

→{evaluate argument for pattern

`[]`

}●

`iterate id 'c'`

→{function application}

`'c' : iterate id 'c'`

…

`null ('c' : iterate id 'c')`

→{match pattern, function application}

`False`

□

(Technically I should draw a graph to show the sharing of `'c'`

.)

The evaluation rule for function application, where the function takes several arguments and is defined by pattern-matching, is fairly delicate to describe because the order of evaluating the arguments is delicate.

Presumably, the function definition consists of a sequence of equations. At function application, the equations are tried in the sequence order.

- To try the first equation:
- Evaluate the 1st argument until we know it matches or it doesn't match. If it doesn't match, jump below to try the second equation.
- Evaluate the 2nd argument until we know it matches of it doesn't match. If it doesn't match, jump below to try the second equation.
- Etc.
- If all arguments match, bind variables.
- If the equation comes with no guard, go with its RHS. Otherwise…
- If there are guards, evaluate them in sequence until the first
guard that evaluates to
`True`

, and go with its RHS. If all evaluate to`False`

, jump below to try the second equation.

- To try the second equation:
- Evaluate the 1st argument until we know it matches or it doesn't match. If it doesn't match, jump below to try the third equation.
- Evaluate the 2nd argument until we know it matches of it doesn't match. If it doesn't match, jump below to try the third equation.
- …

- To try the third equation: …
- If none of the equations is chosen, it's a runtime error.

An important aspect is that arguments could be more evaluated than the chosen equation requires because there are equations before it, and they have more specifc patterns or more demanding guards, and we need that much evaluation to decide to skip them.

Example: `f [] (r 'y')`

given:

r x = x : r x f [] _ = 0 f _ [] = 1

●

`f [] (r 'y')`

→{match pattern, function application}

`0`

□

Example: `f (r 'y') []`

●

`f (r 'y') []`

→{evaluate 1st argument for pattern

`[]`

}●

`r 'y'`

→{function application}

`'y' : r 'y'`

…

`f ('y' : r 'y') []`

→{match pattern, function application}

`1`

□

Example: `g (r 'x') (r 'y')`

given:

r x = x : r x g (x:xs) (y:ys) | x==y = 0 g _ _ = 1

●

`g (r 'x') (r 'y')`

→{evaluate arguments for 1st

`g`

equation}●

`r 'x'`

→{function application}

`'x' : r 'x'`

●

`r 'y'`

→{function application}

`'y' : r 'y'`

●

`'x' == 'y'`

→{Char equality}

`False`

…

`g ('x' : r 'x') ('y' : r 'y')`

→{match pattern, function application}

`1`

□

Function definitions that not only have patterns and guards but also
`where`

bindings can have very elaborate sharing structures.
I will show them in the future when I find a good notation.

The most important purpose of knowing lazy evaluation is to explain asymptotic complexity, especially exhaustions such as stack overflow and heap leak.

Asymptotic time complexity is the number of steps in the evaluation sequence.

Asymptotic stack complexity is the depth of nesting in the evaluation sequence (how many times I zoom into sub-expressions).

Asymptotic heap complexity can be gleaned from expression sizes in the evaluation sequence, but it may take some work when there are zoom-in steps.

When there is no zooming in, every step has the complete expression graph, and so the graph size is heap size. Of course, I inline some nodes, so count both node numbers and string lengths to be sure. Richard Bird's book has another good way to count: count actual arguments. All are asymptotically accurate counts, differing in constant overheads only.

When there is zooming in, in some cases none of the steps shows the
complete expression graph at the time. So you have to imagine a flattened
presetation for the complete graph. Here is an example
excerpted from the `foldr`

case study later:

●

`f (1:2:3:[])`

→{match pattern, function application}

`1 + f (2:3:[])`

→{need right operand}

●

`f (2:3:[])`

→{match pattern, function application}

`2 + f (3:[])`

→{need right operand}

●

`f (3:[])`

→{match pattern, function application}

`3 + f []`

→{need right operand}

●

`f []`

→{match pattern, function application}

`0`

At this point, the complete graph is `1 + (2 + (3 + 0))`

.
The total number of actual arguments is 6 (3 binary operators).
In the `foldr`

case study, if you start with a list of length n,
then in the middle the complete graph has 2n actual arguments, i.e.,
Θ(n) heap space.

It is a good exercise to sum numbers (and do other tasks) with
`foldr`

, though we know it is fairly expensive.

foldr f z [] = z foldr f z (x:xs) = f x (foldr f z xs)

●

`foldr (+) 0 (1:2:3:[])`

→{match pattern, function application}

`1 + foldr (+) 0 (2:3:[])`

→{need right operand}

●

`foldr (+) 0 (2:3:[])`

→{match pattern, function application}

`2 + foldr (+) 0 (3:[])`

→{need right operand}

●

`foldr (+) 0 (3:[])`

→{match pattern, function application}

`3 + foldr (+) 0 []`

→{need right operand}

●

`foldr (+) 0 []`

→{match pattern, function application}

`0`

…

`3 + 0`

→{arithmetic}

`3`

…

`2 + 3`

→{arithmetic}

`5`

…

`1 + 5`

→{arithmetic}

`6`

□

This takes Θ(n) stack space; a long list will cause a stack overflow.
Moreover, at the darkest moment, the whole graph for `1+(2+(3+0))`

exists in the heap. This takes Θ(n) heap space.

Lest you think `foldr`

is always a hog, here is a different example.

True || _ = True False || x = x

●

`foldr (||) False (True:False:False:[])`

→{match pattern, function application}

`True || foldr (||) False (False:False:[])`

→{match pattern, function application}

`True`

□

If the binary operation is lazy in the 2nd argument,
`foldr`

is not a hog.

You have seen how `foldr`

fails at summing numbers.
Lest you think `foldl`

is any better:

foldl f z [] = z foldl f z (x:xs) = foldl f (f z x) xs

●

`foldl (+) 0 (1:2:3:[])`

→{match pattern, function application}

`foldl (+) (0+1) (2:3:[])`

→{match pattern, function application}

`foldl (+) ((0+1)+2) (3:[])`

→{match pattern, function application}

`foldl (+) (((0+1)+2)+3) []`

→{match pattern, function application}

`((0+1)+2)+3`

→{need left operand}

●

`(0+1)+2`

→{need left operand}

●

`0+1`

→{arithmetic}

`1`

…

`1+2`

→{arithmetic}

`3`

…

`3+3`

→{arithmetic}

`6`

□

This uses Θ(n) stack space and Θ(n) heap space. So-called “tail call” is utterly irrelevant. After reading this example, please do me a favour: never ask about “tail call” again.

A better way to sum numbers is covered in the future, involving eagerness annotations.

Even if we have an efficient function for summing—call it
`goodsum`

—the next surprise comes when computing averages using
`goodsum 0 xs / length xs`

.

Assume that `goodsum`

behaves like this in general:

`goodsum 5 (10:xs)`

→{match pattern, function application}

`goodsum (5+10) xs`

→{magic}

`goodsum 15 xs`

etc

and these definitions for the rest:

average xs = goodsum 0 xs / length xs genlist 0 = [] genlist n = n : genlist (n-1)

I won't need a definition for `length`

because I won't get
that far.

●

`average (genlist 3)`

→{function application}

```
```

→{evaluate left operand}

●

`goodsum 0 (genlist 3)`

→{evaluate operand for pattern}

●

`genlist 3`

→{match pattern, function application}

`3 : genlist (3-1)`

…

`goodsum 0 (3 : genlist (3-1))`

→{magic}

`goodsum 3 (genlist (3-1))`

→{evaluate operand for pattern}

●

`genlist (3-1)`

→{evaluate operand, match pattern, function application}

`2 : genlist (2-1)`

…

`goodsum 3 (2 : genlist (2-1))`

→{magic}

`goodsum 5 (genlist (2-1))`

→{evaluate operand for pattern}

●

`genlist (2-1)`

→{evaluate operand, match pattern, function application}

`1 : genlist (1-1)`

…

`goodsum 5 (1 : genlist (1-1))`

→{magic}

`goodsum 6 (genlist (1-1))`

→{evaluate operand for pattern}

●

`genlist (1-1)`

→{evaluate operand, match pattern, function application}

`[]`

…

`goodsum 6 []`

→{match pattern, function application}

`6`

…

```
```

etc

The effect is that due to sharing, although `goodsum`

tries
to forget pieces of the list as it goes, some other part of the parent
expression has a pointer to the list and keeps it. Before `goodsum`

begins, the pointer points to just a short `genlist 3`

;
when `goodsum`

finishes, the pointer points to a long list,
which occupies Θ(n) heap space. Some people consider it a heap leak.

To show this effect more clearly, I give a flat presentation:

●

`average (genlist 3)`

→{function application}

```
```

→{evaluate

`genlist`

}```
```

→{evaluate

`goodsum`

}```
```

→{evaluate

`genlist`

}```
```

→{evaluate

`goodsum`

}```
```

→{evaluate

`genlist`

}```
```

→{evaluate

`goodsum`

}```
```

→{evaluate

`genlist`

}```
```

→{evaluate

`goodsum`

}```
```

etc

Thus heap size has grown to Θ(n).

Lest you think all lazy evaluation does is taking up a lot of memory, the following examples show how to use lazy evaluation properly to obtain compositional, asymptotically optimal programs.

Again, assume that `goodsum`

behaves like this in general:

`goodsum 5 (10:xs)`

→{match pattern, function application}

`goodsum (5+10) xs`

→{magic}

`goodsum 15 xs`

etc

and this definition:

genlist 0 = [] genlist n = n : genlist (n-1)

Evaluate `goodsum 0 (genlist 3)`

:

●

`goodsum 0 (genlist 3)`

→{evaluate operand for pattern}

●

`genlist 3`

→{match pattern, function application}

`3 : genlist (3-1)`

…

`goodsum 0 (3 : genlist (3-1))`

→{magic}

`goodsum 3 (genlist (3-1))`

→{evaluate operand for pattern}

●

`genlist (3-1)`

→{evaluate operand, match pattern, function application}

`2 : genlist (2-1)`

…

`goodsum 3 (2 : genlist (2-1))`

→{magic}

`goodsum 5 (genlist (2-1))`

→{evaluate operand for pattern}

●

`genlist (2-1)`

→{evaluate operand, match pattern, function application}

`1 : genlist (1-1)`

…

`goodsum 5 (1 : genlist (1-1))`

→{magic}

`goodsum 6 (genlist (1-1))`

→{evaluate operand for pattern}

●

`genlist (1-1)`

→{evaluate operand, match pattern, function application}

`[]`

…

`goodsum 6 []`

→{match pattern, function application}

`6`

For a while, it goes like the Average example. The important point this time is that, as no one else holds a pointer to the generated list, its cells are short-lived, like virtual particles. This program takes Θ(1) memory. Some compilers even optimize away the short-lived cells.

This shows an important idiom of using lazy evaluation: We have a producer,
`genlist 3`

, and we have a consumer, `goodsum 0`

,
and when put together, they behave like a pair of co-routines, passing control
to the producer for one cell, then passing control to the consumer to use
and lose that cell, then passing control to the producer for the next cell…
We say that in Haskell we use lists as our for-loops.

The above example still does not do justice to the idiom. The idiom really shines when the consumer stops consuming early for whatever reason it feels like. The producer is not burdened with the stopping condition of the consumer. We get a cleaner separation. To be continued.

I have more Haskell Notes and Examples