Previous Next |

Equational reasoning aids in understanding functional programs. One important principle of equational reasoning is β-*substitution:* If *f*(*x*) = *B* with some function body *B*, then any app *f*(*E*) to an expression *E* is equivalent to *B* with every occurrence of *x* replaced with *E*: *f*′ (x) = *B* implies that *f* (*E*) ≡ *B*[*x* ↦ *E*] But consider the PureFunJava program fragments,

{int loop (int z) { {int loop (int z) { return if (z>0) z return if (z>0) z else loop(z); else loop(z); } } int f (int x) { int f (int x) { return if (y>8) x return if (y>8) x else -y; else -y; } } return return if (y>8) loop(y) f(loop(y)); else -y; } }

If the expression *B* is `if (y>8) x else -y`, and expression *E* is `loop(y)`, then clearly the program on the left contains *f*(*E*) and the program on the right contains *B*[*x* ↦ *E*]. So these programs are equivalent, using equational reasoning. However, *the programs do not always behave the same!* If *y* = 0, then the program on the right will return 0, but the program on the left will first get stuckinacallto *loop*(0), which infinite-loops. Clearly, if we want to claim that two programs are equivalent, then they must behave the same. In PureFunJava, if we obtain program *A* by doing substition on program *B*, then *A* and *B* will never give different results *if they both halt*; but *A* or *B* might not halt on the same set of inputs. To remedy this (partial) failure of equational reasoning, we can introduce *lazy evaluation* into the coding language. Haskell is the most widely used lazy language. A program compiled with lazy evaluation will not evaluate any expression unless its value is demanded by some other part of the computation. In contrast, *strict* languages such as MiniJava, PureFunJava, ML, C, and Java evaluate each expression as the control flow of the program reaches it. To explore the compilation of lazy functional languages, we will use the LazyJava language. Its syntax is identical to PureFunJava, and its semantics are almost identical, except that lazy evaluation is used in compiling it.

Most coding languages (Pascal, C, ML, Java, MiniJava, PureFunJava) use *call-by-value* to pass function arguments: To compute *f*(*g*(*x*)), first *g*(*x*) is computed, and this value is passed to *f*. But if *f* did not actually need to use its argument, then computing *g*(*x*) will have been unnecessary. To avoid computing expressions before their results are needed, we can use *call-by-name* evaluation. Essentially, each variable is not a simple value, but is a *thunk:* a function that computes the value on demand. The compiler replaces each expression of type `int` with a function value of type `()->int`, and similarly for all other types. At each place where a variable is created, the compiler creates a function value; and everywhere a variable is used, the compiler puts a function app. Thus the LazyJava program

{int a = 5+7; return a + 10; }

is automatically transformed to

{int a() {return 5+7;} return a() + 10; }

Where are variables created? At variable declarations and at function-parameter bindings. Thus, each variable turns into a function, and at each function-call site, we need a little function declaration for each actual-parameter expression. Program 15.14 illustrates this transformation applied to the `look` function of Program 15.3a.

```
type c_void_int = () -> int;
type c_void_tree = () -> tree;
class tree {
c_void_String key;
c_void_int binding;
c_void_tree left;
c_void_tree right;
}
public c_void_int look(c_void_tree t, c_void_String k) {
c_void_int c = t().key().compareTo(k);
if (c() < 0) return look(t().left, k);
else if (c() > 0) return look(t().right, k);
else return t().binding;
}
```

The problem with call-by-name is that each thunk may be executed many times, each time (redundantly) yielding the same value. For example, suppose there is a tree represented by a thunk `t1`. Each time `look(t1,k)` is called, `t1()` is evaluated, which rebuilds the (identical) tree every time!

Lazy evaluation, also called *call-by-need*, is a modification of call-by-name that never evaluates the same thunk twice. Each thunk is equipped with a *memo* slot to store the value. When the thunk is first created, the memo slot is empty. Each evaluation of the thunk checks the memo slot: If full, it returns the *memo-ized* value; if empty, it calls the thunk function. To streamline this process, we will represent a lazy thunk as an object with a *thunk function*, a *memo slot*, and (as with any closure object) instance variables to represent free variables as necessary for use by the thunk function. An *unevaluated* thunk has an empty memo slot, and the thunk function, when called, computes a value and puts it in the memo slot. An *evaluated* thunk has the previously computed value in its memo slot, and its thunk function just returns the memo-slot value. For example, the LazyJava declaration int twenty = addFive(15) (in Program 15.1) is compiled in a context where the environment pointer EP will point to a record containing the `addFive` function. The representation of `addFive(15)` is not a function call that will go and compute the answer *now*, but a thunk that will remember how to compute it on demand, *later*. We might translate this fragment of the LazyJava program into FunJava as follows:

/* thisalready points to a record containingaddFive */ c_void_int twenty = new intThunk(this);

which is supported by the auxiliary declarations

class intThunk {public int eval(); int memo; boolean done; } class c_int_int {public int exec(int x);} class intFuncThunk {public c_int_int eval(); c_int_int memo; boolean done; } class twentyThunk extends intThunk { intFuncThunk addFive; public int exec() { if (!done) { memo = addFive.eval().exec(15); done = true; } return memo; } twentyThunk(addFive) {this.addFive=addFive;} } twentyThunk twenty = new twentyThunk(...);

To create a thunk such as `twenty`, it must be given values for its free variables (in this case, `addFive`) so that when it later evaluates, it has all the information it needs; this is just the same as closure conversion. To *touch* a lazy thunk `t`, we just compute `t.eval()`. The first time `t.eval()` is executed, it will see that `done` is false, and it will calculate the result and put it in `memo`. Any subsequent time that `t` is touched, `t.eval()` will simply return the `memo` field.

Here is a program that uses the `enter` function of Program 15.3b to build a tree mapping fthree ↦ 3!, −one ↦ (−1)!]:

{int fact(int i) { return if (i==0) 1 else i * fact(i-1) } tree t0 = new tree("",0,null,null); tree t1 = t0.enter("-one", if i=0 then 1 else i * fact(i-1)); tree t2 = t1.enter("three", fact(3)); return putInt(t2.look("three", exit)); }

A curious thing about this program is that `fact(-1)` is undefined. Thus, if this program is compiled by a (strict) PureFunJava compiler, it will infiniteloop (or will eventually overflow the machine's arithmetic as it keeps subtracting 1 from a negative number). But if compiled by a LazyJava compiler, the program will succeed, printing three factorial! First, variable `t1` is defined; but this does not actually call `enter` - it merely makes a thunk which will do so on demand. Then, `t2` is defined, which also does nothing but make a thunk. Then a thunk is created for `look(t2,"three")` (but `look` is not actually called). Finally, a thunk for the expression `putInt(…,exit)` is created. This is the result of the program. But the runtime system then "demands" an `answer` from this program, which can be computed only by calling the outermost thunk. So the body of `putInt` executes, which immediately demands the integer value of its first argument; this causes the `look(t2,"three")` thunk to evaluate. The body of `look` needs to compare `k` with `t.key`. Since `k` and `t` are each thunks, we can compute an integer by evaluating `k.eval()` and a tree by evaluating `t.eval()`. From the tree we can extract the `key` field; but each field is a thunk, so we must actually do `(t.eval().key)()` to get the integer.

The `t.key` value will turn out to be −1, so `look(t().right,k)` is called. *The program never evaluates the* `binding` *thunk in the* `-one` *node*, so `fact(-1)` is never given a chance to infinite-loop.

Lazy functional programs are subject to many of the same kinds of optimizations as strict functional programs, or even imperative programs. Loops can be identified (these are simply tail-recursive functions), induction variables can be identified, common subexpressions can be eliminated, and so on. In addition, lazy compilers can do some kinds of optimizations that strict functional or imperative compilers cannot, using equational reasoning. For example, given a loop

type intfun = int->int; intfun f (int i) { public int g(int j) {return h(i) * j;} return g; }

an optimizer might like to hoist the invariant computation `h(i)` out of the function `g`. After all, `g` may be called thousands of times, and it would be better not to recompute `h(i)` each time. Thus we obtain

type intfun = int->int; intfun f (int i) { int hi = h(i) public int g(int j) {return hi * j;} return g; }

and now each time `g` is called, it runs faster. This is valid in a lazy language. But in a strict language, this transformation is invalid! Suppose after `intfun a = f(8)` the function `a` is never called at all; and suppose `h(8)` infinite-loops; before the "optimization" the program would have terminated successfully, but afterward we get a nonterminating program. Of course, the transformation is also invalid in an impure functional language, because `h(8)` might have side effects, and we are changing the number of times `h(8)` is executed. **Dead-code removal** Another subtle problem with strict coding languages is the removal of *dead code*. Suppose we have

int f(int i) { int d = g(x); return i+2; }

The variable `d` is never used; it is *dead* at its definition. Therefore, the call to `g(x)` should be removed. In a conventional coding language, such as MiniJava or FunJava, we cannot remove `g(x)` because it might have side effects that are necessary to the operation of the program. In a strict, purely functional language such as PureFunJava, removing the computation `g(x)` could optimize a nonterminating computation into a terminating one! Though this seems benign, it can be very confusing to the programmer. We do not want programs to change their input/output behavior when compiled with different levels of optimization. In a lazy language, it is perfectly safe to remove dead computations such as `g(x)`. **Deforestation** In any language, it is common to break a program into one module that produces a data structure and another module that consumes it. Program 15.15 is a simple example; `range(i,j)` generates a list of the integers from `i` to `j`, `squares(l)` returns the square of each number, and `sum(l)` adds up all the numbers.

```
class intList {int head; intList tail; intList(head,tail){...}}
type intfun = int->int;
type int2fun = (int,int) -> int;
public int sumSq(intfun inc,int2fun mul, int2fun add) {
public intList range(int i, int j) {
return if (i>j) then null else intList(i, range(inc(i),j));
}
public intList squares(intList l) {
return if (l==null) null
else intList(mul(l.head,l.head), squares(l.tail));
}
int sum(int accum, intList l) {
return if (l==null) accum else sum(add(accum,l.head), l.tail);
}
return sum(0,squares(range(1,100)));
}
```

First `range` builds a list of 100 integers; then `squares` builds another list of 100 integers; finally, `sum` traverses this list. It is wasteful to build these lists. A transformation called *deforestation* removes intermediate lists and trees (hence the name) and does everything in one pass. The deforested `sumSq` program looks like this:

public int sumSq(intfun inc,int2fun mul, int2fun add) { public int f(int accum, int i, int j) { return if (i>j) accum else f(add(accum,mul(i,i)),inc(i)); } return f(0,1,100); }

In impure functional languages (where functions can have side effects) deforestation is not usually valid. Suppose, for example, that the functions `inc`, `mul`, and `add` alter global variables, or print on an output file. The deforestation transformation has rearranged the order of calling these functions; instead of

inc(1), inc(2), ... inc(100), mul(1, 1), mul(2, 2),... mul(100, 100), add(0, 1), add(1, 4),... add(328350, 10000)

the functions are called in the order

mul(1, 1), add(0, 1), inc(1), mul(2, 2), add(1, 4), inc(2), ⋮ mul(100, 100), add(328350, 10000), inc(100)

Only in a pure functional language is this transformation always legal.

Although laziness allows certain new optimizations, the overhead of thunk creation and thunk evaluation is very high. If no attention is paid to this problem, then the lazy program will run slowly no matter what other optimizations are enabled. The solution is to put thunks only where they are needed. If a function *f*(*x*) is certain to evaluate its argument *x*, then there is no need to pass a thunk for *x*; we can just pass an evaluated *x* instead. We are trading an evaluation now for a certain eventual evaluation. **Definition of strictness** We say a function *f*(*x*) is *strict in x* if, whenever some actual parameter *a* would fail to terminate, then *f*(*a*) would also fail to terminate. A multi-argument function *f*(*x*_{1},…, *x _{n}*) is strict in

int f(int x, int y) { return x + x + y; } int g(int x, int y) { return if (x>0) y else x; } tree h(String x, int y) { return new tree(x,y,null,null); } int j(int x) { return j(0); }

The function `f` is *strict* in its argument `x`, since if the result `f(x,y)` is demanded then `f` will certainly touch (demand the value of) `x`. Similarly, `f` is strict in argument `y`, and `g` is strict in `x`. But `g` is not strict in its second argument, because `g` can sometimes compute its result without touching `y`. The function `h` is not strict in either argument. Even though it appears to "use" both `x` and `y`, it does not demand (string or integer) values from them; instead it just puts them into a data structure, and it could be that no other part of the program will ever demand values from the `key` or `binding` fields of that particular `tree`. Curiously, by our definition of strictness, the function `j` is strict in `x` even though it never uses `x`. But the purpose of strictness analysis is to determine whether it is safe to evaluate `x` before passing it to the function `j`: Will this cause a terminating program to become nonterminating? In this case, if `j` is going to be called, it will infinite-loop anyway, so it doesn't matter if we perform a (possibly nonterminating) evaluation of `x` beforehand. **Using the result of strictness analysis** Program 15.16 shows the result of transforming the `look` function (of Program 15.3a) using strictness information. A call-by-name transformation has been applied here, as in Program 15.14, but the result would be similar using call-by-need. Function `look` is strict in both its arguments `t` and `key`. Thus, when comparing `k<t.key`, it does not have to *touch* `k` and `t`. However, the `t.key` field still points to a thunk, so it must be touched.

```
bindingThunk look(tree t, key k) {
return if (k < t.key.eval()) look(t.left.eval(), k)
else if (k > t.key.eval()) look(t.right.eval(), k)
else t.binding;
}
```

Since `look` is strict, callers of `look` are expected to pass evaluated values, not thunks. This is illustrated by the recursive calls, which must explicitly *touch* `t.left` and `t.right` to turn them from thunks to values. **Approximate strictness analysis** In some cases, such as the functions `f`, `g`, and `h` above, the strictness or nonstrictness of a function is obvious - and easily determined by an optimizing compiler. But in general, exact strictness analysis is not computable - like exact dynamic liveness analysis (see page 210) and many other dataflow problems. Thus, compilers must use a conservative approximation; where the exact strictness of a function argument cannot be determined, the argument must be assumed nonstrict. Then a thunk will be created for it; this slows down the program a bit, but at least the optimizer will not have turned a terminating program into an infinite-looping program. Algorithm 15.17 shows an algorithm for computing strictness. It maintains aset *H* of tuples of the form (`f`, (*b*_{1},…, *b _{n}*)), where

Function *M*:

Calculation of *H*:

**until** *done* Strictness (after the calculation of *H* terminates): `f` is strict in its *i*th argument if

If (`f`, (1, 1, 0)) is in the set *H*, then `f` might not be strict in its third argument. If (`f`, (1, 1, 0)) is never put into *H*,then `f` must be strict in its third argument. We also need an auxiliary function to calculate whether an *expression* may terminate. Given an expression *E* and a set of variables σ, we say that *M*(*E*, σ) means "*E* may terminate if all the variables in σ may terminate." If *E*_{1} is `i+j`, and there is some possibility that the thunks `i` and `j` may halt, then it is also possible that *E*_{1} will halt too: *M*(* i* +

Algorithm 15.17 will not work on the full LazyJava language, because it does not handle functions passed as arguments or returned as results. But for *first-order* programs (without higher-order functions), it does a good job of computing (static) strictness. More powerful algorithms for strictness analysis handle higher-order functions.

Church [1941] developed the λ-calculus, a "programming language" of nested functions that can be passed as arguments and returned as results. He was hampered by having no machines to compile for. **Closures** Landin [1964] showed how to interpret λ-calculus on an abstract machine, using closures allocated on a heap. Steele [1978] used closure representations specialized to different patterns of function usage, so that in many cases nonlocal variables are passed as extra arguments to an inner function to avoid heap-allocating a record. Cousineau et al. [1985] showed how closure conversion can be expressed as a transformation back into the source language, so that closure analysis can be cleanly separated from other phases of code generation. Static links are actually not the best basis for doing closure conversion; for many reasons it is better to consider each nonlocal variable separately, instead of always grouping together all the variables at the same nesting level. Kranz et al. [1986] performed *escape analysis* to determine which closures can be stack-allocated because they do not outlive their creating function and also integrated closure analysis with register allocation to make a high-performance optimizing compiler. Shao and Appel [1994] integrate closures with the use of callee-save registers to minimize the load/store traffic caused by accessing local and nonlocal variables. Appel [1992, s 10 and 12] has a good overview of closure conversion. **Continuations** Tail calls are particularly efficient and easy to analyze. Strachey and Wadsworth [1974] showed that the control flow of any program (even an imperative one) can be expressed as function calls, using the notion of *continuations*. Steele [1978] transformed programs into *continuationpassing style* early in compilation, turning all function calls into tail calls, to simplify all the analysis and optimization phases of the compiler. Kranz et al. [1986] built an optimizing compiler for Scheme using continuation-passing style; Appel [1992] describes a continuation-based optimizing compiler for ML. **Inline expansion** Cocke and Schwartz [1970] describe inline expansion of function bodies; Scheifler [1977] shows that it is particularly useful for languages supporting data abstraction, where there tend to be many tiny functions implementing operations on an abstract data type. Appel [1992] describes practical heuristics for controlling code explosion. **Continuation-based I/O** Wadler [1995] describes the use of monads to generalize the notion of continuation-based interaction. **Lazy evaluation** Algol-60 [Naur et al. 1963] used call-by-name evaluation for function arguments, implemented using thunks - but also permitted side effects, so programmers needed to know what they were doing! Most of its successors use call-by-value. Henderson and Morris [1976] and Friedman and Wise [1976] independently invented lazy evaluation (call-by-need). Hughes [1989] argues that lazy functional languages permit clearer and more modular coding than imperative languages.

Several lazy pure functional languages were developed in the 1980s; the community of researchers in this area designed and adopted the language Haskell [Hudak et al. 1992] as a standard. Peyton Jones [1987; 1992] describes many implementation and optimization techniques for lazy functional languages; Peyton Jones and Partain [1993] describe a practical algorithm for higher-order strictness analysis. Wadler [1990] describes deforestation.

JaVa |
Previous Next |