A FUNCTIONAL INTERMEDIATE FORM
A functional coding language is one in which (as discussed in ) execution proceeds by binding variables to values, and never modifying a variable once it is initialized. This permits equational reasoning, which is useful to the programmer. But equational reasoning is even more useful to the compiler - many compiler optimizations involve the rewriting of a slow program into an equivalent faster program. When the compiler doesn't have to worry about x's value now versus x's value later, then these transformations are easier to express. This single-assignment property is at the heart of both functional coding and SSA form. There is a close relationship between the functional intermediate representations used by functional-language compilers and the SSA form used by imperative-language compilers. Image 19.18 shows the abstract syntax of the kind of intermediate representation used in modern functional-language compilers. It aspires to the best qualities of quadruples, SSA form, and lambda-calculus. As in quadruple notation, expressions are broken down into primitive operations whose order of evaluation is specified, every intermediate result is an explicitly named temporary, and every argument of an operator or function is an atom (variable or constant). As in SSA form and lambda-calculus, every variable has a single assignment (or binding), and every use of the variable is within the scope of the binding. As in lambda-calculus, scope is a simple syntactic notion, not requiring calculation of dominators.
![]() |
atom → c Constant integer atom → s Constant string pointer atom → v Variable exp → let fundefs in exp Function declaration exp → let v = atom in exp Copy exp → let v = binop(atom, atom) in exp Arithmetic operator exp → let v = M[atom] in exp Fetch from memory exp → M[atom]:=atom; exp Store to memory exp → if atom relop atom then exp else exp Conditional branch exp → atom(args) Tail call exp → let v = atom(args) in exp Non-tail call exp → return atom Return args → args → atom args fundefs → fundefs → fundefs function v(formals) = exp formals → formals → v formals binop → plus | minus | mul | ... relop → eq | ne | lt | ...
![]() |
Image 19.18: Functional intermediate representation. Binding occurrences of variables are underlined.
Scope No variable name can be used in more than one binding. Every binding of a variable has a scope within which all the uses of that variable must occur. For a variable bound by let v = … in exp, the scope of v is just the
exp. The scope of a function variable fi bound in let function f1(...) = exp1 : : function fk(...) = expk in exp
includes all the expj (to allow for mutually recursive functions) as well as the exp. For a variable bound as the formal parameter of a function, the scope is the body of that function. These scope rules make many optimizations easy to reason about; we will take inline expansion of functions as an example. As discussed in , when we have a definition f(x) = E and a use f(z) we can replace the use by a copy of E but with all the x's replaced by z's. In the Tree language of this is difficult to express because there are no functions; in the functional notation of the substitution can get complicated if z is a nonatomic expression (as shown in Algorithm 15.8b). But in the functional intermediate form of Image 19.18, where all actual parameters are atomic, inline expansion becomes very simple, as shown in Algorithm 15.8a. Translating SSA into functional form Any SSA program can be translated into this functional form, as shown in Algorithm 19.20. Each control-flow node with more than one predecessor becomes a function. The arguments of that function are precisely the variables for which there are φ-functions at the node. If node f dominates node g, then the function for g will be nested inside the body of the function for f . Instead of jumping to a node, a control-flow edge into a φ-containing node is represented by a function call. Program 19.19 shows how a translated program looks.SSA program of Image 19.4g converted to functional intermediate form.
Translate(node)= let C be the children of node in the dominator tree let p1,..., pn be the nodes of C that have more than one predecessor for i ← 1 to n let a1,..., ak be the targets of φ functions in pi (possibly k = 0) let Si = Translate(pi) let Fi = "function f pi (a1,..., ak) = Si" let F = F1 F2 ... Fn return Statements(node, 1, F) Statements(node, j, F)= if there are < j statements in node then let s be the successor of node if s has only one predecessor then return Statements(s, 1, F) else s has m predecessors suppose node is the ith predecessor of s suppose the φ-functions in s are a1 ← φ(a11,..., a1m), ... ak ← φ(ak1,..., akm) return "let F in fs(a1i,..., aki )" else if the jth statement of node is a φ-function then return Statements(node, j + 1, F) else if the jth statement of node is "return a" then return "let F in return a" else if the jth statement of node is a ← b ⊕ c The cases for a ← b, then let S = Statements(node, j + 1, F) a ← M[b], and M[a] ← b return "let a = b ⊕ c in S" are similar. else if the jth statement of node is "if a < b goto s1 else s2" then (in edge-split SSA form) s1 has only one predecessor, as does s2 let S1 = Translate(s1) let S2 = Translate(s2) return "let F in if a < b then S1 else S2"
Translating functional programs into functional intermediate form A functional program in a language such as PureFun-MiniJava starts in a form that obeys all the scope rules, but arguments are not atomic and variables are not unique. It is a simple matter to introduce well-scoped intermediate temporaries by a recursive walk of expression trees; dominator and SSA calculations are unnecessary.
All of the SSA-based optimization algorithms work equally well on a functional intermediate form; so will the optimizations and transformations on functional programs described in . Functional intermediate forms can also be made explicitly typed, type-checkable, and polymorphic as described in . All in all, this kind of intermediate representation has much to recommend it.