Previous Next |
Let us define canonical trees as having these properties:
No SEQ or ESEQ.
How can the ESEQ nodes be eliminated? The idea is to lift them higher and higher in the tree, until they can become SEQ nodes. Image 8.1 gives some useful identities on trees.
Identity (1) is obvious. So is identity (2): Statement s is to be evaluated; then e_{1}; then e_{2}; then the sum of the expressions is returned. If s has side effects that affect e_{1} or e_{2}, then either the left-hand side or the right-hand side of the first equation will execute those side effects before the expressions are evaluated. Identity (3) is more complicated, because of the need not to interchange the evaluations of s and e_{1}. For example, if s is MOVE(MEM(x), y) and e_{1} is BINOP(PLUS, MEM(x), z), then the program will compute a different result if s is evaluated before e_{1} instead of after. Our goal is simply to pull s out of the BINOP expression; but now (to preserve the order of evaluation) we must pull e_{1} out of the BINOP with it. To do so, we assign e_{1} into a new temporary t, and put t inside the BINOP. It may happen that s causes no side effects that can alter the result produced by e_{1}. This will happen if the temporaries and memory locations assigned by s are not referenced by e_{1} (and s and e_{1} don't both perform external I/O). In this case, identity (4) can be used. We cannot always tell if two expressions commute. For example, whether MOVE(MEM(x), y) commutes with MEM(z) depends on whether x = z, which we cannot always determine at compile time. So we conservatively approximate whether statements commute, saying either "they definitely do commute" or "perhaps they don't commute." For example, we know that any statement "definitely commutes" with the expression CONST(n), so we can use identity (4) to justify special cases like
The commute function estimates (very naively) whether a statement commutes with an expression:
static boolean commute(Tree.Stm a, Tree.Exp b) { return isNop(a) || b instanceof Tree.NAME || b instanceof Tree.CONST; } static boolean isNop(Tree.Stm a) { return a instanceof Tree.EXP && ((Tree.EXP)a).exp instanceof Tree.CONST; }
A constant commutes with any statement, and the empty statement commutes with any expression. Anything else is assumed not to commute.
In general, for each kind of Tree statement or expression we can identify the subexpressions. Then we can make rewriting rules, similar to the ones in Image 8.1, to pull the ESEQs out of the statement or expression. For example, in [e_{1}, e_{2}, ESEQ(s, e_{3})], the statement s must be pulled leftward past e_{2} and e_{1}. If they commute, we have (s; [e_{1}, e_{2}, e_{3}]). But suppose e_{2} does not commute with s; then we must have
Or if e_{2} commutes with s but e_{1} does not, we have
The reorder function takes a list of expressions and returns a pair of (statement, expression-list). The statement contains all the things that must be executed before the expression-list. As shown in these examples, this includes all the statement-parts of the ESEQs, as well as any expressions to their left with which they did not commute. When there are no ESEQs at all we will use EXP(CONST 0), which does nothing, as the statement. Algorithm Step one is to make a "subexpression-extraction" method for each kind. Step two is to make a "subexpression-insertion" method: Given an ESEQ-clean version of each subexpression, this builds a new version of the expression or statement. These will be methods of the Tree.Exp and Tree.Stm classes:
package Tree; abstract public class Exp { abstract public ExpList kids(); abstract public Exp build(ExpList kids); } abstract public class Stm { abstract public ExpList kids(); abstract public Stm build(ExpList kids); }
Each subclass Exp or Stm must implement the methods; for example,
package Tree; public class BINOP extends Exp { public int binop; public Exp left, right; public BINOP(int b, Exp l, Exp r) {binop=b; ...} public final static int PLUS=0, MINUS=1, MUL=2, DIV=3, AND=4,OR=5,LSHIFT=6,RSHIFT=7,ARSHIFT=8,XOR=9; public ExpList kids() {return new ExpList(left, new ExpList(right,null));} public Exp build(ExpList kids) { return new BINOP(binop,kids.head,kids.tail.head); } }
Other subclasses have similar (or even simpler) kids and build methods. Using these build methods, we can write functions
static Tree.Stm do_stm(Tree.Stm s) static Tree.ESEQ do_exp (Tree.Exp e)
that pull all the ESEQs out of a statement or expression, respectively. That is, do_stm uses s.kids() to get the immediate subexpressions of s, which will be an expression-list l. It then pulls all the ESEQs out of l recursively, yielding a clump of side-effecting statements s_{1} and a cleaned-up list l′. Then SEQ(s_{1}, s.build(l′)) constructs a new statement, like the original s but with no ESEQs. These functions rely on auxiliary functions reorder_stm and reorder_exp for help; see also Exercise 8.3. The left-hand operand of the MOVE statement is not considered a subexpression, because it is the destination of the statement - its value is not used by the statement. However, if the destination is a memory location, then the address acts like a source. Thus we have,
public class MOVE extends Stm { public Exp dst, src; public MOVE(Exp d, Exp s) {dst=d; src=s;} public ExpList kids() { if (dst instanceof MEM) return new ExpList(((MEM)dst).exp, new ExpList(src,null)); else return new ExpList(src,null); } public Stm build(ExpList kids) { if (dst instanceof MEM) return new MOVE(new MEM(kids.head), kids.tail.head); else return new MOVE(dst, kids.head); } }
Now, given a list of "kids", we pull the ESEQs out, from right to left.
The Tree language permits CALL nodes to be used as subexpressions. However, the actual implementation of CALL will be that each function returns its result in the same dedicated return-value register TEMP(RV). Thus, if we have
the second call will overwrite the RV register before the PLUS can be executed. We can solve this problem with a rewriting rule. The idea is to assign each return value immediately into a fresh temporary register, that is
Now the ESEQ-eliminator will percolate the MOVE up outside of its containing BINOP (etc.) expressions. This technique will generate a few extra MOVE instructions, which the register allocator () can clean up. The rewriting rule is implemented as follows: reorder replaces any occurrence of CALL(f, args) by
and calls itself again on the ESEQ. But do_stm recognizes the pattern
and does not call reorder on the CALL node in that case, but treats the f and args as the children of the MOVE node. Thus, reorder never "sees" any CALL that is already the immediate child of a MOVE. Occurrences of the pattern EXP(CALL(f, args)) are treated similarly.
Once an entire function body s_{0} is processed with do_stm, the result is a tree s_{0}′ where all the SEQ nodes are near the top (never underneath any other kind of node). The linearize function repeatedly applies the rule
The result is that s′_{0} is linearized into an expression of the form
Here the SEQ nodes provide no structuring information at all, and we can just consider this to be a simple list of statements,
where none of the s_{i} contain SEQ or ESEQ nodes. These rewrite rules are implemented by linearize, with an auxiliary function linear:
static Tree.StmList linear(Tree.SEQ s, Tree.StmList l) { return linear(s.left,linear(s.right,l)); } static Tree.StmList linear(Tree.Stm s, Tree.StmList l) { if (s instanceof Tree.SEQ) return linear((Tree.SEQ)s,l); else return new Tree.StmList(s,l); } static public Tree.StmList linearize(Tree.Stm s) { return linear(do_stm(s), null); }
JaVa | Previous Next |